Code is read much more often than it is written, so plan accordingly.

"Even if you don't intend anybody else to read your code, there's still a very good chance that somebody will have to stare at your code and figure out what it does: That person is probably going to be you, twelve months from now."

—Raymond Chen

One obvious area when it comes to writing queries is the use of capitalization and whitespace. Because white space doesn't have any meaning in SQL, it can be used to help convey meaning in a complex query. 

A few tips to help make our queries more readable:

* If a `select` statement has more than one column, put each on a new line, indented from the select statement.
* Always **capitalize SQL function names and keywords**
* Put each **clause** of our query on a **new line**.
* Use indenting to make subqueries appear logically separate.

Another important consideration when writing readable queries is the use of **alias names** and **shortcuts**. **Name aliases** should be clear– a common convention is using the **first letter** of the table name.

If we work in a team, we might consider a [SQL style guide](https://www.sqlstyle.guide/)— a great guide is available at SQL style guide, but remember that readability is more important than consistency. If we have a complex query and we think breaking the style guide will make it more readable, we should do it.

When constructing complex queries, it's useful to create an intermediate table to produce our final results

One way to alleviate this is to use a **WITH** clause. **WITH** clauses allow us to define one or more named subqueries before the start of the main query.

The syntax for the `WITH` clause is relatively straight-forward.

`WITH [alias_name] AS ([subquery])
SELECT [main_query]`

In [24]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import sqlite3 as sql

In [25]:
conn = sql.connect("chinook.db")

def read_query(q):
    return pd.read_sql_query(q, conn)

In [26]:
# Create a query that shows summary data for every playlist in the Chinook database:
# Use a WITH clause to create a named subquery with the following info:
# The unique ID for the playlist.
# The name of the playlist.
# The name of each track from the playlist.
# The length of each track in seconds.
# Our final table should have the following columns, in order:
# playlist_id - the unique ID for the playlist.
# playlist_name - The name of the playlist.
# number_of_tracks - A count of the number of tracks in the playlist.
# length_seconds - The sum of the length of the playlist in seconds.
# The results should be sorted by playlist_id in ascending order.

q = """With playlist_info As (Select pl.playlist_id playlist_id, 
       pl.name playlist_name,
       t.name track_name, (t.milliseconds/1000) length_seconds
       From playlist pl
       Left Join playlist_track plt 
       on pl.playlist_id = plt.playlist_id
       left join track t
       On plt.track_id = t.track_id)
       
       Select playlist_id, playlist_name, count(track_name) number_of_tracks, 
       Sum (length_seconds) length_seconds from playlist_info 
       Group BY 1,2
       Order By 1 """

read_query(q)

Unnamed: 0,playlist_id,playlist_name,number_of_tracks,length_seconds
0,1,Music,3290,876049.0
1,2,Movies,0,
2,3,TV Shows,213,500987.0
3,4,Audiobooks,0,
4,5,90’s Music,1477,397970.0
5,6,Audiobooks,0,
6,7,Movies,0,
7,8,Music,3290,876049.0
8,9,Music Videos,1,294.0
9,10,TV Shows,213,500987.0


When we use the `WITH` clause, we're creating a temporary named subquery that we can use only within that query. But what if we find ourselves using the same `WITH` with lots of different queries? It would be nice to permanently define a subquery that we can use again and again.

We do this by creating a `view`, which we can then use in all future queries. An easy way to think of this is the `WITH` clause creates a temporary view. The syntax for creating a `view` is:

`CREATE VIEW database.view_name AS
    SELECT * FROM database.table;`

We'll be specifying the database name using `[database name].[view or table name]` syntax in instead of just `[view or table name]`. We'll need to use this in conjunction with any views because we have [manually attached the database](https://sqlite.org/lang_attach.html). If we're working with SQLite on our local machine, we don't need to specify the database name

Here's an example of how to create a view called `customer_2`, identical to the existing customer table:

`CREATE VIEW chinook.customer_2 AS
    SELECT * FROM chinook.customer;`
    
If we wanted to modify this view, and tried to redefine it, we'd get an error:

`CREATE VIEW chinook.customer_2 AS
    SELECT
        customer_id,
        first_name || last_name name,
        phone,
        email,
        support_rep_id
    FROM chinook.customer;`
    
Error: table customer_2 already exists

If we wish to redefine a view, we first have to delete, or drop the existing view:

`DROP VIEW chinook.customer_2;`

In [33]:
# Create a view called customer_gt_90_dollars:
# The view should contain the columns from customers, in their original order.
# The view should contain only customers who have purchased more than $90 in tracks from the store.
# After the SQL query that creates the view, write a second query to display your newly created view: SELECT * FROM chinook.customer_gt_90_dollars;.
# Make sure you use a semicolon (;) to indicate the end of each query.


"""Create View customer_gt_90_dollars As
   Select c.* From customer c
   Left Join invoice inv
   ON inv.customer_id = c.customer_id
   Group BY c.customer_id
   Having SUM(inv.total) > 90; """



'Create View customer_gt_90_dollars As\n   Select c.* From customer c\n   Left Join invoice inv\n   ON inv.customer_id = c.customer_id\n   Group BY c.customer_id\n   Having SUM(inv.total) > 90; '

In [37]:
read_query("""SELECT * from customer_gt_90_dollars;""")

Unnamed: 0,customer_id,first_name,last_name,company,address,city,state,country,postal_code,phone,fax,email,support_rep_id
0,1,Luís,Gonçalves,Embraer - Empresa Brasileira de Aeronáutica S.A.,"Av. Brigadeiro Faria Lima, 2170",São José dos Campos,SP,Brazil,12227-000,+55 (12) 3923-5555,+55 (12) 3923-5566,luisg@embraer.com.br,3
1,3,François,Tremblay,,1498 rue Bélanger,Montréal,QC,Canada,H2G 1A7,+1 (514) 721-4711,,ftremblay@gmail.com,3
2,5,František,Wichterlová,JetBrains s.r.o.,Klanova 9/506,Prague,,Czech Republic,14700,+420 2 4172 5555,+420 2 4172 5555,frantisekw@jetbrains.com,4
3,6,Helena,Holý,,Rilská 3174/6,Prague,,Czech Republic,14300,+420 2 4177 0449,,hholy@gmail.com,5
4,13,Fernanda,Ramos,,Qe 7 Bloco G,Brasília,DF,Brazil,71020-677,+55 (61) 3363-5547,+55 (61) 3363-7855,fernadaramos4@uol.com.br,4
5,17,Jack,Smith,Microsoft Corporation,1 Microsoft Way,Redmond,WA,USA,98052-8300,+1 (425) 882-8080,+1 (425) 882-8081,jacksmith@microsoft.com,5
6,20,Dan,Miller,,541 Del Medio Avenue,Mountain View,CA,USA,94040-111,+1 (650) 644-3358,,dmiller@comcast.com,4
7,21,Kathy,Chase,,801 W 4th Street,Reno,NV,USA,89503,+1 (775) 223-7665,,kachase@hotmail.com,5
8,22,Heather,Leacock,,120 S Orange Ave,Orlando,FL,USA,32801,+1 (407) 999-7788,,hleacock@gmail.com,4
9,30,Edward,Francis,,230 Elgin Street,Ottawa,ON,Canada,K2P 1L7,+1 (613) 234-3322,,edfrancis@yachoo.ca,3


In [None]:
# Alternate of above query

"""CREATE VIEW chinook.customer_gt_90_dollars AS 
    SELECT
        c.*
    FROM chinook.invoice i
    INNER JOIN chinook.customer c ON i.customer_id = c.customer_id
    GROUP BY 1
    HAVING SUM(i.total) > 90;"""

"""SELECT * FROM chinook.customer_gt_90_dollars;"""

In [36]:
# "Drop View customer_gt_90_dollars"

In [42]:
# customers that live in the USA.

"""Create View customer_usa As
   Select * From customer
   Where country = "USA"
"""

# Alternate

# CREATE VIEW chinook.customer_usa AS 
#      SELECT * FROM chinook.customer
#      WHERE country = "USA";

'Create View customer_usa As\n   Select * From customer\n   Where country = "USA"\n'

In [43]:
q = """Select * From customer_usa"""
read_query(q)

Unnamed: 0,customer_id,first_name,last_name,company,address,city,state,country,postal_code,phone,fax,email,support_rep_id
0,16,Frank,Harris,Google Inc.,1600 Amphitheatre Parkway,Mountain View,CA,USA,94043-1351,+1 (650) 253-0000,+1 (650) 253-0000,fharris@google.com,4
1,17,Jack,Smith,Microsoft Corporation,1 Microsoft Way,Redmond,WA,USA,98052-8300,+1 (425) 882-8080,+1 (425) 882-8081,jacksmith@microsoft.com,5
2,18,Michelle,Brooks,,627 Broadway,New York,NY,USA,10012-2612,+1 (212) 221-3546,+1 (212) 221-4679,michelleb@aol.com,3
3,19,Tim,Goyer,Apple Inc.,1 Infinite Loop,Cupertino,CA,USA,95014,+1 (408) 996-1010,+1 (408) 996-1011,tgoyer@apple.com,3
4,20,Dan,Miller,,541 Del Medio Avenue,Mountain View,CA,USA,94040-111,+1 (650) 644-3358,,dmiller@comcast.com,4
5,21,Kathy,Chase,,801 W 4th Street,Reno,NV,USA,89503,+1 (775) 223-7665,,kachase@hotmail.com,5
6,22,Heather,Leacock,,120 S Orange Ave,Orlando,FL,USA,32801,+1 (407) 999-7788,,hleacock@gmail.com,4
7,23,John,Gordon,,69 Salem Street,Boston,MA,USA,2113,+1 (617) 522-1333,,johngordon22@yahoo.com,4
8,24,Frank,Ralston,,162 E Superior Street,Chicago,IL,USA,60611,+1 (312) 332-3232,,fralston@gmail.com,3
9,25,Victor,Stevens,,319 N. Frances Street,Madison,WI,USA,53703,+1 (608) 257-0597,,vstevens@yahoo.com,5


Where regular joins are used to join columns, the `union` operator is used to join rows from **tables** and/or **views**.

The syntax for the `union` operator is composed of two or more `SELECT` statements:

`[select_statement_one]
UNION
[select_statement_two]`

Rather than using the `ON` keyword, the statements before and after `UNION` must have the **same number of columns**, with **compatible types** in order

an example, `FLOAT` and `INT` are compatible types, but `FLOAT` and `TEXT` are not

Because we created `customer_usa` and `customer_gt_90_dollars` with identical **column names**, **order**, and **type** as customer, we can safely use `UNION`.

In [44]:
# identify customers who are in the USA OR have spent more than $90
# UNION to produce a table of customers in the USA or have spent more than $90, 
# using the customer_usa and customer_gt_90_dollars views

q = """Select * from customer_usa
       UNION
       Select * from customer_gt_90_dollars"""
read_query(q)


Unnamed: 0,customer_id,first_name,last_name,company,address,city,state,country,postal_code,phone,fax,email,support_rep_id
0,1,Luís,Gonçalves,Embraer - Empresa Brasileira de Aeronáutica S.A.,"Av. Brigadeiro Faria Lima, 2170",São José dos Campos,SP,Brazil,12227-000,+55 (12) 3923-5555,+55 (12) 3923-5566,luisg@embraer.com.br,3
1,3,François,Tremblay,,1498 rue Bélanger,Montréal,QC,Canada,H2G 1A7,+1 (514) 721-4711,,ftremblay@gmail.com,3
2,5,František,Wichterlová,JetBrains s.r.o.,Klanova 9/506,Prague,,Czech Republic,14700,+420 2 4172 5555,+420 2 4172 5555,frantisekw@jetbrains.com,4
3,6,Helena,Holý,,Rilská 3174/6,Prague,,Czech Republic,14300,+420 2 4177 0449,,hholy@gmail.com,5
4,13,Fernanda,Ramos,,Qe 7 Bloco G,Brasília,DF,Brazil,71020-677,+55 (61) 3363-5547,+55 (61) 3363-7855,fernadaramos4@uol.com.br,4
5,16,Frank,Harris,Google Inc.,1600 Amphitheatre Parkway,Mountain View,CA,USA,94043-1351,+1 (650) 253-0000,+1 (650) 253-0000,fharris@google.com,4
6,17,Jack,Smith,Microsoft Corporation,1 Microsoft Way,Redmond,WA,USA,98052-8300,+1 (425) 882-8080,+1 (425) 882-8081,jacksmith@microsoft.com,5
7,18,Michelle,Brooks,,627 Broadway,New York,NY,USA,10012-2612,+1 (212) 221-3546,+1 (212) 221-4679,michelleb@aol.com,3
8,19,Tim,Goyer,Apple Inc.,1 Infinite Loop,Cupertino,CA,USA,95014,+1 (408) 996-1010,+1 (408) 996-1011,tgoyer@apple.com,3
9,20,Dan,Miller,,541 Del Medio Avenue,Mountain View,CA,USA,94040-111,+1 (650) 644-3358,,dmiller@comcast.com,4


We just successfully used `UNION`. There are two other operators that will help us with these - `intersect` and `except`. Combined, these three operators allow us to perform set operations in SQL. 

**`Operator`**	            **`What it Does`**	                           **`Python Equivalent`** 


`UNION`	    `Selects rows that occur in either statement.`	 `or`                                                    

`INTERSECT`	`Selects rows that occur in both statements.`	 `and`                                                       

`EXCEPT`    `Selects rows that occur in the first statement`,   
            `but don't occur in the second statement.`         `not`

Both the syntax and the rules about column number and ordering of similar types are the same for `INTERSECT` and `EXCEPT` as they are for `UNION`.


In [45]:
# customers who are in the USA and have spent more than $90

q = """SELECT * from customer_usa
INTERSECT
SELECT * from customer_gt_90_dollars;"""

read_query(q)

Unnamed: 0,customer_id,first_name,last_name,company,address,city,state,country,postal_code,phone,fax,email,support_rep_id
0,17,Jack,Smith,Microsoft Corporation,1 Microsoft Way,Redmond,WA,USA,98052-8300,+1 (425) 882-8080,+1 (425) 882-8081,jacksmith@microsoft.com,5
1,20,Dan,Miller,,541 Del Medio Avenue,Mountain View,CA,USA,94040-111,+1 (650) 644-3358,,dmiller@comcast.com,4
2,21,Kathy,Chase,,801 W 4th Street,Reno,NV,USA,89503,+1 (775) 223-7665,,kachase@hotmail.com,5
3,22,Heather,Leacock,,120 S Orange Ave,Orlando,FL,USA,32801,+1 (407) 999-7788,,hleacock@gmail.com,4


In [46]:
# customers who are in the USA and have not spent $90

q = """SELECT * from customer_usa
    EXCEPT
    SELECT * from customer_gt_90_dollars;"""

read_query(q)

Unnamed: 0,customer_id,first_name,last_name,company,address,city,state,country,postal_code,phone,fax,email,support_rep_id
0,16,Frank,Harris,Google Inc.,1600 Amphitheatre Parkway,Mountain View,CA,USA,94043-1351,+1 (650) 253-0000,+1 (650) 253-0000,fharris@google.com,4
1,18,Michelle,Brooks,,627 Broadway,New York,NY,USA,10012-2612,+1 (212) 221-3546,+1 (212) 221-4679,michelleb@aol.com,3
2,19,Tim,Goyer,Apple Inc.,1 Infinite Loop,Cupertino,CA,USA,95014,+1 (408) 996-1010,+1 (408) 996-1011,tgoyer@apple.com,3
3,23,John,Gordon,,69 Salem Street,Boston,MA,USA,2113,+1 (617) 522-1333,,johngordon22@yahoo.com,4
4,24,Frank,Ralston,,162 E Superior Street,Chicago,IL,USA,60611,+1 (312) 332-3232,,fralston@gmail.com,3
5,25,Victor,Stevens,,319 N. Frances Street,Madison,WI,USA,53703,+1 (608) 257-0597,,vstevens@yahoo.com,5
6,26,Richard,Cunningham,,2211 W Berry Street,Fort Worth,TX,USA,76110,+1 (817) 924-7272,,ricunningham@hotmail.com,4
7,27,Patrick,Gray,,1033 N Park Ave,Tucson,AZ,USA,85719,+1 (520) 622-4200,,patrick.gray@aol.com,4
8,28,Julia,Barnett,,302 S 700 E,Salt Lake City,UT,USA,84102,+1 (801) 531-7272,,jubarnett@gmail.com,5


The results of `UNION`, `INTERSECT` and `EXCEPT` conform to the 'everything in SQL is a table' 

The results of these operations can be used in subqueries and joined to other tables for more complex analysis.

In [68]:
# query that works out how many customers that are in the USA and have purchased more than $90 are assigned to each sales support agent. 
# For the purposes of this exercise, no two employees have the same name.
# Our result should have the following columns, in order:
# employee_name - The first_name and last_name of the employee separated by a space, eg Luke Skywalker.
# customers_usa_gt_90 - The number of customer assigned to that employee that are both from the USA and have have purchased more than $90 worth of tracks.
# The result should include all employees with the title "Sales Support Agent", but not employees with any other title.
# Order our results by the employee_name column.

q = """With customers_usa_gt_90 AS (Select * From customer_usa
       Intersect 
       Select * From customer_gt_90_dollars)
       
       Select e.first_name || " " || e.last_name employee_name, 
       Count(c.customer_id) customers_usa_gt_90
       From employee e
       Left Join customers_usa_gt_90 c
       ON e.employee_id = c.support_rep_id
       Where title = "Sales Support Agent"
       Group BY 1
        """
read_query(q)

Unnamed: 0,employee_name,customers_usa_gt_90
0,Jane Peacock,0
1,Margaret Park,2
2,Steve Johnson,2


`With` clauses allow us to define one or more named subqueries. To do this, we use a single `WITH` clause and multiple, comma-separated alias/subquery pairs:

`WITH
    [alias_name] AS ([subquery]),
    [alias_name_2] AS ([subquery_2]),
    [alias_name_3] AS ([subquery_3])
SELECT [main_query]`

While each subquery can be independent, we can actually use the result of the first subquery in subsequent subqueries, and so on. This can be a useful way of building readable complex queries.

In [84]:
# query that uses multiple named subqueries in a WITH clause to gather total sales data on customers from India:
# The first named subquery should return all customers that are from India.
# The second named subquery should calculate the sum total for every customer.
# The main query should join the two named subqueries, resulting in the following final columns:
# customer_name - The first_name and last_name of the customer, separated by a space, eg Luke Skywalker.
# total_purchases - The total amount spent on purchases by that customer.
# The results should be sorted by the customer_name column in alphabetical order.

q = '''With 
       customer_india AS (
       Select * from customer
       Where country = "India"),
       
       sales_per_customer As (
       Select customer_id, SUM(total) total from invoice
       Group by 1
       )
       
       Select c.first_name || " " || c.last_name customer_name, 
       s.total from customer_india c
       Left Join sales_per_customer s
       ON s.customer_id = c.customer_id
       Order By 1'''

read_query(q)
        
       

Unnamed: 0,customer_name,total
0,Manoj Pareek,111.87
1,Puja Srivastava,71.28


#### We will be writing a query to find the customer from each country that has spent the most money at our store.

In [101]:
q = '''With all_customer As (
        SELECT c.country, c.first_name || " " || c.last_name customer_name, 
        SUM(inv.total) total_purchased from customer c
       Inner Join invoice inv
       ON inv.customer_id = c.customer_id
       Group By 2)
       
       Select country,customer_name, MAX(total_purchased) total_purchased 
       from all_customer
       group by 1
       Order by 1
       '''

read_query(q)

Unnamed: 0,country,customer_name,total_purchased
0,Argentina,Diego Gutiérrez,39.6
1,Australia,Mark Taylor,81.18
2,Austria,Astrid Gruber,69.3
3,Belgium,Daan Peeters,60.39
4,Brazil,Luís Gonçalves,108.9
5,Canada,François Tremblay,99.99
6,Chile,Luis Rojas,97.02
7,Czech Republic,František Wichterlová,144.54
8,Denmark,Kara Nielsen,37.62
9,Finland,Terhi Hämäläinen,79.2


In [1]:
# Alternate of above query

q = """WITH
    customer_country_purchases AS
        (
         SELECT
             i.customer_id,
             c.country,
             SUM(i.total) total_purchases
         FROM invoice i
         INNER JOIN customer c ON i.customer_id = c.customer_id
         GROUP BY 1, 2
        ),
    country_max_purchase AS
        (
         SELECT
             country,
             MAX(total_purchases) max_purchase
         FROM customer_country_purchases
         GROUP BY 1
        ),
    country_best_customer AS
        (
         SELECT
            cmp.country,
            cmp.max_purchase,
            (
             SELECT ccp.customer_id
             FROM customer_country_purchases ccp
             WHERE ccp.country = cmp.country AND cmp.max_purchase = ccp.total_purchases
            ) customer_id
         FROM country_max_purchase cmp
        )
SELECT
    cbc.country country,
    c.first_name || " " || c.last_name customer_name,
    cbc.max_purchase total_purchased
FROM customer c
INNER JOIN country_best_customer cbc ON cbc.customer_id = c.customer_id
ORDER BY 1 ASC"""