# 1. Introduction

In the previous two lessons, we've learned a lot about joining data. We've gone from creating basic joins between two tables to making complex joins using multiple tables, subqueries, unusual join types and aggregate functions.

In this lesson, we're going to continue to practice constructing complex joins, while also learning how to:

* Build and format your queries for readability
* Creating named subqueries and views
* Combining data using set operations.

Just like the previous lesson, we'll be working with the Chinook database. So you can easily refer to it, the schema for the Chinook database is provided again below.

![](https://s3.amazonaws.com/dq-content/190/chinook-schema.svg)

# 2. Writing Readable Queries

"Code is read much more often than it is written, so plan accordingly.

"Even if you don't intend anybody else to read your code, there's still a very good chance that somebody will have to stare at your code and figure out what it does: That person is probably going to be you, twelve months from now."

—Raymond Chen

Often quoted and paraphrased, this philosophy is especially important when writing SQL, where queries can quickly get visually complex. Taking the time to write your queries to be more easily understood will take a little extra time now, but will save you time when you come back to old queries that you have written, and help your colleagues when you're working in a data team.

One obvious area when it comes to writing queries is the use of capitalization and whitespace. Because white space doesn't have any meaning in SQL, it can be used to help convey meaning in a complex query. Let's compare the same query written twice— first without whitespace and capitalization:

    select ta.artist_name artist, count(*) tracks_sold from invoice_line il
    inner join (select t.track_id, ar.name artist_name from track t
    inner join album al on al.album_id = t.album_id
    inner join artist ar on ar.artist_id = al.artist_id) ta
    on ta.track_id = il.track_id group by 1 order by 2 desc limit 10;
    
And now, with whitespace and capitalization:

    SELECT
        ta.artist_name artist,
        COUNT(*) tracks_sold
    FROM invoice_line il
    INNER JOIN (
                SELECT
                    t.track_id,
                    ar.name artist_name
                FROM track t
                INNER JOIN album al ON al.album_id = t.album_id
                INNER JOIN artist ar ON ar.artist_id = al.artist_id
               ) ta
               ON ta.track_id = il.track_id
    GROUP BY 1
    ORDER BY 2 DESC LIMIT 10;
    
As you can see, a little time put into whitespace and capitalization pays off. A few tips to help make your queries more readable:

* If a select statement has more than one column, put each on a new line, indented from the select statement.
* Always capitalize SQL function names and keywords
* Put each clause of your query on a new line.
* Use indenting to make subqueries appear logically separate.

In [7]:
%%capture

%load_ext sql

%sql sqlite:///chinook.db

# 3. The With Clause

In [30]:
%%sql

WITH playlist_info AS 
    (
        SELECT p.playlist_id,
                p.name AS playlist_name,
                t.name AS track_name,
                (t.milliseconds/1000.0) AS length_seconds
        
    FROM playlist p
    LEFT JOIN playlist_track AS pt ON pt.playlist_id = p.playlist_id
    LEFT JOIN track AS T on t.track_id = pt.track_id
    )
SELECT playlist_id, 
        playlist_name, 
        COUNT(track_name) AS number_of_tracks, 
        SUM(length_seconds) AS length_seconds
    FROM playlist_info
GROUP BY playlist_id, playlist_name
ORDER BY playlist_id ASC

 * sqlite:///chinook.db
Done.


playlist_id,playlist_name,number_of_tracks,length_seconds
1,Music,3290,877683.0829999988
2,Movies,0,
3,TV Shows,213,501094.95700000005
4,Audiobooks,0,
5,90’s Music,1477,398705.153
6,Audiobooks,0,
7,Movies,0,
8,Music,3290,877683.0829999988
9,Music Videos,1,294.294
10,TV Shows,213,501094.95700000005


# 4. Creating Views

When we use the WITH clause, we're creating a temporary named subquery that we can use only within that query. But what if we find ourselves using the same WITH with lots of different queries? It would be nice to permanently define a subquery that we can use again and again.

We do this by creating a view, which we can then use in all future queries. An easy way to think of this is the WITH clause creates a temporary view. The syntax for creating a view is:

    CREATE VIEW database.view_name AS
        SELECT * FROM database.table;
        
We'll be specifying the database name using [database name].[view or table name] syntax in instead of just [view or table name]. You'll need to use this in conjunction with any views because we have manually attached the database. If you're working with SQLite on your local machine, or in one of our Jupyter projects, you don't need to specify the database name like in the following example:

    CREATE VIEW view_name AS
        SELECT * FROM table;
        
Here's an example of how to create a view called customer_2, identical to the existing customer table:

    CREATE VIEW chinook.customer_2 AS
        SELECT * FROM chinook.customer;
        
If we wanted to modify this view, and tried to redefine it, we'd get an error.

If we wish to redefine a view, we first have to delete, or drop the existing view:

    DROP VIEW chinook.customer_2;

In [94]:
%%sql

DROP VIEW IF EXISTS customer_gt_90_dollars;

 * sqlite:///chinook.db
Done.


[]

In [95]:
%%sql

CREATE VIEW customer_gt_90_dollars AS
    SELECT c.*
        FROM customer AS c
    INNER JOIN invoice AS i ON i.customer_id=c.customer_id
    GROUP BY c.customer_id
    HAVING SUM(i.total)>90;

SELECT * FROM customer_gt_90_dollars
LIMIT 10;

 * sqlite:///chinook.db
Done.
Done.


customer_id,first_name,last_name,company,address,city,state,country,postal_code,phone,fax,email,support_rep_id
1,Luís,Gonçalves,Embraer - Empresa Brasileira de Aeronáutica S.A.,"Av. Brigadeiro Faria Lima, 2170",São José dos Campos,SP,Brazil,12227-000,+55 (12) 3923-5555,+55 (12) 3923-5566,luisg@embraer.com.br,3
3,François,Tremblay,,1498 rue Bélanger,Montréal,QC,Canada,H2G 1A7,+1 (514) 721-4711,,ftremblay@gmail.com,3
5,František,Wichterlová,JetBrains s.r.o.,Klanova 9/506,Prague,,Czech Republic,14700,+420 2 4172 5555,+420 2 4172 5555,frantisekw@jetbrains.com,4
6,Helena,Holý,,Rilská 3174/6,Prague,,Czech Republic,14300,+420 2 4177 0449,,hholy@gmail.com,5
13,Fernanda,Ramos,,Qe 7 Bloco G,Brasília,DF,Brazil,71020-677,+55 (61) 3363-5547,+55 (61) 3363-7855,fernadaramos4@uol.com.br,4
17,Jack,Smith,Microsoft Corporation,1 Microsoft Way,Redmond,WA,USA,98052-8300,+1 (425) 882-8080,+1 (425) 882-8081,jacksmith@microsoft.com,5
20,Dan,Miller,,541 Del Medio Avenue,Mountain View,CA,USA,94040-111,+1 (650) 644-3358,,dmiller@comcast.com,4
21,Kathy,Chase,,801 W 4th Street,Reno,NV,USA,89503,+1 (775) 223-7665,,kachase@hotmail.com,5
22,Heather,Leacock,,120 S Orange Ave,Orlando,FL,USA,32801,+1 (407) 999-7788,,hleacock@gmail.com,4
30,Edward,Francis,,230 Elgin Street,Ottawa,ON,Canada,K2P 1L7,+1 (613) 234-3322,,edfrancis@yachoo.ca,3


# 5. Combining Rows With Union

In [96]:
%%sql

CREATE VIEW customer_usa AS 
     SELECT * FROM customer
     WHERE country = "USA";

 * sqlite:///chinook.db
Done.


[]

In [112]:
%%sql

SELECT * FROM customer_gt_90_dollars

UNION

SELECT * FROM customer_usa
LIMIT 5

 * sqlite:///chinook.db
Done.


customer_id,first_name,last_name,company,address,city,state,country,postal_code,phone,fax,email,support_rep_id
1,Luís,Gonçalves,Embraer - Empresa Brasileira de Aeronáutica S.A.,"Av. Brigadeiro Faria Lima, 2170",São José dos Campos,SP,Brazil,12227-000,+55 (12) 3923-5555,+55 (12) 3923-5566,luisg@embraer.com.br,3
3,François,Tremblay,,1498 rue Bélanger,Montréal,QC,Canada,H2G 1A7,+1 (514) 721-4711,,ftremblay@gmail.com,3
5,František,Wichterlová,JetBrains s.r.o.,Klanova 9/506,Prague,,Czech Republic,14700,+420 2 4172 5555,+420 2 4172 5555,frantisekw@jetbrains.com,4
6,Helena,Holý,,Rilská 3174/6,Prague,,Czech Republic,14300,+420 2 4177 0449,,hholy@gmail.com,5
13,Fernanda,Ramos,,Qe 7 Bloco G,Brasília,DF,Brazil,71020-677,+55 (61) 3363-5547,+55 (61) 3363-7855,fernadaramos4@uol.com.br,4


# 6. Combining Rows Using Intersect and Except

The three scenarios we discussed at the start of the previous screen were:

* Customers who are in the USA or have spent more than \$90
* Customers who are in the USA and have spent more than \$90
* Customers who are in the USA and have not spent more than \$90
We just successfully used UNION for the first, but what about the other two? There are two other operators that will help us with these - intersect and except. Combined, these three operators allow us to perform set operations in SQL. Here's a diagram and explanation of how these compare with union.

![](https://s3.amazonaws.com/dq-content/190/set_operations.svg)

The results of UNION, INTERSECT and EXCEPT conform to the 'everything in SQL is a table' concept we learned in the SQL fundamentals course. The results of these operations can be used in subqueries and joined to other tables for more complex analysis. Let's look at a scenario where we'll need to join the results of a set operation to another table:



In [129]:
%%sql


WITH customer_usa_gt_90 AS (
                            SELECT * FROM customer_gt_90_dollars
                            INTERSECT
                            SELECT * FROM customer_usa
                            )

 * sqlite:///chinook.db
Done.


employee_name
Jane Peacock
Margaret Park
Steve Johnson


In [131]:
%%sql

SELECT DISTINCT (e.first_name || " " || e.last_name) AS employee_name
    FROM customer_gt_90_dollars AS c90
LEFT JOIN employee AS e ON e.employee_id = c90.support_rep_id
WHERE e.title="Sales Support Agent"

INTERSECT

SELECT DISTINCT (e.first_name || " " || e.last_name) AS employee_name
    FROM customer_usa AS cusa
LEFT JOIN employee AS e ON e.employee_id = cusa.support_rep_id
WHERE e.title="Sales Support Agent"
ORDER BY employee_name

 * sqlite:///chinook.db
Done.


employee_name
Jane Peacock
Margaret Park
Steve Johnson
