<div align="right" style=" font-size: 80%; text-align: center; margin: 0 auto">
<img src="https://raw.githubusercontent.com/Explore-AI/Pictures/master/ExploreAI_logos/Logo blue_dark.png"  style="width:25px" align="right";/>
</div>

# Join statements
© ExploreAI Academy

In this exercise,  we will explore different SQL `JOIN` statements in order to cross-examine data contained in multiple tables in a dataset.




> ⚠️ In this exercise, we will query a sample SQLite database file called Chinook to gain some insight into the data. Ensure that you have downloaded the database file, `chinook.db`.

## Learning objectives

In this train, we will learn how to:
- Join multiple tables.
- Find common information between tables.
- Use a `LEFT JOIN` to check for missing information.
- Use `CROSS JOIN` to find all the possible combinations of required table rows.

First, let's load our sample database:


In [1]:
# load and activate the SQL extension to allow us to execute SQL in a Jupyter notebook
%load_ext sql

# Load the Chinook database stored in your local machine. 
# Make sure the file is saved in the same folder as this notebook.
%sql sqlite:///chinook.db

'Connected: @chinook.db'

<div align="center" style=" font-size: 80%; text-align: center; margin: 0 auto">
<img src="https://raw.githubusercontent.com/Explore-AI/Pictures/master/The%20chinook%20database%20ERD.jpeg"/>
<br>
<br>
    <em>Figure 1: The Chinook database ERD</em>
</div>

## Overview
Run the necessary queries that will provide us with the following information. Compare your queries with the solutions at the end of this notebook.


### Exercise 1

Sometimes artists add a title track to their albums. This is a track that has the same title as the album. Write a query that returns albums that have a title track.
Return rows in the AlbumId column from the albums table, Title from the albums table, and the Name from the tracks table, where the album table Title column matches the tracks table Name column.

In [2]:
%%sql

SELECT
    Title,albums.AlbumId,Tracks.Name
FROM
    albums
INNER JOIN
    Tracks
on
    albums.Title = Tracks.Name;

 * sqlite:///chinook.db
Done.


Title,AlbumId,Name
Balls to the Wall,2,Balls to the Wall
Restless and Wild,3,Restless and Wild
Let There Be Rock,4,Let There Be Rock
Master Of Puppets,152,Master Of Puppets
Out Of Exile,11,Out Of Exile
Black Sabbath,16,Black Sabbath
Body Count,18,Body Count
Chemical Wedding,19,Chemical Wedding
Prenda Minha,21,Prenda Minha
Minha Historia,23,Minha Historia


### Exercise 2

Suppose that in the previous exercise, we were additionally interested in knowing who the artists of the listed albums are. Write a query that can achieve this.

In [22]:
%%sql

SELECT
    Title,albums.AlbumId,Tracks.Name,artists.name
FROM
    albums
INNER JOIN
    Tracks,artists
on
    albums.Title = Tracks.Name
    AND
    artists.ArtistId = albums.ArtistId

 * sqlite:///chinook.db
Done.


Title,AlbumId,Name,Name_1
Balls to the Wall,2,Balls to the Wall,Accept
Restless and Wild,3,Restless and Wild,Accept
Let There Be Rock,4,Let There Be Rock,AC/DC
Out Of Exile,11,Out Of Exile,Audioslave
Black Sabbath,16,Black Sabbath,Black Sabbath
Black Sabbath,16,Black Sabbath,Black Sabbath
Body Count,18,Body Count,Body Count
Chemical Wedding,19,Chemical Wedding,Bruce Dickinson
Prenda Minha,21,Prenda Minha,Caetano Veloso
Minha Historia,23,Minha Historia,Chico Buarque


In [3]:
%%sql

SELECT
    Name,Tracks.TrackId,invoice_items.TrackId
FROM
    Tracks
LEFT JOIN
    invoice_items
ON
   Tracks.TrackId != invoice_items.TrackId
LIMIT
    3;

 * sqlite:///chinook.db
Done.


Name,TrackId,TrackId_1
For Those About To Rock (We Salute You),1,2
For Those About To Rock (We Salute You),1,2
For Those About To Rock (We Salute You),1,3


### Exercise 3

One use case for a `LEFT JOIN` is that it can be used to check for missing information. In this case, try to find out what media items have not been bought yet (i.e. are not an item in any invoice).

In [9]:
%%sql

SELECT
   Tracks.TrackId,invoice_items.InvoiceId
FROM
    Tracks
LEFT JOIN
    invoice_items
ON
   Tracks.TrackId = invoice_items.TrackId
LIMIT 
    10;


 * sqlite:///chinook.db
Done.


TrackId,InvoiceId
1,108.0
6,2.0
7,
8,2.0
8,214.0
9,108.0
9,319.0
10,2.0
11,
12,2.0


In [4]:
%%sql

SELECT
    *
FROM
    Tracks
LIMIT 5;

 * sqlite:///chinook.db
Done.


TrackId,Name,AlbumId,MediaTypeId,GenreId,Composer,Milliseconds,Bytes,UnitPrice
1,For Those About To Rock (We Salute You),1,1,1,"Angus Young, Malcolm Young, Brian Johnson",343719,11170334,0.99
2,Balls to the Wall,2,2,1,,342562,5510424,0.99
3,Fast As a Shark,3,2,1,"F. Baltes, S. Kaufman, U. Dirkscneider & W. Hoffman",230619,3990994,0.99
4,Restless and Wild,3,2,1,"F. Baltes, R.A. Smith-Diesel, S. Kaufman, U. Dirkscneider & W. Hoffman",252051,4331779,0.99
5,Princess of the Dawn,3,2,1,Deaffy & R.A. Smith-Diesel,375418,6290521,0.99


### Exercise 4

In the results, the tracks that have a value of `None` (i.e. `NULL`) for `InvoiceId` are the ones that have not been purchased yet. Add a `WHERE` clause to only focus on these 'unpopular' tracks.

In [29]:
%%sql

SELECT
   Tracks.TrackId,invoice_items.InvoiceId
FROM
    Tracks
LEFT JOIN
    invoice_items
ON
   Tracks.TrackId = invoice_items.TrackId
WHERE
    invoice_items.InvoiceId IS NULL

 * sqlite:///chinook.db
Done.


TrackId,InvoiceId
7,
11,
17,
18,
22,
23,
27,
29,
33,
34,


### Exercise 5

Let's suppose that, as part of a new business strategy, Chinook wants to develop new product categories for their media items that are based on genre and media type. Write a query that will list all possible product categories (i.e. all possible genre and media type combinations).

In [39]:
%%sql

SELECT
    *
FROM
    media_types
UNION
SELECT
    *
FROM
    genres

 * sqlite:///chinook.db
Done.


MediaTypeId,Name
1,MPEG audio file
1,Rock
2,Jazz
2,Protected AAC audio file
3,Metal
3,Protected MPEG-4 video file
4,Alternative & Punk
4,Purchased AAC audio file
5,AAC audio file
5,Rock And Roll


In [42]:
%%sql

SELECT
    *
FROM
    media_types
CROSS JOIN
    genres

 * sqlite:///chinook.db
Done.


MediaTypeId,Name,GenreId,Name_1
1,MPEG audio file,1,Rock
1,MPEG audio file,2,Jazz
1,MPEG audio file,3,Metal
1,MPEG audio file,4,Alternative & Punk
1,MPEG audio file,5,Rock And Roll
1,MPEG audio file,6,Blues
1,MPEG audio file,7,Latin
1,MPEG audio file,8,Reggae
1,MPEG audio file,9,Pop
1,MPEG audio file,10,Soundtrack


## Solutions

### Exercise 1

In [None]:
%%sql

SELECT 
    a.AlbumId, 
    a.Title AS "Album Title", 
    t.Name AS "Track Name"
FROM 
    albums a
INNER JOIN 
    tracks AS t
    ON a.Title = t.Name
LIMIT 10;  -- Remove this line to see the full query output

### Exercise 2

In [23]:
%%sql

SELECT 
    a.AlbumId, 
    a.Title AS "Album Title", 
    t.Name AS "Track Name", 
    ar.Name AS "Artist Name"
FROM 
    albums AS a
INNER JOIN 
    tracks AS t
    ON a.Title = t.Name
INNER JOIN 
    artists AS ar
    ON ar.ArtistId = a.ArtistId
  -- Remove this line to see the full query output

 * sqlite:///chinook.db
Done.


AlbumId,Album Title,Track Name,Artist Name
2,Balls to the Wall,Balls to the Wall,Accept
3,Restless and Wild,Restless and Wild,Accept
4,Let There Be Rock,Let There Be Rock,AC/DC
11,Out Of Exile,Out Of Exile,Audioslave
16,Black Sabbath,Black Sabbath,Black Sabbath
16,Black Sabbath,Black Sabbath,Black Sabbath
18,Body Count,Body Count,Body Count
19,Chemical Wedding,Chemical Wedding,Bruce Dickinson
21,Prenda Minha,Prenda Minha,Caetano Veloso
23,Minha Historia,Minha Historia,Chico Buarque


Return rows in the `AlbumId` from the `albums` table, `Title` from the `albums` table, the `Name` from the `tracks` table, and the `Name` from the `artists` table, where the `albums` table `Title` column matches the `tracks` table `Name` column, and the `artists` table `Name` column where the `artists` table `ArtistId` is equal to the `albums` table `ArtistId`.

### Exercise 3

In [5]:
%%sql

SELECT 
    t.TrackId, 
    ii.InvoiceId
FROM 
    tracks t
LEFT JOIN 
    invoice_items ii
    ON t.TrackId = ii.TrackId
LIMIT 10;  -- Remove this line to see the full query output

 * sqlite:///chinook.db
Done.


TrackId,InvoiceId
1,108.0
6,2.0
7,
8,2.0
8,214.0
9,108.0
9,319.0
10,2.0
11,
12,2.0


### Exercise 4

In [None]:
%%sql

SELECT 
    t.TrackId, 
    ii.InvoiceId
FROM 
    tracks AS t
LEFT JOIN 
    invoice_items AS ii
    ON t.TrackId = ii.TrackId
WHERE ii.InvoiceId IS NULL
LIMIT 10;  -- Remove this line to see the full query output

### Exercise 5

In [37]:
%%sql

SELECT 
    g.Name AS "Genre", 
    m.Name AS "Media Type"
FROM 
    genres AS g
CROSS JOIN 
    media_types AS m
  -- Remove this line to see the full query output

 * sqlite:///chinook.db
Done.


Genre,Media Type
Rock,MPEG audio file
Rock,Protected AAC audio file
Rock,Protected MPEG-4 video file
Rock,Purchased AAC audio file
Rock,AAC audio file
Jazz,MPEG audio file
Jazz,Protected AAC audio file
Jazz,Protected MPEG-4 video file
Jazz,Purchased AAC audio file
Jazz,AAC audio file
