# SQL joins - Album invoices analysis

### Starting SQL

In [1]:
%%capture
%load_ext sql
%sql sqlite:///chinook.db

### Overview of tables

In [2]:
%%sql
SELECT
    name,
    type
FROM sqlite_master
WHERE type IN ("table","view");

 * sqlite:///chinook.db
Done.


name,type
album,table
artist,table
customer,table
employee,table
genre,table
invoice,table
invoice_line,table
media_type,table
playlist,table
playlist_track,table


### Test query

We do a quick test query to make sure it's working properly before starting our analysis.

In [3]:
%%sql
SELECT
    c.customer_id,
    c.first_name,
    c.last_name,
    i.total
FROM customer c LEFT JOIN invoice i ON c.customer_id = i.customer_id
LIMIT 5

 * sqlite:///chinook.db
Done.


customer_id,first_name,last_name,total
1,Luís,Gonçalves,8.91
1,Luís,Gonçalves,5.94
1,Luís,Gonçalves,8.91
1,Luís,Gonçalves,13.86
1,Luís,Gonçalves,5.94


### Tracks sold per genre

We first want to look at the most popular genres in the USA.

In [4]:
%%sql
WITH 
    usa AS
        (
        SELECT * FROM invoice
        WHERE billing_country = 'USA'
        ),
    tot_tracks_sold AS
        (
        SELECT
        SUM(il.quantity) quantity_tot
        FROM invoice_line il 
        LEFT JOIN invoice i ON il.invoice_id = i.invoice_id
        WHERE i.billing_country = 'USA'
        )
    

SELECT
    t.genre_id,
    g.name genre_name,
    SUM(il.quantity) tracks_sold,
    CAST(SUM(il.quantity) as Float) / CAST((SELECT
                                            * FROM tot_tracks_sold
                                           ) as Float) percentage
FROM invoice_line il 
LEFT JOIN track t ON il.track_id = t.track_id
INNER JOIN usa ON usa.invoice_id = il.invoice_id
LEFT JOIN genre g ON g.genre_id = t.genre_id

GROUP BY 1
ORDER BY 3 DESC
LIMIT 10


 * sqlite:///chinook.db
Done.


genre_id,genre_name,tracks_sold,percentage
1,Rock,561,0.5337773549000951
4,Alternative & Punk,130,0.1236917221693625
3,Metal,124,0.1179828734538534
14,R&B/Soul,53,0.0504281636536631
6,Blues,36,0.0342530922930542
23,Alternative,35,0.033301617507136
9,Pop,22,0.0209324452901998
7,Latin,22,0.0209324452901998
17,Hip Hop/Rap,20,0.0190294957183634
2,Jazz,14,0.0133206470028544


Rock, Punk and metal are in the top 3 genres in the USA, with rock clearly leading.

### Checking employee performance

Now we want to see if there are any employees who are performing particularly well.

In [5]:
%%sql
SELECT
    e.first_name || ' ' || e.last_name employee,
    e.hire_date,
    ROUND(SUM(i.total),2) total_sales
FROM employee e
LEFT JOIN customer c ON e.employee_id = c.support_rep_id
LEFT JOIN invoice i ON i.customer_id = c.customer_id

GROUP BY 1,2
HAVING total_sales <> 'none'

 * sqlite:///chinook.db
Done.


employee,hire_date,total_sales
Jane Peacock,2017-04-01 00:00:00,1731.51
Margaret Park,2017-05-03 00:00:00,1584.0
Steve Johnson,2017-10-17 00:00:00,1393.92


The performance seems balanced out between the employees.

### Sales by country

Let's look at which countries sell the most.

In [6]:
%%sql

WITH 
    merge AS
        (
        SELECT 
            *
        FROM invoice i
        LEFT JOIN customer c ON i.customer_id = c.customer_id
        ),
    T1 AS
        (
        SELECT  
            country,
            CASE
                WHEN COUNT(DISTINCT(customer_id)) = 1 THEN 'Other'
                ELSE country
                END
                AS country_2,             
            COUNT(DISTINCT(customer_id)) customers,
            SUM(total) total_sales,
            COUNT(DISTINCT(invoice_id)) invoices,
            SUM(total) / COUNT(DISTINCT(invoice_id)) average_order,
            SUM(total) / COUNT(DISTINCT(customer_id)) customer_lifetime_value
        FROM merge 
        GROUP BY 1
        ORDER BY total_sales DESC
        ),
    T1_2 AS
        (
        SELECT 
            country_2,
            SUM(customers) customers,
            SUM(total_sales) total_sales,
            SUM(total_sales) / SUM(invoices) average_order,
            SUM(total_sales) / SUM(customers) customer_lifetime_value,
            CASE
                WHEN country_2 = 'Other' THEN 1
                ELSE 0
                END
                AS sort
        FROM T1
        GROUP BY country_2
        ORDER BY total_sales DESC    
        ),
    T1_3 AS
        (
        SELECT * FROM T1_2
        ORDER BY sort ASC
        )
        

SELECT * FROM T1_3


 * sqlite:///chinook.db
Done.


country_2,customers,total_sales,average_order,customer_lifetime_value,sort
USA,13,1040.4899999999998,7.942671755725189,80.0376923076923,0
Canada,8,535.5900000000001,7.047236842105265,66.94875000000002,0
Brazil,5,427.68000000000006,7.011147540983608,85.53600000000002,0
France,5,389.0699999999999,7.781399999999998,77.81399999999998,0
Germany,4,334.62,8.161463414634147,83.655,0
Czech Republic,2,273.24000000000007,9.108000000000002,136.62000000000003,0
United Kingdom,3,245.52,8.768571428571429,81.84,0
Portugal,2,185.13,6.383793103448276,92.565,0
India,2,183.15,8.72142857142857,91.575,0
Other,15,1094.9399999999998,7.4485714285714275,72.996,1


USA clearly sells the most.

### Album vs Individual track

We now want to compare album purchases vs singles purchases. Let's look at what sells better.

In [7]:
%%sql

SELECT
il.invoice_line_id,
il.invoice_id,
c.last_name,
t.track_id,
t.name,
a.album_id,
a.title

FROM invoice_line il
LEFT JOIN invoice i ON il.invoice_id = i.invoice_id
LEFT JOIN track t ON t.track_id = il.track_id
LEFT JOIN album a ON a.album_id = t.album_id
LEFT JOIN customer c ON c.customer_id = i.customer_id

LIMIT 10

 * sqlite:///chinook.db
Done.


invoice_line_id,invoice_id,last_name,track_id,name,album_id,title
1,1,Brooks,1158,Right Next Door to Hell,91,Use Your Illusion I
2,1,Brooks,1159,Dust N' Bones,91,Use Your Illusion I
3,1,Brooks,1160,Live and Let Die,91,Use Your Illusion I
4,1,Brooks,1161,Don't Cry (Original),91,Use Your Illusion I
5,1,Brooks,1162,Perfect Crime,91,Use Your Illusion I
6,1,Brooks,1163,You Ain't the First,91,Use Your Illusion I
7,1,Brooks,1164,Bad Obsession,91,Use Your Illusion I
8,1,Brooks,1165,Back off Bitch,91,Use Your Illusion I
9,1,Brooks,1166,Double Talkin' Jive,91,Use Your Illusion I
10,1,Brooks,1167,November Rain,91,Use Your Illusion I


In [8]:
%%sql

WITH 
    merge AS
        (
        SELECT
            il.invoice_line_id,
            il.invoice_id,
            c.last_name,
            t.track_id,
            t.name,
            a.album_id,
            a.title
        FROM invoice_line il
        LEFT JOIN invoice i ON il.invoice_id = i.invoice_id
        LEFT JOIN track t ON t.track_id = il.track_id
        LEFT JOIN album a ON a.album_id = t.album_id
        LEFT JOIN customer c ON c.customer_id = i.customer_id
        ),
    album_track1 AS
        (
        SELECT 
            album_id,
            MIN(track_id) a_track1
        FROM track
        GROUP BY album_id
        ORDER BY 2
        ),
    ifs AS
        (
        SELECT
            invoice_id,
            MIN(track_id) first_track_id
        FROM invoice_line
        GROUP BY invoice_id
        ORDER BY 2
        ),
    T1 AS
        (
        SELECT
            ifs.*,
            CASE
                WHEN
                    (
                    SELECT t.track_id FROM track t   
                    WHERE t.album_id = (
                                        SELECT t2.album_id FROM track t2
                                        WHERE t2.track_id = ifs.first_track_id
                                        ) 
                        
                    EXCEPT
                        
                    SELECT il2.track_id FROM invoice_line il2
                    WHERE ifs.invoice_id = il2.invoice_id
                    ) IS NULL            
            
                AND
            
                    (
                    SELECT il2.track_id FROM invoice_line il2
                    WHERE ifs.invoice_id = il2.invoice_id
                     
                    EXCEPT
                     
                    SELECT t.track_id FROM track t   
                    WHERE t.album_id = (
                                        SELECT t2.album_id FROM track t2
                                        WHERE t2.track_id = ifs.first_track_id
                                        ) 
                        
                    ) IS NULL
                THEN "yes"
                ELSE "no"
                END
                AS "album_purchase"
            FROM ifs
        ),
    T2 AS
        (
        SELECT
        COUNT(invoice_id) number_of_invoices,
        album_purchase,
        CAST(COUNT(invoice_id) as FLoat) / (SELECT COUNT(*) FROM T1) percentage
        FROM T1
        GROUP BY album_purchase
        )
    
SELECT * FROM T2

    




 * sqlite:///chinook.db
Done.


number_of_invoices,album_purchase,percentage
500,no,0.8143322475570033
114,yes,0.1856677524429967


Singles clearly sell a lot more, representing 81% of sales.