# Solutions

1. [Intro to Databases and SQL](#1.-Intro-to-Databases-and-SQL)
2. [The SELECT Statement](#2.-The-SELECT-Statement)
3. [GROUP BY and JOIN Clauses](#3.-GROUP-BY-and-JOIN-Clauses)

## 1. Intro to Databases and SQL

![1]

[1]: images/prof_student_class.png

### Exercise 1

<span style="color:green; font-size:16px">In words, describe the relationship between the professor and class tables.</span>

**Answer** - Each class is taught by one and only one professor. Some professors may not be present in the class table. Other professors may appear one or more times in the class table.

### Exercise 2

<span style="color:green; font-size:16px">In words, describe the relationship between the class and students_in_class tables.</span>

**Answer** - Each class has at least one student_id in it. Each class in the students_in_class table is mapped to exactly one row in the class table.

### Exercise 3

<span style="color:green; font-size:16px">What is the minimum and maximum number of professors each student can have?</span>

**Answer** - Since each student is not guaranteed to appear in the students_in_class table, the minimum is 0. There is no maximum as students may appear any number of times in the students_in_class table and all classes are guaranteed to have exactly one professor.

## 2. The SELECT Statement

In [1]:
import pandas as pd

### Exercise 1

<span style="color:green; font-size:16px">Create a variable called `CS_CHINOOK` and assign it the value of the connection string. Use it to read in all of the columns of the tracks table.</span>

In [2]:
CS_CHINOOK = 'sqlite:///../data/databases/chinook.db'
sql = """
SELECT *
FROM tracks
"""
pd.read_sql(sql, CS_CHINOOK).head()

Unnamed: 0,TrackId,Name,AlbumId,MediaTypeId,GenreId,Composer,Milliseconds,Bytes,UnitPrice
0,1,For Those About To Rock (We Salute You),1,1,1,"Angus Young, Malcolm Young, Brian Johnson",343719,11170334,0.99
1,2,Balls to the Wall,2,2,1,,342562,5510424,0.99
2,3,Fast As a Shark,3,2,1,"F. Baltes, S. Kaufman, U. Dirkscneider & W. Ho...",230619,3990994,0.99
3,4,Restless and Wild,3,2,1,"F. Baltes, R.A. Smith-Diesel, S. Kaufman, U. D...",252051,4331779,0.99
4,5,Princess of the Dawn,3,2,1,Deaffy & R.A. Smith-Diesel,375418,6290521,0.99


### Exercise 2

<span style="color:green; font-size:16px">Select the name and composer columns from the tracks table, returning the first five records.</span>

In [3]:
sql = """
SELECT name, composer
FROM tracks
LIMIT 5
"""
pd.read_sql(sql, CS_CHINOOK)

Unnamed: 0,Name,Composer
0,For Those About To Rock (We Salute You),"Angus Young, Malcolm Young, Brian Johnson"
1,Balls to the Wall,
2,Fast As a Shark,"F. Baltes, S. Kaufman, U. Dirkscneider & W. Ho..."
3,Restless and Wild,"F. Baltes, R.A. Smith-Diesel, S. Kaufman, U. D..."
4,Princess of the Dawn,Deaffy & R.A. Smith-Diesel


### Exercise 3

<span style="color:green; font-size:16px">Find the number of unique composers and unit prices in the tracks table.</span>

In [4]:
sql = """
SELECT 
    count(distinct composer) as num_unique_composer, 
    count(distinct UnitPrice) as num_unique_unitprice
FROM tracks
LIMIT 5
"""
pd.read_sql(sql, CS_CHINOOK)

Unnamed: 0,num_unique_composer,num_unique_unitprice
0,852,2


### Exercise 4

<span style="color:green; font-size:16px">Find the unique unit prices in the tracks table.</span>

In [5]:
sql = """
SELECT DISTINCT unitprice
FROM tracks
"""
pd.read_sql(sql, CS_CHINOOK)

Unnamed: 0,UnitPrice
0,0.99
1,1.99


### Exercise 5

<span style="color:green; font-size:16px">Count the total number of records and the number of non-missing values of composer in the tracks table.</span>

In [6]:
sql = """
SELECT count(*), count(composer)
FROM tracks
"""
pd.read_sql(sql, CS_CHINOOK)

Unnamed: 0,count(*),count(composer)
0,3503,2525


### Exercise 6

<span style="color:green; font-size:16px">Return the first five records in the tracks table where composer is missing.</span>

In [7]:
sql = """
SELECT *
FROM tracks
WHERE composer IS NULL
LIMIT 5
"""
pd.read_sql(sql, CS_CHINOOK)

Unnamed: 0,TrackId,Name,AlbumId,MediaTypeId,GenreId,Composer,Milliseconds,Bytes,UnitPrice
0,2,Balls to the Wall,2,2,1,,342562,5510424,0.99
1,63,Desafinado,8,1,2,,185338,5990473,0.99
2,64,Garota De Ipanema,8,1,2,,285048,9348428,0.99
3,65,Samba De Uma Nota Só (One Note Samba),8,1,2,,137273,4535401,0.99
4,66,Por Causa De Você,8,1,2,,169900,5536496,0.99


### Exercise 7

<span style="color:green; font-size:16px">Filter the tracks table where unit price is 1.99. Return the first five after the 100th.</span>

In [8]:
sql = """
SELECT *
FROM tracks
WHERE unitprice = 1.99
LIMIT 5 OFFSET 100
"""
pd.read_sql(sql, CS_CHINOOK)

Unnamed: 0,TrackId,Name,AlbumId,MediaTypeId,GenreId,Composer,Milliseconds,Bytes,UnitPrice
0,2919,Born to Run,230,3,19,,2618619,213772057,1.99
1,2920,Three Minutes,231,3,19,,2763666,531556853,1.99
2,2921,Exodus (Part 1),230,3,19,,2620747,213107744,1.99
3,2922,"Live Together, Die Alone, Pt. 1",231,3,21,,2478041,457364940,1.99
4,2923,Exodus (Part 2) [Season Finale],230,3,19,,2605557,208667059,1.99


### Exercise 8

<span style="color:green; font-size:16px">Compute the minutes and seconds of each song in the tracks table as separate columns. Return the song name and milliseconds along with the other two columns naming them appropriately.</span>

In [9]:
sql = """
SELECT name,
        milliseconds,
        Milliseconds / 1000 / 60 AS minutes,
        Milliseconds / 1000 % 60 AS seconds
FROM tracks
"""
pd.read_sql(sql, CS_CHINOOK).head(10)

Unnamed: 0,Name,Milliseconds,minutes,seconds
0,For Those About To Rock (We Salute You),343719,5,43
1,Balls to the Wall,342562,5,42
2,Fast As a Shark,230619,3,50
3,Restless and Wild,252051,4,12
4,Princess of the Dawn,375418,6,15
5,Put The Finger On You,205662,3,25
6,Let's Get It Up,233926,3,53
7,Inject The Venom,210834,3,30
8,Snowballed,203102,3,23
9,Evil Walks,263497,4,23


### Exercise 9

<span style="color:green; font-size:16px">Select all the records between three and four minutes in length from the tracks table.</span>

In [10]:
sql = """
SELECT *
FROM tracks
WHERE milliseconds between 3 * 60 * 1000 and 4 * 60 * 1000
"""
pd.read_sql(sql, CS_CHINOOK).head()

Unnamed: 0,TrackId,Name,AlbumId,MediaTypeId,GenreId,Composer,Milliseconds,Bytes,UnitPrice
0,3,Fast As a Shark,3,2,1,"F. Baltes, S. Kaufman, U. Dirkscneider & W. Ho...",230619,3990994,0.99
1,6,Put The Finger On You,1,1,1,"Angus Young, Malcolm Young, Brian Johnson",205662,6713451,0.99
2,7,Let's Get It Up,1,1,1,"Angus Young, Malcolm Young, Brian Johnson",233926,7636561,0.99
3,8,Inject The Venom,1,1,1,"Angus Young, Malcolm Young, Brian Johnson",210834,6852860,0.99
4,9,Snowballed,1,1,1,"Angus Young, Malcolm Young, Brian Johnson",203102,6599424,0.99


### Exercise 10

<span style="color:green; font-size:16px">How many records from the tracks table are under 30 seconds in length?</span>

In [11]:
sql = """
SELECT count(*)
FROM tracks
WHERE milliseconds < 30000
"""
pd.read_sql(sql, CS_CHINOOK)

Unnamed: 0,count(*)
0,8


### Exercise 11

<span style="color:green; font-size:16px">How many records from the tracks table are under 30 seconds or more than 10 minutes in length?</span>

In [12]:
sql = """
SELECT count(*)
FROM tracks
WHERE milliseconds < 30000 or milliseconds > 600000
"""
pd.read_sql(sql, CS_CHINOOK).head(10)

Unnamed: 0,count(*)
0,268


### Exercise 12

<span style="color:green; font-size:16px">Calculate the average unit price for songs greater than 10 minutes in length from the tracks table.</span>

In [13]:
sql = """
SELECT avg(unitprice)
FROM tracks
WHERE milliseconds > 600000
"""
pd.read_sql(sql, CS_CHINOOK).head(10)

Unnamed: 0,avg(unitprice)
0,1.801538


### Exercise 13

<span style="color:green; font-size:16px">Select tracks with TrackId of 10, 100, or 1000.</span>

In [14]:
sql = """
SELECT *
FROM tracks
WHERE trackid in (10, 100, 1000)
"""
pd.read_sql(sql, CS_CHINOOK)

Unnamed: 0,TrackId,Name,AlbumId,MediaTypeId,GenreId,Composer,Milliseconds,Bytes,UnitPrice
0,10,Evil Walks,1,1,1,"Angus Young, Malcolm Young, Brian Johnson",263497,8611245,0.99
1,100,Out Of Exile,11,1,4,"Cornell, Commerford, Morello, Wilk",291291,9506571,0.99
2,1000,What If I Do?,80,1,1,"Dave Grohl, Taylor Hawkins, Nate Mendel, Chris...",302994,9929799,0.99


### Exercise 14

<span style="color:green; font-size:16px">Select all customers from France and Portugal.</span>

In [15]:
sql = """
SELECT *
FROM customers
WHERE country in ("France", "Portugal")
"""
pd.read_sql(sql, CS_CHINOOK)

Unnamed: 0,CustomerId,FirstName,LastName,Company,Address,City,State,Country,PostalCode,Phone,Fax,Email,SupportRepId
0,34,João,Fernandes,,Rua da Assunção 53,Lisbon,,Portugal,,+351 (213) 466-111,,jfernandes@yahoo.pt,4
1,35,Madalena,Sampaio,,"Rua dos Campeões Europeus de Viena, 4350",Porto,,Portugal,,+351 (225) 022-448,,masampaio@sapo.pt,4
2,39,Camille,Bernard,,"4, Rue Milton",Paris,,France,75009.0,+33 01 49 70 65 65,,camille.bernard@yahoo.fr,4
3,40,Dominique,Lefebvre,,"8, Rue Hanovre",Paris,,France,75002.0,+33 01 47 42 71 71,,dominiquelefebvre@gmail.com,4
4,41,Marc,Dubois,,"11, Place Bellecour",Lyon,,France,69002.0,+33 04 78 30 30 30,,marc.dubois@hotmail.com,5
5,42,Wyatt,Girard,,"9, Place Louis Barthou",Bordeaux,,France,33000.0,+33 05 56 96 96 96,,wyatt.girard@yahoo.fr,3
6,43,Isabelle,Mercier,,"68, Rue Jouvence",Dijon,,France,21000.0,+33 03 80 73 66 99,,isabelle_mercier@apple.fr,3


### Exercise 15

<span style="color:green; font-size:16px">Find the top 10 invoices by total.</span>

In [16]:
sql = """
SELECT *
FROM invoices
ORDER BY total DESC
LIMIT 10
"""
pd.read_sql(sql, CS_CHINOOK)

Unnamed: 0,InvoiceId,CustomerId,InvoiceDate,BillingAddress,BillingCity,BillingState,BillingCountry,BillingPostalCode,Total
0,404,6,2013-11-13 00:00:00,Rilská 3174/6,Prague,,Czech Republic,14300,25.86
1,299,26,2012-08-05 00:00:00,2211 W Berry Street,Fort Worth,TX,USA,76110,23.86
2,96,45,2010-02-18 00:00:00,Erzsébet krt. 58.,Budapest,,Hungary,H-1073,21.86
3,194,46,2011-04-28 00:00:00,3 Chatham Street,Dublin,Dublin,Ireland,,21.86
4,89,7,2010-01-18 00:00:00,"Rotenturmstraße 4, 1010 Innere Stadt",Vienne,,Austria,1010,18.86
5,201,25,2011-05-29 00:00:00,319 N. Frances Street,Madison,WI,USA,53703,18.86
6,88,57,2010-01-13 00:00:00,"Calle Lira, 198",Santiago,,Chile,,17.91
7,306,5,2012-09-05 00:00:00,Klanova 9/506,Prague,,Czech Republic,14700,16.86
8,313,43,2012-10-06 00:00:00,"68, Rue Jouvence",Dijon,,France,21000,16.86
9,103,24,2010-03-21 00:00:00,162 E Superior Street,Chicago,IL,USA,60611,15.86


### Exercise 16

<span style="color:green; font-size:16px">Sort the invoices table by BillingCountry and within that by Total from greatest to least.</span>

In [17]:
sql = """
SELECT *
FROM invoices
ORDER BY billingcountry, total DESC
"""
pd.read_sql(sql, CS_CHINOOK).head()

Unnamed: 0,InvoiceId,CustomerId,InvoiceDate,BillingAddress,BillingCity,BillingState,BillingCountry,BillingPostalCode,Total
0,348,56,2013-03-10 00:00:00,307 Macacha Güemes,Buenos Aires,,Argentina,1106,13.86
1,403,56,2013-11-08 00:00:00,307 Macacha Güemes,Buenos Aires,,Argentina,1106,8.91
2,164,56,2010-12-17 00:00:00,307 Macacha Güemes,Buenos Aires,,Argentina,1106,5.94
3,142,56,2010-09-14 00:00:00,307 Macacha Güemes,Buenos Aires,,Argentina,1106,3.96
4,119,56,2010-06-12 00:00:00,307 Macacha Güemes,Buenos Aires,,Argentina,1106,1.98


### Exercise 17

<span style="color:green; font-size:16px">Find all tracks that have a name beginning or ending in 'X'.</span>

In [18]:
sql = """
SELECT *
FROM tracks
WHERE name like 'X%' or name like '%X'
"""
pd.read_sql(sql, CS_CHINOOK).head(20)

Unnamed: 0,TrackId,Name,AlbumId,MediaTypeId,GenreId,Composer,Milliseconds,Bytes,UnitPrice
0,52,Man In The Box,7,1,1,"Jerry Cantrell, Layne Staley",286641,9310272,0.99
1,159,FX,17,1,3,"Tony Iommi, Bill Ward, Geezer Butler, Ozzy Osb...",103157,3331776,0.99
2,361,X-9 2001,32,1,10,,273920,9310370,0.99
3,977,Xote Dos Milagres,78,1,7,,269557,8897778,0.99
4,1593,Bonzo's Montreux,128,1,1,John Bonham,258925,8557447,0.99
5,1996,Heart-Shaped Box,163,1,1,Kurt Cobain,281887,9210982,0.99
6,2410,Xanadu,196,1,1,Geddy Lee And Alex Lifeson/Geddy Lee And Neil ...,667428,21753168,0.99
7,2416,The Temples Of Syrinx,196,1,1,Geddy Lee And Alex Lifeson/Geddy Lee And Neil ...,133459,4360163,0.99
8,2642,Twentienth Century Fox,214,1,1,"Robby Krieger, Ray Manzarek, John Densmore, Ji...",153913,5069211,0.99
9,2748,Squeeze Box,221,1,1,Pete Townshend,161280,5256508,0.99


### Exercise 18

<span style="color:green; font-size:16px">Find all tracks that have the word 'smith' anywhere in the composer.</span>

In [19]:
sql = """
SELECT *
FROM tracks
WHERE composer like '%smith%'
"""
pd.read_sql(sql, CS_CHINOOK).head(10)

Unnamed: 0,TrackId,Name,AlbumId,MediaTypeId,GenreId,Composer,Milliseconds,Bytes,UnitPrice
0,4,Restless and Wild,3,2,1,"F. Baltes, R.A. Smith-Diesel, S. Kaufman, U. D...",252051,4331779,0.99
1,5,Princess of the Dawn,3,2,1,Deaffy & R.A. Smith-Diesel,375418,6290521,0.99
2,186,Killing Floor,19,1,3,Adrian Smith,269557,8854240,0.99
3,191,Machine Men,19,1,3,Adrian Smith,341655,11138147,0.99
4,1221,2 Minutes To Midnight,95,1,3,Adrian Smith/Bruce Dickinson,337423,5400576,0.99
5,1226,Can I Play With Madness,96,1,3,Adrian Smith/Bruce Dickinson/Steve Harris,282488,4521984,0.99
6,1229,The Evil That Men Do,96,1,3,Adrian Smith/Bruce Dickinson/Steve Harris,325929,5216256,0.99
7,1235,The Wicker Man,97,1,1,Adrian Smith/Bruce Dickinson/Steve Harris,275539,11022464,0.99
8,1241,The Fallen Angel,97,1,1,Adrian Smith/Steve Harris,240718,9629824,0.99
9,1245,Wildest Dreams,98,1,13,Adrian Smith/Steve Harris,232777,9312384,0.99


### Exercise 19

<span style="color:green; font-size:16px">Calculate the average bytes per millisecond for all tracks. Make sure to use true division. Round to one decimal place.</span>

In [20]:
sql = """
SELECT round(avg(bytes / cast(milliseconds as float)), 1) AS avg_bytes_per_ms
FROM tracks
"""
pd.read_sql(sql, CS_CHINOOK).head(10)

Unnamed: 0,avg_bytes_per_ms
0,40.5


### Exercise 20

<span style="color:green; font-size:16px">Return the five longest names in the track table.</span>

In [21]:
sql = """
SELECT name
FROM tracks
ORDER BY length(name) DESC
LIMIT 5
"""
pd.read_sql(sql, CS_CHINOOK)

Unnamed: 0,Name
0,Homecoming / The Death Of St. Jimmy / East 12t...
1,Symphony No. 3 Op. 36 for Orchestra and Sopran...
2,Jesus Of Suburbia / City Of The Damned / I Don...
3,"The Nutcracker, Op. 71a, Act II: Scene 14: Pas..."
4,Blind Curve: Vocal Under A Bloodlight / Passin...


### Exercise 21

<span style="color:green; font-size:16px">Use the [SQLite math functions page][0] to calculate the area of a circle with radius of 17.</span>


[0]: https://www.sqlite.org/lang_mathfunc.html

In [22]:
sql = """
SELECT pi() * pow(17, 2)
"""
pd.read_sql(sql, CS_CHINOOK)

Unnamed: 0,"pi() * pow(17, 2)"
0,907.920277


### Exercise 22

<span style="color:green; font-size:16px">Count the number of customers that do not have a company name.</span>

In [23]:
sql = """
SELECT count(*)
FROM customers
WHERE company IS NULL
"""
pd.read_sql(sql, CS_CHINOOK)

Unnamed: 0,count(*)
0,49


## 3. GROUP BY and JOIN Clauses

In [24]:
import pandas as pd
CS_CHINOOK = 'sqlite:///../data/databases/chinook.db'
tracks = pd.read_sql('tracks', CS_CHINOOK)
artists = pd.read_sql('artists', CS_CHINOOK)
genres = pd.read_sql('genres', CS_CHINOOK)
invoices = pd.read_sql('invoices', CS_CHINOOK)
invoice_items = pd.read_sql('invoice_items', CS_CHINOOK)
customers = pd.read_sql('customers', CS_CHINOOK)
employees = pd.read_sql('employees', CS_CHINOOK)

In [25]:
artists.head()

Unnamed: 0,ArtistId,Name
0,1,AC/DC
1,2,Accept
2,3,Aerosmith
3,4,Alanis Morissette
4,5,Alice In Chains


In [26]:
invoices.head(3)

Unnamed: 0,InvoiceId,CustomerId,InvoiceDate,BillingAddress,BillingCity,BillingState,BillingCountry,BillingPostalCode,Total
0,1,2,2009-01-01,Theodor-Heuss-Straße 34,Stuttgart,,Germany,70174,1.98
1,2,4,2009-01-02,Ullevålsveien 14,Oslo,,Norway,171,3.96
2,3,8,2009-01-03,Grétrystraat 63,Brussels,,Belgium,1000,5.94


In [27]:
invoice_items.head(3)

Unnamed: 0,InvoiceLineId,InvoiceId,TrackId,UnitPrice,Quantity
0,1,1,2,0.99,1
1,2,1,4,0.99,1
2,3,2,6,0.99,1


In [28]:
tracks.head(3)

Unnamed: 0,TrackId,Name,AlbumId,MediaTypeId,GenreId,Composer,Milliseconds,Bytes,UnitPrice
0,1,For Those About To Rock (We Salute You),1,1,1,"Angus Young, Malcolm Young, Brian Johnson",343719,11170334,0.99
1,2,Balls to the Wall,2,2,1,,342562,5510424,0.99
2,3,Fast As a Shark,3,2,1,"F. Baltes, S. Kaufman, U. Dirkscneider & W. Ho...",230619,3990994,0.99


In [29]:
customers.head(3)

Unnamed: 0,CustomerId,FirstName,LastName,Company,Address,City,State,Country,PostalCode,Phone,Fax,Email,SupportRepId
0,1,Luís,Gonçalves,Embraer - Empresa Brasileira de Aeronáutica S.A.,"Av. Brigadeiro Faria Lima, 2170",São José dos Campos,SP,Brazil,12227-000,+55 (12) 3923-5555,+55 (12) 3923-5566,luisg@embraer.com.br,3
1,2,Leonie,Köhler,,Theodor-Heuss-Straße 34,Stuttgart,,Germany,70174,+49 0711 2842222,,leonekohler@surfeu.de,5
2,3,François,Tremblay,,1498 rue Bélanger,Montréal,QC,Canada,H2G 1A7,+1 (514) 721-4711,,ftremblay@gmail.com,3


### Exercise 1

<span style="color:green; font-size:16px">Find the grand total of invoices by billingcountry. Order the results by this total from greatest to least.</span>

In [30]:
sql = """
SELECT billingcountry, sum(total) as total
FROM invoices
GROUP BY billingcountry
ORDER BY total DESC
"""
pd.read_sql(sql, CS_CHINOOK).head(10)

Unnamed: 0,BillingCountry,total
0,USA,523.06
1,Canada,303.96
2,France,195.1
3,Brazil,190.1
4,Germany,156.48
5,United Kingdom,112.86
6,Czech Republic,90.24
7,Portugal,77.24
8,India,75.26
9,Chile,46.62


### Exercise 2

<span style="color:green; font-size:16px">Find the count, min, max, and avg milliseconds by genreid in the tracks table. Convert the milliseconds to minutes. Order the results by count descending.</span>

In [31]:
# note how 1000.0 is used so that the first division is true division
sql = """
SELECT genreid, 
       count(milliseconds) as count,
       round(min(milliseconds / 1000.0 / 60), 2) as min_time,
       round(max(milliseconds / 1000.0 / 60), 2) as max_time,
       round(avg(milliseconds / 1000.0 / 60), 2) as avg_time
FROM tracks
GROUP BY genreid
ORDER by count DESC
"""
pd.read_sql(sql, CS_CHINOOK).head(10)

Unnamed: 0,GenreId,count,min_time,max_time,avg_time
0,1,1297,0.02,26.87,4.73
1,7,579,0.55,9.05,3.88
2,3,374,0.7,13.61,5.16
3,4,332,0.08,9.31,3.91
4,2,130,2.11,15.13,4.86
5,19,93,20.63,88.12,35.75
6,6,81,2.25,9.83,4.51
7,24,74,0.86,9.94,4.9
8,21,64,1.88,84.81,42.92
9,14,61,2.12,6.97,3.67


### Exercise 3

<span style="color:green; font-size:16px">Using the invoice_items table, count the times each trackid and invoiceid combination appear. This number should be one, as an invoice should not have multiple instances of the same trackid. Can you verify that there is in fact at most one?</span>

In [32]:
# A simple group by to get the count by trackid and invoiceid
sql = """
SELECT trackid, invoiceid, count(*) as ct
FROM invoice_items
GROUP BY trackid, invoiceid
"""
pd.read_sql(sql, CS_CHINOOK).head()

Unnamed: 0,TrackId,InvoiceId,ct
0,1,108,1
1,2,1,1
2,2,214,1
3,3,319,1
4,4,1,1


In [33]:
# use having clause to see if any counts are greater than 1
sql = """
SELECT trackid, invoiceid, count(*) as ct
FROM invoice_items
GROUP BY trackid, invoiceid
HAVING ct > 1
"""
pd.read_sql(sql, CS_CHINOOK).head()

Unnamed: 0,TrackId,InvoiceId,ct


### Exercise 4

<span style="color:green; font-size:16px">Calculate the total revenue of each track using the invoice_items table. Return the top five tracks by revenue.</span>

In [34]:
# use having clause to see if any counts are greater than 1
sql = """
SELECT trackid, sum(unitprice) as revenue
FROM invoice_items
GROUP BY trackid
ORDER BY revenue DESC
LIMIT 5
"""
pd.read_sql(sql, CS_CHINOOK)

Unnamed: 0,TrackId,revenue
0,2832,3.98
1,2850,3.98
2,2868,3.98
3,3177,3.98
4,3200,3.98


### Exercise 5

<span style="color:green; font-size:16px">Create a table with the track name and genre name (not genreid).</span>

In [35]:
sql = """
SELECT 
    t.name as track_name, 
    g.name as genre_name
FROM tracks as t 
    LEFT JOIN genres as g ON t.genreid = g.genreid
"""
pd.read_sql(sql, CS_CHINOOK).head()

Unnamed: 0,track_name,genre_name
0,For Those About To Rock (We Salute You),Rock
1,Balls to the Wall,Rock
2,Fast As a Shark,Rock
3,Restless and Wild,Rock
4,Princess of the Dawn,Rock


### Exercise 6

<span style="color:green; font-size:16px">Create a table with the track name, genre name, album title, and artist title. You will need to join four tables together.</span>

In [36]:
sql = """
SELECT 
    t.name as track_name, 
    g.name as genre_name,
    a.title as album_title,
    ar.name as artist_title
FROM tracks as t 
    LEFT JOIN genres as g ON t.genreid = g.genreid
    LEFT JOIN albums as a ON t.albumid = a.albumid
    LEFT JOIN artists as ar ON a.artistid = ar.artistid
"""
pd.read_sql(sql, CS_CHINOOK).head()

Unnamed: 0,track_name,genre_name,album_title,artist_title
0,For Those About To Rock (We Salute You),Rock,For Those About To Rock We Salute You,AC/DC
1,Balls to the Wall,Rock,Balls to the Wall,Accept
2,Fast As a Shark,Rock,Restless and Wild,Accept
3,Restless and Wild,Rock,Restless and Wild,Accept
4,Princess of the Dawn,Rock,Restless and Wild,Accept


### Exercise 7

<span style="color:green; font-size:16px">For tracks less than two minutes in length, count the occurrence of each media type. Make sure to use the media type name.</span>

In [37]:
sql = """
SELECT 
    mt.name, count(*)
FROM tracks as t
    LEFT JOIN media_types as mt ON t.mediatypeid = mt.mediatypeid
WHERE t.milliseconds < 2 * 60 * 1000
GROUP BY mt.mediatypeid

"""
pd.read_sql(sql, CS_CHINOOK)

Unnamed: 0,Name,count(*)
0,MPEG audio file,85
1,Protected AAC audio file,5
2,Protected MPEG-4 video file,1
3,Purchased AAC audio file,2
