# Intro to SQL, Part 2
**Learning Objective:** 
- Practice using functions and aggregate GROUP BY queries, especially MIN(), MAX(), AVG(), COUNT(), SUM()
- Continue practicing SELECT statements

❗**TODO:** Add a Markdown block below. Put the names of both partners there.

Below is the ERD for the Database we will be practicing with. (Chinook.db)

![chinook_schematic.jpeg](attachment:chinook_schematic.jpeg)

## Setup

*This is already done for you.* First, we install the python requirements from **requirements.txt**. There's lot of libraries, but most importantly, this installs the  **ipython-sql** library that enables SQL execution in Jupyter Notebooks and an older version of SQLAlchemy (1.4.46) that works with Codespaces.

❗ The command ```**%load_ext sql**``` is used to activate sql in Jupyter.

❗ The command ```**%sql sqlite:///<database_name>.db**``` is used to select the working database. (Note 3 slashes!)



In [1]:
# Start the Jupyter SQL engine, connecting to a SQLite database 
%reload_ext sql 
%sql sqlite:///chinook.db

# Exercises - GROUP BY

#### Ex. 1 - COUNT()
How many tracks did each composer write? Show the name of the composer and the number of songs. (in Tracks table)


In [2]:
%%sql
SELECT Composer, COUNT(*) as NumberOfTracks
FROM Tracks
WHERE Composer IS NOT NULL
GROUP BY Composer
ORDER BY NumberOfTracks Desc;

 * sqlite:///chinook.db
Done.


Composer,NumberOfTracks
Steve Harris,80
U2,44
Jagger/Richards,35
Billy Corgan,31
Kurt Cobain,26
Bill Berry-Peter Buck-Mike Mills-Michael Stipe,25
The Tea Party,24
Miles Davis,23
Gilberto Gil,23
Chris Cornell,23


**Tips:** Dealing with Null values - Use the [COALESCE function](https://www.postgresqltutorial.com/postgresql-tutorial/postgresql-coalesce/) to return an alternative non-null value. For example 
`coalesce(composer, 'Unknown')` 

In [3]:
%%sql
-- run this cell to try it out!
SELECT trackid, name, composer, coalesce(composer, 'Unknown') 
FROM tracks
WHERE composer IS null; 
 

 * sqlite:///chinook.db
Done.


TrackId,Name,Composer,"coalesce(composer, 'Unknown')"
2,Balls to the Wall,,Unknown
63,Desafinado,,Unknown
64,Garota De Ipanema,,Unknown
65,Samba De Uma Nota Só (One Note Samba),,Unknown
66,Por Causa De Você,,Unknown
67,Ligia,,Unknown
68,Fotografia,,Unknown
69,Dindi (Dindi),,Unknown
70,Se Todos Fossem Iguais A Você (Instrumental),,Unknown
71,Falando De Amor,,Unknown


#### Ex. 2 
How many songs did each composer write in each genre? Show missing composers as "Unknown"

_(hint: group by both composer and genre)_


In [6]:
%%sql
SELECT COALESCE(Composer, 'Unknown') AS ComposerName,
genres.Name AS GenreName,
COUNT(*) AS NumberOfSongs
FROM tracks
JOIN genres ON tracks.GenreID = genres.GenreID
GROUP BY ComposerName, GenreName
ORDER BY ComposerName, GenreName;


 * sqlite:///chinook.db
Done.


ComposerName,GenreName,NumberOfSongs
"A. F. Iommi, W. Ward, T. Butler, J. Osbourne",Metal,3
A. Jamal,Jazz,1
A.Bouchard/J.Bouchard/S.Pearlman,Metal,1
A.Isbell/A.Jones/O.Redding,Blues,1
AC/DC,Rock,8
Aaron Copland,Classical,1
Aaron Goldberg,Jazz,1
Ace Frehley,Rock,2
"Acyi Marques/Arlindo Bruz/Braço, Beto Sem/Zeca Pagodinho",Latin,1
Acyr Marques/Arlindo Cruz/Franco,Latin,1


#### Ex. 3 - SUM()
Who are the top 5 composers who wrote the most music, by length of time? Show time in minutes.  

In [None]:
%%sql
SELECT Composer, SUM(Milliseconds) / 60000.0 AS TotalTimeInMinutes
FROM tracks
WHERE Composer IS NOT NULL
GROUP BY Composer
ORDER BY TotalTimeINMInutes DESC
LIMIT 5;

#### Ex. 4 - MAX()
What is the longest song? Show the track name, composer, and time in minutes.

In [9]:
%%sql
SELECT
Name AS TrackName,
Composer,
Milliseconds / 60000.0 AS TimeInMinutes
FROM tracks
ORDER BY Milliseconds DESC
LIMIT 1;

 * sqlite:///chinook.db
Done.


TrackName,Composer,TimeInMinutes
Occupation / Precipice,,88.11588333333333


#### Ex. 5 - HAVING
Show how many songs each composer wrote in each genre, excluding unknown composers and those who wrote less than 5 songs.

In [11]:
%%sql
SELECT Composer, genres.Name AS Genre,
COUNT(*) AS NumberOfSongs
FROM tracks
JOIN genres ON tracks.GenreID = genres.GenreID
WHERE Composer IS NOT NULL
GROUP BY Composer, Genre
HAVING COUNT (*) >= 5
ORDER BY Composer, Genre;

 * sqlite:///chinook.db
Done.


Composer,Genre,NumberOfSongs
AC/DC,Rock,8
"Adam Clayton, Bono, Larry Mullen & The Edge",Rock,11
"Adam Clayton, Bono, Larry Mullen, The Edge",Rock,11
Adrian Smith,Metal,5
Adrian Smith/Bruce Dickinson,Metal,5
Adrian Smith/Bruce Dickinson/Steve Harris,Metal,5
Alanis Morissette & Glenn Ballard,Rock,13
Alex Van Halen/David Lee Roth/Edward Van Halen/Michael Anthony,Rock,7
"Angus Young, Malcolm Young, Brian Johnson",Rock,10
"Anthony Kiedis, Flea, John Frusciante, and Chad Smith",Rock,16


## Practice - Write your Own!
Using any of the other tables in Chinook, write 3 meaningful queries that use GROUP BY to show statistics about the data. 

Please include the question your query is designed to answer.  

#### Question 1
...

In [12]:
%%sql
SELECT BillingCountry AS Country,
ROUND(SUM(Total), 2) AS TotalSales
FROM invoices
GROUP BY BillingCountry
ORDER BY TotalSales DESC;

 * sqlite:///chinook.db
Done.


Country,TotalSales
USA,523.06
Canada,303.96
France,195.1
Brazil,190.1
Germany,156.48
United Kingdom,112.86
Czech Republic,90.24
Portugal,77.24
India,75.26
Chile,46.62


#### Question 2
...

In [14]:
%%sql
SELECT genres.Name AS Genre,
Round(AVG(tracks.Milliseconds)/ 60000, 2) AS AverageTrackLengthMinutes
FROM tracks
JOIN genres ON tracks.GenreID = genres.GenreID
GROUP BY genres.Name
ORDER BY AverageTrackLengthMinutes DESC

 * sqlite:///chinook.db
Done.


Genre,AverageTrackLengthMinutes
Sci Fi & Fantasy,48.53
Science Fiction,43.76
Drama,42.92
TV Shows,35.75
Comedy,26.42
Metal,5.16
Electronica/Dance,5.05
Heavy Metal,4.96
Classical,4.9
Jazz,4.86


#### Question 3
...

In [None]:
%%sql 
-- Write your query here

### Reflection and Questions

What remaining or new questions do you have?

_response_


### Submission: Commit and Push your Completed Exercises