# SQL Exploration

This notebook will walk through introductory, intermediate and advanced SQL querying.  We will use SQLite to execute our queries and will display the results in pandas dataframes.

## The Data

We will explore the Chinook Database, which is a sample database representing a digital media store.  The following image shows the schema for the tables in the database:

![image](images/ChinookDatabaseSchema.png)

In [3]:
import pandas as pd
import sqlite3

con = sqlite3.connect('ChinookDatabase1.4_Sqlite/Chinook_Sqlite.sqlite')

# INTRODUCTORY QUERIES

### 1) How many customers are there in the database?

### 2) List the unique job titles from the Employee table.

### 3) What is the average invoice total? 

### 4) How many customers are located in the United States?

### 5) How many songs have the word "Love" in the title?

### 6) List all songs that are longer than 6 minutes.  For clarity, create an additional column that shows song length in seconds.

### 7) Which song has the largest file size?

### 8) How many tracks do not list a composer?

### 9) List all invoices from March, 2011.

### 10) List all invoices from France by invoice total in descending order.

# INTERMEDIATE QUERIES

### 1) List the maximum invoice totals for each country.

### 2) List the maximum invoice totals for each year.

### 3) List the maximum invoice totals for each year in each country.

### 4) Refine question 3 to only include invoice totals greater than 7 dollars.

### 5) List each track name along with the corresponding genre name.

### 6) List the longest track for each genre.

### 7) List all employee/customer relationships.  Your results should include 2 columns.  The first column should state the employee's first and last name.  The second column should state the customer's first and last name.  If an employee does not serve as a support representative to any customers, the corresponding customer name entry should be null.  If an employee serves as a support representative to multiple customers, each individual relationship should be returned.

### 8) List the total number of tracks from the genre "Classical" included in each playlist.

### 9) List all employees/supervisor relationships.  Your results should include 2 columns.  The first column should state the employee's first and last name.  The second column should state the supervisor's first and last name.  If the employee does not have a supervisor, the corresponding supervisor name entry should be null.

### 10) Update the Heavy Metal Classic playlist to include all additional tracks that are included in the Heavy Metal Genre.  Do not allow tracks to be duplicated in the new playlist.

# ADVANCED QUERIES

### 1) Return a table that states the number of tracks that are less than 3 minutes long, between 3 and 5 minutes long, and more than 5 minutes long.

### 2) Return a list of all InvoiceId's where the invoice total is greater than the average total for all invoices.

### 3) What proportion of customers are located in the United States?

### 4) Return all columns of the customer table for customers that reside in the country that has the greatest number of invoices.

### 5) What were the 10 most purchased songs during the year in which total revenue was the highest?