# SQL querying and selecting data (exercises)

## Preparation

For this section you need `chinook.db` database file and working `%sql` magic.  
If you don't have it, please go back to the [previous section](connect_to_database.ipynb) and follow the instructions.  
The following code should not produce any errors:

In [None]:
%load_ext sql
%sql sqlite:///chinook.db

## Exercise: biggest tracks

Print (select) the top 10 biggest `tracks` according to size in `Bytes` column.


In [None]:
%sql SELECT * FROM tracks ORDER BY Bytes DESC LIMIT 10

## Exercise: simple filtering

Write statements to get `tracks` with: the `AlbumId` equal to `1` and the `Bytes` length greater than 200,000 milliseconds.

In [None]:
%sql SELECT * FROM tracks WHERE AlbumId = 1 AND Bytes > 200000

## Exercise: filter with `IN`

Return `customers` from `State` of `FL` (Florida), `WA` (Washington), `CA` (California).  
Use `IN`, not `AND`.

In [None]:
%sql SELECT * FROM customers WHERE State IN ('FL', 'WA', 'CA')

## Exercise: filter for numbers in range

Find `invoices` whose `Total` is between 14.96 and 18.86. Use `BETWEEN`.  
Sort the output with increasing `Total`. Show only these columns: `InvoiceId`, `BillingAddress`, `Total`.

In [None]:
%sql SELECT InvoiceId, BillingAddress, Total FROM invoices WHERE Total BETWEEN 14.96 AND 18.86 ORDER BY Total ASC

## Exercise: filter partially matching words

Find the `tracks` whose `Name`s contain a substring: `Br` (two letters), one letter, `wn` (two letters).

In [None]:
%sql SELECT * FROM tracks WHERE Name LIKE '%Br_wn%'

## Exercise: filtering missing values

Find the `customers` who do not have phone numbers. In the result show only the name and the (missing) phone number.

In [None]:
%sql SELECT FirstName, LastName, Phone FROM customers WHERE Phone IS NULL

## Exercise: from the database to a Python list

Create a Python variable `bs` to be a list containing all `tracks` sizes as provided in the `Bytes` column.  
Print the `type` of the `bs` variable. Print the first 10 elements of `bs`.

In [None]:
import sqlalchemy as sa
import pandas as pd 
engine = sa.create_engine("sqlite:///chinook.db")

In [None]:
df = pd.read_sql("SELECT Bytes FROM tracks", con=engine)
bs = df.loc[:, 'Bytes'].tolist()
print(bs[:10])

## Exercise: from the database to a Pandas data frame

Create a Python variable `df` to be a Pandas `DataFrame` with two columns corresponding to `Milliseconds` and `Bytes` columns of the `tracks` table. Print `df`.  
You will likely need to:
- Import `pandas` package.
- Use `read_sql` function from `pandas`.
- Create a separate connection `engine` with `creeate_engine`.

In [None]:
engine = sa.create_engine("sqlite:///chinook.db")
df = pd.read_sql("SELECT Milliseconds, Bytes FROM tracks", con=engine)
df