# SQL en Python
<img src=".\img\image-4.png" alt="EDA Path"
    title="A typical EDA path" width="600" height="300" />
## Conectamos con la base de datos

### Documentación de la librería SQLite3 que vamos a utilizar:
https://docs.python.org/3/library/sqlite3.html


In [4]:
# Importamos librerias
import pandas as pd
import sqlite3

In [5]:
# Conectamos con la base de datos chinook.db
path = 'chinook.db'
connection = sqlite3.connect(path)

# Obtenemos un cursor que utilizaremos para hacer las queries
curs = connection.cursor()  # el cursor se comunica con la query a traves de la base de datos y lo trae

In [6]:
# Creamos una Query sencilla
query = """
SELECT *
FROM genres
"""

In [7]:
# Ejecutamos la query con nuestro cursor creado anteriormente
my_query = curs.execute(query)
my_query

<sqlite3.Cursor at 0x23c0addac70>

In [8]:
my_query.fetchall() # Nos muestra los datos...... Es una lista de tuplas

[(1, 'Rock'),
 (2, 'Jazz'),
 (3, 'Metal'),
 (4, 'Alternative & Punk'),
 (5, 'Rock And Roll'),
 (6, 'Blues'),
 (7, 'Latin'),
 (8, 'Reggae'),
 (9, 'Pop'),
 (10, 'Soundtrack'),
 (11, 'Bossa Nova'),
 (12, 'Easy Listening'),
 (13, 'Heavy Metal'),
 (14, 'R&B/Soul'),
 (15, 'Electronica/Dance'),
 (16, 'World'),
 (17, 'Hip Hop/Rap'),
 (18, 'Science Fiction'),
 (19, 'TV Shows'),
 (20, 'Sci Fi & Fantasy'),
 (21, 'Drama'),
 (22, 'Comedy'),
 (23, 'Alternative'),
 (24, 'Classical'),
 (25, 'Opera')]

In [9]:
# Con esta función leemos los datos y lo pasamos a un DataFrame de Pandas
def sql_query(query):
    
    curs.execute(query)

    datos_query = curs.fetchall() 

    col_names = [description[0] for description in curs.description]

    return pd.DataFrame(datos_query, columns= col_names)

In [10]:
sql_query(query)

Unnamed: 0,GenreId,Name
0,1,Rock
1,2,Jazz
2,3,Metal
3,4,Alternative & Punk
4,5,Rock And Roll
5,6,Blues
6,7,Latin
7,8,Reggae
8,9,Pop
9,10,Soundtrack


# Tambien podemos obtener el mismo resultado directamente con pandas (abajo)

In [11]:
pd.read_sql_query(query, connection) # AQUI NOS ESTAMOS SALTANDO EL USO DEL CURSON... metemos la query y el elemento de connection, sin necesidad del cursor

# La query sería LA ORDEN concreta que se pide en SQL

Unnamed: 0,GenreId,Name
0,1,Rock
1,2,Jazz
2,3,Metal
3,4,Alternative & Punk
4,5,Rock And Roll
5,6,Blues
6,7,Latin
7,8,Reggae
8,9,Pop
9,10,Soundtrack


## Ya podemos comenzar con la práctica de chinook:
Antes de empezar a atacar una base de datos, tendremos que saber qué hay dentro, y para ello lo mejor es ver cómo es su **modelo de datos**

![imagen](./img/chinook_data_model.png)

### 1.	Facturas de Clientes de Brasil, Nombre del cliente, Id de factura, fecha de la factura y el país de la factura

In [12]:
query = """
SELECT *
FROM customers, invoices
WHERE customers.CustomerId = invoices.CustomerId
"""
# Una es la key principal... y esa key, para relacionarlas, debe estar en ambas tablas.

pd.read_sql_query(query, connection)

# IMPORTANTE !! Nos hace un JOIN de las 2 tablas... donde se muestra la interseccion... es decir, los elementos comunes

Unnamed: 0,CustomerId,FirstName,LastName,Company,Address,City,State,Country,PostalCode,Phone,...,SupportRepId,InvoiceId,CustomerId.1,InvoiceDate,BillingAddress,BillingCity,BillingState,BillingCountry,BillingPostalCode,Total
0,1,Luís,Gonçalves,Embraer - Empresa Brasileira de Aeronáutica S.A.,"Av. Brigadeiro Faria Lima, 2170",São José dos Campos,SP,Brazil,12227-000,+55 (12) 3923-5555,...,3,98,1,2010-03-11 00:00:00,"Av. Brigadeiro Faria Lima, 2170",São José dos Campos,SP,Brazil,12227-000,3.98
1,1,Luís,Gonçalves,Embraer - Empresa Brasileira de Aeronáutica S.A.,"Av. Brigadeiro Faria Lima, 2170",São José dos Campos,SP,Brazil,12227-000,+55 (12) 3923-5555,...,3,121,1,2010-06-13 00:00:00,"Av. Brigadeiro Faria Lima, 2170",São José dos Campos,SP,Brazil,12227-000,3.96
2,1,Luís,Gonçalves,Embraer - Empresa Brasileira de Aeronáutica S.A.,"Av. Brigadeiro Faria Lima, 2170",São José dos Campos,SP,Brazil,12227-000,+55 (12) 3923-5555,...,3,143,1,2010-09-15 00:00:00,"Av. Brigadeiro Faria Lima, 2170",São José dos Campos,SP,Brazil,12227-000,5.94
3,1,Luís,Gonçalves,Embraer - Empresa Brasileira de Aeronáutica S.A.,"Av. Brigadeiro Faria Lima, 2170",São José dos Campos,SP,Brazil,12227-000,+55 (12) 3923-5555,...,3,195,1,2011-05-06 00:00:00,"Av. Brigadeiro Faria Lima, 2170",São José dos Campos,SP,Brazil,12227-000,0.99
4,1,Luís,Gonçalves,Embraer - Empresa Brasileira de Aeronáutica S.A.,"Av. Brigadeiro Faria Lima, 2170",São José dos Campos,SP,Brazil,12227-000,+55 (12) 3923-5555,...,3,316,1,2012-10-27 00:00:00,"Av. Brigadeiro Faria Lima, 2170",São José dos Campos,SP,Brazil,12227-000,1.98
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
407,59,Puja,Srivastava,,"3,Raj Bhavan Road",Bangalore,,India,560001,+91 080 22289999,...,3,45,59,2009-07-08 00:00:00,"3,Raj Bhavan Road",Bangalore,,India,560001,5.94
408,59,Puja,Srivastava,,"3,Raj Bhavan Road",Bangalore,,India,560001,+91 080 22289999,...,3,97,59,2010-02-26 00:00:00,"3,Raj Bhavan Road",Bangalore,,India,560001,1.99
409,59,Puja,Srivastava,,"3,Raj Bhavan Road",Bangalore,,India,560001,+91 080 22289999,...,3,218,59,2011-08-20 00:00:00,"3,Raj Bhavan Road",Bangalore,,India,560001,1.98
410,59,Puja,Srivastava,,"3,Raj Bhavan Road",Bangalore,,India,560001,+91 080 22289999,...,3,229,59,2011-09-30 00:00:00,"3,Raj Bhavan Road",Bangalore,,India,560001,13.86


In [13]:
query = """
SELECT *
FROM customers c, invoices i
WHERE c.CustomerId = i.CustomerId
"""
# Aca hice lo mismo que mas arribas, pero con los alias para customers (c) y para invoices (i)

pd.read_sql_query(query, connection)

# IMPORTANTE !! Nos hace un JOIN de las 2 tablas... donde se muestra la interseccion... es decir, los elementos comunes

Unnamed: 0,CustomerId,FirstName,LastName,Company,Address,City,State,Country,PostalCode,Phone,...,SupportRepId,InvoiceId,CustomerId.1,InvoiceDate,BillingAddress,BillingCity,BillingState,BillingCountry,BillingPostalCode,Total
0,1,Luís,Gonçalves,Embraer - Empresa Brasileira de Aeronáutica S.A.,"Av. Brigadeiro Faria Lima, 2170",São José dos Campos,SP,Brazil,12227-000,+55 (12) 3923-5555,...,3,98,1,2010-03-11 00:00:00,"Av. Brigadeiro Faria Lima, 2170",São José dos Campos,SP,Brazil,12227-000,3.98
1,1,Luís,Gonçalves,Embraer - Empresa Brasileira de Aeronáutica S.A.,"Av. Brigadeiro Faria Lima, 2170",São José dos Campos,SP,Brazil,12227-000,+55 (12) 3923-5555,...,3,121,1,2010-06-13 00:00:00,"Av. Brigadeiro Faria Lima, 2170",São José dos Campos,SP,Brazil,12227-000,3.96
2,1,Luís,Gonçalves,Embraer - Empresa Brasileira de Aeronáutica S.A.,"Av. Brigadeiro Faria Lima, 2170",São José dos Campos,SP,Brazil,12227-000,+55 (12) 3923-5555,...,3,143,1,2010-09-15 00:00:00,"Av. Brigadeiro Faria Lima, 2170",São José dos Campos,SP,Brazil,12227-000,5.94
3,1,Luís,Gonçalves,Embraer - Empresa Brasileira de Aeronáutica S.A.,"Av. Brigadeiro Faria Lima, 2170",São José dos Campos,SP,Brazil,12227-000,+55 (12) 3923-5555,...,3,195,1,2011-05-06 00:00:00,"Av. Brigadeiro Faria Lima, 2170",São José dos Campos,SP,Brazil,12227-000,0.99
4,1,Luís,Gonçalves,Embraer - Empresa Brasileira de Aeronáutica S.A.,"Av. Brigadeiro Faria Lima, 2170",São José dos Campos,SP,Brazil,12227-000,+55 (12) 3923-5555,...,3,316,1,2012-10-27 00:00:00,"Av. Brigadeiro Faria Lima, 2170",São José dos Campos,SP,Brazil,12227-000,1.98
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
407,59,Puja,Srivastava,,"3,Raj Bhavan Road",Bangalore,,India,560001,+91 080 22289999,...,3,45,59,2009-07-08 00:00:00,"3,Raj Bhavan Road",Bangalore,,India,560001,5.94
408,59,Puja,Srivastava,,"3,Raj Bhavan Road",Bangalore,,India,560001,+91 080 22289999,...,3,97,59,2010-02-26 00:00:00,"3,Raj Bhavan Road",Bangalore,,India,560001,1.99
409,59,Puja,Srivastava,,"3,Raj Bhavan Road",Bangalore,,India,560001,+91 080 22289999,...,3,218,59,2011-08-20 00:00:00,"3,Raj Bhavan Road",Bangalore,,India,560001,1.98
410,59,Puja,Srivastava,,"3,Raj Bhavan Road",Bangalore,,India,560001,+91 080 22289999,...,3,229,59,2011-09-30 00:00:00,"3,Raj Bhavan Road",Bangalore,,India,560001,13.86


In [14]:
query = """
SELECT c.*, InvoiceDate
FROM customers c, invoices i
WHERE c.CustomerId = i.CustomerId
"""

# Ahora tomamos todas las columnas de customers y solamente la columna InvoiceDate de invoices

pd.read_sql_query(query, connection)


Unnamed: 0,CustomerId,FirstName,LastName,Company,Address,City,State,Country,PostalCode,Phone,Fax,Email,SupportRepId,InvoiceDate
0,1,Luís,Gonçalves,Embraer - Empresa Brasileira de Aeronáutica S.A.,"Av. Brigadeiro Faria Lima, 2170",São José dos Campos,SP,Brazil,12227-000,+55 (12) 3923-5555,+55 (12) 3923-5566,luisg@embraer.com.br,3,2010-03-11 00:00:00
1,1,Luís,Gonçalves,Embraer - Empresa Brasileira de Aeronáutica S.A.,"Av. Brigadeiro Faria Lima, 2170",São José dos Campos,SP,Brazil,12227-000,+55 (12) 3923-5555,+55 (12) 3923-5566,luisg@embraer.com.br,3,2010-06-13 00:00:00
2,1,Luís,Gonçalves,Embraer - Empresa Brasileira de Aeronáutica S.A.,"Av. Brigadeiro Faria Lima, 2170",São José dos Campos,SP,Brazil,12227-000,+55 (12) 3923-5555,+55 (12) 3923-5566,luisg@embraer.com.br,3,2010-09-15 00:00:00
3,1,Luís,Gonçalves,Embraer - Empresa Brasileira de Aeronáutica S.A.,"Av. Brigadeiro Faria Lima, 2170",São José dos Campos,SP,Brazil,12227-000,+55 (12) 3923-5555,+55 (12) 3923-5566,luisg@embraer.com.br,3,2011-05-06 00:00:00
4,1,Luís,Gonçalves,Embraer - Empresa Brasileira de Aeronáutica S.A.,"Av. Brigadeiro Faria Lima, 2170",São José dos Campos,SP,Brazil,12227-000,+55 (12) 3923-5555,+55 (12) 3923-5566,luisg@embraer.com.br,3,2012-10-27 00:00:00
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
407,59,Puja,Srivastava,,"3,Raj Bhavan Road",Bangalore,,India,560001,+91 080 22289999,,puja_srivastava@yahoo.in,3,2009-07-08 00:00:00
408,59,Puja,Srivastava,,"3,Raj Bhavan Road",Bangalore,,India,560001,+91 080 22289999,,puja_srivastava@yahoo.in,3,2010-02-26 00:00:00
409,59,Puja,Srivastava,,"3,Raj Bhavan Road",Bangalore,,India,560001,+91 080 22289999,,puja_srivastava@yahoo.in,3,2011-08-20 00:00:00
410,59,Puja,Srivastava,,"3,Raj Bhavan Road",Bangalore,,India,560001,+91 080 22289999,,puja_srivastava@yahoo.in,3,2011-09-30 00:00:00


In [15]:
query = """
SELECT c.FirstName||' '||c.LastName as 'Full Name', i.InvoiceId, i.InvoiceDate, i.BillingCountry
FROM customers c
JOIN invoices i ON c.CustomerId = i.CustomerId
WHERE c.Country = 'Brazil'
"""

# Ahora lo vamos a hacer con la funcion JOIN

pd.read_sql_query(query, connection).head() # Hay un head para mostrar solo los primeros.

# Lo puedo hacer igualmente con WHERE y añadir tantas condiociones (con 'and' y 'or' o bien una debajo de la otra) como queramos.


Unnamed: 0,Full Name,InvoiceId,InvoiceDate,BillingCountry
0,Luís Gonçalves,98,2010-03-11 00:00:00,Brazil
1,Luís Gonçalves,121,2010-06-13 00:00:00,Brazil
2,Luís Gonçalves,143,2010-09-15 00:00:00,Brazil
3,Luís Gonçalves,195,2011-05-06 00:00:00,Brazil
4,Luís Gonçalves,316,2012-10-27 00:00:00,Brazil


### 2.	Facturas de Clientes de Brasil

In [16]:
query = """
SELECT i.*
FROM customers c
JOIN invoices i ON c.CustomerId = i.CustomerId
WHERE c.country = 'Brazil'
"""

# Selecciono TODOO(i.*) de la tabla de facturas, siempre que el pais del cliente sea Brasil.

pd.read_sql_query(query, connection).head()

Unnamed: 0,InvoiceId,CustomerId,InvoiceDate,BillingAddress,BillingCity,BillingState,BillingCountry,BillingPostalCode,Total
0,98,1,2010-03-11 00:00:00,"Av. Brigadeiro Faria Lima, 2170",São José dos Campos,SP,Brazil,12227-000,3.98
1,121,1,2010-06-13 00:00:00,"Av. Brigadeiro Faria Lima, 2170",São José dos Campos,SP,Brazil,12227-000,3.96
2,143,1,2010-09-15 00:00:00,"Av. Brigadeiro Faria Lima, 2170",São José dos Campos,SP,Brazil,12227-000,5.94
3,195,1,2011-05-06 00:00:00,"Av. Brigadeiro Faria Lima, 2170",São José dos Campos,SP,Brazil,12227-000,0.99
4,316,1,2012-10-27 00:00:00,"Av. Brigadeiro Faria Lima, 2170",São José dos Campos,SP,Brazil,12227-000,1.98


### 3.	Muestra cada factura asociada a cada agente de ventas con su nombre completo.

In [17]:
query = """
SELECT i.*, e.FirstName||' '||e.LastName as 'Full Name'
FROM employees e
JOIN customers c ON e.EmployeeId = c.SupportRepId
JOIN invoices i ON c.CustomerId = i.CustomerId 
"""


pd.read_sql_query(query, connection)

Unnamed: 0,InvoiceId,CustomerId,InvoiceDate,BillingAddress,BillingCity,BillingState,BillingCountry,BillingPostalCode,Total,Full Name
0,98,1,2010-03-11 00:00:00,"Av. Brigadeiro Faria Lima, 2170",São José dos Campos,SP,Brazil,12227-000,3.98,Jane Peacock
1,121,1,2010-06-13 00:00:00,"Av. Brigadeiro Faria Lima, 2170",São José dos Campos,SP,Brazil,12227-000,3.96,Jane Peacock
2,143,1,2010-09-15 00:00:00,"Av. Brigadeiro Faria Lima, 2170",São José dos Campos,SP,Brazil,12227-000,5.94,Jane Peacock
3,195,1,2011-05-06 00:00:00,"Av. Brigadeiro Faria Lima, 2170",São José dos Campos,SP,Brazil,12227-000,0.99,Jane Peacock
4,316,1,2012-10-27 00:00:00,"Av. Brigadeiro Faria Lima, 2170",São José dos Campos,SP,Brazil,12227-000,1.98,Jane Peacock
...,...,...,...,...,...,...,...,...,...,...
407,88,57,2010-01-13 00:00:00,"Calle Lira, 198",Santiago,,Chile,,17.91,Steve Johnson
408,217,57,2011-08-20 00:00:00,"Calle Lira, 198",Santiago,,Chile,,1.98,Steve Johnson
409,240,57,2011-11-22 00:00:00,"Calle Lira, 198",Santiago,,Chile,,3.96,Steve Johnson
410,262,57,2012-02-24 00:00:00,"Calle Lira, 198",Santiago,,Chile,,5.94,Steve Johnson


In [18]:
# OTRA FORMA omitiendo el segundo JOIN y poniendo un AND
query = """
SELECT i.*, e.FirstName||' '||e.LastName as 'Full Name'
FROM employees e
JOIN customers c, invoices i ON e.EmployeeId = c.SupportRepId AND i.CustomerId = c.CustomerId
"""


pd.read_sql_query(query, connection)

Unnamed: 0,InvoiceId,CustomerId,InvoiceDate,BillingAddress,BillingCity,BillingState,BillingCountry,BillingPostalCode,Total,Full Name
0,98,1,2010-03-11 00:00:00,"Av. Brigadeiro Faria Lima, 2170",São José dos Campos,SP,Brazil,12227-000,3.98,Jane Peacock
1,121,1,2010-06-13 00:00:00,"Av. Brigadeiro Faria Lima, 2170",São José dos Campos,SP,Brazil,12227-000,3.96,Jane Peacock
2,143,1,2010-09-15 00:00:00,"Av. Brigadeiro Faria Lima, 2170",São José dos Campos,SP,Brazil,12227-000,5.94,Jane Peacock
3,195,1,2011-05-06 00:00:00,"Av. Brigadeiro Faria Lima, 2170",São José dos Campos,SP,Brazil,12227-000,0.99,Jane Peacock
4,316,1,2012-10-27 00:00:00,"Av. Brigadeiro Faria Lima, 2170",São José dos Campos,SP,Brazil,12227-000,1.98,Jane Peacock
...,...,...,...,...,...,...,...,...,...,...
407,88,57,2010-01-13 00:00:00,"Calle Lira, 198",Santiago,,Chile,,17.91,Steve Johnson
408,217,57,2011-08-20 00:00:00,"Calle Lira, 198",Santiago,,Chile,,1.98,Steve Johnson
409,240,57,2011-11-22 00:00:00,"Calle Lira, 198",Santiago,,Chile,,3.96,Steve Johnson
410,262,57,2012-02-24 00:00:00,"Calle Lira, 198",Santiago,,Chile,,5.94,Steve Johnson


### 4.	Para cada factura muestra el nombre del cliente, el país, el nombre del agente y el total

In [19]:
query = """
SELECT c.FirstName||' '||c.LastName as 'Full Name Cliente', c.Country, e.FirstName||' '||e.LastName as 'Full Name Agente', i.Total
FROM invoices i
JOIN customers c ON c.CustomerId = i.CustomerId
JOIN employees e ON c.SupportRepId = e.EmployeeId
"""
pd.read_sql_query(query, connection)

Unnamed: 0,Full Name Cliente,Country,Full Name Agente,Total
0,Luís Gonçalves,Brazil,Jane Peacock,3.98
1,Luís Gonçalves,Brazil,Jane Peacock,3.96
2,Luís Gonçalves,Brazil,Jane Peacock,5.94
3,Luís Gonçalves,Brazil,Jane Peacock,0.99
4,Luís Gonçalves,Brazil,Jane Peacock,1.98
...,...,...,...,...
407,Puja Srivastava,India,Jane Peacock,5.94
408,Puja Srivastava,India,Jane Peacock,1.99
409,Puja Srivastava,India,Jane Peacock,1.98
410,Puja Srivastava,India,Jane Peacock,13.86


### 5.	Muestra cada artículo de la factura con el nombre de la canción.

In [20]:
query = """
SELECT *
FROM invoice_items it
JOIN tracks t ON t.TrackId = it.TrackId
"""
pd.read_sql_query(query, connection)

Unnamed: 0,InvoiceLineId,InvoiceId,TrackId,UnitPrice,Quantity,TrackId.1,Name,AlbumId,MediaTypeId,GenreId,Composer,Milliseconds,Bytes,UnitPrice.1
0,1,1,2,0.99,1,2,Balls to the Wall,2,2,1,,342562,5510424,0.99
1,2,1,4,0.99,1,4,Restless and Wild,3,2,1,"F. Baltes, R.A. Smith-Diesel, S. Kaufman, U. D...",252051,4331779,0.99
2,3,2,6,0.99,1,6,Put The Finger On You,1,1,1,"Angus Young, Malcolm Young, Brian Johnson",205662,6713451,0.99
3,4,2,8,0.99,1,8,Inject The Venom,1,1,1,"Angus Young, Malcolm Young, Brian Johnson",210834,6852860,0.99
4,5,2,10,0.99,1,10,Evil Walks,1,1,1,"Angus Young, Malcolm Young, Brian Johnson",263497,8611245,0.99
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2235,2236,411,3136,0.99,1,3136,Looking For Love,141,1,3,Sykes,391941,12769847,0.99
2236,2237,411,3145,0.99,1,3145,Sweet Lady Luck,141,1,3,Vandenberg,273737,8919163,0.99
2237,2238,411,3154,0.99,1,3154,Feirinha da Pavuna/Luz do Repente/Bagaço da La...,248,1,7,"Arlindo Cruz/Franco/Marquinhos PQD/Negro, Jove...",107206,3593684,0.99
2238,2239,411,3163,0.99,1,3163,Samba pras moças,248,1,7,Grazielle/Roque Ferreira,152816,5121366,0.99


In [21]:
query = """
SELECT it.InvoiceId, t.Name as 'Song Name'
FROM invoice_items it
JOIN tracks t ON t.TrackId = it.TrackId
"""
pd.read_sql_query(query, connection)

Unnamed: 0,InvoiceId,Song Name
0,1,Balls to the Wall
1,1,Restless and Wild
2,2,Put The Finger On You
3,2,Inject The Venom
4,2,Evil Walks
...,...,...
2235,411,Looking For Love
2236,411,Sweet Lady Luck
2237,411,Feirinha da Pavuna/Luz do Repente/Bagaço da La...
2238,411,Samba pras moças


### 6.	Muestra todas las canciones con su nombre, formato, álbum y género.

In [22]:
query = """
SELECT t.Name as 'Song Name', mt.Name as 'Format', a.Title as 'Album', g.Name as 'Genre'
FROM tracks t
JOIN albums a ON a.AlbumId = t.AlbumId
JOIN genres g ON g.GenreId = t.GenreId
JOIN media_types mt ON mt.MediaTypeId = t.MediaTypeId
"""
pd.read_sql_query(query, connection)

Unnamed: 0,Song Name,Format,Album,Genre
0,For Those About To Rock (We Salute You),MPEG audio file,For Those About To Rock We Salute You,Rock
1,Balls to the Wall,Protected AAC audio file,Balls to the Wall,Rock
2,Fast As a Shark,Protected AAC audio file,Restless and Wild,Rock
3,Restless and Wild,Protected AAC audio file,Restless and Wild,Rock
4,Princess of the Dawn,Protected AAC audio file,Restless and Wild,Rock
...,...,...,...,...
3498,Pini Di Roma (Pinien Von Rom) \ I Pini Della V...,Protected AAC audio file,Respighi:Pines of Rome,Classical
3499,"String Quartet No. 12 in C Minor, D. 703 ""Quar...",Protected AAC audio file,Schubert: The Late String Quartets & String Qu...,Classical
3500,"L'orfeo, Act 3, Sinfonia (Orchestra)",Protected AAC audio file,Monteverdi: L'Orfeo,Classical
3501,"Quintet for Horn, Violin, 2 Violas, and Cello ...",Protected AAC audio file,Mozart: Chamber Music,Classical


### 7.	Muestra cuántas canciones hay en cada playlist y el nombre de cada playlist.

### Solucion Borja... Cuenta los elementos por el ID... De esta forma evitamos el problema de tener nombres de playlists repetidos, como en el caso de "Music", ya que tiene un Id distinto.

In [23]:
query = """
SELECT pl.Name as 'Playlist',COUNT(*) as 'Nº Canciones'
FROM playlists pl
JOIN playlist_track pt ON pt.playlistId = pl.playlistId
GROUP BY pl.PlaylistID
"""
pd.read_sql_query(query, connection)

Unnamed: 0,Playlist,Nº Canciones
0,Music,3290
1,TV Shows,213
2,90’s Music,1477
3,Music,3290
4,Music Videos,1
5,TV Shows,213
6,Brazilian Music,39
7,Classical,75
8,Classical 101 - Deep Cuts,25
9,Classical 101 - Next Steps,25


In [199]:
query = """
SELECT pl.Name as 'Nombre Playlist', COUNT(*) as 'N° Canciones'
FROM playlist_track pt
LEFT JOIN  playlists pl  ON pl.PlaylistId = pt.PlaylistId
GROUP BY pl.Name
"""

pd.read_sql_query(query, connection)   # Si quito el LEFT JOIN, me aparecen 2 playlist

Unnamed: 0,Nombre Playlist,N° Canciones
0,90’s Music,1477
1,Brazilian Music,39
2,Classical,75
3,Classical 101 - Deep Cuts,25
4,Classical 101 - Next Steps,25
5,Classical 101 - The Basics,25
6,Grunge,15
7,Heavy Metal Classic,26
8,Music,6580
9,Music Videos,1


In [None]:
## Lo unico que cambia en estos ejercicioes es que arriba, arranco por PLAYLIST_TRACK... entonces, al hacer el LEFT JOIN, no figuran las playlists faltantes... En cambio abajo, 

In [197]:
query = """
SELECT pl.Name as 'Nombre Playlist', COUNT(*) as 'N° Canciones'
FROM playlists pl
LEFT JOIN playlist_track pt ON pt.PlaylistId = pl.PlaylistId
GROUP BY pl.Name
"""

pd.read_sql_query(query, connection)

# Al hacer el LEFT JOIN, me aparecen las playlist AUDIOBOOKS y MOVIES, pero estas no tienen canciones... y me figura 2

Unnamed: 0,Nombre Playlist,N° Canciones
0,90’s Music,1477
1,Audiobooks,2
2,Brazilian Music,39
3,Classical,75
4,Classical 101 - Deep Cuts,25
5,Classical 101 - Next Steps,25
6,Classical 101 - The Basics,25
7,Grunge,15
8,Heavy Metal Classic,26
9,Movies,2


In [None]:
## PRUEBA DECONCEPTO PARA LEFT JOIN.... ENTIENDO ENTONCES QUE playlists TIENE DOS PLAYLIST ('AUDIOBOOKS' y 'Movies') que no tienen ningun TrackId

In [121]:
query = """
SELECT *
FROM playlists pl
LEFT JOIN playlist_track pt ON pt.PlaylistId = pl.PlaylistId
"""

pd.read_sql_query(query, connection)

Unnamed: 0,PlaylistId,Name,PlaylistId.1,TrackId
0,1,Music,1.0,1.0
1,1,Music,1.0,2.0
2,1,Music,1.0,3.0
3,1,Music,1.0,4.0
4,1,Music,1.0,5.0
...,...,...,...,...
8714,17,Heavy Metal Classic,17.0,2094.0
8715,17,Heavy Metal Classic,17.0,2095.0
8716,17,Heavy Metal Classic,17.0,2096.0
8717,17,Heavy Metal Classic,17.0,3290.0


In [138]:
query = """
SELECT *
FROM playlists
"""
pd.read_sql_query(query, connection)

Unnamed: 0,PlaylistId,Name
0,1,Music
1,2,Movies
2,3,TV Shows
3,4,Audiobooks
4,5,90’s Music
5,6,Audiobooks
6,7,Movies
7,8,Music
8,9,Music Videos
9,10,TV Shows


### 8.	Muestra cuánto ha vendido cada empleado.

In [145]:
query = """
SELECT e.FirstName||' '||e.LastName as 'Empleado', COUNT(*) as 'Vendidos'
FROM employees e
JOIN customers c ON e.EmployeeId = c.SupportRepId
JOIN invoices i ON c.CustomerId = i.CustomerId
WHERE e.Title = 'Sales Support Agent'
GROUP BY Empleado
"""

pd.read_sql_query(query, connection)

Unnamed: 0,Empleado,Vendidos
0,Jane Peacock,146
1,Margaret Park,140
2,Steve Johnson,126


In [25]:
query = """
SELECT e.FirstName||' '||e.LastName as 'Empleado', SUM(Quantity) as 'Canciones Vendidas'
FROM employees e
JOIN customers c ON e.EmployeeId = c.SupportRepId
JOIN invoices i ON c.CustomerId = i.CustomerId
JOIN invoice_items ii ON ii.invoiceId = i.InvoiceId
GROUP BY e.EmployeeId
"""

pd.read_sql_query(query, connection)

Unnamed: 0,Empleado,Canciones Vendidas
0,Jane Peacock,796
1,Margaret Park,760
2,Steve Johnson,684


In [26]:
query = """
SELECT e.FirstName, sum(i.Total)
FROM invoices i
JOIN customers c ON i.CustomerId = c.CustomerId
JOIN employees e ON c.SupportRepId = e.EmployeeId
GROUP BY e.FirstName
"""
pd.read_sql_query(query, connection)

Unnamed: 0,FirstName,sum(i.Total)
0,Jane,833.04
1,Margaret,775.4
2,Steve,720.16


### 9.	¿Quién ha sido el agente de ventas que más ha vendido en 2009?

Unnamed: 0,Empleado,Cantidad vendida en 2009
0,Margaret Park,30
1,Steve Johnson,28
2,Jane Peacock,25


In [37]:
# ESTE ES EL RESUELTADO CORRECTO

query = """
SELECT e.FirstName||' '||e.LastName as 'Employee', SUM(i.Total) as 'Ventas'
FROM employees e
JOIN customers c ON e.EmployeeId = c.SupportRepId
JOIN invoices i ON c.CustomerId = i.CustomerId
WHERE i.InvoiceDate LIKE '2009%'
GROUP BY employeeId
ORDER BY Ventas desc
LIMIT 1
"""
# Con el LIMIT 1 se queda con el primero

pd.read_sql_query(query, connection)

Unnamed: 0,Employee,Ventas
0,Steve Johnson,164.34


### 10.	¿Cuáles son los 3 grupos que más han vendido?

In [None]:
# ESTE ES EL QUE HICIMOS EN CLASE
# C
query = """
SELECT *
FROM 
"""

pd.read_sql_query(query, connection)

In [43]:
query = """
SELECT ar.Name, COUNT(*)
FROM artists ar
JOIN albums al ON al.ArtistId = ar.ArtistId
JOIN tracks t ON t.AlbumId = al.AlbumId
JOIN invoice_items it ON it.TrackId = t.TrackId
JOIN invoices iv ON iv.InvoiceId = it.InvoiceId
GROUP BY ar.Name
ORDER BY COUNT(*) desc
LIMIT 3
"""

pd.read_sql_query(query, connection)

Unnamed: 0,Name,COUNT(*)
0,Iron Maiden,140
1,U2,107
2,Metallica,91


In [2]:
import numpy as np
n = np.array([[1,2,3]])
n.shape

(1, 3)

### 11. Muestra cuántas canciones de Rock hay en cada playlist

In [45]:
# RESUELTO EN CLASE - Ver el de abajo.

query = """
SELECT pl.Name, COUNT(t.Name) as 'N° canciones'
FROM genres g
JOIN tracks t ON g.GenreId = t.GenreId
JOIN playlist_track pt ON pt.TrackId = t.TrackId
JOIN playlists pl ON pl.PlaylistId = pt.PlaylistId
WHERE g.Name = 'Rock'
GROUP BY pl.Name
"""

pd.read_sql_query(query, connection)



Unnamed: 0,Name,N° canciones
0,90’s Music,621
1,Grunge,14
2,Heavy Metal Classic,9
3,Music,2594


In [47]:
# RESUELTO EN CLASE --> Lo correcto es agrupar por su ID... por eso conviene este metodo, que al ser CLAVE PRIMARIA, me aseguro que son elementos totalmente distintos.

query = """
SELECT pl.Name, COUNT(pl.PlaylistId) as 'N° canciones'
FROM genres g
JOIN tracks t ON g.GenreId = t.GenreId
JOIN playlist_track pt ON pt.TrackId = t.TrackId
JOIN playlists pl ON pl.PlaylistId = pt.PlaylistId
WHERE g.Name = 'Rock'
GROUP BY pl.PlaylistId
"""

pd.read_sql_query(query, connection)

Unnamed: 0,Name,N° canciones
0,Music,1297
1,90’s Music,621
2,Music,1297
3,Grunge,14
4,Heavy Metal Classic,9


In [184]:
# RESUELTO POR MI

query = """
SELECT pl.Name, COUNT(*)
FROM genres g
JOIN tracks t ON g.GenreId = t.GenreId
JOIN playlist_track pt ON pt.TrackId = t.TrackId
JOIN playlists pl ON pl.PlaylistId = pt.PlaylistId
WHERE g.Name = 'Rock'
GROUP BY pl.Name
"""

pd.read_sql_query(query, connection)

Unnamed: 0,Name,COUNT(*)
0,90’s Music,621
1,Grunge,14
2,Heavy Metal Classic,9
3,Music,2594


### 12. Muestra una tabla con todas canciones y su(s) Id de factura, hayan sido vendidas alguna vez o no.

In [50]:
# RESUELTO EN CLASES

query = """
SELECT t.Name as 'Song', ii.InvoiceId as 'Fact. Num'
FROM tracks t
LEFT JOIN invoice_items ii ON t.TrackId = ii.TrackId

"""
# Se usa el LEFT JOIN para las que no han sido vendidas. Entonces agarramos TODA la tabla de Tracks (la dejamos fija), y le añadimos la de Invoice_Items...
# De esta forma, en la tabla resultante nos va a duplicar las canciones que aparecen más de una vez (como la 2da canción).. y donde no encuentra ninguna equivalencia, nos va a poner un NaN (como al final de la columna).

pd.read_sql_query(query, connection)

Unnamed: 0,Song,Fact. Num
0,For Those About To Rock (We Salute You),108.0
1,Balls to the Wall,1.0
2,Balls to the Wall,214.0
3,Fast As a Shark,319.0
4,Restless and Wild,1.0
...,...,...
3754,"String Quartet No. 12 in C Minor, D. 703 ""Quar...",108.0
3755,"String Quartet No. 12 in C Minor, D. 703 ""Quar...",319.0
3756,"L'orfeo, Act 3, Sinfonia (Orchestra)",
3757,"Quintet for Horn, Violin, 2 Violas, and Cello ...",


### 13. ¿Cuántos artistas no tienen ningún album?

In [56]:
query = """
SELECT COUNT(*) as 'Artistas sin albums'
FROM artists ar
LEFT JOIN albums al ON ar.ArtistId = al.ArtistId
WHERE al.Title IS NULL

"""

pd.read_sql_query(query, connection)

# Uso el LEFT JOIN porque el JOIN te hace una interseccion total, sin dejar valores vacíos... Entonces nosotros queremos que la unión se haga con TODOS los artistas (izquierda), con respecto a la lsita de albumnes (derecha), tengan o no valor.

Unnamed: 0,Artistas sin albums
0,71
