
# Exemplos de queries SQL integrados ao Python e Pandas
Usaremos dados de exemplo do site [MySQL Tutorial](http://www.mysqltutorial.org/mysql-sample-database.aspx).

Para recriar o banco de dados em SQLite, sem Python, basta executar o arquivo com os comandos SQL na linha de comando do sistema operacional:

```shell
sqlite3 database.db < sample-database-dump.sql
```

O arquivo `database.db` será criado e tabelas e dados conforme especificados em `sql-sample-database.sql` serão populados.

Se estivéssemos lidando com um SGBD (sistema gerenciador de banco de dados) mais robusto, como MariaDB/MySQL, Oracle ou DB2, o comando acima seria diferente e deverá conter:

* o hostname ou IP do servidor de banco de dados
* usuário e senha de acesso
* nome do banco de dados em que vamos operar

Mas SQLite é bem mais simples e didático e opera num arquivo local.

## Visualizando o banco de dados, suas tabelas e dados

![database example](http://www.mysqltutorial.org/wp-content/uploads/2009/12/MySQL-Sample-Database-Schema.png)

Cada SGBD tem seu próprio ferramental para navegar nos dados. MariaDB/MySQL tem o popular [PHP MyAdmin](https://www.phpmyadmin.net), Oracle e DB2 tem suas próprias ferramentas proprietárias, SQLite tem [DB Browser for SQLite](https://sqlitebrowser.org) e uma ferramenta online chamada [SQLite Online](https://sqliteonline.com).

Para navegar nos dados do DB SQLite que acabamos de criar, use uma das 3 opções:


* Use o comando `sqlite3`.
* Instale o **DB Browser for SQLite** e abra o arquivo.
* Envie o arquivo para o site **SQLiteOnline.com**.

#### Um banco de dados é um local para armazenar dados. Um sistema de banco de dados relacional (`Relational Database Management System` [`RDMS`](https://www.dofactory.com/sql)) armazena dados em tabelas.

## SQLite.

#### Vamos primeiro usar a estrutura e dados do arquivo `sql-sample-database.sql` para criar um banco de dados no arquivo `database.db`.

In [1]:
import pandas as pd
import sqlite3

db = sqlite3.connect('database.db')

#### Vamos executar um script `SQL` com o método [`.executescript()`](https://docs.python.org/3/library/sqlite3.html#sqlite3.Cursor.executescript).

In [2]:
script = 'sample-database-dump.sql'

db.cursor().executescript(open(script).read())

<sqlite3.Cursor at 0x7f10d0f958f0>

#### Criar uma conexão com nossa base de dados.

In [3]:
conn = sqlite3.connect('database.db')
curs = conn.cursor()
#conn.close()

#### Criamos uma nova tabela.

In [4]:
#curs.execute("""DROP TABLE countries;""")

curs.execute("""CREATE TABLE IF NOT EXISTS countries (

key INT PRIMARY KEY,

name TEXT UNIQUE,

founding_year INT,

capital TEXT

);""")


conn.commit()

#### Podemos mostrar a tabela criada com o comando [`SELECT`](https://towardsdatascience.com/data-science-lesson-2-selecting-data-using-sql-3aaf8258619d). O método [`.read_sql_query()`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_sql_query.html) retorna um DataFrame correspondente ao conjunto de resultados da requisição.

In [5]:
query = "SELECT * FROM countries;"

queryResult = pd.read_sql_query(query, db)

queryResult.head()

Unnamed: 0,key,name,founding_year,capital


#### Vamos executar uma requisição com o método [`.execute()`](https://docs.python.org/3/library/sqlite3.html#sqlite3.Cursor.execute), que executa um requisição única ao [`sqlite_master`](https://wiki.tcl-lang.org/page/sqlite_master).

In [6]:
res = conn.execute("SELECT name FROM sqlite_master WHERE type = 'table';")
print(type(res))
for name in res:
    print (name[0])

<class 'sqlite3.Cursor'>
customers
employees
offices
orderdetails
orders
payments
productlines
products
countries


#### Podemos transformar o `cursor` `res` em um dataframe usando o método [`DataFrame.from_records()`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.from_records.html), que converte estruturas ou recordes de `ndarrray` em `DataFrames`.

In [7]:
res = conn.execute("SELECT name FROM sqlite_master WHERE type = 'table';")

cols = [column[0] for column in res.description]

results = pd.DataFrame.from_records(data = res.fetchall(), 
                                    columns = cols
                                   )
results

Unnamed: 0,name
0,customers
1,employees
2,offices
3,orderdetails
4,orders
5,payments
6,productlines
7,products
8,countries


#### Inserimos então alguns valores na tabela criada.

In [8]:
curs.execute("""INSERT INTO countries  

(key, name, founding_year, capital)

VALUES 


(1, 'BRASIL', 1500, 'SALVADOR')
;""")

conn.commit()

query = "SELECT * FROM countries;"

queryResult = pd.read_sql_query(query,db)

queryResult.head()

Unnamed: 0,key,name,founding_year,capital
0,1,BRASIL,1500,SALVADOR


#### Vamos inserir mais valores na tabela.

In [9]:
curs.execute("""INSERT INTO countries  

(key, name, founding_year, capital)

VALUES 

(2, 'MÉXICO', 1519, 'CIDADE DO MÉXICO'),
(3, 'ARGENTINA', 1516, 'BUENOS AIRES')
;""")

conn.commit()

#### Podemos observar a tabela a seguir.

In [10]:
query = "SELECT * FROM countries;"

queryResult = pd.read_sql_query(query,db)

queryResult.head()

Unnamed: 0,key,name,founding_year,capital
0,1,BRASIL,1500,SALVADOR
1,2,MÉXICO,1519,CIDADE DO MÉXICO
2,3,ARGENTINA,1516,BUENOS AIRES


#### Inserimos mais um valor na tabela `countries`.

In [11]:
curs.execute("""INSERT INTO countries  

(key, name, founding_year, capital)

VALUES 


(4, 'URUGUAI', 1680, 'MONTEVIDÉU')
;""")

conn.commit()

query = "SELECT * FROM countries;"

queryResult = pd.read_sql_query(query,db)

queryResult.head()

Unnamed: 0,key,name,founding_year,capital
0,1,BRASIL,1500,SALVADOR
1,2,MÉXICO,1519,CIDADE DO MÉXICO
2,3,ARGENTINA,1516,BUENOS AIRES
3,4,URUGUAI,1680,MONTEVIDÉU


#### Vamos inserir uma observação com dado faltante na tabela.

In [12]:
curs.execute("""INSERT INTO countries  

(name, founding_year, capital)

VALUES 

('VENEZUELA', 1500.0, 'SALVADOR')
;""")

conn.commit()

#### E vemos o que ela apresenta.

In [13]:
query = "SELECT * FROM countries;"

queryResult = pd.read_sql_query(query,db)

queryResult.head()

Unnamed: 0,key,name,founding_year,capital
0,1.0,BRASIL,1500,SALVADOR
1,2.0,MÉXICO,1519,CIDADE DO MÉXICO
2,3.0,ARGENTINA,1516,BUENOS AIRES
3,4.0,URUGUAI,1680,MONTEVIDÉU
4,,VENEZUELA,1500,SALVADOR


#### Podemos também atualizar valores com o comando [`UPDATE`](https://www.sqlitetutorial.net/sqlite-update/).

In [14]:
curs.execute("""UPDATE countries  

SET capital = 'BRASÍLIA'

WHERE

name = 'BRASIL' 

;""")
conn.commit()

query = "SELECT * FROM countries;"

queryResult = pd.read_sql_query(query,db)

queryResult.head()

Unnamed: 0,key,name,founding_year,capital
0,1.0,BRASIL,1500,BRASÍLIA
1,2.0,MÉXICO,1519,CIDADE DO MÉXICO
2,3.0,ARGENTINA,1516,BUENOS AIRES
3,4.0,URUGUAI,1680,MONTEVIDÉU
4,,VENEZUELA,1500,SALVADOR


#### Vamos novamente atualizar um valor da tabela.

In [15]:
curs.execute("""UPDATE countries  

SET capital = 'ASSUNCAO'

;""")
conn.commit()

query = "SELECT * FROM countries;"

queryResult = pd.read_sql_query(query,db)

queryResult.head()

Unnamed: 0,key,name,founding_year,capital
0,1.0,BRASIL,1500,ASSUNCAO
1,2.0,MÉXICO,1519,ASSUNCAO
2,3.0,ARGENTINA,1516,ASSUNCAO
3,4.0,URUGUAI,1680,ASSUNCAO
4,,VENEZUELA,1500,ASSUNCAO


#### Ou então anular um valor.

In [16]:
curs.execute("""UPDATE countries  

SET capital = NULL

WHERE

name = 'ARGENTINA'

;""")
conn.commit()

query = "SELECT * FROM countries;"

queryResult = pd.read_sql_query(query,db)

queryResult.head()

Unnamed: 0,key,name,founding_year,capital
0,1.0,BRASIL,1500,ASSUNCAO
1,2.0,MÉXICO,1519,ASSUNCAO
2,3.0,ARGENTINA,1516,
3,4.0,URUGUAI,1680,ASSUNCAO
4,,VENEZUELA,1500,ASSUNCAO


#### Podemos excluir uma linha.

In [17]:
curs.execute("""DELETE FROM countries  

WHERE

name = 'ARGENTINA'

;""")
conn.commit()

query = "SELECT * FROM countries;"

queryResult = pd.read_sql_query(query,db)

queryResult.head()

Unnamed: 0,key,name,founding_year,capital
0,1.0,BRASIL,1500,ASSUNCAO
1,2.0,MÉXICO,1519,ASSUNCAO
2,4.0,URUGUAI,1680,ASSUNCAO
3,,VENEZUELA,1500,ASSUNCAO


#### Ou excluir todas as linhas.

In [18]:
curs.execute("""DELETE FROM countries  

;""")
conn.commit()

query = "SELECT * FROM countries;"

queryResult = pd.read_sql_query(query,db)

queryResult.head()

Unnamed: 0,key,name,founding_year,capital


#### É possível excluir uma tabela.

In [19]:
curs.execute("""
DROP TABLE countries;""")

conn.commit()

#### E checamos novamente as tabelas do banco de dados.

In [20]:
res = conn.execute("SELECT name FROM sqlite_master WHERE type='table';")
for name in res:
    print (name[0])

customers
employees
offices
orderdetails
orders
payments
productlines
products


## Vamos focar em ler dados de um banco relacional utilizando SQL

A operação mais comum em um banco de dados é a leitura de dados, e para isso precisamos escrever um requerimento, que é mais conhecido como QUERY.

Uma query poder ser quebrada em partes:
1. SELECT = irá iniciar a seleção de dados que faremos a leitura
2. FROM = irá selecionar as tabelas de interesse
3. WHERE = vamos declarar as condições para ler os dados
4. GROUP BY = podemos agrupar algum dado considerando uma coluna 
5. Pós processamento = podemos ordenar (ORDER BY) ou estabelecer limites (LIMIT)



## Vamos construir aos poucos nossa query

In [21]:
query = "SELECT * FROM customers;"

queryResult = pd.read_sql_query(query,db)

queryResult.head()

Unnamed: 0,customerNumber,customerName,contactLastName,contactFirstName,phone,addressLine1,addressLine2,city,state,postalCode,country,salesRepEmployeeNumber,creditLimit
0,103,Atelier graphique,Schmitt,Carine,40.32.2555,"54, rue Royale",,Nantes,,44000,France,1370.0,21000
1,112,Signal Gift Stores,King,Jean,7025551838,8489 Strong St.,,Las Vegas,NV,83030,USA,1166.0,71800
2,114,"Australian Collectors, Co.",Ferguson,Peter,03 9520 4555,636 St Kilda Road,Level 3,Melbourne,Victoria,3004,Australia,1611.0,117300
3,119,La Rochelle Gifts,Labrune,Janine,40.67.8555,"67, rue des Cinquante Otages",,Nantes,,44000,France,1370.0,118200
4,121,Baane Mini Imports,Bergulfsen,Jonas,07-98 9555,Erling Skakkes gate 78,,Stavern,,4110,Norway,1504.0,81700


#### É possível ler apenas colunas específicas de uma tabela.

In [22]:
query = "SELECT customerName, phone FROM customers;"

queryResult = pd.read_sql_query(query,db)

queryResult.head()

Unnamed: 0,customerName,phone
0,Atelier graphique,40.32.2555
1,Signal Gift Stores,7025551838
2,"Australian Collectors, Co.",03 9520 4555
3,La Rochelle Gifts,40.67.8555
4,Baane Mini Imports,07-98 9555


## SQLite com Pandas

###  <span style = "color:blue">Prática independente.</span>

#### Vamos escrever algumas queries, mas primeiro, vamos conhecer nossas tabelas.

In [23]:
#customers
#employees
#offices
#orderdetails
#orders
#payments
#productlines
#products

#query = """SELECT Distinct jobTitle
query = """SELECT *

    FROM orderdetails AS o
    
    ;"""

queryResult = pd.read_sql_query(query,db)

queryResult.head()
#queryResult.shape

Unnamed: 0,orderNumber,productCode,quantityOrdered,priceEach,orderLineNumber
0,10100,S18_1749,30,136.0,3
1,10100,S18_2248,50,55.09,2
2,10100,S18_4409,22,75.46,4
3,10100,S24_3969,49,35.29,1
4,10101,S18_2325,25,108.06,4


#### Aplique o comando [`SELECT DISTINCT`](https://www.sqlitetutorial.net/sqlite-select-distinct) para observar apenas os valores distintos da coluna `country`.

In [25]:
query = """SELECT DISTINCT country

    FROM customers
    
    ;"""

queryResult = pd.read_sql_query(query,db)

queryResult.head()

Unnamed: 0,country
0,France
1,USA
2,Australia
3,Norway
4,Poland


####  <span style = "color:red">Código original.</span>
<!--- 
query = """SELECT DISTINCT country 
          
          FROM customers ;"""

query2 = pd.read_sql_query(query,db)

query2.head()
-->

#### Podemos também aplicar uma condição para fazer a leitura, com incluir apenas os casos e, que `country = Australia`, da tabela `customers`.

In [26]:
query = """SELECT *

    FROM customers

    WHERE
    country = 'Australia'
    
    ;"""

queryResult = pd.read_sql_query(query,db)

queryResult.head()

Unnamed: 0,customerNumber,customerName,contactLastName,contactFirstName,phone,addressLine1,addressLine2,city,state,postalCode,country,salesRepEmployeeNumber,creditLimit
0,114,"Australian Collectors, Co.",Ferguson,Peter,03 9520 4555,636 St Kilda Road,Level 3,Melbourne,Victoria,3004,Australia,1611,117300
1,276,"Anna’s Decorations, Ltd",O’Hara,Anna,02 9936 8555,201 Miller Street,Level 15,North Sydney,NSW,2060,Australia,1611,107800
2,282,Souveniers And Things Co.,Huxley,Adrian,+61 2 9495 8555,Monitor Money Building,815 Pacific Hwy,Chatswood,NSW,2067,Australia,1611,93300
3,333,"Australian Gift Network, Co",Calaghan,Ben,61-7-3844-6555,31 Duncan St. West End,,South Brisbane,Queensland,4101,Australia,1611,51600
4,471,"Australian Collectables, Ltd",Clenahan,Sean,61-9-3844-6555,7 Allen Street,,Glen Waverly,Victoria,3150,Australia,1611,60300


####  <span style = "color:red">Código original.</span>
<!--- 
query = """SELECT * 
          
          FROM customers 
          
          WHERE country = 'Australia' ;"""

query2 = pd.read_sql_query(query,db)

query2.head()
-->

#### E agora uma condição composta com `country = Norway` e `country = Poland`.

In [27]:
query = """SELECT *

    FROM customers

    WHERE
    country = 'Australia' AND
    country = 'Poland'
    
    ;"""

queryResult = pd.read_sql_query(query,db)

queryResult.head()

Unnamed: 0,customerNumber,customerName,contactLastName,contactFirstName,phone,addressLine1,addressLine2,city,state,postalCode,country,salesRepEmployeeNumber,creditLimit


Nada retorna, pois não há como um customer ser de dois países ao mesm tempo nesse caso.

####  <span style = "color:red">Código original.</span>
<!--- 
query = """SELECT * 
          
          FROM customers 
          
          WHERE country = 'Norway' OR country = 'Poland';"""

query2 = pd.read_sql_query(query,db)

query2.head()
-->

#### Reproduza o problema acima aplicando o conceito `IN` do `SQL`.

In [28]:
query = """SELECT *

    FROM customers

    WHERE
    country in ('Australia', 'Poland')
    
    ;"""

queryResult = pd.read_sql_query(query,db)

queryResult.head()

Unnamed: 0,customerNumber,customerName,contactLastName,contactFirstName,phone,addressLine1,addressLine2,city,state,postalCode,country,salesRepEmployeeNumber,creditLimit
0,114,"Australian Collectors, Co.",Ferguson,Peter,03 9520 4555,636 St Kilda Road,Level 3,Melbourne,Victoria,3004,Australia,1611.0,117300
1,125,Havel & Zbyszek Co,Piestrzeniewicz,Zbyszek,(26) 642-7555,ul. Filtrowa 68,,Warszawa,,01-012,Poland,,0
2,276,"Anna’s Decorations, Ltd",O’Hara,Anna,02 9936 8555,201 Miller Street,Level 15,North Sydney,NSW,2060,Australia,1611.0,107800
3,282,Souveniers And Things Co.,Huxley,Adrian,+61 2 9495 8555,Monitor Money Building,815 Pacific Hwy,Chatswood,NSW,2067,Australia,1611.0,93300
4,333,"Australian Gift Network, Co",Calaghan,Ben,61-7-3844-6555,31 Duncan St. West End,,South Brisbane,Queensland,4101,Australia,1611.0,51600


Agora retorna, pois ele procura as linhas em que o país é Australia OU Poland.

####  <span style = "color:red">Código original.</span>
<!--- 
query = """SELECT * 
          
          FROM customers 
          
          WHERE country IN ('Norway', 'Poland');"""

query2 = pd.read_sql_query(query,db)

query2.head()
-->

#### Procure agora pelas ocorrências na cidade de 'Strasbourg', na França.

In [31]:
query = """SELECT *

    FROM customers

    WHERE
    country = 'France' AND
    city = 'Strasbourg'
    
    ;"""

queryResult = pd.read_sql_query(query,db)

queryResult.head()

Unnamed: 0,customerNumber,customerName,contactLastName,contactFirstName,phone,addressLine1,addressLine2,city,state,postalCode,country,salesRepEmployeeNumber,creditLimit
0,209,Mini Caravy,Citeaux,Frédérique,88.60.1555,"24, place Kléber",,Strasbourg,,67000,France,1370,53800


####  <span style = "color:red">Código original.</span>
<!--- 
query = """SELECT * 
          
          FROM customers 
          
          WHERE country = 'France' AND city = 'Strasbourg' ;"""

query2 = pd.read_sql_query(query, db)

query2.head()
-->

#### Obtenha as colunas `'productName'`, `'productLine'`, `'buyPrice'`, `'orderNumber'`, `'priceEach'` das tabelas das tabelas `'orderdetails'` e `'products'`, para o caso em que a linha de produtos `productLine` seja `'Motorcycles'`.

In [36]:
query = """SELECT p.productName,
          p.productLine,
          p.buyPrice,
          o.orderNumber, 
          o.priceEach
          

    FROM orderdetails AS o, products AS p

    WHERE
    p.productLine = 'Motorcycles'
    
    ;"""

queryResult = pd.read_sql_query(query,db)

queryResult.head()

Unnamed: 0,productName,productLine,buyPrice,orderNumber,priceEach
0,1936 Harley Davidson El Knucklehead,Motorcycles,24.23,10100,136.0
1,1957 Vespa GS150,Motorcycles,32.95,10100,136.0
2,1960 BSA Gold Star DBD34,Motorcycles,37.32,10100,136.0
3,1969 Harley Davidson Ultimate Chopper,Motorcycles,48.81,10100,136.0
4,1974 Ducati 350 Mk3 Desmo,Motorcycles,56.13,10100,136.0


####  <span style = "color:red">Código original.</span>
<!--- 
query = """SELECT
    p.productName,
    p.productLine,
    p.buyPrice,
    o.orderNumber,
    o.priceEach
    
    FROM orderdetails AS o INNER JOIN products AS p
    ON o.productCode = p.productCode

    WHERE productLine = 'Motorcycles'
    
    ;"""

queryResult = pd.read_sql_query(query,db)

queryResult.head()
-->

#### Selecione as colunas `'requiredDate'`, `'shippedDate'`, `'priceEach'`, `'quantityOrdered'`, das tabelas `'orderdetails'` `'orders'`.

In [38]:
query = """SELECT o.requiredDate,
                  o.shippedDate,
                  od.priceEach,
                  od.quantityOrdered 
          
          FROM orders AS o, orderdetails AS od
;"""

queryResult = pd.read_sql_query(query,db)

queryResult.head()

Unnamed: 0,requiredDate,shippedDate,priceEach,quantityOrdered
0,2003-01-13,2003-01-10,136.0,30
1,2003-01-13,2003-01-10,55.09,50
2,2003-01-13,2003-01-10,75.46,22
3,2003-01-13,2003-01-10,35.29,49
4,2003-01-13,2003-01-10,108.06,25


####  <span style = "color:red">Código original.</span>
<!--- 
query = """SELECT

    o.requiredDate,
    o.shippedDate,
    od.priceEach,
    od.quantityOrdered
    
    
    FROM orderdetails AS od INNER JOIN orders AS o
    
    WHERE
    o.orderNumber = od.orderNumber
    ;"""

queryResult = pd.read_sql_query(query,db)

queryResult.head()
-->

#### Selecione as colunas `'orderNumber'`, `'orderDate'`, `'requiredDate'`, `'shippedDate'`, `'paymentDate'`, `'amount'` das tabelas `'orders'` `'payments'`. Ordene os desultados pela data do pedido `'orderDate'` de forma decrescente.

In [40]:
query = """SELECT o.orderNumber,
                  o.orderDate,
                  o.requiredDate,
                  o.shippedDate,
                  p.paymentDate,
                  p.amount 
          
          FROM orders AS o, payments AS p
;"""

queryResult = pd.read_sql_query(query,db)

queryResult.head()

Unnamed: 0,orderNumber,orderDate,requiredDate,shippedDate,paymentDate,amount
0,10100,2003-01-06,2003-01-13,2003-01-10,2004-10-19,6066.78
1,10100,2003-01-06,2003-01-13,2003-01-10,2003-06-05,14571.44
2,10100,2003-01-06,2003-01-13,2003-01-10,2004-12-18,1676.14
3,10100,2003-01-06,2003-01-13,2003-01-10,2004-12-17,14191.12
4,10100,2003-01-06,2003-01-13,2003-01-10,2003-06-06,32641.98


####  <span style = "color:red">Código original.</span>
<!--- 
query = """SELECT
    o.orderNumber,
    o.orderDate,
    o.requiredDate,
    o.shippedDate,
    p.paymentDate,
    p.amount
    
    FROM 
    orders AS o INNER JOIN payments AS p
    
    WHERE
    o.customerNumber = p.customerNumber
    
    ORDER BY
    o.orderDate DESC;"""

queryResult = pd.read_sql_query(query,db)

queryResult.head()
-->

#### Encontra as informações sobre `'city'`, `'state'`, `'jobTitle'`, `'lastName'` e `'firstName'`, a partir das tabelas `'offices'` e `'employees'`, do funcionário com o `'jobTitle'` igual a 'Sale Manager (EMEA)'.

In [42]:
query = """SELECT o.city,
                  o.state,
                  e.jobTitle,
                  e.firstName,
                  e.lastName

          FROM offices AS o, employees AS e

          WHERE jobTitle = 'Sale Manager (EMEA)'
;"""

queryResult = pd.read_sql_query(query, db)

queryResult.head()

Unnamed: 0,city,state,jobTitle,firstName,lastName
0,San Francisco,CA,Sale Manager (EMEA),Gerard,Bondur
1,Boston,MA,Sale Manager (EMEA),Gerard,Bondur
2,NYC,NY,Sale Manager (EMEA),Gerard,Bondur
3,Paris,,Sale Manager (EMEA),Gerard,Bondur
4,Tokyo,Chiyoda-Ku,Sale Manager (EMEA),Gerard,Bondur


####  <span style = "color:red">Código original.</span>
<!--- 
query = """SELECT
    o.city,
    o.state,
    e.jobTitle,
    e.lastName,
    e.firstName
    
    
    FROM 
    offices AS o INNER JOIN employees AS e
    ON o.officeCode = e.officeCode
    
    WHERE
    jobTitle = 'President'
;"""

queryResult = pd.read_sql_query(query,db)

queryResult.head()
-->

#### Realize uma contagem, com a função [`COUNT`](https://www.sqlitetutorial.net/sqlite-count-function/), de registos de produtos `'productVendor'` e apresente o vendedor `'productVendor'`, agrupando os resultados por vendedor. 

In [48]:
query = """SELECT p.productVendor,
                  COUNT(p.productVendor) AS Total
                  

    FROM products AS p

    GROUP BY
    p.productVendor

    ORDER BY
    Total DESC
;"""

queryResult = pd.read_sql_query(query,db)

queryResult.head()

Unnamed: 0,productVendor,Total
0,Classic Metal Creations,10
1,Carousel DieCast Legends,9
2,Exoto Designs,9
3,Gearbox Collectibles,9
4,Highway 66 Mini Classics,9


####  <span style = "color:red">Código original.</span>
<!--- 
query = """SELECT
    p.productVendor AS Vendedor,
    COUNT(p.productVendor) AS 'TotalRegistro'
    
    FROM 
    products AS p
    
    GROUP BY
    p.productVendor
    
    ORDER BY
    TotalRegistro;"""

queryResult = pd.read_sql_query(query,db)

queryResult.head()
-->

## Lista completa de agregações 

<img src="list_aggregation.png">
          

#### Use o comando [`HAVING`](https://www.sqlitetutorial.net/sqlite-having/) para aplicar a condição `'quantityOrdered > 10'`, selecionando as colunas de de nome do produto `'productName'`, linha do produto `'productLine'` e descrição do produto `'productDescription'`. Calcule a diferença entre o [preço de varejo sugerido pelo fabricante]((https://www.investopedia.com/terms/m/manufacturers-suggested-retail-price-msrp.asp)) `'MSRP'` e o preço cobrado `'buyPrice'` e ordene a saída por esse valor. Agrupe os dados por linha de produto, considerando apenas os resultados em que a quantidade pedida `'quantityOrdered'` supera os 10 produtos. Limite a saída aos três primeiros resultados.

In [57]:
query = """SELECT p.productName,
                  p.productLine,
                  p.productDescription,
                  (p.MSRP - p.buyPrice) AS valueDifference

            FROM products AS p, orderdetails AS od

            GROUP BY
            p.productLine

            HAVING 
            od.quantityOrdered > 10

            ORDER BY
            valueDifference DESC

            LIMIT 3
;"""

queryResult = pd.read_sql_query(query, db)

queryResult

Unnamed: 0,productName,productLine,productDescription,valueDifference
0,1982 Camaro Z28,Classic Cars,Features include opening and closing doors. Co...,54.62
1,Diamond T620 Semi-Skirted Tanker,Trucks and Buses,This limited edition model is licensed and per...,47.46
2,2002 Yamaha YZR M1,Motorcycles,"Features rotating wheels , working kick stand....",47.19


####  <span style = "color:red">Código original.</span>
<!--- 
query = """SELECT
    p.productName AS nomeProduto,
    p.productLine AS Linha,
    p.productDescription AS descricao,
    (p.MSRP - p.buyPrice) AS sugeridoMenosCobrado
    
    FROM 
    orderdetails AS o INNER JOIN products AS p
    
    WHERE
    o.productCode = p.productCode
    
    GROUP BY
    p.productLine
    
    HAVING
    o.quantityOrdered > 10
    
    ORDER BY
    sugeridoMenosCobrado
    
    LIMIT 
    3
    ;"""

queryResult = pd.read_sql_query(query,db)

queryResult
-->

#### Utilize o comando [`SELECT DISTINCT`](https://www.sqlitetutorial.net/sqlite-select-distinct) para verificar a existência de dados duplicados na coluna `'productCode'` da tabela `'orderDetails'`.

In [68]:
query = """SELECT productCode

    FROM orderDetails
    
    ORDER BY
    productCode
    ;"""

queryResult = pd.read_sql_query(query,db)

queryResult.head()

Unnamed: 0,productCode
0,S10_1678
1,S10_1678
2,S10_1678
3,S10_1678
4,S10_1678


In [70]:
#Quantidade de valores totais em productCode
len(queryResult)

2996

In [73]:
query = """SELECT DISTINCT productCode

    FROM orderDetails

    ORDER BY
    productCode
    
    ;"""

queryResult = pd.read_sql_query(query,db)

queryResult.head()

Unnamed: 0,productCode
0,S10_1678
1,S10_1949
2,S10_2016
3,S10_4698
4,S10_4757


In [72]:
#Quantidade de valores únicos em productCode
len(queryResult)

109

####  <span style = "color:red">Código original.</span>
<!--- 
query = """SELECT DISTINCT productCode

FROM 
orderDetails;"""

queryResult = pd.read_sql_query(query,db)

queryResult.head()
-->

## SubQuery

#### Muitas vezes precisamos ler registros que satisfazem certas condições, mas em alguns casos essas condições devem ser aplicadas em outra tabela. Quando temos casos desse tipo o caminho a ser seguido é fazer subqueries.    

1. Exemplo de escrita:
        SELECT column-names
        FROM table-name1
        WHERE value IN (SELECT column-name
                   FROM table-name2)

#### Vamos primeiro ver as tabelas que iremos trabalhar.

In [74]:
query = """SELECT *

FROM offices;"""

queryResult = pd.read_sql_query(query,db)

queryResult.head()

Unnamed: 0,officeCode,city,phone,addressLine1,addressLine2,state,country,postalCode,territory
0,1,San Francisco,+1 650 219 4782,100 Market Street,Suite 300,CA,USA,94080,
1,2,Boston,+1 215 837 0825,1550 Court Place,Suite 102,MA,USA,02107,
2,3,NYC,+1 212 555 3000,523 East 53rd Street,apt. 5A,NY,USA,10022,
3,4,Paris,+33 14 723 4404,43 Rue Jouffroy D’abbans,,,France,75017,EMEA
4,5,Tokyo,+81 33 224 5000,4-1 Kioicho,,Chiyoda-Ku,Japan,102-8578,Japan


In [75]:
query = """SELECT *

FROM employees;"""

queryResult = pd.read_sql_query(query,db)

queryResult.head()

Unnamed: 0,employeeNumber,lastName,firstName,extension,email,officeCode,reportsTo,jobTitle
0,1002,Murphy,Diane,x5800,dmurphy@classicmodelcars.com,1,,President
1,1056,Patterson,Mary,x4611,mpatterso@classicmodelcars.com,1,1002.0,VP Sales
2,1076,Firrelli,Jeff,x9273,jfirrelli@classicmodelcars.com,1,1002.0,VP Marketing
3,1088,Patterson,William,x4871,wpatterson@classicmodelcars.com,6,1056.0,Sales Manager (APAC)
4,1102,Bondur,Gerard,x5408,gbondur@classicmodelcars.com,4,1056.0,Sale Manager (EMEA)


In [76]:
query = """SELECT *
    
    FROM 
    employees
    
    WHERE
    officeCode in (SELECT officeCode
    
    FROM offices
    
    WHERE
    city = 'London');"""

queryResult = pd.read_sql_query(query,db)

queryResult

Unnamed: 0,employeeNumber,lastName,firstName,extension,email,officeCode,reportsTo,jobTitle
0,1501,Bott,Larry,x2311,lbott@classicmodelcars.com,7,1102,Sales Rep
1,1504,Jones,Barry,x102,bjones@classicmodelcars.com,7,1102,Sales Rep


In [77]:
query = """SELECT officeCode
    
    FROM offices
    
    WHERE
    city = 'London';"""

queryResult = pd.read_sql_query(query,db)

queryResult

Unnamed: 0,officeCode
0,7
