# INNER JOIN & LEFT JOIN

> La fonctionnalité la plus déterminante de SQL est sans doute la jointure, car c’est pour cela que les bases de données relationnelles sont réellement conçues. Bien qu'il existe plusieurs types de jointure, aucun n'est aussi couramment utilisé que le INNER JOIN et LEFT JOIN. Nous couvrirons ces deux opérateurs dans cette section.

# SET UP

In [None]:
import sqlite3
import pandas as pd
import urllib.request

# download SQLite database and connect to it
urllib.request.urlretrieve("https://github.com/thomasnield/anaconda_intro_to_sql/blob/main/company_operations.db?raw=true", "company_operations.db")
conn = sqlite3.connect('company_operations.db')

In [None]:
%%capture
%load_ext sql

In [None]:
%sql sqlite:///company_operations.db

# Clés primaires et étrangères

* Afficher les 5 première ligne de `CUSTOMER`et `CUSTOMER_ORDER`. Qu'est-ce qu'ils ont en commun?

In [None]:
sql_customer = """
SELECT * FROM CUSTOMER
LIMIT 5
"""
pd.read_sql(sql_customer,conn)

Unnamed: 0,CUSTOMER_ID,CUSTOMER_NAME,ADDRESS,CITY,STATE,ZIP,CATEGORY
0,1,Alpha Medical,18745 Train Dr,Dallas,TX,75021,INDUSTRIAL
1,2,Oak Cliff Base,2379 Cliff Ave,Abbevile,LA,70510,GOVERNMENT
2,3,Sports Unlimited,1605 Station Dr,Alexandrai,LA,71301,COMMERCIAL
3,4,Riley Sporting Goods,9854 Firefly Blvd,Austin,TX,78701,COMMERCIAL
4,5,Lite Industrial,462 Roadrunner Blvd,Houston,TX,77254,INDUSTRIAL


In [None]:
sql_customer_order = """
SELECT * FROM CUSTOMER_ORDER
LIMIT 5
"""
pd.read_sql(sql_customer_order,conn)

Unnamed: 0,CUSTOMER_ORDER_ID,CUSTOMER_ID,ORDER_DATE,PRODUCT_ID,QUANTITY,RUSH_SHIP
0,1,9,2021-01-01,7,20,0
1,2,5,2021-01-01,15,110,0
2,3,3,2021-01-01,4,120,0
3,4,6,2021-01-01,7,200,0
4,5,2,2021-01-01,3,60,0


Les deux tables ont une colonne CUSTOMER_ID, et vous pouvez probablement déduire ce qu'elle représente pour chaque table. La table CUSTOMER a un CUSTOMER_ID unique attribué à chaque enregistrement client. Mais dans la table CUSTOMER_ORDER, il est utilisé pour attribuer une commande à un CLIENT donné, en utilisant ce CUSTOMER_ID.

Cela fait de la table CUSTOMER la table parente avec le CUSTOMER_ID comme clé primaire. La table CUSTOMER_ORDER est la table enfant avec le CUSTOMER_ID comme clé étrangère. Vous pouvez le considérer comme la table parente fournissant des données à la table enfant, via la clé primaire vers la clé étrangère.

La clé primaire ne peut pas avoir de valeurs en double, ce qui a du sens car aucun deux clients ne devraient avoir le même CUSTOMER_ID. Cependant, il peut y avoir plusieurs instances d'une valeur dans une colonne de clé étrangère, car un client donné peut avoir plusieurs commandes. Il s'agit d'une relation classique un-à-plusieurs.

Ces relations sont conçues pour être jointes, et constituent un cas d'utilisation fondamental pour les INNER JOIN et LEFT JOIN.

# INNER JOIN

La jointure INTERNE (INNER JOIN) est le type de jointure le plus courant en SQL. Elle assemble deux tables ou plus sur une ou plusieurs colonnes. Dans notre exemple, il serait utile de rendre nos enregistrements CUSTOMER_ORDER plus descriptifs en intégrant les informations sur les CLIENTS aux côtés de chaque enregistrement CUSTOMER_ORDER. Une jointure INTERNE peut accomplir cela comme illustré ci-dessous.

In [None]:
sql = """
SELECT
CUSTOMER_ORDER_ID,
CUSTOMER.CUSTOMER_ID,
CUSTOMER_NAME,
ADDRESS,
CITY,
STATE,
ZIP,
ORDER_DATE,
PRODUCT_ID,
QUANTITY
FROM CUSTOMER INNER JOIN CUSTOMER_ORDER
ON CUSTOMER.CUSTOMER_ID = CUSTOMER_ORDER.CUSTOMER_ID
"""

pd.read_sql(sql,conn)

Unnamed: 0,CUSTOMER_ORDER_ID,CUSTOMER_ID,CUSTOMER_NAME,ADDRESS,CITY,STATE,ZIP,ORDER_DATE,PRODUCT_ID,QUANTITY
0,1,9,Dent Research,392 45th St,Waco,TX,76700,2021-01-01,7,20
1,2,5,Lite Industrial,462 Roadrunner Blvd,Houston,TX,77254,2021-01-01,15,110
2,3,3,Sports Unlimited,1605 Station Dr,Alexandrai,LA,71301,2021-01-01,4,120
3,4,6,Prairie Sports Center,689 Stadium Way,Tulsa,OK,74101,2021-01-01,7,200
4,5,2,Oak Cliff Base,2379 Cliff Ave,Abbevile,LA,70510,2021-01-01,3,60
...,...,...,...,...,...,...,...,...,...,...
1185,1994,9,Dent Research,392 45th St,Waco,TX,76700,2021-03-31,4,70
1186,1995,5,Lite Industrial,462 Roadrunner Blvd,Houston,TX,77254,2021-03-31,8,140
1187,1996,10,Gamma Solutions,2752 27th St,Phoenix,AZ,85001,2021-03-31,7,80
1188,1997,9,Dent Research,392 45th St,Waco,TX,76700,2021-03-31,6,20


In [None]:
%%sql
SELECT
CUSTOMER_ORDER_ID,
CUSTOMER.CUSTOMER_ID,
CUSTOMER_NAME,
ADDRESS,
CITY,
STATE,
ZIP,
ORDER_DATE,
PRODUCT_ID,
QUANTITY
FROM CUSTOMER INNER JOIN CUSTOMER_ORDER
ON CUSTOMER.CUSTOMER_ID = CUSTOMER_ORDER.CUSTOMER_ID
LIMIT 5

 * sqlite:///company_operations.db
Done.


CUSTOMER_ORDER_ID,CUSTOMER_ID,CUSTOMER_NAME,ADDRESS,CITY,STATE,ZIP,ORDER_DATE,PRODUCT_ID,QUANTITY
1,9,Dent Research,392 45th St,Waco,TX,76700,2021-01-01,7,20
2,5,Lite Industrial,462 Roadrunner Blvd,Houston,TX,77254,2021-01-01,15,110
3,3,Sports Unlimited,1605 Station Dr,Alexandrai,LA,71301,2021-01-01,4,120
4,6,Prairie Sports Center,689 Stadium Way,Tulsa,OK,74101,2021-01-01,7,200
5,2,Oak Cliff Base,2379 Cliff Ave,Abbevile,LA,70510,2021-01-01,3,60


*  INNER JOIN AVEC LA CLAUSE WHERE

In [None]:
%%sql
SELECT
CUSTOMER_ORDER_ID,
CUSTOMER.CUSTOMER_ID,
CUSTOMER_NAME,
ADDRESS,
CITY,
STATE,
ZIP,
ORDER_DATE,
PRODUCT_ID,
QUANTITY
FROM CUSTOMER , CUSTOMER_ORDER
WHERE CUSTOMER.CUSTOMER_ID = CUSTOMER_ORDER.CUSTOMER_ID
LIMIT 5

 * sqlite:///company_operations.db
Done.


CUSTOMER_ORDER_ID,CUSTOMER_ID,CUSTOMER_NAME,ADDRESS,CITY,STATE,ZIP,ORDER_DATE,PRODUCT_ID,QUANTITY
1,9,Dent Research,392 45th St,Waco,TX,76700,2021-01-01,7,20
2,5,Lite Industrial,462 Roadrunner Blvd,Houston,TX,77254,2021-01-01,15,110
3,3,Sports Unlimited,1605 Station Dr,Alexandrai,LA,71301,2021-01-01,4,120
4,6,Prairie Sports Center,689 Stadium Way,Tulsa,OK,74101,2021-01-01,7,200
5,2,Oak Cliff Base,2379 Cliff Ave,Abbevile,LA,70510,2021-01-01,3,60


# LEFT JOIN

Que se passe-t-il s'il existe des enregistrements CLIENT qui n'ont pas d'enregistrements CUSTOMER_ORDER ? Apparaissent-ils dans une jointure INTERNE ? Par exemple, "Alpha Medical" avec un CUSTOMER_ID de 1 n'a pas de commandes. Apparaît-il dans notre requête INNER JOIN ? Ajoutons une condition WHERE pour le découvrir.

In [None]:
%%sql
SELECT
CUSTOMER_ORDER_ID,
CUSTOMER.CUSTOMER_ID,
CUSTOMER_NAME,
ADDRESS,
CITY,
STATE,
ZIP,
ORDER_DATE,
PRODUCT_ID,
QUANTITY
FROM CUSTOMER INNER JOIN CUSTOMER_ORDER
ON CUSTOMER.CUSTOMER_ID = CUSTOMER_ORDER.CUSTOMER_ID

WHERE CUSTOMER.CUSTOMER_ID = 1


 * sqlite:///company_operations.db
Done.


CUSTOMER_ORDER_ID,CUSTOMER_ID,CUSTOMER_NAME,ADDRESS,CITY,STATE,ZIP,ORDER_DATE,PRODUCT_ID,QUANTITY


nous obtenons effectivement un résultat vide. Mais regardez ce qui se passe si nous modifions notre INNER JOIN en un LEFT JOIN (ou LEFT OUTER JOIN, qui sont tous deux des alias pour la même opération).

In [None]:
%%sql
SELECT
CUSTOMER_ORDER_ID,
CUSTOMER.CUSTOMER_ID,
CUSTOMER_NAME,
ADDRESS,
CITY,
STATE,
ZIP,
ORDER_DATE,
PRODUCT_ID,
QUANTITY
FROM CUSTOMER LEFT JOIN CUSTOMER_ORDER
ON CUSTOMER.CUSTOMER_ID = CUSTOMER_ORDER.CUSTOMER_ID

WHERE CUSTOMER.CUSTOMER_ID = 1


 * sqlite:///company_operations.db
Done.


CUSTOMER_ORDER_ID,CUSTOMER_ID,CUSTOMER_NAME,ADDRESS,CITY,STATE,ZIP,ORDER_DATE,PRODUCT_ID,QUANTITY
,1,Alpha Medical,18745 Train Dr,Dallas,TX,75021,,,


Remarquez comment "Alpha Medical" apparaît maintenant avec un enregistrement fictif même s'il n'avait aucun enregistrement CUSTOMER_ORDER. Tous ses champs CUSTOMER_ORDER sont NULL (ce que Pandas affiche comme None) car il n'y avait pas d'enregistrements CUSTOMER_ORDER pour effectuer la jointure et remplir ces informations. Cependant, le LEFT JOIN a ajouté cet enregistrement fictif pour "Alpha Medical".

En d'autres termes, le LEFT JOIN inclut tous les enregistrements de la table "left" même s'il n'y a pas d'enregistrements à joindre dans la table "right". Par "left", je veux dire la table littéralement spécifiée à gauche de l'opérateur LEFT JOIN. Cela signifie que l'ordre dans lequel vous déclarez les tables dans votre FROM est important avec un LEFT JOIN.

> Il existe également un opérateur RIGHT JOIN ou RIGHT OUTER JOIN, qui inverse la direction et inclut tous les enregistrements de la table RIGHT même s'il n'y en a aucun à joindre dans la table LEFT. Cependant, il est rarement utilisé car ce qui peut être fait avec un RIGHT JOIN peut également être réalisé avec un LEFT JOIN. Il existe également un FULL OUTER JOIN qui inclut tous les enregistrements dans les deux directions, mais il est également rarement utilisé. En fait, SQLite ne prend pas en charge le RIGHT JOIN ou le FULL OUTER JOIN pour cette raison.








Comme nous le verrons, cela peut être utile pour créer des rapports ultérieurement, car nous souhaitons probablement inclure des clients qui n'ont aucune commande. Un autre cas d'utilisation courant du LEFT JOIN est de trouver des enregistrements parent qui n'ont pas d'enfants, tels que des enregistrements CLIENT qui n'ont pas d'enregistrements CUSTOMER_ORDER. Nous pouvons le faire en qualifiant tous les champs CUSTOMER_ORDER pour qu'ils soient nuls, alors qu'ils ne sont normalement pas nuls mais deviennent nuls en conséquence du LEFT JOIN.

* La liste des ID clients , noms des clients qui n'ont pas passé de commandes ?

|    |   CUSTOMER_ID | CUSTOMER_NAME   |
|---:|--------------:|:----------------|
|  0 |             1 | Alpha Medical   |

In [None]:
#SYLVIE
sql = """
SELECT
      CUSTOMER.CUSTOMER_ID,
      CUSTOMER_NAME
FROM CUSTOMER
LEFT JOIN CUSTOMER_ORDER
ON CUSTOMER.CUSTOMER_ID = CUSTOMER_ORDER.CUSTOMER_ID
WHERE CUSTOMER_ORDER.CUSTOMER_ID IS NULL """
pd.read_sql(sql,conn)

Unnamed: 0,CUSTOMER_ID,CUSTOMER_NAME
0,1,Alpha Medical


# JOINTURE de plusieurs tables

Que se passerait-il si nous voulions ajouter les informations sur les PRODUITS à nos enregistrements CUSTOMER_ORDER en plus des informations sur les CLIENTS ?

C'est possible en effectuant une deuxième jointure. Jetons un coup d'œil à la table PRODUCT et notons qu'elle utilise un PRODUCT_ID, qui existe également dans la table CUSTOMER_ORDER en tant que clé étrangère.








In [None]:
sql ="""
SELECT * FROM PRODUCT
LIMIT 3
"""
pd.read_sql(sql,conn)

Unnamed: 0,PRODUCT_ID,PRODUCT_NAME,PRODUCT_GROUP,PRICE
0,1,Eagle Kit,ALPHA,120
1,2,Hawkeye Cam,ALPHA,80
2,3,Sparrow Blade,BETA,40


In [None]:
sql ="""
SELECT * FROM CUSTOMER_ORDER
LIMIT 3
"""
pd.read_sql(sql,conn)

Unnamed: 0,CUSTOMER_ORDER_ID,CUSTOMER_ID,ORDER_DATE,PRODUCT_ID,QUANTITY,RUSH_SHIP
0,1,9,2021-01-01,7,20,0
1,2,5,2021-01-01,15,110,0
2,3,3,2021-01-01,4,120,0


In [None]:
sql ="""
SELECT * FROM CUSTOMER
LIMIT 3
"""
pd.read_sql(sql,conn)

Unnamed: 0,CUSTOMER_ID,CUSTOMER_NAME,ADDRESS,CITY,STATE,ZIP,CATEGORY
0,1,Alpha Medical,18745 Train Dr,Dallas,TX,75021,INDUSTRIAL
1,2,Oak Cliff Base,2379 Cliff Ave,Abbevile,LA,70510,GOVERNMENT
2,3,Sports Unlimited,1605 Station Dr,Alexandrai,LA,71301,COMMERCIAL


Reproduire ce résultat avec une requète SQL



|    |   CUSTOMER_ORDER_ID |   CUSTOMER_ID | CUSTOMER_NAME         | ADDRESS             | CITY       | STATE   |   ZIP | ORDER_DATE   |   PRODUCT_ID |   QUANTITY |   PRICE |
|---:|--------------------:|--------------:|:----------------------|:--------------------|:-----------|:--------|------:|:-------------|-------------:|-----------:|--------:|
|  0 |                   1 |             9 | Dent Research         | 392 45th St         | Waco       | TX      | 76700 | 2021-01-01   |            7 |         20 |      56 |
|  1 |                   2 |             5 | Lite Industrial       | 462 Roadrunner Blvd | Houston    | TX      | 77254 | 2021-01-01   |           15 |        110 |      40 |
|  2 |                   3 |             3 | Sports Unlimited      | 1605 Station Dr     | Alexandrai | LA      | 71301 | 2021-01-01   |            4 |        120 |      40 |
|  3 |                   4 |             6 | Prairie Sports Center | 689 Stadium Way     | Tulsa      | OK      | 74101 | 2021-01-01   |            7 |        200 |      56 |
|  4 |                   5 |             2 | Oak Cliff Base        | 2379 Cliff Ave      | Abbevile   | LA      | 70510 | 2021-01-01   |            3 |         60 |      40 |

In [None]:
# INNER JOIN ENTRE 2 TABLES

sql = """
SELECT
    CUSTOMER_ORDER_ID,
    CUSTOMER.CUSTOMER_ID,
    CUSTOMER_NAME,
    ADDRESS,
    CITY,
    STATE,
    ZIP,
    ORDER_DATE,
    CUSTOMER_ORDER.PRODUCT_ID,
    PRODUCT.PRODUCT_NAME,
    QUANTITY,
    PRICE
FROM CUSTOMER, CUSTOMER_ORDER, PRODUCT
WHERE CUSTOMER_ORDER.CUSTOMER_ID = CUSTOMER.CUSTOMER_ID
AND CUSTOMER_ORDER.PRODUCT_ID = PRODUCT.PRODUCT_ID
 """
pd.read_sql(sql,conn)

Unnamed: 0,CUSTOMER_ORDER_ID,CUSTOMER_ID,CUSTOMER_NAME,ADDRESS,CITY,STATE,ZIP,ORDER_DATE,PRODUCT_ID,PRODUCT_NAME,QUANTITY,PRICE
0,1,9,Dent Research,392 45th St,Waco,TX,76700,2021-01-01,7,Vulture X,20,56
1,2,5,Lite Industrial,462 Roadrunner Blvd,Houston,TX,77254,2021-01-01,15,Kriket Light XL,110,40
2,3,3,Sports Unlimited,1605 Station Dr,Alexandrai,LA,71301,2021-01-01,4,Raven Klaw,120,40
3,4,6,Prairie Sports Center,689 Stadium Way,Tulsa,OK,74101,2021-01-01,7,Vulture X,200,56
4,5,2,Oak Cliff Base,2379 Cliff Ave,Abbevile,LA,70510,2021-01-01,3,Sparrow Blade,60,40
...,...,...,...,...,...,...,...,...,...,...,...,...
1185,1994,9,Dent Research,392 45th St,Waco,TX,76700,2021-03-31,4,Raven Klaw,70,40
1186,1995,5,Lite Industrial,462 Roadrunner Blvd,Houston,TX,77254,2021-03-31,8,Roadrunner Pro,140,70
1187,1996,10,Gamma Solutions,2752 27th St,Phoenix,AZ,85001,2021-03-31,7,Vulture X,80,56
1188,1997,9,Dent Research,392 45th St,Waco,TX,76700,2021-03-31,6,Owl NV,20,100


In [None]:
# LEFT JOIN entre 2 tables

sql = """
SELECT
    CUSTOMER_ORDER_ID,
    CUSTOMER.CUSTOMER_ID,
    CUSTOMER_NAME,
    ADDRESS,
    CITY,
    STATE,
    ZIP,
    ORDER_DATE,
    CUSTOMER_ORDER.PRODUCT_ID,
    PRODUCT.PRODUCT_NAME,
    QUANTITY,
    PRICE
FROM CUSTOMER LEFT JOIN CUSTOMER_ORDER
ON CUSTOMER_ORDER.CUSTOMER_ID = CUSTOMER.CUSTOMER_ID
LEFT JOIN PRODUCT
ON CUSTOMER_ORDER.PRODUCT_ID = PRODUCT.PRODUCT_ID
 """
pd.read_sql(sql,conn)

Unnamed: 0,CUSTOMER_ORDER_ID,CUSTOMER_ID,CUSTOMER_NAME,ADDRESS,CITY,STATE,ZIP,ORDER_DATE,PRODUCT_ID,PRODUCT_NAME,QUANTITY,PRICE
0,,1,Alpha Medical,18745 Train Dr,Dallas,TX,75021,,,,,
1,5.0,2,Oak Cliff Base,2379 Cliff Ave,Abbevile,LA,70510,2021-01-01,3.0,Sparrow Blade,60.0,40.0
2,16.0,2,Oak Cliff Base,2379 Cliff Ave,Abbevile,LA,70510,2021-01-01,13.0,Natterjack Light,60.0,40.0
3,13.0,2,Oak Cliff Base,2379 Cliff Ave,Abbevile,LA,70510,2021-01-01,15.0,Kriket Light XL,10.0,40.0
4,129.0,2,Oak Cliff Base,2379 Cliff Ave,Abbevile,LA,70510,2021-01-07,13.0,Natterjack Light,40.0,40.0
...,...,...,...,...,...,...,...,...,...,...,...,...
1186,1989.0,10,Gamma Solutions,2752 27th St,Phoenix,AZ,85001,2021-03-31,2.0,Hawkeye Cam,90.0,80.0
1187,1975.0,10,Gamma Solutions,2752 27th St,Phoenix,AZ,85001,2021-03-31,2.0,Hawkeye Cam,170.0,80.0
1188,1980.0,10,Gamma Solutions,2752 27th St,Phoenix,AZ,85001,2021-03-31,4.0,Raven Klaw,30.0,40.0
1189,1996.0,10,Gamma Solutions,2752 27th St,Phoenix,AZ,85001,2021-03-31,7.0,Vulture X,80.0,56.0


# JOINTURE ET AGGREGATION

* Ajouter  le chiffre d'affaires total par client, ajoutons une expression PRICE * QUANTITY et appelons-la REVENUE.
* Ensuite, nous pouvons utiliser la fonction SUM() sur cette expression et ajouter une clause GROUP BY pour regrouper les attributs du CLIENT.

In [None]:
sql = """
SELECT CUSTOMER.CUSTOMER_ID,
CUSTOMER_NAME,
PRICE * QUANTITY AS REVENUE
FROM CUSTOMER LEFT JOIN CUSTOMER_ORDER
ON CUSTOMER_ORDER.CUSTOMER_ID = CUSTOMER.CUSTOMER_ID
LEFT JOIN PRODUCT
ON CUSTOMER_ORDER.PRODUCT_ID = PRODUCT.PRODUCT_ID

"""
pd.read_sql(sql,conn)

Unnamed: 0,CUSTOMER_ID,CUSTOMER_NAME,REVENUE
0,1,Alpha Medical,
1,2,Oak Cliff Base,3600.0
2,2,Oak Cliff Base,4800.0
3,2,Oak Cliff Base,8400.0
4,2,Oak Cliff Base,9600.0
...,...,...,...
1186,10,Gamma Solutions,400.0
1187,10,Gamma Solutions,2800.0
1188,10,Gamma Solutions,3600.0
1189,10,Gamma Solutions,5600.0


In [None]:
sql = """
SELECT CUSTOMER.CUSTOMER_ID,
CUSTOMER_NAME,
SUM(PRICE * QUANTITY) AS TOTAL_REVENUE
FROM CUSTOMER LEFT JOIN CUSTOMER_ORDER
ON CUSTOMER_ORDER.CUSTOMER_ID = CUSTOMER.CUSTOMER_ID
LEFT JOIN PRODUCT
ON CUSTOMER_ORDER.PRODUCT_ID = PRODUCT.PRODUCT_ID
GROUP BY CUSTOMER.CUSTOMER_ID, CUSTOMER_NAME
ORDER BY TOTAL_REVENUE DESC
--LIMIT 1
"""
pd.read_sql(sql,conn)

Unnamed: 0,CUSTOMER_ID,CUSTOMER_NAME,TOTAL_REVENUE
0,4,Riley Sporting Goods,1012460.0
1,8,Allen Stadium,918920.0
2,9,Dent Research,896720.0
3,7,Facility 95,865410.0
4,6,Prairie Sports Center,779870.0
5,5,Lite Industrial,679680.0
6,10,Gamma Solutions,673500.0
7,3,Sports Unlimited,671070.0
8,2,Oak Cliff Base,664660.0
9,1,Alpha Medical,


Ecrire une requète pour chaque CAS :
* **Q1** Donner Customer_id, customer_name et TOTAL_REVENUE pour les clients qui ont déjà passé des commandes
* **Q2** Donner Customer_id, customer_name et TOTAL_REVENUE pour tous les clients en remplaçant les NAN par 0

|    |   CUSTOMER_ID | CUSTOMER_NAME         |   TOTAL_REVENUE |
|---:|--------------:|:----------------------|----------------:|
|  0 |             4 | Riley Sporting Goods  |         1012460 |
|  1 |             8 | Allen Stadium         |          918920 |
|  2 |             9 | Dent Research         |          896720 |
|  3 |             7 | Facility 95           |          865410 |
|  4 |             6 | Prairie Sports Center |          779870 |
|  5 |             5 | Lite Industrial       |          679680 |
|  6 |            10 | Gamma Solutions       |          673500 |
|  7 |             3 | Sports Unlimited      |          671070 |
|  8 |             2 | Oak Cliff Base        |          664660 |
|  9 |             1 | Alpha Medical         |               0 |

In [None]:
# Q1
sql = """
SELECT CUSTOMER.CUSTOMER_ID,
CUSTOMER_NAME,
SUM(PRICE * QUANTITY) AS TOTAL_REVENUE
FROM CUSTOMER INNER JOIN CUSTOMER_ORDER
ON CUSTOMER_ORDER.CUSTOMER_ID = CUSTOMER.CUSTOMER_ID
LEFT JOIN PRODUCT
ON CUSTOMER_ORDER.PRODUCT_ID = PRODUCT.PRODUCT_ID
GROUP BY CUSTOMER.CUSTOMER_ID, CUSTOMER_NAME
ORDER BY TOTAL_REVENUE DESC
--LIMIT 1
"""
pd.read_sql(sql,conn)

Unnamed: 0,CUSTOMER_ID,CUSTOMER_NAME,TOTAL_REVENUE
0,4,Riley Sporting Goods,1012460
1,8,Allen Stadium,918920
2,9,Dent Research,896720
3,7,Facility 95,865410
4,6,Prairie Sports Center,779870
5,5,Lite Industrial,679680
6,10,Gamma Solutions,673500
7,3,Sports Unlimited,671070
8,2,Oak Cliff Base,664660


In [None]:
#Q2
sql = """
SELECT CUSTOMER.CUSTOMER_ID,
CUSTOMER_NAME,
COALESCE(SUM(PRICE * QUANTITY),0) AS TOTAL_REVENUE
FROM CUSTOMER LEFT JOIN CUSTOMER_ORDER
ON CUSTOMER_ORDER.CUSTOMER_ID = CUSTOMER.CUSTOMER_ID
LEFT JOIN PRODUCT
ON CUSTOMER_ORDER.PRODUCT_ID = PRODUCT.PRODUCT_ID
GROUP BY CUSTOMER.CUSTOMER_ID, CUSTOMER_NAME
ORDER BY TOTAL_REVENUE DESC
--LIMIT 1
"""
pd.read_sql(sql,conn)

Unnamed: 0,CUSTOMER_ID,CUSTOMER_NAME,TOTAL_REVENUE
0,4,Riley Sporting Goods,1012460
1,8,Allen Stadium,918920
2,9,Dent Research,896720
3,7,Facility 95,865410
4,6,Prairie Sports Center,779870
5,5,Lite Industrial,679680
6,10,Gamma Solutions,673500
7,3,Sports Unlimited,671070
8,2,Oak Cliff Base,664660
9,1,Alpha Medical,0


# EXERCICE
Donner le Revenue Total par Product

|    |   PRODUCT_ID | PRODUCT_NAME     |   TOTAL_REVENUE |
|---:|-------------:|:-----------------|----------------:|
|  0 |           12 | Wolverine Kit    |         1101000 |
|  1 |            6 | Owl NV           |          924000 |
|  2 |            1 | Eagle Kit        |          772800 |
|  3 |            2 | Hawkeye Cam      |          630400 |
|  4 |            8 | Roadrunner Pro   |          615300 |
|  5 |           14 | Grasshopper Pro  |          460000 |
|  6 |            7 | Vulture X        |          425040 |
|  7 |            4 | Raven Klaw       |          353600 |
|  8 |           13 | Natterjack Light |          352000 |
|  9 |           15 | Kriket Light XL  |          351600 |
| 10 |            3 | Sparrow Blade    |          339200 |
| 11 |           10 | Emu Handheld     |          272300 |
| 12 |           11 | Pelican Handheld |          199500 |
| 13 |            9 | Falcon Tracker   |          183800 |
| 14 |            5 | Kriket Light     |          181750 |

# CORRECTION
 * Commencer par Selectionner PRODUCT_ID,PRODUCT_NAME, QUANTITY et PRICE

In [None]:
sql = """
SELECT PRODUCT.PRODUCT_ID,
      PRODUCT_NAME,
      QUANTITY,
      PRICE
FROM PRODUCT LEFT JOIN CUSTOMER_ORDER
ON PRODUCT.PRODUCT_ID = CUSTOMER_ORDER.PRODUCT_ID
"""
pd.read_sql(sql,conn)

Unnamed: 0,PRODUCT_ID,PRODUCT_NAME,QUANTITY,PRICE
0,1,Eagle Kit,10,120
1,1,Eagle Kit,10,120
2,1,Eagle Kit,10,120
3,1,Eagle Kit,10,120
4,1,Eagle Kit,10,120
...,...,...,...,...
1185,15,Kriket Light XL,190,40
1186,15,Kriket Light XL,190,40
1187,15,Kriket Light XL,200,40
1188,15,Kriket Light XL,200,40


* Ajouter le Revenue da chaque produit au SELECT

In [None]:
sql = """
SELECT PRODUCT.PRODUCT_ID,
      PRODUCT_NAME,
      QUANTITY,
      PRICE,
      QUANTITY * PRICE AS REVENUE
FROM PRODUCT LEFT JOIN CUSTOMER_ORDER
ON PRODUCT.PRODUCT_ID = CUSTOMER_ORDER.PRODUCT_ID
"""
pd.read_sql(sql,conn)

Unnamed: 0,PRODUCT_ID,PRODUCT_NAME,QUANTITY,PRICE,REVENUE
0,1,Eagle Kit,10,120,1200
1,1,Eagle Kit,10,120,1200
2,1,Eagle Kit,10,120,1200
3,1,Eagle Kit,10,120,1200
4,1,Eagle Kit,10,120,1200
...,...,...,...,...,...
1185,15,Kriket Light XL,190,40,7600
1186,15,Kriket Light XL,190,40,7600
1187,15,Kriket Light XL,200,40,8000
1188,15,Kriket Light XL,200,40,8000


* Aggrégation avec SUM() et réponse Final

In [None]:
sql = """
SELECT PRODUCT.PRODUCT_ID,
      PRODUCT_NAME,
      SUM(QUANTITY * PRICE) AS TOTAL_REVENUE
FROM PRODUCT LEFT JOIN CUSTOMER_ORDER
ON PRODUCT.PRODUCT_ID = CUSTOMER_ORDER.PRODUCT_ID
GROUP BY PRODUCT.PRODUCT_ID, PRODUCT_NAME
ORDER BY TOTAL_REVENUE DESC
"""
pd.read_sql(sql,conn)

Unnamed: 0,PRODUCT_ID,PRODUCT_NAME,TOTAL_REVENUE
0,12,Wolverine Kit,1101000
1,6,Owl NV,924000
2,1,Eagle Kit,772800
3,2,Hawkeye Cam,630400
4,8,Roadrunner Pro,615300
5,14,Grasshopper Pro,460000
6,7,Vulture X,425040
7,4,Raven Klaw,353600
8,13,Natterjack Light,352000
9,15,Kriket Light XL,351600
