# Projet : réaliser une étude sur les produits alimentaires

## 1 - Contexte

Vous réalisez une mission "UFC-Que Choisir". Cette association vous demande de réaliser une étude 
sur la qualité des produits alimentaires mis à disposition des consommateurs par les marques. Cette étude a vocation à aider les consommateurs dans leurs choix.

L'objectif est d'analyser les marques sur au moins les critères suivants : la qualité nutritionnelle, l'impact environnemental, la part des produits biologiques.


Pour cela, vous proposer d'utiliser les données de l'application Yuka

Voici une liste des questions que se pose l'association : 

- Quelles sont les marques qui jouent le jeu et affichent le nutriscore ? Lesquelles ne le font pas?

- Quelles sont les marques qui ont le plus recours aux additifs nocifs

- Quelles sont les marques qui présentent des produits avec la meilleure qualité nutritionnelle ? Les pires ?

- Quelles sont les marques qui jouent le jeu et affichent l'ecosore ? Lesquelles ne le font pas?

- Quelles sont les marques qui semblent le plus respectueuses de l'environnement ? le moins?

- Quelles sont les marques qui proposent essentiellement des produits biologiques ?

- Observe-t-on une corrélation entre le caratère bio des produits et la qualité nutritionnelle ? 

- Observe-t-on une corrélation entre le caratère bio des produits et l'aspect envionnemental ?

- Observe-t-on une corrélation entre la la qualité nutritionnelle et l'aspec envionnemental ?

- Quelles marques faut-il recommander ? 

- Quelles marques ne faut-il surtout pas recommander ? 




## 2 - Données

3 fichiers issues de l'application Yuka sont à exploiter pour réaliser l'analyse. 


Agriculture_biologique_final.csv

Qualite nutritionnelle_final.csv

Impact environnemental_final.csv

Voici le lien pour récupérer les fichiers

# https://www.dropbox.com/sh/pwsv4coi2sbbhyo/AABJ81-xWu3K2Cl0DOCmwbGsa?dl=0

## 3 - Consignes

Travail de groupe de 3 personnes. 

Pour l'évaluation CC, chaque groupe doit livrer à la fin des sessions :

    - un rapport synthétisant l'analyse faite et les conculusions obtenues
    
    - un notebook nettoyé et commenté

## 4 - Contraintes !!!

Votre analyse doit inclure des analyses univariées et multivariées


## 5 - Préparation de l'environnement

In [1]:
# Import de la librairie Pandas
import pandas as pd

# Import de la librairie matplotlib
import matplotlib.pyplot as plt

#Import de numpy
import numpy as np

from scipy import stats

## 6 - A vous de jouer¶

Inspirez-vous du TP vu en cours

In [2]:
bio = pd.read_csv("Agriculture_biologique_final.csv", sep ="\t", low_memory = False)
qnutri = pd.read_csv("Qualite nutritionnelle_final.csv", sep ="\t", low_memory = False)
ienvi = pd.read_csv("Impact environnemental_final.csv", sep ="\t", low_memory = False)

In [3]:
qnutri

Unnamed: 0,code,product_name,quantity,brands_tags,countries_tags,serving_size,image_url,ingredients_tags,nutriscore_grade,energy-kcal_100g,fat_100g,saturated-fat_100g,sugars_100g,proteins_100g,salt_100g,nb_nocif
0,00000000000026772226,Skyr,480 g,danone,en:france,,https://images.openfoodfacts.org/images/produc...,,a,57.0,0.2,0.10,3.9,10.0,0.09,0.0
1,0000000000017,Vitória crackers,,,en:france,,https://images.openfoodfacts.org/images/produc...,,,375.0,7.0,3.08,15.0,7.8,1.40,0.0
2,0000000000031,Cacao,130 g,,en:france,,https://images.openfoodfacts.org/images/produc...,,,,,,,,,0.0
3,0000000000100,moutarde au moût de raisin,100g,courte-paille,en:france,,https://images.openfoodfacts.org/images/produc...,fr:eau-graines-de-teguments-de-moutarde-vinaig...,d,,8.2,2.20,22.0,5.1,4.60,0.0
4,0000000000123,Sauce Sweety chili 0%,,,en:france,,https://images.openfoodfacts.org/images/produc...,,,21.0,0.0,0.00,0.4,0.2,2.04,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
932861,9999991953895,Crème de Marrons,,,en:france,,https://images.openfoodfacts.org/images/produc...,,,,,,,,,0.0
932862,9999992756068,Steak haché,500 g,,en:france,,https://images.openfoodfacts.org/images/produc...,,,,,,,,,0.0
932863,9999992756112,Steak haché,1 kg,,en:france,,https://images.openfoodfacts.org/images/produc...,,,196.0,14.0,6.20,0.0,19.0,0.19,0.0
932864,999999999,Thé noir BIO Darjeeling,,pages,en:france,,,,,,,,,,,0.0


In [4]:
qnutri_sansna = qnutri[qnutri['brands_tags'].notna()]
qnutri_sansna = qnutri_sansna[qnutri_sansna['product_name'].notna()]
qnutri_sansna_duplicate = qnutri_sansna.drop_duplicates()

In [5]:
qnutri_sansna.isna().sum()

code                       0
product_name               0
quantity              218560
brands_tags                0
countries_tags             0
serving_size          371014
image_url              44793
ingredients_tags      211980
nutriscore_grade      215168
energy-kcal_100g      101616
fat_100g               81497
saturated-fat_100g     81139
sugars_100g            80248
proteins_100g          79666
salt_100g              87962
nb_nocif                   0
dtype: int64

In [6]:
qnutri_sansna.duplicated().sum()

31967

In [7]:
qnutri_sansna_duplicate

Unnamed: 0,code,product_name,quantity,brands_tags,countries_tags,serving_size,image_url,ingredients_tags,nutriscore_grade,energy-kcal_100g,fat_100g,saturated-fat_100g,sugars_100g,proteins_100g,salt_100g,nb_nocif
0,00000000000026772226,Skyr,480 g,danone,en:france,,https://images.openfoodfacts.org/images/produc...,,a,57.0,0.2,0.10,3.9,10.0,0.090,0.0
3,0000000000100,moutarde au moût de raisin,100g,courte-paille,en:france,,https://images.openfoodfacts.org/images/produc...,fr:eau-graines-de-teguments-de-moutarde-vinaig...,d,,8.2,2.20,22.0,5.1,4.600,0.0
15,0000000001199,Solène céréales poulet,,crous,en:france,,https://images.openfoodfacts.org/images/produc...,"en:antioxidant,en:colour,en:tomato,en:vegetabl...",,219.0,5.9,0.50,1.7,9.7,0.464,0.0
16,0000000001281,Tarte noix de coco,,"crous-resto,crous",en:france,,https://images.openfoodfacts.org/images/produc...,,d,381.0,22.0,15.50,21.9,4.6,0.100,0.0
20,0000000001663,Crème dessert chocolat,,ferme-de-la-fremondiere,en:france,,https://images.openfoodfacts.org/images/produc...,"en:whole-milk,en:dairy,en:milk,en:sugar,en:add...",,0.0,0.0,0.00,0.0,0.0,0.000,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
932782,99665555,La parisienne à poêler,,bonduelle,en:france,,https://images.openfoodfacts.org/images/produc...,,,52.0,0.6,0.10,2.1,1.7,0.470,0.0
932784,997046,Chocolat noir patissier,,monoprix-bio,en:france,,https://images.openfoodfacts.org/images/produc...,,,578.0,39.0,24.00,46.0,5.1,0.000,0.0
932792,998042,Saveur ABRICIT,,gerble,en:france,,https://images.openfoodfacts.org/images/produc...,,,45.0,1.8,0.01,1.5,9.7,0.130,0.0
932803,9990000,Lindt pâte à tartiner,,lindt,en:france,,https://images.openfoodfacts.org/images/produc...,,,,,,,,,0.0


In [8]:
bio.isna().sum()

product_name      7359
brands_tags          0
serving_size    377744
est_bio              0
dtype: int64

In [9]:
bio_sansna = bio[bio['product_name'].notna()]

In [10]:
bio_sansna.isna().sum()

product_name         0
brands_tags          0
serving_size    371014
est_bio              0
dtype: int64

In [11]:
bio_sansna.duplicated().sum()

59913

In [12]:
bio_sansna_duplicate = bio_sansna.drop_duplicates()

In [13]:
bio_sansna_duplicate

Unnamed: 0,product_name,brands_tags,serving_size,est_bio
0,Skyr,danone,,False
1,moutarde au moût de raisin,courte-paille,,False
2,Solène céréales poulet,crous,,False
3,Tarte noix de coco,"crous-resto,crous",,False
4,Crème dessert chocolat,ferme-de-la-fremondiere,,False
...,...,...,...,...
463912,Nutra'cake framboise,delical,,False
463915,Chocolat noir patissier,monoprix-bio,,True
463917,Saveur ABRICIT,gerble,,False
463918,Lindt pâte à tartiner,lindt,,False


In [14]:
bio_sansna_duplicate_gb = bio_sansna_duplicate.groupby(by=["product_name", "brands_tags"]).first().reset_index()
bio_sansna_duplicate_gb

Unnamed: 0,product_name,brands_tags,serving_size,est_bio
0,,m-s,,False
1,18 marrons glacés,motta,,False
2,6 Crêpes de sarrasin,les-delices-de-landeleau,,False
3,Biscuit Tablette Chocolat au Lait bio,"u-bio,u",25 g,True
4,Boletus,coop,10 g,False
...,...,...,...,...
388821,자연은 튼튼 (Jayeon-eun Teunteun),"자연은,jayeon-eun",serving,False
388822,🍇 Raisins sultanines,rapunzel,,True
388823,🍚Riz au lait🥛,la-fermiere,160g,False
388824,🐰 Lait du pays Alpin,milka,100g,False


In [15]:
bio_sansna_duplicate_gb.duplicated().sum()

0

In [16]:
bio_sansna_duplicate[bio_sansna_duplicate["product_name"]== "Skyr"]

Unnamed: 0,product_name,brands_tags,serving_size,est_bio
0,Skyr,danone,,False
5361,Skyr,liberte-canada,serving,False
15774,Skyr,danone,140g,False
22749,Skyr,siggi-s,140g,False
51822,Skyr,danone,1 pot (140g),False
52248,Skyr,danone,5 cuill�res � caf�,False
151598,Skyr,yoplait,,False
185983,Skyr,logismose,,False
229277,Skyr,"les-2-vaches,danone",120g,True
268429,Skyr,puffy-s,,True


In [17]:
ienvi

Unnamed: 0,code,product_name,quantity,brands_tags,countries_tags,serving_size,image_url,ecoscore_grade,est_plastique,est_palm,est_cocoa
0,00000000000026772226,Skyr,480 g,danone,en:france,,https://images.openfoodfacts.org/images/produc...,d,False,False,False
1,0000000000017,Vitória crackers,,,en:france,,https://images.openfoodfacts.org/images/produc...,unknown,False,False,False
2,0000000000031,Cacao,130 g,,en:france,,https://images.openfoodfacts.org/images/produc...,unknown,False,False,False
3,0000000000100,moutarde au moût de raisin,100g,courte-paille,en:france,,https://images.openfoodfacts.org/images/produc...,c,False,False,False
4,0000000000123,Sauce Sweety chili 0%,,,en:france,,https://images.openfoodfacts.org/images/produc...,unknown,False,False,False
...,...,...,...,...,...,...,...,...,...,...,...
932861,9999991953895,Crème de Marrons,,,en:france,,https://images.openfoodfacts.org/images/produc...,unknown,False,False,False
932862,9999992756068,Steak haché,500 g,,en:france,,https://images.openfoodfacts.org/images/produc...,e,False,False,False
932863,9999992756112,Steak haché,1 kg,,en:france,,https://images.openfoodfacts.org/images/produc...,unknown,False,False,False
932864,999999999,Thé noir BIO Darjeeling,,pages,en:france,,,unknown,False,False,False


In [18]:
ienvi_sansna = ienvi[ienvi['brands_tags'].notna()]
ienvi_sansna = ienvi_sansna[ienvi_sansna['product_name'].notna()]
ienvi_sansna_duplicate = ienvi_sansna.drop_duplicates()

In [19]:
# les colonnes en commun sont product_name et brands_tags
df_total = qnutri_sansna_duplicate.merge(ienvi_sansna_duplicate,on=['code'], how="inner", suffixes=('', '_DROP')).filter(regex='^(?!.*_DROP)')


In [20]:
df_total

Unnamed: 0,code,product_name,quantity,brands_tags,countries_tags,serving_size,image_url,ingredients_tags,nutriscore_grade,energy-kcal_100g,fat_100g,saturated-fat_100g,sugars_100g,proteins_100g,salt_100g,nb_nocif,ecoscore_grade,est_plastique,est_palm,est_cocoa
0,00000000000026772226,Skyr,480 g,danone,en:france,,https://images.openfoodfacts.org/images/produc...,,a,57.0,0.2,0.10,3.9,10.0,0.090,0.0,d,False,False,False
1,0000000000100,moutarde au moût de raisin,100g,courte-paille,en:france,,https://images.openfoodfacts.org/images/produc...,fr:eau-graines-de-teguments-de-moutarde-vinaig...,d,,8.2,2.20,22.0,5.1,4.600,0.0,c,False,False,False
2,0000000001199,Solène céréales poulet,,crous,en:france,,https://images.openfoodfacts.org/images/produc...,"en:antioxidant,en:colour,en:tomato,en:vegetabl...",,219.0,5.9,0.50,1.7,9.7,0.464,0.0,unknown,False,False,False
3,0000000001281,Tarte noix de coco,,"crous-resto,crous",en:france,,https://images.openfoodfacts.org/images/produc...,,d,381.0,22.0,15.50,21.9,4.6,0.100,0.0,unknown,True,False,False
4,0000000001663,Crème dessert chocolat,,ferme-de-la-fremondiere,en:france,,https://images.openfoodfacts.org/images/produc...,"en:whole-milk,en:dairy,en:milk,en:sugar,en:add...",,0.0,0.0,0.00,0.0,0.0,0.000,0.0,unknown,False,False,False
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
424597,99665555,La parisienne à poêler,,bonduelle,en:france,,https://images.openfoodfacts.org/images/produc...,,,52.0,0.6,0.10,2.1,1.7,0.470,0.0,unknown,False,False,False
424598,997046,Chocolat noir patissier,,monoprix-bio,en:france,,https://images.openfoodfacts.org/images/produc...,,,578.0,39.0,24.00,46.0,5.1,0.000,0.0,unknown,False,False,False
424599,998042,Saveur ABRICIT,,gerble,en:france,,https://images.openfoodfacts.org/images/produc...,,,45.0,1.8,0.01,1.5,9.7,0.130,0.0,unknown,False,False,False
424600,9990000,Lindt pâte à tartiner,,lindt,en:france,,https://images.openfoodfacts.org/images/produc...,,,,,,,,,0.0,unknown,False,False,False


In [21]:
df_total.duplicated().sum()

4

In [22]:
df_total = df_total.drop_duplicates()

In [23]:
df_total = df_total.merge(bio_sansna_duplicate_gb, on=["product_name","brands_tags"], how="inner", suffixes=('', '_DROP')).filter(regex='^(?!.*_DROP)').drop_duplicates()


In [24]:
df_total

Unnamed: 0,code,product_name,quantity,brands_tags,countries_tags,serving_size,image_url,ingredients_tags,nutriscore_grade,energy-kcal_100g,...,saturated-fat_100g,sugars_100g,proteins_100g,salt_100g,nb_nocif,ecoscore_grade,est_plastique,est_palm,est_cocoa,est_bio
0,00000000000026772226,Skyr,480 g,danone,en:france,,https://images.openfoodfacts.org/images/produc...,,a,57.0,...,0.10,3.9,10.0,0.09,0.0,d,False,False,False,False
1,03414569,Skyr,,danone,en:france,140g,https://images.openfoodfacts.org/images/produc...,"en:skimmed-milk,en:dairy,en:milk,en:lactic-fer...",a,57.0,...,0.10,3.9,10.0,0.09,0.0,b,False,False,False,False
2,04319111,Skyr,825 g,danone,en:france,,https://images.openfoodfacts.org/images/produc...,"en:skimmed-milk,en:dairy,en:milk,fr:ferments-l...",,,...,,3.9,10.0,0.09,0.0,unknown,True,False,False,False
3,1033097270864,Skyr,4,danone,en:france,,,,,,...,,,,,0.0,d,False,False,False,False
4,3033491270864,Skyr,2 x 140 g,danone,"en:france,en:switzerland",1 pot (140g),https://images.openfoodfacts.org/images/produc...,"en:skimmed-milk,en:dairy,en:milk,en:lactic-fer...",a,57.0,...,0.10,3.9,10.0,0.10,0.0,b,True,False,False,False
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
424593,996299394093,Nutra'cake framboise,,delical,en:france,,https://images.openfoodfacts.org/images/produc...,,,381.0,...,6.70,29.0,15.6,0.80,0.0,unknown,False,False,False,False
424594,997046,Chocolat noir patissier,,monoprix-bio,en:france,,https://images.openfoodfacts.org/images/produc...,,,578.0,...,24.00,46.0,5.1,0.00,0.0,unknown,False,False,False,True
424595,998042,Saveur ABRICIT,,gerble,en:france,,https://images.openfoodfacts.org/images/produc...,,,45.0,...,0.01,1.5,9.7,0.13,0.0,unknown,False,False,False,False
424596,9990000,Lindt pâte à tartiner,,lindt,en:france,,https://images.openfoodfacts.org/images/produc...,,,,...,,,,,0.0,unknown,False,False,False,False


In [25]:
df_total.duplicated().sum()

0