<span style="font-size:3em;"> Analysis of the global plant-based meat market </span> 

This is an analysis of the global plant-based meat market, that take into consideration the following three axes for each country: water shortages, the meat consumption in relation to the production and the percentage of vegetarians, creating a coefficient to predict which country will be more suited to the market. 

The following notebook is the cleaning and prep of the water shortages part. 
This CSV file, along to the other parts of the analysis, has been used to build a dashbord in Tableau, that you can find here:
https://public.tableau.com/app/profile/rossana.coro/viz/VegetalMeatProject/Accueil

In [94]:
import pandas as pd
import plotly.io as pio
import plotly.express as px
import plotly.graph_objects as go
import numpy as np
import seaborn as sns

In [95]:
#Import csv file from folder
dataset_brut = pd.read_excel("/Users/losed/OneDrive/Bureau/project_demo_day/stress_water_per_country.xlsx")
dataset_brut.head()

Unnamed: 0,gid_0,name_0,indicator_name,weight,score,score_ranked,cat,label,un_region,wb_region
0,ARE,United Arab Emirates,bws,Dom,5.0,5.0,4.0,Extremely High (>80%),Asia,Middle East & North Africa
1,ARE,United Arab Emirates,bws,Irr,5.0,5.0,4.0,Extremely High (>80%),Asia,Middle East & North Africa
2,BHR,Bahrain,bws,One,5.0,4.5,4.0,Extremely High (>80%),Asia,Middle East & North Africa
3,BHR,Bahrain,bws,Tot,5.0,4.5,4.0,Extremely High (>80%),Asia,Middle East & North Africa
4,BHR,Bahrain,bws,Dom,5.0,5.0,4.0,Extremely High (>80%),Asia,Middle East & North Africa


In [96]:
dataset = dataset_brut.drop(columns=["indicator_name","weight","score","score_ranked","cat"])
dataset.head()

Unnamed: 0,gid_0,name_0,label,un_region,wb_region
0,ARE,United Arab Emirates,Extremely High (>80%),Asia,Middle East & North Africa
1,ARE,United Arab Emirates,Extremely High (>80%),Asia,Middle East & North Africa
2,BHR,Bahrain,Extremely High (>80%),Asia,Middle East & North Africa
3,BHR,Bahrain,Extremely High (>80%),Asia,Middle East & North Africa
4,BHR,Bahrain,Extremely High (>80%),Asia,Middle East & North Africa


In [97]:
dataset = dataset.rename(columns={"gid_0": "Location_code", "name_0": "Country",  "label": "Water_lack_risk_level", "wb_region": "World_region","un_region": "Continent"})
dataset.head()

Unnamed: 0,Location_code,Country,Water_lack_risk_level,Continent,World_region
0,ARE,United Arab Emirates,Extremely High (>80%),Asia,Middle East & North Africa
1,ARE,United Arab Emirates,Extremely High (>80%),Asia,Middle East & North Africa
2,BHR,Bahrain,Extremely High (>80%),Asia,Middle East & North Africa
3,BHR,Bahrain,Extremely High (>80%),Asia,Middle East & North Africa
4,BHR,Bahrain,Extremely High (>80%),Asia,Middle East & North Africa


In [98]:
country_water= dataset['Location_code'].unique()
print(sorted(country_water))

['AFG', 'AGO', 'ALB', 'AND', 'ARE', 'ARG', 'ARM', 'ATG', 'AUS', 'AUT', 'AZE', 'BDI', 'BEL', 'BEN', 'BFA', 'BGD', 'BGR', 'BHR', 'BHS', 'BIH', 'BLR', 'BLZ', 'BOL', 'BRA', 'BRB', 'BRN', 'BTN', 'BWA', 'CAF', 'CAN', 'CHE', 'CHL', 'CHN', 'CIV', 'CMR', 'COD', 'COG', 'COL', 'COM', 'CPV', 'CRI', 'CUB', 'CYP', 'CZE', 'DEU', 'DJI', 'DMA', 'DNK', 'DOM', 'DZA', 'ECU', 'EGY', 'ERI', 'ESP', 'EST', 'ETH', 'FIN', 'FJI', 'FRA', 'FSM', 'GAB', 'GBR', 'GEO', 'GHA', 'GIN', 'GMB', 'GNB', 'GNQ', 'GRC', 'GRD', 'GTM', 'GUY', 'HND', 'HRV', 'HTI', 'HUN', 'IDN', 'IND', 'IRL', 'IRN', 'IRQ', 'ISL', 'ISR', 'ITA', 'JAM', 'JOR', 'JPN', 'KAZ', 'KEN', 'KGZ', 'KHM', 'KIR', 'KNA', 'KOR', 'KWT', 'LAO', 'LBN', 'LBR', 'LBY', 'LCA', 'LIE', 'LKA', 'LSO', 'LTU', 'LUX', 'LVA', 'MAR', 'MCO', 'MDA', 'MDG', 'MEX', 'MHL', 'MKD', 'MLI', 'MLT', 'MMR', 'MNE', 'MNG', 'MOZ', 'MRT', 'MUS', 'MWI', 'MYS', 'NAM', 'NER', 'NGA', 'NIC', 'NLD', 'NOR', 'NPL', 'NRU', 'NZL', 'OMN', 'PAK', 'PAN', 'PER', 'PHL', 'PLW', 'PNG', 'POL', 'PRK', 'PRT', 'PRY'

['AFG', 'AGO', 'ALB', 'AND', 'ARE', 'ARM', 'ATG', 'AUT', 'AZE', 'BDI', 'BEL', 'BEN', 'BFA', 'BGD', 'BGR', 'BHR', 'BHS', 'BIH', 'BLR', 'BLZ', 'BOL',  'BRB', 'BRN', 'BTN', 'BWA', 'CAF', 'CIV', 'CMR', 'COD', 'COG', 'COM', 'CPV', 'CRI', 'CUB', 'CYP', 'CZE', 'DEU', 'DJI', 'DMA', 'DNK', 'DOM', 'DZA', 'ECU', 'ERI', 'ESP', 'EST', 'FIN', 'FJI', 'FRA', 'FSM', 'GAB', 'GEO', 'GHA', 'GIN', 'GMB', 'GNB', 'GNQ', 'GRC', 'GRD', 'GTM', 'GUY', 'HND', 'HRV', 'HTI', 'HUN', 'KEN', 'KGZ', 'KHM', 'KIR', 'KNA', 'KWT', 'LAO', 'LBN', 'LBR', 'LBY', 'LCA', 'LIE', 'LKA', 'LSO', 'LTU', 'LUX', 'LVA', 'MAR', 'MCO', 'MDA', 'MDG', 'NAM', 'NER', 'NGA', 'NIC', 'NLD', 'NPL', 'NRU', 'OMN','PAN','PLW', 'PNG', 'POL', 'PRK', 'PRT', 'QAT', 'ROU', 'RWA','SDN', 'SEN', 'SGP', 'SLB', 'SLE', 'SLV', 'SMR', 'SOM', 'SRB', 'SSD', 'STP', 'SUR', 'SVK', 'SVN', 'SWE', 'SWZ', 'SYC', 'SYR', 'TCD', 'TGO', 'TJK', 'TKM', 'TLS', 'TON', 'TTO', 'TUN',  'TUV', 'TZA', 'UGA', 'URY', 'UZB', 'VCT', 'VEN', 'VUT', 'WSM', 'YEM','ZMB', 'ZWE']

In [99]:
country_list = ['AFR', 'ARG', 'ASP', 'AUS', 'BRA', 'BRICS', 'CAN', 'CHE', 'CHL', 'CHN', 'COL', 'DVD', 'DVG', 'EGY', 'ETH', 'EUN', 'EUR', 'GBR', 'IDN', 'IND', 'IRN', 'ISR', 'JPN', 'KAZ', 'KOR', 'LAC', 'MEX', 'MYS', 'NGA', 'NOA', 'NOR', 'NZL', 'OCD', 'OECD', 'PAK', 'PER', 'PHL', 'PRY', 'RUS', 'SAU', 'THA', 'TUR', 'UKR', 'USA', 'VNM', 'WLD', 'ZAF']

dataset = dataset[dataset['Location_code'].isin(country_list )]
dataset.shape

(455, 5)

In [110]:
dataset['Location_code'].unique()


array(['SAU', 'ISR', 'EGY', 'IRN', 'ZAF', 'CHL', 'MEX', 'VNM', 'UKR',
       'IDN', 'PER', 'IND', 'PAK', 'USA', 'THA', 'TUR', 'RUS', 'GBR',
       'PHL', 'NGA', 'CHN', 'ETH', 'PRY', 'ARG', 'AUS', 'NOR', 'BRA',
       'KAZ', 'MYS', 'KOR', 'NZL', 'JPN', 'CAN', 'CHE', 'COL'],
      dtype=object)

In [113]:
location = np.unique(dataset['Location_code'])

sorted_location = np.sort(location)

print(sorted_location)

['ARG' 'AUS' 'BRA' 'CAN' 'CHE' 'CHL' 'CHN' 'COL' 'EGY' 'ETH' 'GBR' 'IDN'
 'IND' 'IRN' 'ISR' 'JPN' 'KAZ' 'KOR' 'MEX' 'MYS' 'NGA' 'NOR' 'NZL' 'PAK'
 'PER' 'PHL' 'PRY' 'RUS' 'SAU' 'THA' 'TUR' 'UKR' 'USA' 'VNM' 'ZAF']


In [114]:
dataset.head

<bound method NDFrame.head of      Location_code       Country  Water_lack_risk_level Continent  \
42             SAU  Saudi Arabia  Extremely High (>80%)      Asia   
43             SAU  Saudi Arabia  Extremely High (>80%)      Asia   
44             SAU  Saudi Arabia  Extremely High (>80%)      Asia   
45             ISR        Israel  Extremely High (>80%)      Asia   
46             SAU  Saudi Arabia  Extremely High (>80%)      Asia   
...            ...           ...                    ...       ...   
2236           SAU  Saudi Arabia                 NoData      Asia   
2237           SAU  Saudi Arabia                 NoData      Asia   
2238           SAU  Saudi Arabia                 NoData      Asia   
2239           SAU  Saudi Arabia                 NoData      Asia   
2240           SAU  Saudi Arabia                 NoData      Asia   

                    World_region  
42    Middle East & North Africa  
43    Middle East & North Africa  
44    Middle East & North Africa  
4