# Effects of climate change on agriculture

We'll be visualizing trends in food production across different countries since 1960s to 2020 to assess the impact of climate change on different food crops.

In [48]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

## Introduction
(will insert some introduction here)



## Step 1: Downloading Data
We have acquired our dataset from AgroSphere (a project under NASA), let us start visualizing the data on emmissions.
We have 4 files (CSV) in the /data folder, as described:
+ countries.csv - Contains unique country codes which are used in emmissions and food production data
+ emissionAll.csv - Contains data from emissions from various sources (only agriculture.
+ FAOcrops.csv - Contains data from food productions (weight of crop in tonnes) of different countries.
+ yield.csv - Contains data from food productions (area of cultivated land in Ha) of different countries.

In [71]:
countries = pd.DataFrame(pd.read_csv('data/countries.csv'))
emissionall = pd.DataFrame(pd.read_csv('data/emissionAll.csv'))
cropsyield = pd.DataFrame(pd.read_csv('data/FAOcrops.csv'))
areayield = pd.DataFrame(pd.read_csv('data/yield.csv'))

## Step 2: Cleaning and selecting the data
The data is very big and unclean. It contains many information that we don't need so we'll be selecting only what we need and dropping the rest.

In [26]:
countries = countries.drop(['code2','Numeric code','lat','lon','iconCode'], axis=1)
countries.head(10)

Unnamed: 0,country,code3
0,Albania,ALB
1,Algeria,DZA
2,American Samoa,ASM
3,Andorra,AND
4,Angola,AGO
5,Anguilla,AIA
6,Antarctica,ATA
7,Antigua and Barbuda,ATG
8,Argentina,ARG
9,Armenia,ARM


So, in the previous cell we removed all label that we didn't need and now we have only two columns - the country names and the code with us. This, looks good!

In [27]:
countryincludes = ['AFG','AUS','AUT','BGD','BRA','CHL','CHN','COG','FRA','GHA','IND','IDN','IRQ','JPN','JOR','MEX','NZL','KOR','REU','GBR','USA','VNM']
len(countryincludes)

22

This is the list of countries only which we would be focusing the project on - 22 for now. Let us find the country in our data.

In [28]:
countries[countries.code3.isin(countryincludes)]

Unnamed: 0,country,code3
11,Australia,AUS
12,Austria,AUT
16,Bangladesh,BGD
28,Brazil,BRA
41,Chile,CHL
42,China,CHN
47,Congo,COG
71,France,FRA
79,Ghana,GHA
98,India,IND


Now let us try our hands on the areayield dataset.
Let us first know the crops (unique values) and then shall we select what we need to use in future.

In [34]:
cropsyield.typeName.unique()

array(['Almonds, with shell Production in tonnes',
       'Anise, badian, fennel, coriander Production in tonnes',
       'Apples Production in tonnes', 'Apricots Production in tonnes',
       'Barley Production in tonnes', 'Berries nes Production in tonnes',
       'Cotton lint Production in tonnes',
       'Cottonseed Production in tonnes', 'Figs Production in tonnes',
       'Fruit, citrus nes Production in tonnes',
       'Fruit, fresh nes Production in tonnes',
       'Fruit, stone nes Production in tonnes',
       'Grapes Production in tonnes', 'Linseed Production in tonnes',
       'Maize Production in tonnes',
       'Melons, other (inc.cantaloupes) Production in tonnes',
       'Millet Production in tonnes', 'Nuts, nes Production in tonnes',
       'Olives Production in tonnes', 'Oranges Production in tonnes',
       'Peaches and nectarines Production in tonnes',
       'Pears Production in tonnes', 'Pistachios Production in tonnes',
       'Plums and sloes Production in tonne

Now that we'll select only 10 crops to move forward with (This will make our work easier). If we want more we can change them later.

In [30]:
cropsincludes = ['Apples Production in tonnes',
                'Maize Production in tonnes',
                'Fruit, fresh nes Production in tonnes',
                'Potatoes Production in tonnes',
                'Rice, paddy Production in tonnes',
               'Vegetables Primary Production in tonnes',
               'Pulses, Total Production in tonnes',
               'Poppy seed Production in tonnes',
               'Sugar beet Production in tonnes',
               'Coffee, green Production in tonnes']
print("Total number of crops = ",len(cropsincludes))
cropsincludes

Total number of crops =  10


['Apples Production in tonnes',
 'Maize Production in tonnes',
 'Fruit, fresh nes Production in tonnes',
 'Potatoes Production in tonnes',
 'Rice, paddy Production in tonnes',
 'Vegetables Primary Production in tonnes',
 'Pulses, Total Production in tonnes',
 'Poppy seed Production in tonnes',
 'Sugar beet Production in tonnes',
 'Coffee, green Production in tonnes']

In [66]:
selectcrops = cropsyield[cropsyield.typeName.isin(cropsincludes)].copy()
selectcrops.index = range(len(selectcrops))
selectcrops = selectarea[selectarea.code3.isin(countryincludes)]
## We also need to eliminate data from countries we don't need!
selectcrops

Unnamed: 0,code3,typeName,1961,1962,1963,1964,1965,1966,1967,1968,...,2005,2006,2007,2008,2009,2010,2011,2012,2013,2014
0,AFG,Apples yield in hg/ha,68018.0,68018.0,68018.0,78298.0,82258.0,83212.0,90196.0,93311.0,...,72672.0,75000.0,85000.0,81919.0,85105.0,70000.0,70000.0,76519.0,76005.0,73000.0
1,AFG,"Fruit, fresh nes yield in hg/ha",51153.0,50109.0,52129.0,60544.0,63984.0,66607.0,71047.0,75299.0,...,99235.0,103347.0,107886.0,120000.0,150000.0,120000.0,70000.0,73690.0,74902.0,75000.0
2,AFG,Maize yield in hg/ha,14000.0,14000.0,14260.0,14257.0,14400.0,14400.0,14144.0,17064.0,...,12069.0,26204.0,26277.0,26277.0,21429.0,16448.0,16400.0,21986.0,21972.0,24882.0
3,AFG,Potatoes yield in hg/ha,86667.0,76667.0,81333.0,86000.0,88000.0,90667.0,98000.0,100000.0,...,150000.0,150000.0,150400.0,140000.0,140000.0,120000.0,100000.0,109524.0,131960.0,136054.0
4,AFG,"Rice, paddy yield in hg/ha",15190.0,15190.0,15190.0,17273.0,17273.0,15180.0,19223.0,19515.0,...,30313.0,33750.0,32471.0,32211.0,32250.0,32308.0,32000.0,24390.0,24980.0,24409.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1103,VNM,"Fruit, fresh nes yield in hg/ha",86000.0,86000.0,86000.0,86000.0,88095.0,88095.0,88095.0,88095.0,...,122222.0,121739.0,117391.0,117021.0,117441.0,117445.0,117367.0,116667.0,116667.0,117521.0
1104,VNM,Maize yield in hg/ha,11230.0,11987.0,9307.0,12436.0,11503.0,9589.0,11000.0,10962.0,...,35979.0,37310.0,39259.0,31753.0,40137.0,40899.0,43128.0,43019.0,44354.0,44140.0
1105,VNM,Potatoes yield in hg/ha,133333.0,150000.0,166667.0,166667.0,137500.0,150000.0,180000.0,170000.0,...,105714.0,105714.0,103333.0,105556.0,104865.0,120448.0,137810.0,146352.0,135800.0,140954.0
1106,VNM,"Rice, paddy yield in hg/ha",18966.0,19937.0,21400.0,19441.0,19414.0,18079.0,19159.0,17095.0,...,48891.0,48943.0,49869.0,52336.0,52372.0,53416.0,55383.0,56353.0,55728.0,57538.0


**Wooho! Great so far.**

Now, we'll go forward with this new data with selected parameters. And do similiar things with areayield.

In [35]:
areayield.typeName.unique()

array(['Almonds, with shell yield in hg/ha',
       'Anise, badian, fennel, coriander yield in hg/ha',
       'Apples yield in hg/ha', 'Apricots yield in hg/ha',
       'Barley yield in hg/ha', 'Berries nes yield in hg/ha',
       'Figs yield in hg/ha', 'Fruit, citrus nes yield in hg/ha',
       'Fruit, fresh nes yield in hg/ha',
       'Fruit, stone nes yield in hg/ha', 'Grapes yield in hg/ha',
       'Linseed yield in hg/ha', 'Maize yield in hg/ha',
       'Melons, other (inc.cantaloupes) yield in hg/ha',
       'Millet yield in hg/ha', 'Nuts, nes yield in hg/ha',
       'Olives yield in hg/ha', 'Oranges yield in hg/ha',
       'Peaches and nectarines yield in hg/ha', 'Pears yield in hg/ha',
       'Pistachios yield in hg/ha', 'Plums and sloes yield in hg/ha',
       'Potatoes yield in hg/ha', 'Pulses, nes yield in hg/ha',
       'Rice, paddy yield in hg/ha', 'Seed cotton yield in hg/ha',
       'Sesame seed yield in hg/ha', 'Sugar beet yield in hg/ha',
       'Sugar cane yield in hg

In [39]:
areaincludes = ['Apples yield in hg/ha',
                'Maize yield in hg/ha',
                'Fruit, fresh nes yield in hg/ha',
                'Potatoes yield in hg/ha',
                'Rice, paddy yield in hg/ha',
               'Vegetables Primary yield in hg/ha',
               'Pulses, Total yield in hg/ha',
               'Poppy seed yield in hg/ha',
               'Sugar beet yield in hg/ha',
               'Coffee, green yield in hg/ha']
print("Total number of crops = ", len(areaincludes))
areaincludes

Total number of crops =  10


['Apples yield in hg/ha',
 'Maize yield in hg/ha',
 'Fruit, fresh nes yield in hg/ha',
 'Potatoes yield in hg/ha',
 'Rice, paddy yield in hg/ha',
 'Vegetables Primary yield in hg/ha',
 'Pulses, Total yield in hg/ha',
 'Poppy seed yield in hg/ha',
 'Sugar beet yield in hg/ha',
 'Coffee, green yield in hg/ha']

In [67]:
selectarea = areayield[areayield.typeName.isin(areaincludes)]
selectarea.index = range(len(selectarea))
selectarea = selectarea[selectarea.code3.isin(countryincludes)]
selectarea

Unnamed: 0,code3,typeName,1961,1962,1963,1964,1965,1966,1967,1968,...,2005,2006,2007,2008,2009,2010,2011,2012,2013,2014
0,AFG,Apples yield in hg/ha,68018.0,68018.0,68018.0,78298.0,82258.0,83212.0,90196.0,93311.0,...,72672.0,75000.0,85000.0,81919.0,85105.0,70000.0,70000.0,76519.0,76005.0,73000.0
1,AFG,"Fruit, fresh nes yield in hg/ha",51153.0,50109.0,52129.0,60544.0,63984.0,66607.0,71047.0,75299.0,...,99235.0,103347.0,107886.0,120000.0,150000.0,120000.0,70000.0,73690.0,74902.0,75000.0
2,AFG,Maize yield in hg/ha,14000.0,14000.0,14260.0,14257.0,14400.0,14400.0,14144.0,17064.0,...,12069.0,26204.0,26277.0,26277.0,21429.0,16448.0,16400.0,21986.0,21972.0,24882.0
3,AFG,Potatoes yield in hg/ha,86667.0,76667.0,81333.0,86000.0,88000.0,90667.0,98000.0,100000.0,...,150000.0,150000.0,150400.0,140000.0,140000.0,120000.0,100000.0,109524.0,131960.0,136054.0
4,AFG,"Rice, paddy yield in hg/ha",15190.0,15190.0,15190.0,17273.0,17273.0,15180.0,19223.0,19515.0,...,30313.0,33750.0,32471.0,32211.0,32250.0,32308.0,32000.0,24390.0,24980.0,24409.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1103,VNM,"Fruit, fresh nes yield in hg/ha",86000.0,86000.0,86000.0,86000.0,88095.0,88095.0,88095.0,88095.0,...,122222.0,121739.0,117391.0,117021.0,117441.0,117445.0,117367.0,116667.0,116667.0,117521.0
1104,VNM,Maize yield in hg/ha,11230.0,11987.0,9307.0,12436.0,11503.0,9589.0,11000.0,10962.0,...,35979.0,37310.0,39259.0,31753.0,40137.0,40899.0,43128.0,43019.0,44354.0,44140.0
1105,VNM,Potatoes yield in hg/ha,133333.0,150000.0,166667.0,166667.0,137500.0,150000.0,180000.0,170000.0,...,105714.0,105714.0,103333.0,105556.0,104865.0,120448.0,137810.0,146352.0,135800.0,140954.0
1106,VNM,"Rice, paddy yield in hg/ha",18966.0,19937.0,21400.0,19441.0,19414.0,18079.0,19159.0,17095.0,...,48891.0,48943.0,49869.0,52336.0,52372.0,53416.0,55383.0,56353.0,55728.0,57538.0


We are done with cleaning and selecting the crop production and yield for the crops (10) we need and for the countries (22) we need.

**Now let us come to cleaning emissions data.**

In [70]:
emissionall

Unnamed: 0,code3,typeName,1961,1962,1963,1964,1965,1966,1967,1968,...,2005,2006,2007,2008,2009,2010,2011,2012,2013,2014
0,AFG,Cereals excluding rice Emissions (CO2eq) gigag...,402.2165,408.3269,385.7396,406.7923,410.0940,392.3671,410.4523,415.6880,...,540.4174,513.3178,558.5772,399.9802,622.9562,569.4834,502.1896,930.8129,765.7361,754.5742
1,AFG,"Rice, paddy Emissions (CO2eq) gigagrams",665.5675,665.5675,665.5675,699.3576,699.3576,703.5647,660.1999,662.6412,...,528.8122,541.7134,564.2252,626.8125,664.4793,689.6050,703.0602,752.4373,708.0396,747.5037
2,AFG,"Meat, cattle Emissions (CO2eq) gigagrams",1576.9262,1791.9616,1806.2973,1842.1366,1813.4652,1892.3115,1820.6330,1829.9512,...,804.9492,938.9879,972.6768,1035.7538,1018.5510,1270.8592,1235.7367,891.6801,856.5577,875.2105
3,AFG,"Milk, whole fresh cow Emissions (CO2eq) gigagrams",1172.1377,1172.1377,1306.0963,1306.0963,1456.7998,1607.5032,1774.9514,1808.4411,...,4353.6544,4688.5509,5023.4474,5525.7922,5525.7922,6530.4817,6363.0334,6697.9299,6764.9092,6912.2253
4,AFG,"Meat, goat Emissions (CO2eq) gigagrams",717.8808,693.8384,607.4048,543.4226,454.2740,454.2740,428.6811,407.9935,...,1008.1470,969.5444,666.9083,879.9692,757.1233,970.1842,1116.4903,1068.7168,1026.0620,1032.4997
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1982,VNM,"Meat, buffalo Emissions (CO2eq) gigagrams",3959.0307,3987.9648,4135.1052,4197.7369,4118.5210,4239.9031,4054.4779,4059.4179,...,5099.0251,5096.3871,5228.2493,5055.8798,5034.5320,5019.3593,4728.2541,4579.7252,4461.0354,4375.0942
1983,VNM,"Milk, whole fresh buffalo Emissions (CO2eq) gi...",14.1142,14.1142,14.1142,14.1142,17.6427,17.6427,21.1713,21.1713,...,56.4568,57.0570,58.2210,56.4568,58.2210,56.4568,56.4568,56.4568,54.6925,56.5855
1984,VNM,"Meat, chicken Emissions (CO2eq) gigagrams",85.7415,94.4821,135.6879,107.8011,124.8662,109.4660,107.8011,118.6229,...,440.9317,395.5095,446.1885,504.0848,566.0580,630.9963,648.5550,635.7599,678.4771,711.6590
1985,VNM,"Eggs, hen, in shell Emissions (CO2eq) gigagrams",84.5900,86.3898,88.1896,91.1892,95.9886,98.9883,101.9879,95.9886,...,287.9659,331.1368,305.9638,311.9631,383.9545,399.5527,419.9503,425.9496,428.9492,449.9276


First let us segregate country wise.

In [78]:
countryemission = emissionall[emissionall.code3.isin(countryincludes)]
countryemission

Unnamed: 0,code3,typeName,1961,1962,1963,1964,1965,1966,1967,1968,...,2005,2006,2007,2008,2009,2010,2011,2012,2013,2014
0,AFG,Cereals excluding rice Emissions (CO2eq) gigag...,402.2165,408.3269,385.7396,406.7923,410.0940,392.3671,410.4523,415.6880,...,540.4174,513.3178,558.5772,399.9802,622.9562,569.4834,502.1896,930.8129,765.7361,754.5742
1,AFG,"Rice, paddy Emissions (CO2eq) gigagrams",665.5675,665.5675,665.5675,699.3576,699.3576,703.5647,660.1999,662.6412,...,528.8122,541.7134,564.2252,626.8125,664.4793,689.6050,703.0602,752.4373,708.0396,747.5037
2,AFG,"Meat, cattle Emissions (CO2eq) gigagrams",1576.9262,1791.9616,1806.2973,1842.1366,1813.4652,1892.3115,1820.6330,1829.9512,...,804.9492,938.9879,972.6768,1035.7538,1018.5510,1270.8592,1235.7367,891.6801,856.5577,875.2105
3,AFG,"Milk, whole fresh cow Emissions (CO2eq) gigagrams",1172.1377,1172.1377,1306.0963,1306.0963,1456.7998,1607.5032,1774.9514,1808.4411,...,4353.6544,4688.5509,5023.4474,5525.7922,5525.7922,6530.4817,6363.0334,6697.9299,6764.9092,6912.2253
4,AFG,"Meat, goat Emissions (CO2eq) gigagrams",717.8808,693.8384,607.4048,543.4226,454.2740,454.2740,428.6811,407.9935,...,1008.1470,969.5444,666.9083,879.9692,757.1233,970.1842,1116.4903,1068.7168,1026.0620,1032.4997
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1982,VNM,"Meat, buffalo Emissions (CO2eq) gigagrams",3959.0307,3987.9648,4135.1052,4197.7369,4118.5210,4239.9031,4054.4779,4059.4179,...,5099.0251,5096.3871,5228.2493,5055.8798,5034.5320,5019.3593,4728.2541,4579.7252,4461.0354,4375.0942
1983,VNM,"Milk, whole fresh buffalo Emissions (CO2eq) gi...",14.1142,14.1142,14.1142,14.1142,17.6427,17.6427,21.1713,21.1713,...,56.4568,57.0570,58.2210,56.4568,58.2210,56.4568,56.4568,56.4568,54.6925,56.5855
1984,VNM,"Meat, chicken Emissions (CO2eq) gigagrams",85.7415,94.4821,135.6879,107.8011,124.8662,109.4660,107.8011,118.6229,...,440.9317,395.5095,446.1885,504.0848,566.0580,630.9963,648.5550,635.7599,678.4771,711.6590
1985,VNM,"Eggs, hen, in shell Emissions (CO2eq) gigagrams",84.5900,86.3898,88.1896,91.1892,95.9886,98.9883,101.9879,95.9886,...,287.9659,331.1368,305.9638,311.9631,383.9545,399.5527,419.9503,425.9496,428.9492,449.9276


Fine, now let us create a new array that contains cumulative figures of emissions for every country for every year.