<a href="https://colab.research.google.com/github/MevrouwHelderder/final_assignment/blob/main/Final_Assignment.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

This is your final assignment. You'll get a lot of freedom in doing this assignment but that also means you have to make choices and explain the reasoning behind those choices in your report.

For this assignment you can use any dataset you can find from the [Our World in Data website](https://ourworldindata.org/).

Please formulate an answer to the following three questions in your report.

* **What is the biggest predictor of a large CO2 output per capita of a country?**
* **which countries are making the biggest strides in decreasing CO2 output?**
* **which non-fossil fuel energy technology will have the best price in the future?**




---


**1: Biggest predictor of CO2 output**

To determine this you may want to consider things like GDP per capita, diets, number of cars per capita, various energy source, mobility and other factors.

Your answer can also be a specific combination of certain factors.


---


**2: Biggest strides in decreasing CO2 output**

You'll need to find the relative CO2 output for each country to be able to calculate this. But countries can have growing and shrinking populations too, so it's probably a good idea to take this into account as well.


---


**3: Best future price for non-fossil fuel energy**

To be able to predict prices you'll probably need to use linear regression over the various non-fossil fuel options.


---


**Submitting your Assignment**

Once you're done with this module, you can go to the next item where you'll be able to submit your assignment.

Please submit both your written report and all notebooks you've created in creating the report. Make sure everything works before submitting.



#Thoughts: 

Find what countries do: 
* did they make any pledges
* what kind of energy are they using now


What contributes to CO2 output?

What are the prices for the different types of non-fossil energy?

What happens when a country outsources everything?

How about CO2 vs the other GHG (greenhouse gasses)? Are there countries where CO2 falls but the other rise?


# What contributes to CO2 output?


In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%load_ext google.colab.data_table

In [36]:
path_base = "https://raw.githubusercontent.com/MevrouwHelderder/final_assignment/main/"

paths = {
    "co2_greenhouse" : path_base + "owid-co2-data.csv",
    "energy_total" : path_base + "owid-energy-data.csv",
    "net_zero_pledge" : path_base + "net-zero-targets.csv",
    "agri_land_use_total" : path_base + "total-agricultural-area-over-the-long-term.csv",
    "grazing" : path_base + "grazing-land-use-over-the-long-term.csv",
    "cropland" : path_base + "cropland-use-over-the-long-term.csv"
}

dataframes = {}

for key, value in paths.items():
  dataframes[key] = pd.read_csv(value)

energy_total_df = dataframes["energy_total"]
co2_greenhouse_df = dataframes["co2_greenhouse"]
net_zero_pledge_df = dataframes["net_zero_pledge"]
agri_land_use_total_df = dataframes["agri_land_use_total"]
grazing_df = dataframes["grazing"]
cropland_df = dataframes["cropland"]

# CO2 output

First let's look at what countries produces, import, export and how that changed through the years.

* **co2**: annual total production-based co2 emission. measured in million tonnes. Measured in million tonnes.
* **co2_growth_abs**: annual growth of production-based co2 emission. Measured in million tonnes.
* **co2_growth_prct**: annual percentage growth of production-based co2 emission. measured in million tonnes. Measured in million tonnes.
* **co2_per_capita**, annual total production-based emissions of carbon dioxide (CO₂) per capita, measured in tonnes per person. 

* **trade_co2**: annual net carbon dioxide (CO₂) emissions embedded in trade. The net of import or export via traded goods. Positive = importer of CO₂ emissions; Negative =  exporter. Measured in million tonnes.
* **consumption_co2**: total co2 minus emissions embedded in exports, plus emissions embedded in imports. Consumption > production = importer of CO₂ emissions; Consumption < Production =  exporter. Measured in million tonnes.
* **consumption_co2**: total co2 minus emissions embedded in exports per capita, plus emissions embedded in imports. Consumption > production = importer of CO₂ emissions; Consumption < Production =  exporter. Measured in tonnes per person. 

Note for later: 
also interesting: 
* what do we see when we look at rich vs poor (columns regarding GDP)
* what do we see when we look at the CO2 per energy unit.

In [86]:
explore_co2 = co2_greenhouse_df.loc[:, ["country", "year", "iso_code", "population", "co2", "co2_growth_abs", "co2_growth_prct", "co2_per_capita", "trade_co2", "consumption_co2", "consumption_co2_per_capita" ]]
explore_co2 = explore_co2.set_index("year")
explore_co2.index = pd.to_datetime(explore_co2.index, format="%Y", errors="coerce")

# show only the true countries, not the combinations like continents
explore_co2 = explore_co2[~explore_co2["iso_code"].isnull()]


# 2020 is the latest year where the data on trade is available. Let's look at the top ten of the different measurements: 

In [92]:
# Total production, not corrected for trade :
top_co2_2020 = explore_co2.loc["2020-01-01 00:00:00",].sort_values(by="co2", ascending = False).head(10)
top_co2_2020

Unnamed: 0_level_0,country,iso_code,population,co2,co2_growth_abs,co2_growth_prct,co2_per_capita,trade_co2,consumption_co2,consumption_co2_per_capita
year,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
2020-01-01,China,CHN,1424930000.0,10956.213,215.217,2.004,7.689,-922.811,10033.401,7.041
2020-01-01,United States,USA,335942000.0,4715.691,-543.453,-10.333,14.037,481.707,5197.398,15.471
2020-01-01,India,IND,1396387000.0,2445.012,-181.447,-6.908,1.751,-168.131,2276.881,1.631
2020-01-01,Russia,RUS,145617300.0,1624.221,-68.142,-4.026,11.154,-264.073,1360.149,9.341
2020-01-01,Japan,JPN,125244800.0,1042.224,-63.791,-5.768,8.321,144.917,1187.141,9.479
2020-01-01,Iran,IRN,87290190.0,729.978,27.02,3.844,8.363,-71.532,658.446,7.543
2020-01-01,Saudi Arabia,SAU,35997110.0,661.193,4.711,0.718,18.368,-3.249,657.944,18.278
2020-01-01,Germany,DEU,83328990.0,639.381,-67.769,-9.583,7.673,130.111,769.492,9.234
2020-01-01,Indonesia,IDN,271858000.0,609.786,-49.65,-7.529,2.243,14.441,624.227,2.296
2020-01-01,South Korea,KOR,51844690.0,597.634,-48.468,-7.502,11.527,61.863,659.497,12.721


In [93]:
# Total producten per capita, not corrected for trade :
top_co2_pc_2020 = explore_co2.loc["2020-01-01 00:00:00",].sort_values(by="co2_per_capita", ascending = False).head(10)
top_co2_pc_2020

Unnamed: 0_level_0,country,iso_code,population,co2,co2_growth_abs,co2_growth_prct,co2_per_capita,trade_co2,consumption_co2,consumption_co2_per_capita
year,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
2020-01-01,Qatar,QAT,2760390.0,92.861,-8.292,-8.197,33.64,-21.722,71.139,25.771
2020-01-01,Bahrain,BHR,1477478.0,37.603,-0.211,-0.558,25.451,-15.773,21.83,14.775
2020-01-01,Brunei,BRN,441736.0,10.553,0.066,0.626,23.89,-0.592,9.961,22.55
2020-01-01,Trinidad and Tobago,TTO,1518142.0,35.756,-5.012,-12.294,23.553,-9.564,26.192,17.253
2020-01-01,Kuwait,KWT,4360451.0,99.779,-4.582,-4.39,22.883,-2.839,96.94,22.232
2020-01-01,United Arab Emirates,ARE,9287286.0,199.084,-9.411,-4.514,21.436,-14.75,184.335,19.848
2020-01-01,New Caledonia,NCL,286412.0,5.387,0.268,5.234,18.807,,,
2020-01-01,Saudi Arabia,SAU,35997108.0,661.193,4.711,0.718,18.368,-3.249,657.944,18.278
2020-01-01,Oman,OMN,4543406.0,72.506,0.329,0.456,15.959,-13.765,58.742,12.929
2020-01-01,Australia,AUS,25670052.0,399.922,-16.434,-3.947,15.579,-45.388,354.534,13.811


In [94]:
# Total co2 for own consumption: 
top_co2_consumer_2020 = explore_co2.loc["2020-01-01 00:00:00",].sort_values(by="consumption_co2", ascending = False).head(10)
top_co2_consumer_2020

Unnamed: 0_level_0,country,iso_code,population,co2,co2_growth_abs,co2_growth_prct,co2_per_capita,trade_co2,consumption_co2,consumption_co2_per_capita
year,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
2020-01-01,China,CHN,1424930000.0,10956.213,215.217,2.004,7.689,-922.811,10033.401,7.041
2020-01-01,United States,USA,335942000.0,4715.691,-543.453,-10.333,14.037,481.707,5197.398,15.471
2020-01-01,India,IND,1396387000.0,2445.012,-181.447,-6.908,1.751,-168.131,2276.881,1.631
2020-01-01,Russia,RUS,145617300.0,1624.221,-68.142,-4.026,11.154,-264.073,1360.149,9.341
2020-01-01,Japan,JPN,125244800.0,1042.224,-63.791,-5.768,8.321,144.917,1187.141,9.479
2020-01-01,Germany,DEU,83328990.0,639.381,-67.769,-9.583,7.673,130.111,769.492,9.234
2020-01-01,South Korea,KOR,51844690.0,597.634,-48.468,-7.502,11.527,61.863,659.497,12.721
2020-01-01,Iran,IRN,87290190.0,729.978,27.02,3.844,8.363,-71.532,658.446,7.543
2020-01-01,Saudi Arabia,SAU,35997110.0,661.193,4.711,0.718,18.368,-3.249,657.944,18.278
2020-01-01,Indonesia,IDN,271858000.0,609.786,-49.65,-7.529,2.243,14.441,624.227,2.296


In [95]:
# Total co2 for own consumption per capita: 
top_co2_consumer_pc_2020 = explore_co2.loc["2020-01-01 00:00:00",].sort_values(by="consumption_co2_per_capita", ascending = False).head(10)
top_co2_consumer_pc_2020

Unnamed: 0_level_0,country,iso_code,population,co2,co2_growth_abs,co2_growth_prct,co2_per_capita,trade_co2,consumption_co2,consumption_co2_per_capita
year,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
2020-01-01,Qatar,QAT,2760390.0,92.861,-8.292,-8.197,33.64,-21.722,71.139,25.771
2020-01-01,Singapore,SGP,5909874.0,29.909,-0.007,-0.024,5.061,113.336,143.245,24.238
2020-01-01,Brunei,BRN,441736.0,10.553,0.066,0.626,23.89,-0.592,9.961,22.55
2020-01-01,Kuwait,KWT,4360451.0,99.779,-4.582,-4.39,22.883,-2.839,96.94,22.232
2020-01-01,United Arab Emirates,ARE,9287286.0,199.084,-9.411,-4.514,21.436,-14.75,184.335,19.848
2020-01-01,Saudi Arabia,SAU,35997108.0,661.193,4.711,0.718,18.368,-3.249,657.944,18.278
2020-01-01,Trinidad and Tobago,TTO,1518142.0,35.756,-5.012,-12.294,23.553,-9.564,26.192,17.253
2020-01-01,Malta,MLT,515364.0,1.6,-0.05,-3.009,3.104,6.962,8.561,16.612
2020-01-01,United States,USA,335942016.0,4715.691,-543.453,-10.333,14.037,481.707,5197.398,15.471
2020-01-01,Belgium,BEL,11561716.0,90.368,-9.065,-9.116,7.816,87.591,177.959,15.392


# Interesting!
It shows that countries that produce a lot do not nessecarily use a lot and vise versa.

next step: who changed the best and worse over the past x years (50?)