# Downloading Data
The following code reads the data from the 'owid-co2-data.csv' file. Please make sure to place the file in the same directory as this notebook.

In [None]:
import pandas as pd

df = pd.read_csv('owid-co2-data.csv')

# Adding further imports that your notebook needs


In [None]:
import matplotlib.pyplot as plt
import seaborn as sns


## Q1. The growth of CO2 emissions

*How have CO2 emissions grown over time?*

The dataset contains entries for countries grouped into four income groups: low, lower-middle, upper-middle, and high. i) How have the total annual CO2 emissions of each income group varied over the years 1950 to 2021? ii) How has the per capita CO2 emissions (i.e., the emissions per person) of these groups varied over the same period? Comment on how the contribution of each income group is different when measured in terms of total emissions and per capita emissions.

(You can find the Persian version of the question in the project file)




In [None]:

# 1)
sns.set_theme()
income_groups=['Low-income countries','Lower-middle-income countries','Upper-middle-income countries','High-income countries']
income_country=df[df['country'].isin(income_groups)]

plt.figure(figsize=(10,5))
a = sns.lineplot(data=income_country, x='year', y='co2', hue='country')
a.set_xlim(1950,2021)
plt.legend(loc='upper left')
plt.title("CO2-EMISSION")
plt.xlabel("YEAR")
plt.ylabel("CO2")
plt.show()

# 2)
sns.set_theme()
income_groups=['Low-income countries','Lower-middle-income countries','Upper-middle-income countries','High-income countries']
income_country=df[df['country'].isin(income_groups)]

plt.figure(figsize=(10,5))
a = sns.lineplot(data=income_country, x='year', y='co2_per_capita', hue='country')
a.set_xlim(1950,2021)
plt.legend(loc='upper left')
plt.title("CO2-EMISSION")
plt.xlabel("YEAR")
plt.ylabel("CO2")
plt.show()



### Comment

Looking at these two graphs, it can be concluded that in the first graph, considering the total amount of carbon dioxide produced by four country models, it seems that high-income countries have been able to significantly reduce their carbon dioxide emissions after 2010. However, based on the second graph, we reach the conclusion that if we consider the per capita emissions in those countries, the carbon dioxide production is still very high in high-income countries, even if it has decreased since 2010.


## Q2. The share of CO2 emissions by country over time (Continued)

Compare the share of global carbon emissions for the top 5 emitting countries as a proportion of the total world emissions. Make separate plots for the years 1960, 1990, and 2020. Make a similar set of plots but normalized in a way that accounts for the population of each country. Note, all the plots should appear in the same figure.

Comment on how the top 5 emitting countries have changed over time and how the top 5 emitting countries change when you normalize for population.

(Add your solution below. Add further markdown and code cells as needed.)

You can find the Persian version of the question in the project file.




In [None]:

sns.set_theme()

not_countries=['Africa','World', 'Non-OECD (GCP)', 'Asia', 'Asia (GCP)', 'Upper-middle-income countries', 'High-income countries', 'OECD (GCP)',  'Europe', 'Europe (GCP)', 'Asia (excl. China and India)', 'North America', 'North America (GCP)', 'Lower-middle-income countries',  'European Union (28)', 'Europe (excl. EU-27)', 'European Union (27) (GCP)', 'European Union (27)', 'Europe (excl. EU-28)', 'Middle East (GCP)',  'Africa (GCP)', 'International transport', 'North America (excl. USA)', 'South America', 'South America (GCP)',]

# Create subplots for a 2x3 grid
fig, axs = plt.subplots(2, 3, figsize=(18, 12))

# Iterate through each year
years_of_interest = [1960, 1990, 2020]
filtered_data = df[df['year'].isin(years_of_interest) & ~df['country'].isin(not_countries)]
grouped_data = filtered_data.groupby(['year', 'country']).agg({'co2': 'sum', 'population': 'sum'}).reset_index()
top5_emitters = {}

for i, year in enumerate(years_of_interest):
    top5_emitters[year] = grouped_data[grouped_data['year'] == year].nlargest(5, 'co2')

    # Plot on the corresponding subplot in the first row
    sns.barplot(data=top5_emitters[year], x='country', y='co2', ax=axs[0, i], color='skyblue')
    axs[0, i].set_title(f"Top 5 CO2 Emission Countries in {year}")
    axs[0, i].set_xlabel("Country")
    axs[0, i].set_ylabel("CO2 Emission")

    # Filter for the top 5 countries per capita for the current year
    df_year = df[df['year'] == year]
    df_year['co2_per_person'] = df_year['share_global_co2'] / df_year['population'] * (1**6)

    top_countries_year_per_capita = df_year.nlargest(5, 'co2_per_person')

    # Plot on the corresponding subplot in the second row
    sns.barplot(data=top_countries_year_per_capita, x='country', y='co2_per_person', ax=axs[1, i], color='purple')
    axs[1, i].set_title(f"Top 5 CO2 Emission Countries per Capita in {year}")
    axs[1, i].set_xlabel("Country")
    axs[1, i].set_ylabel("CO2 Emission per Capita")

# Adjust layout and display the plot
plt.tight_layout()
plt.show()




### Comment

The bar plots for the top 5 CO2-emitting countries in 1960, 1990, and 2020 demonstrate the changing landscape of emissions. It's evident that China's and India's contributions have significantly increased over the years. When considering emissions per capita, the rankings also shift, highlighting the importance of normalizing for population size.

## Q3. The development of wealth inequality over time (Continued)

Make a plot that compares the distribution of GDP per capita across the countries in the world and 10-yearly intervals from 1950 to 2020. Comment on how the distribution has changed over time.

You can find the Persian version of the question in the project file.



In [None]:

sns.set_theme()

not_countries=['Africa','World', 'Non-OECD (GCP)', 'Asia', 'Asia (GCP)', 'Upper-middle-income countries', 'High-income countries', 'OECD (GCP)',  'Europe', 'Europe (GCP)', 'Asia (excl. China and India)', 'North America', 'North America (GCP)', 'Lower-middle-income countries',  'European Union (28)', 'Europe (excl. EU-27)', 'European Union (27) (GCP)', 'European Union (27)', 'Europe (excl. EU-28)', 'Middle East (GCP)',  'Africa (GCP)', 'International transport', 'North America (excl. USA)', 'South America', 'South America (GCP)',]
years_of_interest = [1950, 1960, 1970, 1980, 1990, 2000, 2010, 2018]

# Filter data for the specified years and countries
filtered_data = df[(df['year'].isin(years_of_interest)) & (~df['country'].isin(not_countries))]

# Calculate GDP per capita
filtered_data['gdp_per_capita'] = filtered_data['gdp'] / filtered_data['population']

# Set a custom color palette
custom_palette = sns.color_palette("Paired")

# Create horizontal violin plots for each year with customizations
plt.figure(figsize=(16, 10))
sns.set(style="whitegrid")
sns.set_palette(custom_palette)
sns.set_theme(style="whitegrid", font_scale=1.2)

# Define a common x-axis range
x_axis_range = (-25000, 150000)

for i, years in enumerate(years_of_interest):
    plt.subplot(2, 4, i + 1)
    # Filter out negative values for display
    filtered_data_positive = filtered_data[(filtered_data['gdp_per_capita'] >= 0) | (filtered_data['year'] != years)]
    sns.violinplot(x='gdp_per_capita', y='year', data=filtered_data_positive[filtered_data_positive['year'] == years], orient='h', linewidth=1.5, width=0.8, inner='quartile',color='lightgreen')
    plt.title(f'GDP Per Capita Distribution - {years}', fontsize=14)
    plt.xlabel('GDP Per Capita', fontsize=12)
    plt.ylabel('Year', fontsize=12)
    plt.xlim(x_axis_range)  # Set common x-axis range

plt.tight_layout()
plt.show()



### Comment

The violin plots showcase the distribution of GDP per capita across countries at 10-year intervals from 1950 to 2018. Over time, there is a noticeable shift towards higher GDP per capita, indicating overall economic growth. However, there are still variations in wealth among countries, and the plots highlight changes in the distribution.

Feel free to adjust the visuals or add more insights based on your interpretation of the plots.