# 🌍 EDA on Global CO₂ Emissions (2019–2020)

An exploratory data analysis project examining worldwide CO₂ emissions using a subset of data. This notebook includes trends, breakdowns by country, and sector-wise emissions.

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

In [None]:
# Sample data for demonstration
data = {
    "country": ["World", "World", "China", "China", "United States", "United States"],
    "year": [2019, 2020, 2019, 2020, 2019, 2020],
    "co2": [36000, 34000, 10000, 9800, 5000, 4800],
    "co2_per_capita": [4.7, 4.5, 7.1, 6.9, 15.5, 15.0],
    "coal_co2": [12000, 11000, 7000, 6800, 1000, 900],
    "oil_co2": [14000, 13000, 2000, 1900, 2500, 2400],
    "gas_co2": [8000, 7500, 700, 680, 1200, 1150],
    "cement_co2": [2000, 1800, 300, 290, 300, 280]
}
df = pd.DataFrame(data)
df

## 📈 Global CO₂ Emissions Trend

In [None]:
world_data = df[df["country"] == "World"]
plt.figure(figsize=(10, 5))
sns.lineplot(data=world_data, x="year", y="co2", marker="o")
plt.title("Global CO₂ Emissions Trend (2019–2020)")
plt.xlabel("Year")
plt.ylabel("CO₂ Emissions (million tonnes)")
plt.grid(True)
plt.tight_layout()
plt.show()

## 🧱 Sector-wise Emissions: United States

In [None]:
usa = df[df['country'] == 'United States']
usa_sectors = usa[['year', 'coal_co2', 'oil_co2', 'gas_co2', 'cement_co2']].set_index('year')
usa_sectors.plot(kind='bar', stacked=True, figsize=(10, 6))
plt.title('USA CO₂ Emissions by Sector (2019–2020)')
plt.ylabel('Emissions')
plt.xlabel('Year')
plt.tight_layout()
plt.show()

## 🧠 Key Insights
- Global CO₂ emissions dropped from 2019 to 2020 due to the pandemic.
- China leads in total emissions, followed by the United States.
- The largest sources of CO₂ in the US are oil and gas sectors.
- CO₂ per capita is highest in the United States among the three regions analyzed.