# Climate Data Exploratory Data Analysis

## Introduction
This notebook contains an exploratory data analysis of climate data from 1900 to 2023. The dataset includes global temperatures, CO2 concentration, sea level rise, and Arctic ice area.

Your task is to perform a comprehensive EDA following the requirements in the README.md file.

In [11]:
# Import required libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# Set plot styling
sns.set_style('whitegrid')
sns.set_palette('viridis')
%matplotlib inline

## 1. Data Preparation

Load the climate data and perform necessary cleaning and aggregation.

In [21]:
# Load the dataset
df = pd.read_csv('data/Climate_Change_Indicators.csv') # Place the correct path to the file you are reading here (Make sure to load using the relative path)

# Display the first few rows of the dataset
df.head()

Unnamed: 0,Year,Global Average Temperature (°C),CO2 Concentration (ppm),Sea Level Rise (mm),Arctic Ice Area (million km²)
1048571,1969,13.65,312.51,110.22,9.42
1048572,1919,14.92,348.21,125.24,6.23
1048573,1984,14.81,337.78,23.67,7.74
1048574,1953,15.5,342.91,12.27,3.38
1048575,1930,13.1,398.55,106.72,8.28


In [17]:
# Check for missing values and basic information about the dataset
print("Dataset Information:")
print(df.info())
print("\nMissing Values:")
print(df.isnull().sum())

Dataset Information:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1048576 entries, 0 to 1048575
Data columns (total 5 columns):
 #   Column                           Non-Null Count    Dtype  
---  ------                           --------------    -----  
 0   Year                             1048576 non-null  int64  
 1   Global Average Temperature (°C)  1048576 non-null  float64
 2   CO2 Concentration (ppm)          1048576 non-null  float64
 3   Sea Level Rise (mm)              1048576 non-null  float64
 4   Arctic Ice Area (million km²)    1048576 non-null  float64
dtypes: float64(4), int64(1)
memory usage: 40.0 MB
None

Missing Values:
Year                               0
Global Average Temperature (°C)    0
CO2 Concentration (ppm)            0
Sea Level Rise (mm)                0
Arctic Ice Area (million km²)      0
dtype: int64


In [57]:
# TODO: Aggregate data by year to create a 124-year time series
# Your code here
print (df['Year'].max())
print(df['Year'].min())
yearly_data = df.groupby('Year').agg({
    'Global Average Temperature (°C)': 'mean',  # Average temperature per year
    'CO2 Concentration (ppm)': 'max',          # Maximum CO₂ per year
    'Sea Level Rise (mm)': 'sum',              # Total sea level rise per year
    'Arctic Ice Area (million km²)': 'min'     # Minimum Arctic ice per year
}).reset_index()
yearly_data.head(10)

2023
1900


Unnamed: 0,Year,Global Average Temperature (°C),CO2 Concentration (ppm),Sea Level Rise (mm),Arctic Ice Area (million km²)
0,1900,14.506663,420.0,1254856.35,3.0
1,1901,14.485343,419.98,1271685.95,3.0
2,1902,14.476262,419.99,1294551.2,3.0
3,1903,14.49236,420.0,1252604.15,3.0
4,1904,14.494241,420.0,1266961.48,3.0
5,1905,14.486222,419.99,1273751.24,3.0
6,1906,14.50161,419.96,1261048.78,3.0
7,1907,14.507352,419.99,1258936.21,3.0
8,1908,14.489932,420.0,1270480.52,3.0
9,1909,14.52432,419.94,1273541.21,3.0


## 2. Univariate Analysis

Analyze each climate variable independently.

In [None]:
# TODO: Perform univariate analysis for each climate variable
# Include descriptive statistics and appropriate visualizations
# Your code here


## 3. Bivariate Analysis

Explore relationships between pairs of climate variables.

In [None]:
# TODO: Perform bivariate analysis
# Include correlation analysis and appropriate visualizations
# Your code here

## 4. Multivariate Analysis

Investigate relationships among three or more variables.

In [None]:
# TODO: Perform multivariate analysis
# Create advanced visualizations showing multiple variables
# Your code here

## 5. Conclusions and Insights

Summarize your findings and discuss their implications.

# TODO: Write your conclusions here