# An Analysis of Paleo to present Climate Change Data

<details>

<summary> Student Details </summary>

| Student| Details |
| -------- | -------- |
| Course: | KDATG_L08_Y1 |
| Author: | Rebecca Hannah Quinn |
| Student Number: | G00425671 |

</details>

---

## Project Goals

To Analyse CO2 vs Temperature Anomaly for the past 800,000 years to the present day.
To examine the change in the polar-ice coverage alongside this.
To examine the changes in Ireland and Irish climate change signals.

In gathering this information and data I will fuse and analyse the data using pandas dataframes and export the results to csv and json formats.

I will be anaylising the data, the trends, the relationships between them including but not limited to: temporal leads, lags, and frequencies. I will also use synthesised data to predict global temperature anomoly over the next few decades and compare to published models of the same to see if atmospherice CO2 trends continue and comment on the accelerated warming based on the latest features with temperatures polar-ice coverage and ocean and sea levels.




<details>

<summary> Table of Contents </summary>

[INTRODUCTION](#01)

[PRE-PROCESSING](#02)

[ANALYSIS](#03)

[PREDICITONS](#04)

[FURTHER COMMENT](#05)

[PLOTS](#06)

</details>

---


## Introduction


There have been studies on the Earth"s climate that have resulted in data as far back as 800,000 years by extracting core samples from deep beneath the ice sheets of Greenland and Antarctica. These samples contain detailed information on air temperature and CO2 levels that are trapped within them. Current polar records demonstrate a close association between atmospheric carbon dioxide and temperature in the natural world. In essence, when one increases, the other one follows.

However, there is still some uncertainty about which occurred first - a spike in temperature or CO2. Until now, the most extensive records to date on a significant change in Earth"s climate came from the EPICA Dome C ice core on the Antarctic Plateau. The data, which covered the end of the last ice age, between 20,000 and 10,000 years ago, indicated that CO2 levels could have lagged behind rising global temperatures by as much as 1,400 years.
[1]: https://www.scientificamerican.com/article/ice-core-data-help-solve/#:~:text=Scientists%20use%20air%20trapped%20in,than%20the%20ice%20surrounding%20them.


<a id="01i">

## Pre-processing

</a>





### Import Packages

In [93]:
#importing packages required for analysis and visualization
import requests
from io import StringIO
from IPython.display import display
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

### Initial Adjustments

#### C02 Levels Data

##### Data Collection and Reading in the Data

The first database used `epica8kyr1`  is an updated version of the atmospheric CO2 composite data obtained from the revised EPICA Dome C and Antarctic ice cores. The previous version of Luthi et al. (2008) contained analytical bias and lower quality data, which has been improved in this new version. The age unit is in years before present (yr BP), where present refers to 1950 AD. This has been changed later from negetive integers to the actual year, starting with year 0, in order to merge and clear the data up for use in later plots.

In [94]:
epica1col = ["Year", "co2", "sigma"]
epica8kyr1 = pd.read_excel("https://www.ncei.noaa.gov/pub/data/paleo/icecore/antarctica/antarctica2015co2.xls", sheet_name="CO2 Composite", skiprows=range(0, 15), names=epica1col)
epica8kyr1.reset_index(drop=True, inplace=True)

#show column index numbers to ensure working with correct info
column_labels = epica8kyr1.columns.tolist()

for idx, label in enumerate(column_labels):
    print(f"Column {label} is at indec {idx}")

#move familiar numerical year into year column for later merge as string
epica8kyr1 = epica8kyr1[epica8kyr1['Year'].str.isnumeric()]
epica8kyr1["Year"] = epica8kyr1["Year"].astype(int)

def convertyear(year):
    minyear = epica8kyr1["Year"].min()
    if year < 0:
        return year + abs(minyear)
    else:
        return year

epica8kyr1["Year"] = epica8kyr1["Year"].apply(convertyear)
    
epica8kyr1.to_csv("epica8kyr1.csv")

Column Year is at indec 0
Column co2 is at indec 1
Column sigma is at indec 2


AttributeError: Can only use .str accessor with string values!

In [None]:
epica2col = ["date", "co2", "unc"]
epica8kyr2 = pd.read_csv("https://gml.noaa.gov/webdata/ccgg/trends/co2/co2_annmean_mlo.csv", skiprows=44, sep=",", names=epica2col)

epica8kyr2.to_csv("epica8kyr2.csv")

##### Cleanup of Data

Here we take the "year" column in the second dataset and convert the data to actual dates (stored in a new column titled "date") and put in order and make clearer so when we merge the datasets we can do so seamlessly.

In [None]:
epica8kyr2new = epica8kyr2.rename(columns={"date": "year"})
epica8kyr2.reset_index(drop=True, inplace=True)
epica8kyr2new["year"] = epica8kyr2new["year"].astype(str)
epica8kyr2new.to_csv("epica8kyr2new.csv")

##### Merging Data

###### Merging both C02 datasets for plotting

In [None]:
bot_merge = epica8kyr1.tail()
top_merge = epica8kyr2new.head()

merge_epica1 = pd.concat([epica8kyr1, top_merge, bot_merge, epica8kyr2new], axis=0, ignore_index=True)

merge_epica1.to_csv("epica_merge1.csv", index = False)
###

#### Collection of CH4

In [None]:
ch4columns = ["Year", "Epica Dome C, Antartica"]
ch4read = pd.read_csv("https://climatechange.chicago.gov/sites/production/files/2016-08/ghg-concentrations_fig-2.csv", skiprows=range(0, 5), header=1, usecols=[0, 1], names=ch4columns)

ch4read = ch4read[ch4read['Year'].str.isnumeric()]
ch4read["Year"] = ch4read["Year"].astype(int)

def convertyear(year):
    minyear = ch4read["Year"].min()
    if year < 0:
        return year + abs(minyear)
    else:
        return year

ch4read["Year"] = ch4read["Year"].apply(convertyear)
ch4read.head()

In [None]:
ch4read.info()

In [None]:
ch4read.describe()

In [None]:
noaach4 = "https://gml.noaa.gov/webdata/ccgg/trends/ch4/ch4_annmean_gl.txt"

response = requests.get(noaach4)

if response.status_code == 200:
    text = response.text
    noaach4df = pd.read_csv(StringIO(text), delimiter="\t", skiprows=range(0, 44), header=0)
    noaach4df.to_csv("noaach4.csv", index="True")



---


In [None]:
noaach4df.head()

In [None]:
noaach4df.info()


In [None]:
noaach4df.describe()


---


In [None]:
epicadeut = "https://www.ncei.noaa.gov/pub/data/paleo/icecore/antarctica/epica_domec/edc3deuttemp2007.txt"

response = requests.get(epicadeut)

if response.status_code == 200:
    text = response.text
    epicadeutdf = pd.read_csv(StringIO(text), sep='\s+', skiprows=range(0, 89), header=0)

    
epicadeutdf.rename(columns={"Age": "Year"}, inplace=True)

epicadeutdf["Year"] = epicadeutdf["Year"].astype(int)

def convertyear(year):
    minyear = epicadeutdf["Year"].min()
    if year < 0:
        return year + abs(minyear)
    else:
        return year

epicadeutdf["Year"] = epicadeutdf["Year"].apply(convertyear)
epicadeutdf.to_csv("epicadeut.csv", index="True")


In [None]:
epicadeutdf.head(30)

Unnamed: 0,Bag,ztop,Age,Deuterium,Temperature
0,1,0.0,-50.0,,
1,2,0.55,-43.54769,,
2,3,1.1,-37.41829,,
3,4,1.65,-31.61153,,
4,5,2.2,-24.51395,,
5,6,2.75,-17.73776,,
6,7,3.3,-10.95945,,
7,8,3.85,-3.20879,,
8,9,4.4,5.48176,,
9,10,4.95,13.52038,,


In [None]:
epicadeutdf.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5800 entries, 0 to 5799
Data columns (total 5 columns):
 #   Column       Non-Null Count  Dtype  
---  ------       --------------  -----  
 0   Bag          5800 non-null   int64  
 1   ztop         5800 non-null   float64
 2   Age          5800 non-null   float64
 3   Deuterium    5788 non-null   float64
 4   Temperature  5785 non-null   float64
dtypes: float64(4), int64(1)
memory usage: 226.7 KB



https://www.ncei.noaa.gov/access/monitoring/climate-at-a-glance/global/time-series/antarctic/land_ocean/12/11/1850-2023/data.csv




---

#### Irish Climate Change

https://www.met.ie/climate/available-data/historical-data

#### Exploratory Data Analysis/Initial Exploration

In [None]:
print(merge_epica1.head())

In [None]:
print(merge_epica1.describe())

In [None]:
print(merge_epica1.info())

In [None]:
print(merge_epica1.corr())
###https://www.geeksforgeeks.org/python-pandas-dataframe-corr/

#### Calculating Statistical Measures

In [None]:
mean_value = merge_epica1["co2"].mean()
print(mean_value)

In [None]:
median_value = merge_epica1["co2"].median()
print(median_value)

In [None]:
std_deviation = merge_epica1["co2"].std()
print(std_deviation)

In [None]:
correlation = merge_epica1["co2"].corr(merge_epica1["year"])
print(correlation)

#### Initial Visualization

In [None]:
#Matplotlib Line plot
plt.figure(figsize=(18, 16))
plt.plot(merge_epica1["year"], merge_epica1["co2"], color="blue")
plt.xlabel("YEAR")
plt.ylabel("CO2")
plt.xticks(np.arange(0, 2024, step=50))
plt.yticks()
plt.title("CO2 V Time")
plt.tight_layout()
plt.savefig("lineplot1.png")

In [None]:
#SNS Lineplot
#style
sns.set_style("whitegrid")
sns.set_context("paper")  # Adjust context to paper for smaller font sizes

#size
plt.figure(figsize=(16, 12))

#lineplot
sns.lineplot(data=merge_epica1, x="year", y="co2", color="blue")

#labels
plt.xlabel("Year", fontsize=12)
plt.ylabel("CO2 Levels", fontsize=12)
plt.title("CO2 Levels Over Time", fontsize=14)
plt.xticks(rotation=45)

#ticks spacing
plt.xticks(range(0, 2024, 25))
plt.xticks(fontsize=10)
plt.yticks(fontsize=10)


plt.tight_layout()  #additional spacing
plt.savefig("snslineplot.png")



---


<a id="02i">

## Analysis

</a>

### Trends


### Relationships



---

<a id="03i">

## Predictions

</a>

### Synthetic Data


### Comparisons with Published Climate Models

In [None]:
#SECTION 3 - PYTHON CELL



---

<a id="04i">

## Further Comments

</a>



In [None]:
#SECTION 4 - PYTHON CELL



---

<a id="05i">

## Plots

</a>



In [None]:
#SECTION 5 - PYTHON CELL



---

<a id="07i">

## REFERENCES

</a>


---