# An Analysis of Paleo to present Climate Change Data

<details>

<summary> Student Details </summary>

| Student| Details |
| -------- | -------- |
| Course: | KDATG_L08_Y1 |
| Author: | Rebecca Hannah Quinn |
| Student Number: | G00425671 |

</details>

---

## Project Goals

To Analyse CO2 vs Temperature Anomaly for the past 800,000 years to the present day.
To examine the change in the polar-ice coverage alongside this.
To examine the changes in Ireland and Irish climate change signals.

In gathering this information and data I will fuse and analyse the data using pandas dataframes and export the results to csv and json formats.

I will be anaylising the data, the trends, the relationships between them including but not limited to: temporal leads, lags, and frequencies. I will also use synthesised data to predict global temperature anomoly over the next few decades and compare to published models of the same to see if atmospherice CO2 trends continue and comment on the accelerated warming based on the latest features with temperatures polar-ice coverage and ocean and sea levels.




<details>

<summary> Table of Contents </summary>

[INTRODUCTION](#01)

[PRE-PROCESSING](#02)

[ANALYSIS](#03)

[PREDICITONS](#04)

[FURTHER COMMENT](#05)

[PLOTS](#06)

</details>

---


## Introduction


There have been studies on the Earth"s climate that have resulted in data as far back as 800,000 years by extracting core samples from deep beneath the ice sheets of Greenland and Antarctica. These samples contain detailed information on air temperature and CO2 levels that are trapped within them. Current polar records demonstrate a close association between atmospheric carbon dioxide and temperature in the natural world. In essence, when one increases, the other one follows.

However, there is still some uncertainty about which occurred first - a spike in temperature or CO2. Until now, the most extensive records to date on a significant change in Earth"s climate came from the EPICA Dome C ice core on the Antarctic Plateau. The data, which covered the end of the last ice age, between 20,000 and 10,000 years ago, indicated that CO2 levels could have lagged behind rising global temperatures by as much as 1,400 years.
[1]: https://www.scientificamerican.com/article/ice-core-data-help-solve/#:~:text=Scientists%20use%20air%20trapped%20in,than%20the%20ice%20surrounding%20them.


<a id="01i">

## Pre-processing

</a>





### Import Packages

In [None]:
#importing packages required for analysis and visualization
import requests
from io import StringIO
from IPython.display import display
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

### Initial Adjustments

#### C02 Levels Data

##### Data Collection and Reading in the Data

The first database used `epica8kyr1`  is an updated version of the atmospheric CO2 composite data obtained from the revised EPICA Dome C and Antarctic ice cores. The previous version of Luthi et al. (2008) contained analytical bias and lower quality data, which has been improved in this new version. The age unit is in years before present (yr BP), where present refers to 1950 AD. This has been changed later from negetive integers to the actual year, starting with year 0, in order to merge and clear the data up for use in later plots.

In [None]:
epica1col = ["year", "co2", "sigma"]
epica8kyr1 = pd.read_excel("https://www.ncei.noaa.gov/pub/data/paleo/icecore/antarctica/antarctica2015co2.xls", sheet_name="CO2 Composite", skiprows=59, names=epica1col)

epica8kyr1["year"] = epica8kyr1["year"].astype(int)
epica8kyr1["year"] = -1 * epica8kyr1["year"] + 1950

epica8kyr1.to_csv("epica8kyr1.csv", index=False)

In [None]:
epica2col = ["year", "co2", "unc"]
epica8kyr2 = pd.read_csv("https://gml.noaa.gov/webdata/ccgg/trends/co2/co2_annmean_mlo.csv", skiprows=44, sep=",", names=epica2col)

epica8kyr2.to_csv("epica8kyr2.csv")

##### Mauna Lon Observeratory, 1960 to Present.

Additional CO2 Data

In [381]:
epica3col = ["year", "co2", "unc"]
epica8kyr3 = "https://gml.noaa.gov/webdata/ccgg/trends/ch4/ch4_annmean_gl.txt"

response = requests.get(epica8kyr3)

if response.status_code == 200:
    text = response.text
    epica8kyr3 = pd.read_fwf(StringIO(response.text), names=epica3col, skiprows=45, header=0)
    epica8kyr3.to_csv("epica8kyr3.csv", index=False)



In [382]:
epica8kyr3.columns
epica8kyr3.head()

Unnamed: 0,year,co2,unc
,1984,1644.68,0.67
,1985,1657.29,0.59
,1986,1670.09,0.74
,1987,1682.71,0.49
,1988,1693.13,0.67


##### Cleanup of Data

Here we take the "year" column in the second dataset and convert the data to actual dates (stored in a new column titled "date") and put in order and make clearer so when we merge the datasets we can do so seamlessly.

In [None]:
epica8kyr2.reset_index(drop=True, inplace=True)
epica8kyr2["year"] = epica8kyr2["year"].astype(str)
epica8kyr2.to_csv("epica8kyr2new.csv")

##### Merging Data

###### Merging both C02 datasets for plotting

In [None]:
epica8kyr2['year'] = epica8kyr2['year'].astype(int)
mergeepica1 = pd.merge(epica8kyr1, epica8kyr2, on="year", how="outer", suffixes=("epica8kyr1", "epica8kyr2"))

mergeepica1.to_csv("epica_merge1.csv", index = True)
###

In [None]:
print(mergeepica1.head())
print(mergeepica1.describe())
print(mergeepica1.info())
print(mergeepica1.corr())
###https://www.geeksforgeeks.org/python-pandas-dataframe-corr/

In [373]:
#additional merge

finalmerge = pd.merge(mergeepica1, epica8kyr3, on="year", how="outer", suffixes=("mergeepica1", "epica8kyr3"))
finalmerge = finalmerge.sort_values(by="year", ascending=False)
finalmerge.to_csv("epicafinalmerge.csv", index=True)

#### Calculating Statistical Measures


In [None]:
mean_value = mergeepica1["co2epica8kyr1"].mean()
print(mean_value)

median_value = mergeepica1["co2epica8kyr1"].median()
print(median_value)

std_deviation = mergeepica1["co2epica8kyr1"].std()
print(std_deviation)

correlation = mergeepica1["co2epica8kyr1"].corr(mergeepica1["year"])
print(correlation)



#### Initial Visualization


In [None]:
#Matplotlib Line plot
plt.figure(figsize=(18, 16))
plt.plot(mergeepica1["year"], mergeepica1["co2epica8kyr1"], color="blue")
plt.xlabel("YEAR")
plt.ylabel("CO2")
plt.xticks()
plt.yticks()
plt.title("CO2 V Time")
plt.tight_layout()
plt.savefig("lineplot1.png")

In [None]:
#SNS Lineplot
#style
sns.set_style("whitegrid")
sns.set_context("paper")  # Adjust context to paper for smaller font sizes

#size
plt.figure(figsize=(16, 12))

#lineplot
sns.lineplot(data=mergeepica1, x="year", y="co2epica8kyr1", color="blue")

#labels
plt.xlabel("Year", fontsize=12)
plt.ylabel("CO2 Levels", fontsize=12)
plt.title("CO2 Levels Over Time", fontsize=14)
plt.xticks(rotation=45)

#ticks spacing
plt.xticks(fontsize=10)
plt.yticks(fontsize=10)


plt.tight_layout()  #additional spacing
plt.savefig("snslineplotnew.png")

#### Collection of CH4 Data

In [None]:
colnames = ["year", "mean", "unc"]
noaach4 = "https://gml.noaa.gov/webdata/ccgg/trends/ch4/ch4_annmean_gl.txt"

response = requests.get(noaach4)

if response.status_code == 200:
    text = response.text
    noaach4df = pd.read_fwf(StringIO(response.text), names=colnames, widths=[6, 12, 8], skiprows=44, header=0)
    noaach4df.to_csv("noaach4.csv", index="True")


---


In [None]:
noaach4df.head()

In [None]:
noaach4df.info()

In [None]:
noaach4df.describe()

##### CH4 Plots

In [None]:
noaach4df.columns

In [None]:

#SNS Lineplot
#style
sns.set_style("whitegrid")
sns.set_context("paper")  # Adjust context to paper for smaller font sizes

#size
plt.figure(figsize=(16, 12))

#lineplot
sns.lineplot(data=noaach4df, x="year", y="mean", color="blue")

#labels
plt.xlabel("Year", fontsize=12)
plt.ylabel("CH4", fontsize=12)
plt.title("CH4 Levels", fontsize=14)
plt.xticks(rotation=45)

#ticks spacing
plt.xticks(fontsize=10)
plt.yticks(fontsize=10)


plt.tight_layout()  #additional spacing
plt.savefig("ch4levels2.png")

In [None]:

#lineplot
sns.lineplot(data=noaach4df, x="year", y="mean", color="blue")

#labels
plt.xlabel("Year", fontsize=12)
plt.ylabel("CH4", fontsize=12)
plt.title("CH4 Levels", fontsize=14)
plt.xticks(rotation=45)

#ticks spacing
plt.xticks(fontsize=10)
plt.yticks(fontsize=10)


plt.tight_layout()  #additional spacing
plt.savefig("ch4levels2.png")


---


#### Temperatures

In [None]:
epicadeut = "https://www.ncei.noaa.gov/pub/data/paleo/icecore/antarctica/epica_domec/edc3deuttemp2007.txt"

response = requests.get(epicadeut)

if response.status_code == 200:
    text = response.text
    epicadeutdf = pd.read_csv(StringIO(text), sep='\s+', skiprows=89, header=0)

    
epicadeutdf.rename(columns={"Age": "Year"}, inplace=True)

epicadeutdf["Year"] = epicadeutdf["Year"].astype(int)
epicadeutdf["Year"] = -1 * epicadeutdf["Year"] + 1950
#epicadeutdf["Year"] = epicadeutdf["Year"].astype(str) + " BC" if (epicadeutdf["Year"] < 0).any() else epicadeutdf["Year"]


epicadeutdf.to_csv("epicadeut.csv", index="True")


In [None]:
epicadeutdf.head(15)

In [None]:
epicadeutdf.info()

 EDC3 age scale (years before year 1950)
 Temperature estimate (temperature difference from the average of the last 1000 years)


---


In [None]:
noaatemp = pd.read_csv("https://www.ncei.noaa.gov/access/monitoring/climate-at-a-glance/global/time-series/antarctic/land_ocean/12/11/1850-2023/data.csv", skiprows=range(0,4))

noaatemp.to_csv("noaatemp.csv", index="True")
noaatemp.head(10)


---

#### Irish Climate Change

##### Yearly Rainfall 1711 - 2016

In [None]:
colnames = ["Year", "Month", "Median Rainfall"]
irelandrain = pd.read_csv("DATA Files/IOI_1711_SERIES.CSV", names=colnames, header=0) 
irelandrain.info()
print(irelandrain.columns)

In [None]:
irelandrain.tail() #check final year entry

In [None]:
irelandrain.head()

irelandrain["Median Rainfall"] = irelandrain["Median Rainfall"].astype(float)
uniqueyears = irelandrain["Year"].unique()

totalbyyear = []

for year in uniqueyears:
    
    specificyear = irelandrain[irelandrain["Year"] == year]
    totalmedianyear = specificyear["Median Rainfall"].sum()
    totalbyyear.append({"Year": year, "Total Median": totalmedianyear})

raindfnew = pd.DataFrame(totalbyyear)

raindfnew.to_csv("yearlyrain.csv", index=False)


In [None]:
raindfread = pd.read_csv("yearlyrain.csv")

sns.set_context("paper", rc={"lines.linewidth": 0.5, "figure.figsize":(10, 22)})
plot = sns.lineplot(data=raindfread, x="Year", y="Total Median")
plot.set_xlabel("Year")
plot.set_ylabel("Median Rainfall (mm)")

##### Yearly Temperatures



---


<a id="02i">

## Analysis

</a>

### Trends


### Relationships



---

<a id="03i">

## Predictions

</a>

### Synthetic Data


### Comparisons with Published Climate Models

In [None]:
#SECTION 3 - PYTHON CELL



---

<a id="04i">

## Further Comments

</a>



In [None]:
#SECTION 4 - PYTHON CELL



---

<a id="05i">

## Plots

</a>



In [None]:
#SECTION 5 - PYTHON CELL



---

<a id="07i">

## REFERENCES

</a>


---