# PfDA Assignment 2 2023

## An analysis of paleo-present climate data 

# Table of Contents
1. [Introduction](#overview)  
    - [Problem Statement](#problem-statement)

<a id="overview"></a>

# 1. Introduction and Project Overview: 

This notebook contains my submission for the Programming for Data Analysis Module 2023 module at ATU as part of the Higher Diploma in Computing and Data Analytics.

<a id="problem-statement"></a>
## Problem statement:

- Analyse CO2 vs Temperature Anomaly from 800kyrs – present. 
- Examine one other (paleo/modern) features (e.g. CH4 or polar ice-coverage) 
- Examine Irish context: o Climate change signals: (see Maynooth study: The emergence of a climate change signal in long-term Irish meteorological observations - ScienceDirect https://www.sciencedirect.com/science/article/pii/S2212094723000610#bib13) 
- Fuse and analyse data from various data sources and format fused data set as a pandas dataframe and export to csv and json formats 
- For all of the above variables, analyse the data, the trends and the relationships between them (temporal leads/lags/frequency analysis). 
- Predict global temperature anomaly over next few decades (synthesise data) and compare to published climate models if atmospheric CO2 trends continue  
- Comment on accelerated warming based on very latest features (e.g. temperature/polar-icecoverage) 

Use a Jupyter notebook for your analysis and track your progress using GitHub. 

Use an academic referencing style 

# Background

## Importing Python libraries and modules

In [1]:
# Importing libraries and modules necessary for this task
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import xlrd
import datetime as dt



# CO2 Data

In [2]:
# Read in the CO2 data from IPCC xls file, skipping the first 14 rows. 
# https://agupubs.onlinelibrary.wiley.com/action/downloadSupplement?doi=10.1002%2F2014GL061957&file=grl52461-sup-0003-supplementary.xls

co2_IPCC = pd.read_excel('data/CO2_IPCC.xls', sheet_name='CO2 Composite', skiprows=range(14))

In [3]:
co2_IPCC.head()

Unnamed: 0,Gasage (yr BP),CO2 (ppmv),sigma mean CO2 (ppmv)
0,-51.03,368.022488,0.060442
1,-48.0,361.780737,0.37
2,-46.279272,359.647793,0.098
3,-44.405642,357.10674,0.159923
4,-43.08,353.946685,0.043007


In [4]:
co2_IPCC.describe()

Unnamed: 0,Gasage (yr BP),CO2 (ppmv),sigma mean CO2 (ppmv)
count,1901.0,1901.0,1901.0
mean,242810.270113,235.566624,1.340519
std,274261.195468,35.902698,0.924188
min,-51.03,173.71362,0.01
25%,14606.209,204.826743,0.639335
50%,74525.645,232.456008,1.073871
75%,504177.187879,257.93,1.8
max,805668.868405,368.022488,9.96



https://en.wikipedia.org/wiki/Before_Present

In order to make the data more relateable and easier to compare with the other datasets I will convert the column 'Gasage (yr BP)' to the same year format that is seen in the other datasets.


In [6]:
years = 1950 - co2_IPCC['Gasage (yr BP)']

In [None]:
# co2_IPCC.insert(5, "Year", [years])
co2_IPCC.loc[:, "Year"] = years
print(co2_IPCC)

      Gasage (yr BP)   CO2 (ppmv)  sigma mean CO2 (ppmv)           Year
0          -51.030000  368.022488               0.060442    2001.030000
1          -48.000000  361.780737               0.370000    1998.000000
2          -46.279272  359.647793               0.098000    1996.279272
3          -44.405642  357.106740               0.159923    1994.405642
4          -43.080000  353.946685               0.043007    1993.080000
...               ...         ...                    ...            ...
1896    803925.284376  202.921723               2.064488 -801975.284376
1897    804009.870607  207.498645               0.915083 -802059.870607
1898    804522.674630  204.861938               1.642851 -802572.674630
1899    805132.442334  202.226839               0.689587 -803182.442334
1900    805668.868405  207.285440               2.202808 -803718.868405

[1901 rows x 4 columns]


In [None]:
# Read in the CO2 data from Mauna Loa .csv file, skipping the first 43 rows. 
# https://agupubs.onlinelibrary.wiley.com/action/downloadSupplement?doi=10.1002%2F2014GL061957&file=grl52461-sup-0003-supplementary.xls

co2_maunaloa = pd.read_csv('data/co2_mauna_loa.csv', skiprows=range(43))

In [None]:
co2_maunaloa.head()

co2_maunaloa.tail()

Unnamed: 0,year,mean,unc
59,2018,408.72,0.12
60,2019,411.65,0.12
61,2020,414.21,0.12
62,2021,416.41,0.12
63,2022,418.53,0.12


In [None]:
co2_maunaloa.describe()

Unnamed: 0,year,mean,unc
count,64.0,64.0,64.0
mean,1990.5,358.293437,0.12
std,18.618987,30.580414,9.791247000000001e-17
min,1959.0,315.98,0.12
25%,1974.75,330.895,0.12
50%,1990.5,355.075,0.12
75%,2006.25,382.5725,0.12
max,2022.0,418.53,0.12


## Resources

https://www.sciencedirect.com/science/article/pii/S2212094723000610#bib13

https://xlrd.readthedocs.io/en/latest/

https://gml.noaa.gov/ccgg/trends/data.html
