# My Programming for Data Analytics Project

**By Joanne Feeney**
***

For this project, I will:

- Analyse CO2 vs Temperature Anomaly from 800kyrs – present. 
- Examine one other (paleo/modern) features (e.g. CH4 or polar ice-coverage).
- Examine Irish context e.g. Climate change signals: (see Maynooth study: The emergence of a climate change signal in long-term Irish meteorological observations - ScienceDirect).
- Fuse and analyse data from various data sources and format fused data set as a pandas dataframe and export to csv and json formats.
- For all of the above variables, analyse the data, the trends and the relationships between them (temporal leads/lags/frequency analysis).
- Predict global temperature anomaly over next few decades (synthesise data) and compare to published climate models if atmospheric CO2 trends continue.
- Comment on accelerated warming based on very latest features (e.g. temperature/polar-icecoverage).

Importing different packages that I will use in this notebook

In [1]:
# Imports
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns

Reading in data provided by the lecturer & skipping rows that are not required.

Luthi et. al. data:

In [2]:
# Naming as df1 and reading it into python
df1 = pd.read_csv("data\CO2_data_from_Luthi_et_al_2008.csv", skiprows=6)

In [3]:
# Naming as df2 and reading it into python
df2 = pd.read_csv("data\CO2_data_from_Luthi_et_al_2008_2.csv", skiprows=6)

In [4]:
# Naming as df3 and reading it into python
df3 = pd.read_csv("data\CO2_data_from_Luthi_et_al_2008_3.csv", skiprows=6)

Jouzel data:

In [5]:
# Naming as df4 and reading it into python
df4 = pd.read_csv("data\Temperature_data_from_Jouzel.csv")

df4.head()

Unnamed: 0,bag,ztop,EDC3béta,AICC2012,deutfinal,temp,acc-EDC3beta
0,1,0.0,-50.0,-55.0,,218.3,3.105321
1,2,0.55,-43.55977,-50.513333,,218.3,3.104805
2,3,1.1,-37.44019,-46.026667,,218.3,3.104404
3,4,1.65,-31.64156,-41.54,,218.3,3.104025
4,5,2.2,-24.55278,-34.516667,,218.3,3.103453


NOAA data:

In [6]:
# Naming as df5 and reading it into python
df5 = pd.read_csv("data\Temperature_data_from_NOAA_2.csv", skiprows=6)

df5.head()

Unnamed: 0,Depth (m),"Gasage (EDC3, yr BP)","Gasage (AICC2012, yr BP)",CO2 (ppmv),sigma mean CO2 (ppmv),Depth (m).1,"Gasage (EDC3, yr BP).1","Gasage (AICC2012, yr BP).1",CO2 (ppmv).1,Depth (m).2,...,CO2 (ppmv).19,sigma mean CO2 (ppmv).16,Depth (m).19,"Gasage (AICC2012, yr BP).13",corrected CO2 (ppmv),analytical sigma mean CO2 (ppmv),Correcting Factor (ppmv),lower bound (2 sigma) of correction F. (ppmv),upper bound (2 sigma) of correction F. (ppmv),Unnamed: 108
0,102.83,137.0,350.11,280.4,1.8,149.1,2690.0,,284.7,380.82,...,267.9,3.37,2950.53,562654.67,234.07,0.94,0.0,0.0,0.45,
1,106.89,268.0,486.69,274.9,0.7,173.1,3897.0,3661.93,272.7,382.42,...,265.45,1.43,2951.82,563135.78,240.11,2.13,0.0,0.0,0.49,
2,107.2,279.0,501.2,277.9,0.7,177.4,4124.0,3746.63,268.1,382.76,...,268.86,1.42,2952.92,563536.65,242.29,0.51,0.0,0.0,0.52,
3,110.25,395.0,539.65,279.1,1.3,228.6,6735.0,6449.18,262.2,383.54,...,263.95,1.85,2954.02,563928.77,245.69,1.77,0.0,0.0,0.56,
4,110.5,404.0,539.89,281.9,1.1,250.3,7873.0,7567.35,254.5,385.33,...,270.6,3.85,2955.12,564311.43,245.81,0.49,0.0,0.0,0.59,


Merging of all the Luthi et. al. data as they have matching column names however the NOAA and Jouzel data do not so maybe cannot be merged?

In [7]:
# Merging datasets together (Python for MBAs)
df_fuse1 = pd.merge(df1, 
                    df2,
                    left_on='CO2 (ppmv)',
                    right_on='CO2 (ppmv)',
                    how='outer')

[1]

In [8]:
df_fuse2 = pd.merge(df_fuse1, 
                    df3,
                    left_on='CO2 (ppmv)',
                    right_on='CO2 (ppmv)',
                    how='outer')

In [9]:
df_fuse3 = pd.merge(df_fuse2, 
                    df5,
                    left_on='CO2 (ppmv)',
                    right_on='CO2 (ppmv)',
                    how='outer')

MemoryError: Unable to allocate 12.5 GiB for an array with shape (103, 16341406) and data type float64

In [None]:
df_fuse2.head()

Now that I have fused together some of the data I begin comparing columns and seeing what information I require for this project. I can see from F.Parrenin et. al.'s paper [2], that EDC3 and any other variations of it on the other datasets means years before AD1950 which clears things up for me.

### Conclusion
***

### References

[1] Python for MBAs, Griffel & Guetta, Columbia Business School Publishing, 2021, eBook Academic Collection (EBSCOhost), (https://web.s.ebscohost.com/ehost/ebookviewer/ebook/ZTAwMHh3d19fMjQ1ODcyM19fQU41?sid=9d53254f-59d9-4f57-baa5-1b1ed8837cce@redis&vid=3&format=EB), chapter 7.6 JOINS IN PANDAS, last accessed 20/12/23

[2] The EDC3 chronology for the EPICA Dome C icecore, F.Parrenin et. al., 2007 (https://cp.copernicus.org/articles/3/485/2007/cp-3-485-2007.pdf), last accessed 20/12/23

***
## The End