# The Warming World: Time Series Analysis of Temperature Patterns in Developed Regions.

![Image](image.png)


# Overview
Developed nations, often characterized by their advanced industrialization and higher living standards, have played a significant role in shaping the trajectory of global climate change. This project endeavors to delve deep into the phenomenon of temperature rise within these developed countries. 
Through diligent analysis and data-driven insights, we aim clarify the patterns, causes, and consequences of temperature increases in these regions. Moreover, we recognize that carbon dioxide, is intricately tied to the phenomenon of global warming. Its levels in the Earth's atmosphere have been steadily rising.
By synthesizing these two critical aspects, we aim to provide a holistic view of climate change in developed nations.


# Business Understanding

In a world where environmental sustainability has become a central concern for governments, industries, and communities, the business implications of climate change are profound.
 Companies and organizations operating within developed countries are increasingly aware that they must address climate-related challenges, not only for ethical reasons but also to ensure long-term viability and growth. 
 
 With this rise of temperatures in developed regions, there are some consequences that come with it including more frequent extreme weather events, disruptions to supply chains, increased energy costs, and shifts in consumer behavior. By comprehensively examining temperature trends in these regions, businesses can better assess the specific risks they face and develop strategies to mitigate them. Additionally, understanding the dynamics of atmospheric carbon dioxide allows companies to anticipate regulatory changes and emissions reduction requirements.


## Objective:
Our main objective for this research is to conduct a comprehensive time series analysis of temperature data in developed regions.
### Goals
- To identify long-term temperature trends and variations within these regions.
- Detect seasonal and annual patterns in temperature fluctuations.-involve the accuracy of seasonal temperature pattern detection.
- To assess the implications of temperature rise on ecosystems, agriculture, and public health within developed regions.
- Develop a predictive model to estimate future temperature trends.


## Problem statement
Climate change is an urgent global challenge with far-reaching consequences, and it is particularly significant in developed countries where industrialization and high carbon emissions have been prevalent.
This divergence creates a significant knowledge gap regarding how shifting temperatures impact industries, ecosystems, and public health in these regions. 
With this knowledge, it gap poses difficulties for businesses, policymakers, and the general populace in making informed decisions and adapting effectively to climate change's impacts in developed nations.

The ultimate goal is to develop a predictive model for estimating future temperature trends, providing actionable insights to stakeholders and helping them proactively prepare for climate challenges in developed nations.

# Data Understanding

## CO2 Data (c02):

**Date:** The date of the CO2 measurement.

**Decimal Date:** A numerical representation of the date.

**Average:** The average CO2 concentration in parts per million (ppm) for the given date.

**Interpolated:** Interpolated CO2 concentration.

**Trend:** The trend in CO2 concentration.

**Number of Days:** The number of days associated with the measurement.

The CO2 data consists of 727 entries and provides monthly measurements of CO2 concentration dating back to 1958.

## Temperature Data (temp):

**dt:** The date of the temperature measurement.

**AverageTemperature:** The average temperature for the given date and location.

**AverageTemperatureUncertainty:** Uncertainty associated with the temperature measurement.

**City:** The city where the temperature measurement was taken.

**Country:** The country where the temperature measurement was taken.

**Latitude:** The latitude coordinates of the location.

**Longitude:** The longitude coordinates of the location.

The temperature data consists of 239,177 entries and provides monthly temperature measurements for various cities and countries dating back to 1849.

## Merged Data (merged_data_with_c02):

**Date:** The date associated with CO2 measurements (from c02 data).

**Decimal Date:** A numerical representation of the date (from c02 data).

**Average:** The average CO2 concentration in parts per million (ppm) for the given date (from c02 data).

**Interpolated:** Interpolated CO2 concentration (from c02 data).

**Trend:** The trend in CO2 concentration (from c02 data).

**Number of Days:** The number of days associated with the measurement (from c02 data).

**AverageTemperature:** The average temperature for the given date and location (from temp data).

**AverageTemperatureUncertainty:** Uncertainty associated with the temperature measurement (from temp data).

**City:** The city where the temperature measurement was taken (from temp data).

**Country:** The country where the temperature measurement was taken (from temp data).

**Latitude:** The latitude coordinates of the location (from temp data).

**Longitude:** The longitude coordinates of the location (from temp data).


The merged data combines CO2 concentration data with temperature data based on the common date column. It contains 66,760 entries after filtering out rows with missing CO2 data.

## Key Observations:

The CO2 data provides information on atmospheric CO2 concentrations over time, with measurements starting from 1958.
The temperature data contains a wide range of temperature measurements for various locations and dates, dating back to 1849.
The merged data combines CO2 and temperature data, allowing for potential analysis of temperature trends in relation to CO2 concentration.
There are missing values in the merged data, particularly in the "AverageTemperature" and "AverageTemperatureUncertainty" columns. These missing values may need to be handled during analysis.
The merged data provides an opportunity to explore relationships between CO2 concentration and temperature variations.

In [1]:
import pandas as pd

In [2]:
c02 = pd.read_csv('co2-mm-mlo_csv.csv')
c02.head()

Unnamed: 0,Date,Decimal Date,Average,Interpolated,Trend,Number of Days
0,1958-03-01,1958.208,315.71,315.71,314.62,-1
1,1958-04-01,1958.292,317.45,317.45,315.29,-1
2,1958-05-01,1958.375,317.5,317.5,314.71,-1
3,1958-06-01,1958.458,-99.99,317.1,314.85,-1
4,1958-07-01,1958.542,315.86,315.86,314.98,-1


In [3]:
c02.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 727 entries, 0 to 726
Data columns (total 6 columns):
 #   Column          Non-Null Count  Dtype  
---  ------          --------------  -----  
 0   Date            727 non-null    object 
 1   Decimal Date    727 non-null    float64
 2   Average         727 non-null    float64
 3   Interpolated    727 non-null    float64
 4   Trend           727 non-null    float64
 5   Number of Days  727 non-null    int64  
dtypes: float64(4), int64(1), object(1)
memory usage: 34.2+ KB


In [4]:
temp = pd.read_csv('TemperaturesByMajor.csv')

In [5]:
temp.head()


Unnamed: 0,dt,AverageTemperature,AverageTemperatureUncertainty,City,Country,Latitude,Longitude
0,1849-01-01,26.704,1.435,Abidjan,Côte D'Ivoire,5.63N,3.23W
1,1849-02-01,27.434,1.362,Abidjan,Côte D'Ivoire,5.63N,3.23W
2,1849-03-01,28.101,1.612,Abidjan,Côte D'Ivoire,5.63N,3.23W
3,1849-04-01,26.14,1.387,Abidjan,Côte D'Ivoire,5.63N,3.23W
4,1849-05-01,25.427,1.2,Abidjan,Côte D'Ivoire,5.63N,3.23W


In [6]:
temp.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 239177 entries, 0 to 239176
Data columns (total 7 columns):
 #   Column                         Non-Null Count   Dtype  
---  ------                         --------------   -----  
 0   dt                             239177 non-null  object 
 1   AverageTemperature             228175 non-null  float64
 2   AverageTemperatureUncertainty  228175 non-null  float64
 3   City                           239177 non-null  object 
 4   Country                        239177 non-null  object 
 5   Latitude                       239177 non-null  object 
 6   Longitude                      239177 non-null  object 
dtypes: float64(2), object(5)
memory usage: 12.8+ MB


In [7]:
c02['Date'] = pd.to_datetime(c02['Date'])
temp['dt'] = pd.to_datetime(temp['dt'])


In [8]:
merged_data = pd.merge(c02, temp, left_on='Date', right_on='dt', how='outer')


In [9]:
merged_data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 239237 entries, 0 to 239236
Data columns (total 13 columns):
 #   Column                         Non-Null Count   Dtype         
---  ------                         --------------   -----         
 0   Date                           66760 non-null   datetime64[ns]
 1   Decimal Date                   66760 non-null   float64       
 2   Average                        66760 non-null   float64       
 3   Interpolated                   66760 non-null   float64       
 4   Trend                          66760 non-null   float64       
 5   Number of Days                 66760 non-null   float64       
 6   dt                             239177 non-null  datetime64[ns]
 7   AverageTemperature             228175 non-null  float64       
 8   AverageTemperatureUncertainty  228175 non-null  float64       
 9   City                           239177 non-null  object        
 10  Country                        239177 non-null  object        
 11  

In [10]:
merged_data.head()

Unnamed: 0,Date,Decimal Date,Average,Interpolated,Trend,Number of Days,dt,AverageTemperature,AverageTemperatureUncertainty,City,Country,Latitude,Longitude
0,1958-03-01,1958.208,315.71,315.71,314.62,-1.0,1958-03-01,28.449,0.214,Abidjan,Côte D'Ivoire,5.63N,3.23W
1,1958-03-01,1958.208,315.71,315.71,314.62,-1.0,1958-03-01,19.916,0.554,Addis Abeba,Ethiopia,8.84N,38.11E
2,1958-03-01,1958.208,315.71,315.71,314.62,-1.0,1958-03-01,27.582,0.371,Ahmadabad,India,23.31N,72.52E
3,1958-03-01,1958.208,315.71,315.71,314.62,-1.0,1958-03-01,12.665,0.53,Aleppo,Syria,36.17N,37.79E
4,1958-03-01,1958.208,315.71,315.71,314.62,-1.0,1958-03-01,17.68,0.392,Alexandria,Egypt,31.35N,30.16E


In [11]:
merged_data.tail()

Unnamed: 0,Date,Decimal Date,Average,Interpolated,Trend,Number of Days,dt,AverageTemperature,AverageTemperatureUncertainty,City,Country,Latitude,Longitude
239232,NaT,,,,,,1754-12-01,-0.752,1.669,New York,United States,40.99N,74.56W
239233,NaT,,,,,,1754-12-01,3.653,2.811,Paris,France,49.03N,2.45E
239234,NaT,,,,,,1754-12-01,4.437,2.257,Rome,Italy,42.59N,13.09E
239235,NaT,,,,,,1754-12-01,-3.179,3.144,Saint Petersburg,Russia,60.27N,29.19E
239236,NaT,,,,,,1754-12-01,-4.544,1.983,Toronto,Canada,44.20N,80.50W


In [12]:
merged_data.isnull().sum()

Date                             172477
Decimal Date                     172477
Average                          172477
Interpolated                     172477
Trend                            172477
Number of Days                   172477
dt                                   60
AverageTemperature                11062
AverageTemperatureUncertainty     11062
City                                 60
Country                              60
Latitude                             60
Longitude                            60
dtype: int64

In [13]:
# Filter rows where 'Average' column is not NaN
merged_data_with_c02 = merged_data[~merged_data['Average'].isna()]

# Now, merged_data_with_c02 contains only rows with CO2 data


In [14]:
merged_data_with_c02.head()

Unnamed: 0,Date,Decimal Date,Average,Interpolated,Trend,Number of Days,dt,AverageTemperature,AverageTemperatureUncertainty,City,Country,Latitude,Longitude
0,1958-03-01,1958.208,315.71,315.71,314.62,-1.0,1958-03-01,28.449,0.214,Abidjan,Côte D'Ivoire,5.63N,3.23W
1,1958-03-01,1958.208,315.71,315.71,314.62,-1.0,1958-03-01,19.916,0.554,Addis Abeba,Ethiopia,8.84N,38.11E
2,1958-03-01,1958.208,315.71,315.71,314.62,-1.0,1958-03-01,27.582,0.371,Ahmadabad,India,23.31N,72.52E
3,1958-03-01,1958.208,315.71,315.71,314.62,-1.0,1958-03-01,12.665,0.53,Aleppo,Syria,36.17N,37.79E
4,1958-03-01,1958.208,315.71,315.71,314.62,-1.0,1958-03-01,17.68,0.392,Alexandria,Egypt,31.35N,30.16E


In [15]:
merged_data_with_c02.tail()

Unnamed: 0,Date,Decimal Date,Average,Interpolated,Trend,Number of Days,dt,AverageTemperature,AverageTemperatureUncertainty,City,Country,Latitude,Longitude
66755,2018-05-01,2018.375,411.24,411.24,407.91,24.0,NaT,,,,,,
66756,2018-06-01,2018.458,410.79,410.79,408.49,29.0,NaT,,,,,,
66757,2018-07-01,2018.542,408.71,408.71,408.32,27.0,NaT,,,,,,
66758,2018-08-01,2018.625,406.99,406.99,408.9,30.0,NaT,,,,,,
66759,2018-09-01,2018.708,405.51,405.51,409.02,29.0,NaT,,,,,,


In [16]:
merged_data_with_c02.dropna(inplace=True)


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  merged_data_with_c02.dropna(inplace=True)


In [17]:
merged_data_with_c02.drop(columns=['dt'], inplace=True)

merged_data_with_c02.info()


<class 'pandas.core.frame.DataFrame'>
Index: 66607 entries, 0 to 66696
Data columns (total 12 columns):
 #   Column                         Non-Null Count  Dtype         
---  ------                         --------------  -----         
 0   Date                           66607 non-null  datetime64[ns]
 1   Decimal Date                   66607 non-null  float64       
 2   Average                        66607 non-null  float64       
 3   Interpolated                   66607 non-null  float64       
 4   Trend                          66607 non-null  float64       
 5   Number of Days                 66607 non-null  float64       
 6   AverageTemperature             66607 non-null  float64       
 7   AverageTemperatureUncertainty  66607 non-null  float64       
 8   City                           66607 non-null  object        
 9   Country                        66607 non-null  object        
 10  Latitude                       66607 non-null  object        
 11  Longitude           

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  merged_data_with_c02.drop(columns=['dt'], inplace=True)


In [18]:
merged_data_with_c02.head()


Unnamed: 0,Date,Decimal Date,Average,Interpolated,Trend,Number of Days,AverageTemperature,AverageTemperatureUncertainty,City,Country,Latitude,Longitude
0,1958-03-01,1958.208,315.71,315.71,314.62,-1.0,28.449,0.214,Abidjan,Côte D'Ivoire,5.63N,3.23W
1,1958-03-01,1958.208,315.71,315.71,314.62,-1.0,19.916,0.554,Addis Abeba,Ethiopia,8.84N,38.11E
2,1958-03-01,1958.208,315.71,315.71,314.62,-1.0,27.582,0.371,Ahmadabad,India,23.31N,72.52E
3,1958-03-01,1958.208,315.71,315.71,314.62,-1.0,12.665,0.53,Aleppo,Syria,36.17N,37.79E
4,1958-03-01,1958.208,315.71,315.71,314.62,-1.0,17.68,0.392,Alexandria,Egypt,31.35N,30.16E


In [20]:
merged_data_with_c02.value_counts


<bound method DataFrame.value_counts of             Date  Decimal Date  Average  Interpolated   Trend  Number of Days  \
0     1958-03-01      1958.208   315.71        315.71  314.62            -1.0   
1     1958-03-01      1958.208   315.71        315.71  314.62            -1.0   
2     1958-03-01      1958.208   315.71        315.71  314.62            -1.0   
3     1958-03-01      1958.208   315.71        315.71  314.62            -1.0   
4     1958-03-01      1958.208   315.71        315.71  314.62            -1.0   
...          ...           ...      ...           ...     ...             ...   
66662 2013-09-01      2013.708   393.45        393.45  396.99            27.0   
66664 2013-09-01      2013.708   393.45        393.45  396.99            27.0   
66671 2013-09-01      2013.708   393.45        393.45  396.99            27.0   
66683 2013-09-01      2013.708   393.45        393.45  396.99            27.0   
66696 2013-09-01      2013.708   393.45        393.45  396.99        