# FAOSTAT Temperature Change
- Data description
  1) The FAOSTAT Temperature Change domain disseminates statistics of mean surface temperature change by country, with annual updates. The      current dissemination covers the period 1961–2023. Statistics are available for monthly, seasonal and annual mean temperature    anomalies, i.e., temperature change with respect to a baseline climatology, corresponding to the period 1951–1980. The standard deviation of the temperature change of the baseline methodology is also available. Data are based on the publicly available GISTEMP data, the Global Surface Temperature Change data distributed by the National Aeronautics and Space Administration Goddard Institute for Space Studies (NASA-GISS).

- Content

 1) Code - Number of countries/areas covered: In 2019: 190 countries and 37 other territorial entities.
 2) Time coverage: 1961-2023
 3) Periodicity: Monthly, Seasonal, Yearly
 4) Base period: 1951-1980
 5) Unit of Measure: Celsius degrees °C
 6) Reference period: Months, Seasons, Meteorological year

- Inspiration

  1) Climate change is one of the important issues that face the world in this technological era. The best proof of this situation is the       historical temperature change. You can investigate if any hope there is for stopping global warming :)

In [133]:
# import packages
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
import plotly.graph_objects as go
plt.style.use('dark_background')
%matplotlib inline

In [136]:
df = pd.read_csv('FAOSTAT_data_1-10-2022.csv')

In [6]:
# display all the columns
df

Unnamed: 0,Domain Code,Domain,Area Code (FAO),Area,Element Code,Element,Months Code,Months,Year Code,Year,Unit,Value,Flag,Flag Description
0,ET,Temperature change,2,Afghanistan,7271,Temperature change,7001,January,1961,1961,?C,0.746,Fc,Calculated data
1,ET,Temperature change,2,Afghanistan,7271,Temperature change,7001,January,1962,1962,?C,0.009,Fc,Calculated data
2,ET,Temperature change,2,Afghanistan,7271,Temperature change,7001,January,1963,1963,?C,2.695,Fc,Calculated data
3,ET,Temperature change,2,Afghanistan,7271,Temperature change,7001,January,1964,1964,?C,-5.277,Fc,Calculated data
4,ET,Temperature change,2,Afghanistan,7271,Temperature change,7001,January,1965,1965,?C,1.827,Fc,Calculated data
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
229920,ET,Temperature change,181,Zimbabwe,7271,Temperature change,7020,Meteorological year,2016,2016,?C,1.470,Fc,Calculated data
229921,ET,Temperature change,181,Zimbabwe,7271,Temperature change,7020,Meteorological year,2017,2017,?C,0.443,Fc,Calculated data
229922,ET,Temperature change,181,Zimbabwe,7271,Temperature change,7020,Meteorological year,2018,2018,?C,0.747,Fc,Calculated data
229923,ET,Temperature change,181,Zimbabwe,7271,Temperature change,7020,Meteorological year,2019,2019,?C,1.359,Fc,Calculated data


In [137]:
# making a copy of the data 
dfCopy = df.copy()
dfCopy

Unnamed: 0,Domain Code,Domain,Area Code (FAO),Area,Element Code,Element,Months Code,Months,Year Code,Year,Unit,Value,Flag,Flag Description
0,ET,Temperature change,2,Afghanistan,7271,Temperature change,7001,January,1961,1961,?C,0.746,Fc,Calculated data
1,ET,Temperature change,2,Afghanistan,7271,Temperature change,7001,January,1962,1962,?C,0.009,Fc,Calculated data
2,ET,Temperature change,2,Afghanistan,7271,Temperature change,7001,January,1963,1963,?C,2.695,Fc,Calculated data
3,ET,Temperature change,2,Afghanistan,7271,Temperature change,7001,January,1964,1964,?C,-5.277,Fc,Calculated data
4,ET,Temperature change,2,Afghanistan,7271,Temperature change,7001,January,1965,1965,?C,1.827,Fc,Calculated data
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
229920,ET,Temperature change,181,Zimbabwe,7271,Temperature change,7020,Meteorological year,2016,2016,?C,1.470,Fc,Calculated data
229921,ET,Temperature change,181,Zimbabwe,7271,Temperature change,7020,Meteorological year,2017,2017,?C,0.443,Fc,Calculated data
229922,ET,Temperature change,181,Zimbabwe,7271,Temperature change,7020,Meteorological year,2018,2018,?C,0.747,Fc,Calculated data
229923,ET,Temperature change,181,Zimbabwe,7271,Temperature change,7020,Meteorological year,2019,2019,?C,1.359,Fc,Calculated data


In [10]:
# display number of rows and columns
rows, columns = dfCopy.shape
print(f"Rows: {rows}, columns: {columns}")

Rows: 229925, columns: 14


In [11]:
dfCopy.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 229925 entries, 0 to 229924
Data columns (total 14 columns):
 #   Column            Non-Null Count   Dtype  
---  ------            --------------   -----  
 0   Domain Code       229925 non-null  object 
 1   Domain            229925 non-null  object 
 2   Area Code (FAO)   229925 non-null  int64  
 3   Area              229925 non-null  object 
 4   Element Code      229925 non-null  int64  
 5   Element           229925 non-null  object 
 6   Months Code       229925 non-null  int64  
 7   Months            229925 non-null  object 
 8   Year Code         229925 non-null  int64  
 9   Year              229925 non-null  int64  
 10  Unit              229925 non-null  object 
 11  Value             222012 non-null  float64
 12  Flag              229925 non-null  object 
 13  Flag Description  229925 non-null  object 
dtypes: float64(1), int64(5), object(8)
memory usage: 24.6+ MB


In [12]:
# descriptive
dfCopy.describe()

Unnamed: 0,Area Code (FAO),Element Code,Months Code,Year Code,Year,Value
count,229925.0,229925.0,229925.0,229925.0,229925.0,222012.0
mean,130.647689,7271.0,7009.882353,1991.306248,1991.306248,0.492626
std,76.809008,0.0,6.037955,17.333252,17.333252,1.036364
min,1.0,7271.0,7001.0,1961.0,1961.0,-9.303
25%,64.0,7271.0,7005.0,1976.0,1976.0,-0.071
50%,131.0,7271.0,7009.0,1992.0,1992.0,0.414
75%,194.0,7271.0,7016.0,2006.0,2006.0,0.999
max,351.0,7271.0,7020.0,2020.0,2020.0,11.759


In [24]:
# dfCopy.loc[dfCopy.duplicated()]
# checking duplicate on each column
dfCopy.loc[~dfCopy.duplicated(subset=['Domain', 'Area', 'Element', 'Months', 'Year', 'Flag Description'])].reset_index(drop=True)

Unnamed: 0,Domain Code,Domain,Area Code (FAO),Area,Element Code,Element,Months Code,Months,Year Code,Year,Unit,Value,Flag,Flag Description
0,ET,Temperature change,2,Afghanistan,7271,Temperature change,7001,January,1961,1961,?C,0.746,Fc,Calculated data
1,ET,Temperature change,2,Afghanistan,7271,Temperature change,7001,January,1962,1962,?C,0.009,Fc,Calculated data
2,ET,Temperature change,2,Afghanistan,7271,Temperature change,7001,January,1963,1963,?C,2.695,Fc,Calculated data
3,ET,Temperature change,2,Afghanistan,7271,Temperature change,7001,January,1964,1964,?C,-5.277,Fc,Calculated data
4,ET,Temperature change,2,Afghanistan,7271,Temperature change,7001,January,1965,1965,?C,1.827,Fc,Calculated data
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
229920,ET,Temperature change,181,Zimbabwe,7271,Temperature change,7020,Meteorological year,2016,2016,?C,1.470,Fc,Calculated data
229921,ET,Temperature change,181,Zimbabwe,7271,Temperature change,7020,Meteorological year,2017,2017,?C,0.443,Fc,Calculated data
229922,ET,Temperature change,181,Zimbabwe,7271,Temperature change,7020,Meteorological year,2018,2018,?C,0.747,Fc,Calculated data
229923,ET,Temperature change,181,Zimbabwe,7271,Temperature change,7020,Meteorological year,2019,2019,?C,1.359,Fc,Calculated data


In [181]:
# checking the columns unique values
dfCopy['Flag Description'].unique()
# dfCopy['Flag Description'].nunique() # returns the number of unique values on the column

# dfCopy['Area Code (FAO)'].drop_duplicates()

array(['Calculated data', 'Data not available'], dtype=object)

In [165]:
# replacing the ? to - i in the data 
def ReplaceValue(x):
    return x.str.replace("?", "-")
dfCopy['Months'] = ReplaceValue(dfCopy['Months'])

In [166]:
# using lambda function to replace the ? on the Unit
dfCopy['Unit'] = dfCopy['Unit'].apply(lambda x: str(x).replace("?", "°"))

In [98]:
# checking on the datatype of the columns
dfCopy['Flag Description'].dtype

dtype('O')

In [168]:
# checking for Null values
dfCopy.isna().sum()

Domain Code         0
Domain              0
Area Code (FAO)     0
Area                0
Element Code        0
Element             0
Months Code         0
Months              0
Year Code           0
Year                0
Unit                0
Value               0
Flag                0
Flag Description    0
dtype: int64

In [114]:
# Looking ata the null values on the column
dfCopy[dfCopy['Value'].isna()]

# filiing the null values with zero for better Analysis of the data
dfCopy['Value']= dfCopy['Value'].fillna(0).astype(float)

In [170]:
# dropping column Year Code
dfCopy = dfCopy.drop(labels='Year Code', axis=1)

# Extrapolatory Data Analysis
- Performing EDA to find insight - better visulazation on the data
- Correlation on Numerical Values - Discover how the numerical values correlate with each other
- Aggregation - Performing Aggreation on the data to better understand values indepth(mean,count,sum,max,min)
- Groupby - Group related data to perform better agg and insight
- Data Query - filter out the data when needed

In [188]:
dfCopy['Element'].unique()

array(['Temperature change'], dtype=object)

In [195]:
# Dropping Columns
dfCopy = dfCopy.drop(columns=['Area Code (FAO)', 'Element Code', 'Months Code'], axis=1)

In [196]:
dfCopy

Unnamed: 0,Area,Element,Months,Year,Unit,Value,Flag,Flag Description
0,Afghanistan,Temperature change,January,1961,°C,0.746,Fc,Calculated data
1,Afghanistan,Temperature change,January,1962,°C,0.009,Fc,Calculated data
2,Afghanistan,Temperature change,January,1963,°C,2.695,Fc,Calculated data
3,Afghanistan,Temperature change,January,1964,°C,-5.277,Fc,Calculated data
4,Afghanistan,Temperature change,January,1965,°C,1.827,Fc,Calculated data
...,...,...,...,...,...,...,...,...
229920,Zimbabwe,Temperature change,Meteorological year,2016,°C,1.470,Fc,Calculated data
229921,Zimbabwe,Temperature change,Meteorological year,2017,°C,0.443,Fc,Calculated data
229922,Zimbabwe,Temperature change,Meteorological year,2018,°C,0.747,Fc,Calculated data
229923,Zimbabwe,Temperature change,Meteorological year,2019,°C,1.359,Fc,Calculated data


In [197]:
# working with Kenya 
Country = dfCopy[dfCopy['Area'] == 'Kenya']
Country

Unnamed: 0,Area,Element,Months,Year,Unit,Value,Flag,Flag Description
109480,Kenya,Temperature change,January,1961,°C,0.476,Fc,Calculated data
109481,Kenya,Temperature change,January,1962,°C,-0.942,Fc,Calculated data
109482,Kenya,Temperature change,January,1963,°C,-0.334,Fc,Calculated data
109483,Kenya,Temperature change,January,1964,°C,-0.690,Fc,Calculated data
109484,Kenya,Temperature change,January,1965,°C,-0.747,Fc,Calculated data
...,...,...,...,...,...,...,...,...
110495,Kenya,Temperature change,Meteorological year,2016,°C,1.259,Fc,Calculated data
110496,Kenya,Temperature change,Meteorological year,2017,°C,1.512,Fc,Calculated data
110497,Kenya,Temperature change,Meteorological year,2018,°C,0.635,Fc,Calculated data
110498,Kenya,Temperature change,Meteorological year,2019,°C,1.611,Fc,Calculated data


In [198]:
# New Dataframe
dfKenya = Country[['Area', 'Element', 'Months', 'Year', 'Value']]
dfKenya

Unnamed: 0,Area,Element,Months,Year,Value
109480,Kenya,Temperature change,January,1961,0.476
109481,Kenya,Temperature change,January,1962,-0.942
109482,Kenya,Temperature change,January,1963,-0.334
109483,Kenya,Temperature change,January,1964,-0.690
109484,Kenya,Temperature change,January,1965,-0.747
...,...,...,...,...,...
110495,Kenya,Temperature change,Meteorological year,2016,1.259
110496,Kenya,Temperature change,Meteorological year,2017,1.512
110497,Kenya,Temperature change,Meteorological year,2018,0.635
110498,Kenya,Temperature change,Meteorological year,2019,1.611


In [218]:
dfKenya.reset_index(inplace=True)

In [214]:
# ignoring some rows
columns_ignore = ['Dec-Jan-Feb', 'Mar-Apr-May', 'Jun-Jul-Aug', 'Sep-Oct-Nov', 'Meteorological year']
dfKenya = dfKenya[~dfKenya['Months'].isin(columns_ignore)]
print(dfKenya)

                  Element    Months  Year  Value
Area                                            
Kenya  Temperature change   January  1961  0.476
Kenya  Temperature change   January  1962 -0.942
Kenya  Temperature change   January  1963 -0.334
Kenya  Temperature change   January  1964 -0.690
Kenya  Temperature change   January  1965 -0.747
...                   ...       ...   ...    ...
Kenya  Temperature change  December  2016  1.261
Kenya  Temperature change  December  2017  0.899
Kenya  Temperature change  December  2018  1.404
Kenya  Temperature change  December  2019  0.968
Kenya  Temperature change  December  2020  1.960

[720 rows x 4 columns]


In [219]:
# dropping some columns
dfKenya = dfKenya.drop(columns=['Element', 'Area'])

In [230]:
dfKenya.reset_index('Months', inplace=True)

In [251]:
dfKenya

Unnamed: 0,Months,Year,Value
0,January,1961,0.476
1,January,1962,-0.942
2,January,1963,-0.334
3,January,1964,-0.690
4,January,1965,-0.747
...,...,...,...
715,December,2016,1.261
716,December,2017,0.899
717,December,2018,1.404
718,December,2019,0.968


In [253]:
dfKenya['Date'] = pd.to_datetime(dfKenya['Year'].astype(str) + '-' + dfKenya['Months'], format='%Y-%B')

In [265]:
# Rename Value column to Temperature
dfKenya.rename(columns={'Value':'Temperature'},inplace=True)

In [None]:
dfKenya.set_index('Date', inplace=True)

In [267]:
dfKenya.sort_index(inplace=True)

In [268]:
df_final = dfKenya
df_final

Unnamed: 0_level_0,Months,Year,Temperature
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1961-01-01,January,1961,0.476
1961-02-01,February,1961,0.145
1961-03-01,March,1961,0.517
1961-04-01,April,1961,0.421
1961-05-01,May,1961,0.845
...,...,...,...
2020-08-01,August,2020,1.557
2020-09-01,September,2020,1.447
2020-10-01,October,2020,1.643
2020-11-01,November,2020,1.508
