# FAOSTAT Temperature Change
- Data description
  1) The FAOSTAT Temperature Change domain disseminates statistics of mean surface temperature change by country, with annual updates. The      current dissemination covers the period 1961–2023. Statistics are available for monthly, seasonal and annual mean temperature    anomalies, i.e., temperature change with respect to a baseline climatology, corresponding to the period 1951–1980. The standard deviation of the temperature change of the baseline methodology is also available. Data are based on the publicly available GISTEMP data, the Global Surface Temperature Change data distributed by the National Aeronautics and Space Administration Goddard Institute for Space Studies (NASA-GISS).

- Content

 1) Code - Number of countries/areas covered: In 2019: 190 countries and 37 other territorial entities.
 2) Time coverage: 1961-2023
 3) Periodicity: Monthly, Seasonal, Yearly
 4) Base period: 1951-1980
 5) Unit of Measure: Celsius degrees °C
 6) Reference period: Months, Seasons, Meteorological year

- Inspiration

  1) Climate change is one of the important issues that face the world in this technological era. The best proof of this situation is the       historical temperature change. You can investigate if any hope there is for stopping global warming :)

In [67]:
# import packages
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
import plotly.graph_objects as go
from statsmodels.tsa.statespace.sarimax import SARIMAX
from sklearn.preprocessing import MinMaxScaler
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense
plt.style.use('dark_background')
%matplotlib inline

In [8]:
df = pd.read_csv('FAOSTAT_data_1-10-2022.csv')

In [9]:
# display all the columns
df

Unnamed: 0,Domain Code,Domain,Area Code (FAO),Area,Element Code,Element,Months Code,Months,Year Code,Year,Unit,Value,Flag,Flag Description
0,ET,Temperature change,2,Afghanistan,7271,Temperature change,7001,January,1961,1961,?C,0.746,Fc,Calculated data
1,ET,Temperature change,2,Afghanistan,7271,Temperature change,7001,January,1962,1962,?C,0.009,Fc,Calculated data
2,ET,Temperature change,2,Afghanistan,7271,Temperature change,7001,January,1963,1963,?C,2.695,Fc,Calculated data
3,ET,Temperature change,2,Afghanistan,7271,Temperature change,7001,January,1964,1964,?C,-5.277,Fc,Calculated data
4,ET,Temperature change,2,Afghanistan,7271,Temperature change,7001,January,1965,1965,?C,1.827,Fc,Calculated data
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
229920,ET,Temperature change,181,Zimbabwe,7271,Temperature change,7020,Meteorological year,2016,2016,?C,1.470,Fc,Calculated data
229921,ET,Temperature change,181,Zimbabwe,7271,Temperature change,7020,Meteorological year,2017,2017,?C,0.443,Fc,Calculated data
229922,ET,Temperature change,181,Zimbabwe,7271,Temperature change,7020,Meteorological year,2018,2018,?C,0.747,Fc,Calculated data
229923,ET,Temperature change,181,Zimbabwe,7271,Temperature change,7020,Meteorological year,2019,2019,?C,1.359,Fc,Calculated data


In [21]:
# making a copy of the data 
dfCopy = df.copy()
dfCopy

Unnamed: 0,Domain Code,Domain,Area Code (FAO),Area,Element Code,Element,Months Code,Months,Year Code,Year,Unit,Value,Flag,Flag Description
0,ET,Temperature change,2,Afghanistan,7271,Temperature change,7001,January,1961,1961,?C,0.746,Fc,Calculated data
1,ET,Temperature change,2,Afghanistan,7271,Temperature change,7001,January,1962,1962,?C,0.009,Fc,Calculated data
2,ET,Temperature change,2,Afghanistan,7271,Temperature change,7001,January,1963,1963,?C,2.695,Fc,Calculated data
3,ET,Temperature change,2,Afghanistan,7271,Temperature change,7001,January,1964,1964,?C,-5.277,Fc,Calculated data
4,ET,Temperature change,2,Afghanistan,7271,Temperature change,7001,January,1965,1965,?C,1.827,Fc,Calculated data
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
229920,ET,Temperature change,181,Zimbabwe,7271,Temperature change,7020,Meteorological year,2016,2016,?C,1.470,Fc,Calculated data
229921,ET,Temperature change,181,Zimbabwe,7271,Temperature change,7020,Meteorological year,2017,2017,?C,0.443,Fc,Calculated data
229922,ET,Temperature change,181,Zimbabwe,7271,Temperature change,7020,Meteorological year,2018,2018,?C,0.747,Fc,Calculated data
229923,ET,Temperature change,181,Zimbabwe,7271,Temperature change,7020,Meteorological year,2019,2019,?C,1.359,Fc,Calculated data


In [11]:
# display number of rows and columns
rows, columns = dfCopy.shape
print(f"Rows: {rows}, columns: {columns}")

Rows: 229925, columns: 14


In [12]:
dfCopy.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 229925 entries, 0 to 229924
Data columns (total 14 columns):
 #   Column            Non-Null Count   Dtype  
---  ------            --------------   -----  
 0   Domain Code       229925 non-null  object 
 1   Domain            229925 non-null  object 
 2   Area Code (FAO)   229925 non-null  int64  
 3   Area              229925 non-null  object 
 4   Element Code      229925 non-null  int64  
 5   Element           229925 non-null  object 
 6   Months Code       229925 non-null  int64  
 7   Months            229925 non-null  object 
 8   Year Code         229925 non-null  int64  
 9   Year              229925 non-null  int64  
 10  Unit              229925 non-null  object 
 11  Value             222012 non-null  float64
 12  Flag              229925 non-null  object 
 13  Flag Description  229925 non-null  object 
dtypes: float64(1), int64(5), object(8)
memory usage: 24.6+ MB


In [13]:
# descriptive
dfCopy.describe()

Unnamed: 0,Area Code (FAO),Element Code,Months Code,Year Code,Year,Value
count,229925.0,229925.0,229925.0,229925.0,229925.0,222012.0
mean,130.647689,7271.0,7009.882353,1991.306248,1991.306248,0.492626
std,76.809008,0.0,6.037955,17.333252,17.333252,1.036364
min,1.0,7271.0,7001.0,1961.0,1961.0,-9.303
25%,64.0,7271.0,7005.0,1976.0,1976.0,-0.071
50%,131.0,7271.0,7009.0,1992.0,1992.0,0.414
75%,194.0,7271.0,7016.0,2006.0,2006.0,0.999
max,351.0,7271.0,7020.0,2020.0,2020.0,11.759


In [14]:
# dfCopy.loc[dfCopy.duplicated()]
# checking duplicate on each column
dfCopy.loc[~dfCopy.duplicated(subset=['Domain', 'Area', 'Element', 'Months', 'Year', 'Flag Description'])].reset_index(drop=True)

Unnamed: 0,Domain Code,Domain,Area Code (FAO),Area,Element Code,Element,Months Code,Months,Year Code,Year,Unit,Value,Flag,Flag Description
0,ET,Temperature change,2,Afghanistan,7271,Temperature change,7001,January,1961,1961,?C,0.746,Fc,Calculated data
1,ET,Temperature change,2,Afghanistan,7271,Temperature change,7001,January,1962,1962,?C,0.009,Fc,Calculated data
2,ET,Temperature change,2,Afghanistan,7271,Temperature change,7001,January,1963,1963,?C,2.695,Fc,Calculated data
3,ET,Temperature change,2,Afghanistan,7271,Temperature change,7001,January,1964,1964,?C,-5.277,Fc,Calculated data
4,ET,Temperature change,2,Afghanistan,7271,Temperature change,7001,January,1965,1965,?C,1.827,Fc,Calculated data
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
229920,ET,Temperature change,181,Zimbabwe,7271,Temperature change,7020,Meteorological year,2016,2016,?C,1.470,Fc,Calculated data
229921,ET,Temperature change,181,Zimbabwe,7271,Temperature change,7020,Meteorological year,2017,2017,?C,0.443,Fc,Calculated data
229922,ET,Temperature change,181,Zimbabwe,7271,Temperature change,7020,Meteorological year,2018,2018,?C,0.747,Fc,Calculated data
229923,ET,Temperature change,181,Zimbabwe,7271,Temperature change,7020,Meteorological year,2019,2019,?C,1.359,Fc,Calculated data


In [10]:
# checking the columns unique values
dfCopy['Flag Description'].unique()
# dfCopy['Flag Description'].nunique() # returns the number of unique values on the column

# dfCopy['Area Code (FAO)'].drop_duplicates()

array(['Calculated data', 'Data not available'], dtype=object)

In [22]:
# replacing the ? to - i in the data 
def ReplaceValue(x):
    return x.str.replace("?", "-")
dfCopy['Months'] = ReplaceValue(dfCopy['Months'])

In [23]:
# using lambda function to replace the ? on the Unit
dfCopy['Unit'] = dfCopy['Unit'].apply(lambda x: str(x).replace("?", "°"))

In [17]:
# checking on the datatype of the columns
dfCopy['Flag Description'].dtype

dtype('O')

In [26]:
# checking for Null values
dfCopy.isna().sum()

Domain Code         0
Domain              0
Area Code (FAO)     0
Area                0
Element Code        0
Element             0
Months Code         0
Months              0
Year Code           0
Year                0
Unit                0
Value               0
Flag                0
Flag Description    0
dtype: int64

In [25]:
# Looking ata the null values on the column
dfCopy[dfCopy['Value'].isna()]

# filiing the null values with zero for better Analysis of the data
dfCopy['Value']= dfCopy['Value'].fillna(dfCopy['Value'].mean()).astype(float)

In [27]:
# dropping column Year Code
dfCopy = dfCopy.drop(labels='Year Code', axis=1)

# Extrapolatory Data Analysis
- Performing EDA to find insight - better visulazation on the data
- Correlation on Numerical Values - Discover how the numerical values correlate with each other
- Aggregation - Performing Aggreation on the data to better understand values indepth(mean,count,sum,max,min)
- Groupby - Group related data to perform better agg and insight
- Data Query - filter out the data when needed

In [28]:
dfCopy['Element'].unique()

array(['Temperature change'], dtype=object)

In [29]:
# Dropping Columns
dfCopy = dfCopy.drop(columns=['Area Code (FAO)', 'Element Code', 'Months Code'], axis=1)

In [30]:
dfCopy

Unnamed: 0,Domain Code,Domain,Area,Element,Months,Year,Unit,Value,Flag,Flag Description
0,ET,Temperature change,Afghanistan,Temperature change,January,1961,°C,0.746,Fc,Calculated data
1,ET,Temperature change,Afghanistan,Temperature change,January,1962,°C,0.009,Fc,Calculated data
2,ET,Temperature change,Afghanistan,Temperature change,January,1963,°C,2.695,Fc,Calculated data
3,ET,Temperature change,Afghanistan,Temperature change,January,1964,°C,-5.277,Fc,Calculated data
4,ET,Temperature change,Afghanistan,Temperature change,January,1965,°C,1.827,Fc,Calculated data
...,...,...,...,...,...,...,...,...,...,...
229920,ET,Temperature change,Zimbabwe,Temperature change,Meteorological year,2016,°C,1.470,Fc,Calculated data
229921,ET,Temperature change,Zimbabwe,Temperature change,Meteorological year,2017,°C,0.443,Fc,Calculated data
229922,ET,Temperature change,Zimbabwe,Temperature change,Meteorological year,2018,°C,0.747,Fc,Calculated data
229923,ET,Temperature change,Zimbabwe,Temperature change,Meteorological year,2019,°C,1.359,Fc,Calculated data


In [31]:
# working with Kenya 
Country = dfCopy[dfCopy['Area'] == 'Kenya']
Country

Unnamed: 0,Domain Code,Domain,Area,Element,Months,Year,Unit,Value,Flag,Flag Description
109480,ET,Temperature change,Kenya,Temperature change,January,1961,°C,0.476,Fc,Calculated data
109481,ET,Temperature change,Kenya,Temperature change,January,1962,°C,-0.942,Fc,Calculated data
109482,ET,Temperature change,Kenya,Temperature change,January,1963,°C,-0.334,Fc,Calculated data
109483,ET,Temperature change,Kenya,Temperature change,January,1964,°C,-0.690,Fc,Calculated data
109484,ET,Temperature change,Kenya,Temperature change,January,1965,°C,-0.747,Fc,Calculated data
...,...,...,...,...,...,...,...,...,...,...
110495,ET,Temperature change,Kenya,Temperature change,Meteorological year,2016,°C,1.259,Fc,Calculated data
110496,ET,Temperature change,Kenya,Temperature change,Meteorological year,2017,°C,1.512,Fc,Calculated data
110497,ET,Temperature change,Kenya,Temperature change,Meteorological year,2018,°C,0.635,Fc,Calculated data
110498,ET,Temperature change,Kenya,Temperature change,Meteorological year,2019,°C,1.611,Fc,Calculated data


In [32]:
# New Dataframe
dfKenya = Country[['Area', 'Element', 'Months', 'Year', 'Value']]
dfKenya

Unnamed: 0,Area,Element,Months,Year,Value
109480,Kenya,Temperature change,January,1961,0.476
109481,Kenya,Temperature change,January,1962,-0.942
109482,Kenya,Temperature change,January,1963,-0.334
109483,Kenya,Temperature change,January,1964,-0.690
109484,Kenya,Temperature change,January,1965,-0.747
...,...,...,...,...,...
110495,Kenya,Temperature change,Meteorological year,2016,1.259
110496,Kenya,Temperature change,Meteorological year,2017,1.512
110497,Kenya,Temperature change,Meteorological year,2018,0.635
110498,Kenya,Temperature change,Meteorological year,2019,1.611


In [53]:
dfKenya.reset_index(inplace=True)

In [34]:
# ignoring some rows
columns_ignore = ['Dec-Jan-Feb', 'Mar-Apr-May', 'Jun-Jul-Aug', 'Sep-Oct-Nov', 'Meteorological year']
dfKenya = dfKenya[~dfKenya['Months'].isin(columns_ignore)]
print(dfKenya)

      index   Area             Element    Months  Year  Value
0    109480  Kenya  Temperature change   January  1961  0.476
1    109481  Kenya  Temperature change   January  1962 -0.942
2    109482  Kenya  Temperature change   January  1963 -0.334
3    109483  Kenya  Temperature change   January  1964 -0.690
4    109484  Kenya  Temperature change   January  1965 -0.747
..      ...    ...                 ...       ...   ...    ...
715  110195  Kenya  Temperature change  December  2016  1.261
716  110196  Kenya  Temperature change  December  2017  0.899
717  110197  Kenya  Temperature change  December  2018  1.404
718  110198  Kenya  Temperature change  December  2019  0.968
719  110199  Kenya  Temperature change  December  2020  1.960

[720 rows x 6 columns]


In [35]:
# dropping some columns
dfKenya = dfKenya.drop(columns=['Element', 'Area'])

In [None]:
dfKenya.reset_index('Months', inplace=True)

In [44]:
dfKenya

Unnamed: 0_level_0,level_0,index,Months,Year,Temperature
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
1961-01-01,0,109480,January,1961,0.476
1962-01-01,1,109481,January,1962,-0.942
1963-01-01,2,109482,January,1963,-0.334
1964-01-01,3,109483,January,1964,-0.690
1965-01-01,4,109484,January,1965,-0.747
...,...,...,...,...,...
2016-12-01,715,110195,December,2016,1.261
2017-12-01,716,110196,December,2017,0.899
2018-12-01,717,110197,December,2018,1.404
2019-12-01,718,110198,December,2019,0.968


In [41]:
dfKenya['Date'] = pd.to_datetime(dfKenya['Year'].astype(str) + '-' + dfKenya['Months'], format='%Y-%B')

In [42]:
# Rename Value column to Temperature
dfKenya.rename(columns={'Value':'Temperature'},inplace=True)

In [None]:
dfKenya.set_index('Date', inplace=True)

In [58]:
dfKenya.sort_index(inplace=True)

In [61]:
df_final = dfKenya
df_final

Unnamed: 0_level_0,level_0,index,Months,Year,Temperature
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
1961-01-01,0,109480,January,1961,0.476
1961-02-01,60,109540,February,1961,0.145
1961-03-01,120,109600,March,1961,0.517
1961-04-01,180,109660,April,1961,0.421
1961-05-01,240,109720,May,1961,0.845
...,...,...,...,...,...
2020-08-01,479,109959,August,2020,1.557
2020-09-01,539,110019,September,2020,1.447
2020-10-01,599,110079,October,2020,1.643
2020-11-01,659,110139,November,2020,1.508


In [62]:
df_final = df_final.drop(columns=['level_0','index'], axis=1)

In [78]:
df_final

Unnamed: 0_level_0,Months,Year,Temperature
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1961-01-01,January,1961,0.476
1961-02-01,February,1961,0.145
1961-03-01,March,1961,0.517
1961-04-01,April,1961,0.421
1961-05-01,May,1961,0.845
...,...,...,...
2020-08-01,August,2020,1.557
2020-09-01,September,2020,1.447
2020-10-01,October,2020,1.643
2020-11-01,November,2020,1.508


In [83]:
# Train/Test Split
train = df_final[df_final.index < pd.Timestamp('2010-01-01')]
test = df_final[df_final.index >= pd.Timestamp('2010-01-01')]

# SARIMA MOdel
def train_sarima(data):
    model = SARIMAX(data, order=(1, 1, 1), seasonal_order=(1, 1, 1, 12))
    return model.fit()

sarima_models = {}
for month in df_final['Months'].unique():
    data_month = train[train['Months'] == month]['Temperature']
    try:
        model =  train_sarima(data_month)
        sarima_models[month] = model
        print(f'SARIMA model trained successfully for {month}')
    except Exception as e:
        print(f'Error training SARIMA model for {month}: {e}')

# Generate SARIMA forecasts
forecast_sarima_values = {}
for month in df_final['Months'].unique():
    if month in sarima_models:
        try:
            forecast = forecast_sarima(sarima_models[month],forecast_steps)
            forecast_lstm_values[month] = forecast
            print(f'SARIMA forecast generated successfully for {month}')
        except Exception as e:
            print(f'Error generated SARIMA forcast for {month}: {e}')
    else:
        print(f'No SARIMA model found for {month}')

# Generate LSTM forecasts
forecast_lstm_values = {}
for month in df_final['Months'].unique():
    lstm_model, scaler = lstm_models[month]
    data_month  = df_final[df_final['Months'] == month]['Temperature']
    try:
        forecast_lstm_values[month] = forecast_lstm(lstm_model, scaler, data_month, forecast_steps)
        print(f'LSTM forecast generated successfully for {month}')
    except Exception as e:
        print(f'Error generating LSTM forecast for {month}: {e}')

# Debugging: check keys in forecasting dictionaries
print('SARIMA forecast keys:', forecast_sarima_values.keys())
print('LSTM forecast Keys: ', forecast_lstm_values.keys())


# LSTM Model
def train_lstm(data):
    scaler = MinMaxScaler()
    data_scaled = scaler.fit_transform(data.values.reshape(-1, 1))

    X, y = [], []
    for i in range(len(data_scaled)-12):
        X.append(data_scaled[i:i+12])
        y.append(data_scaled[i+12])
    X, y = np.array(X), np.array(y)

    model = Sequential()
    model.add(LSTM(50, activation='relu', input_shape=(12, 1)))
    model.add(Dense(1))
    model.compile(optimizer='adam', loss='mse')
    model.fit(X, y, epochs=100, verbose=0)

    return model, scaler

lstm_models = {}
for month in df_final['Months'].unique():
    data_month = train[train['Months'] == month]['Temperature']
    lstm_models[month] = train_lstm(data_month)

# Forecasting sarima
def forecast_sarima(model, steps):
    try:
        forecast = model.forecast(steps=steps)
        return forecast
    except Exception as e:
        print(f'Error in forecast_sarima: {e}')
        return None

# Forecasting lstm
def forecast_lstm(model, scaler, data, steps):
    data_scaled = scaler.transform(data.values.reshape(-1, 1))
    input_data = data_scaled[-12:].reshape(1, 12, 1)
    
    forecast_scaled = []
    for _ in range(steps):
        forecast_scaled.append(model.predict(input_data)[0,0])
        input_data = np.append(input_data[:, 1:, :], forecast_scaled[-1].reshape(1, 1, 1), axis=1)
        forecast = scaler.inverse_transform(np.array(forecast_scaled).reshape(-1, 1))
    return forecast.flatten()

forecast_steps = 60
forecast_sarima_values = {}
forecast_lstm_values = {}

for month in df_final['Months'].unique():
    sarima_model = sarima_models[month]
    lstm_model, scaler = lstm_models[month]
    data_month = df_final[df_final['Months'] == month]['Temperature']
    
    forecast_lstm_values[month] = forecast_sarima(sarima_model, forecast_steps)
    forecast_lstm_values[month] = forecast_lstm(lstm_model, scaler, data_month, forecast_steps)

  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  warn('Too few observations to estimate starting parameters%s.'
 This problem is unconstrained.


RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            5     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  8.59177D-01    |proj g|=  2.65458D-01

At iterate    5    f=  6.60068D-01    |proj g|=  3.10461D-02

At iterate   10    f=  6.44012D-01    |proj g|=  4.83103D-02

At iterate   15    f=  6.41222D-01    |proj g|=  1.42865D-02

At iterate   20    f=  6.40808D-01    |proj g|=  4.91006D-03

At iterate   25    f=  6.40775D-01    |proj g|=  2.26193D-03

At iterate   30    f=  6.40769D-01    |proj g|=  2.28410D-03

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nac

  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  warn('Too few observations to estimate starting parameters%s.'
 This problem is unconstrained.


RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            5     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  8.52190D-01    |proj g|=  3.03454D-01

At iterate    5    f=  5.67562D-01    |proj g|=  2.61594D-02

At iterate   10    f=  5.52905D-01    |proj g|=  1.32687D-02

At iterate   15    f=  5.49414D-01    |proj g|=  1.02840D-02

At iterate   20    f=  5.48920D-01    |proj g|=  5.58446D-03

At iterate   25    f=  5.48891D-01    |proj g|=  1.04995D-02

At iterate   30    f=  5.48878D-01    |proj g|=  1.26061D-03

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nac

  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  warn('Too few observations to estimate starting parameters%s.'
 This problem is unconstrained.



At iterate   20    f=  6.90404D-01    |proj g|=  1.00510D-02

At iterate   25    f=  6.90007D-01    |proj g|=  1.98688D-03

At iterate   30    f=  6.89983D-01    |proj g|=  1.97989D-03

At iterate   35    f=  6.89982D-01    |proj g|=  2.63318D-04

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    5     38     49      1     0     0   6.499D-05   6.900D-01
  F =  0.68998143384343436     

CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH             
SARIMA model trained successfully for March
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            5     M =           10

At X0         0 

  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  warn('Non-invertible starting MA parameters found.'
  warn('Too few observations to estimate starting parameters%s.'
 This problem is unconstrained.



At iterate   20    f=  8.73171D-01    |proj g|=  3.77558D-03

At iterate   25    f=  8.73101D-01    |proj g|=  6.90805D-04

At iterate   30    f=  8.73092D-01    |proj g|=  1.83801D-04

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    5     30     42      1     0     0   1.838D-04   8.731D-01
  F =  0.87309228660356997     

CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH             
SARIMA model trained successfully for April
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            5     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  1.

  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  warn('Non-invertible starting MA parameters found.'
  warn('Too few observations to estimate starting parameters%s.'
 This problem is unconstrained.



At iterate   10    f=  6.23250D-01    |proj g|=  1.06698D-02

At iterate   15    f=  6.19178D-01    |proj g|=  9.48952D-03

At iterate   20    f=  6.18855D-01    |proj g|=  3.73106D-03

At iterate   25    f=  6.18799D-01    |proj g|=  4.05160D-03



   evaluations in the last line search.  Termination
   may possibly be caused by a bad search direction.
  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  warn('Too few observations to estimate starting parameters%s.'
 This problem is unconstrained.



           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    5     28     52      1     0     0   5.304D-03   6.188D-01
  F =  0.61879305957922326     

CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH             
SARIMA model trained successfully for May
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            5     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  5.88150D-01    |proj g|=  2.47020D-01

At iterate    5    f=  5.43236D-01    |proj g|=  4.90348D-02

At iterate   10    f=  4.91397D-01    |proj g|=  1.39358D-01

At iterate   15    f=  4.88

  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  warn('Too few observations to estimate starting parameters%s.'
 This problem is unconstrained.



At iterate    5    f=  5.67002D-01    |proj g|=  1.10131D-01

At iterate   10    f=  5.44959D-01    |proj g|=  6.08384D-03

At iterate   15    f=  5.43001D-01    |proj g|=  9.94392D-03

At iterate   20    f=  5.42775D-01    |proj g|=  3.82093D-03

At iterate   25    f=  5.42754D-01    |proj g|=  5.89072D-04

At iterate   30    f=  5.42742D-01    |proj g|=  5.98028D-04

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    5     34     48      1     0     0   1.357D-04   5.427D-01
  F =  0.54274219904212406     

CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH             
SARIMA model trained successfully for July
RUNNING 

  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  warn('Too few observations to estimate starting parameters%s.'
 This problem is unconstrained.



At iterate   15    f=  4.46934D-01    |proj g|=  4.25772D-03

At iterate   20    f=  4.46883D-01    |proj g|=  8.49388D-04

At iterate   25    f=  4.46806D-01    |proj g|=  8.72731D-03

At iterate   30    f=  4.46788D-01    |proj g|=  3.33219D-03

At iterate   35    f=  4.46785D-01    |proj g|=  2.10340D-03



   evaluations in the last line search.  Termination
   may possibly be caused by a bad search direction.
  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  warn('Too few observations to estimate starting parameters%s.'
 This problem is unconstrained.



           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    5     36     61      1     0     0   2.103D-03   4.468D-01
  F =  0.44678509818029438     

CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH             
SARIMA model trained successfully for August
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            5     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  5.45565D-01    |proj g|=  2.39271D-01

At iterate    5    f=  4.37845D-01    |proj g|=  3.87428D-02

At iterate   10    f=  4.29047D-01    |proj g|=  3.49361D-02

At iterate   15    f=  4

  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  warn('Too few observations to estimate starting parameters%s.'
 This problem is unconstrained.



At iterate    5    f=  5.33563D-01    |proj g|=  4.21034D-02

At iterate   10    f=  5.10925D-01    |proj g|=  3.35341D-02

At iterate   15    f=  5.08051D-01    |proj g|=  5.26404D-03

At iterate   20    f=  5.07766D-01    |proj g|=  2.36468D-03

At iterate   25    f=  5.07725D-01    |proj g|=  5.54366D-03

At iterate   30    f=  5.07716D-01    |proj g|=  1.38412D-03

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    5     30     43      1     0     0   1.384D-03   5.077D-01
  F =  0.50771616005300491     

CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH             
SARIMA model trained successfully for October
RUNNI

  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  warn('Too few observations to estimate starting parameters%s.'
 This problem is unconstrained.



At iterate   15    f=  5.47070D-01    |proj g|=  2.73596D-02

At iterate   20    f=  5.46486D-01    |proj g|=  2.45098D-02

At iterate   25    f=  5.46420D-01    |proj g|=  2.62535D-03

At iterate   30    f=  5.46405D-01    |proj g|=  9.85683D-04

At iterate   35    f=  5.46403D-01    |proj g|=  7.93909D-04

At iterate   40    f=  5.46401D-01    |proj g|=  1.50920D-04

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    5     42     60      1     0     0   7.564D-05   5.464D-01
  F =  0.54640134908130789     

CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH             
SARIMA model trained successfully for November
RUNN

  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  warn('Non-invertible starting MA parameters found.'
  warn('Too few observations to estimate starting parameters%s.'
 This problem is unconstrained.



At iterate   25    f=  6.75639D-01    |proj g|=  1.52051D-03

At iterate   30    f=  6.75631D-01    |proj g|=  4.22707D-03

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    5     34     48      1     0     0   6.517D-05   6.756D-01
  F =  0.67563015185474351     

CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH             
SARIMA model trained successfully for December
SARIMA forecast generated successfully for January
SARIMA forecast generated successfully for February
SARIMA forecast generated successfully for March
SARIMA forecast generated successfully for April
SARIMA forecast generated successfully for May
SARI

  super().__init__(**kwargs)
  super().__init__(**kwargs)
  super().__init__(**kwargs)
  super().__init__(**kwargs)
  super().__init__(**kwargs)
  super().__init__(**kwargs)
  super().__init__(**kwargs)
  super().__init__(**kwargs)
  super().__init__(**kwargs)
  super().__init__(**kwargs)
  super().__init__(**kwargs)
  super().__init__(**kwargs)


[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 272ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 64ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 83ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 76ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 80ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 55ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 70ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 92ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 63ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 54ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 49ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 50ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 74ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 7

In [82]:
# Functio to plot Forecast
def plot_forecast(actual, sarima_forecast, lstm_forecast, month):
    plt.figure(figsize=(10, 8))
    plt.plot(actual.index, actual.values, label='Actual', color='black')
    plt.plot(actual.index[-1] + pd.DateOffset(months=1) + np.arange(1, forecast_steps+1), 
            sarima_forecast, label='SARIMA Forecast', linestyle='--', color='red')
    plt.title(f'Temperature change Forecast for {month} in Kenya')
    plt.xlabel('Year')
    plt.ylabel('Temperature Change')
    plt.legend()
    plt.grid(True)
    plt.show()

# Debugging: checking keys in forecast Dictionaries
print(f'SARIMA Forecast Keys: {forecast_sarima_values.keys()}')
print(f'LSTM Forecast Keys: {forecast_lstm_values.keys()}')
print(f'Unique Months in df_final: {df_final['Months'].unique()}')


# Visualizing forecast for each month
for month in df_final['Months'].unique():
    if month in forecast_sarima_values and month in forecast_lstm_values:
        actual_month = df_final[df_final['Months'] == month]['Temperature']
        plot_forecast(actual_month, forecast_sarima_values[month], forecast_lstm_values[month], month)
    else:
        print(f'Forecast Not found in month: {month}')
    

SARIMA Forecast Keys: dict_keys([])
LSTM Forecast Keys: dict_keys(['January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October', 'November', 'December'])
Unique Months in df_final: ['January' 'February' 'March' 'April' 'May' 'June' 'July' 'August'
 'September' 'October' 'November' 'December']
Forecast Not found in month: January
Forecast Not found in month: February
Forecast Not found in month: March
Forecast Not found in month: April
Forecast Not found in month: May
Forecast Not found in month: June
Forecast Not found in month: July
Forecast Not found in month: August
Forecast Not found in month: September
Forecast Not found in month: October
Forecast Not found in month: November
Forecast Not found in month: December
