<a href="https://colab.research.google.com/github/AmeerTechsoft/Data-science-project/blob/main/IoT_Weather_Predictor_.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **IoT Temperature and Humidity Predictor using AutoTS Algorithm**
AutoTS is an automated time series forecasting library in Python that uses a genetic algorithm to search for the best model parameters for a given dataset. It is built on top of the popular machine learning library scikit-learn and provides a simple, easy-to-use interface for training and evaluating time series models.

AutoTS can handle various time series forecasting tasks, including univariate, multivariate, and forecasting with missing values. It also provides a range of models, including classical statistical models such as ARIMA and exponential smoothing, as well as machine learning models such as Random Forests and XGBoost.

One of the key benefits of AutoTS is its ability to automatically select the best model parameters without requiring manual tuning. This is accomplished by using a genetic algorithm to search for the best model parameters based on a given set of evaluation metrics. This makes it particularly useful for data scientists and analysts who may not have deep expertise in time series modeling but still need to generate accurate forecasts.

Overall, AutoTS is a powerful and flexible tool for time series forecasting that can help users save time and effort in selecting and training models.

In [None]:
# Installing AutoTS
!pip install autots

In [None]:
# Declearing Libraries

import pandas as pd
import matplotlib.pyplot as plt
from autots import AutoTS
from autots.datasets import load_daily

In [None]:
CHANNEL_ID = 2039086

# Load data from ThingSpeak channel as .CSV into dataframe
data = pd.read_csv(f'https://api.thingspeak.com/channels/{CHANNEL_ID}/feeds.csv?&results=10000')

# Shows what the dataframe contains
data.info()

In [None]:
data


In [None]:
 #converting Field1 and Field2 to Float string type

 data['field1'] = pd.to_numeric(data['field1'], errors='coerce').astype(float)

 data['field2'] = pd.to_numeric(data['field2'], errors='coerce').astype(float)

In [None]:
# Check for empty fields

data.isna().sum()

In [None]:
# Droping Empty fields
data.dropna(inplace=True)

In [None]:
# Check for empty fields
data.isna().sum()

In [None]:
import pytz

timezone_wca = pytz.timezone('Africa/Lagos')


# Convert timestamp to datetime format
data['created_at'] = pd.to_datetime(data['created_at']).dt.tz_convert(timezone_wca)

# Set timestamp as index
data.set_index('created_at', inplace=True)


In [None]:
data.info()

In [None]:
# Convertingthe data to hourly to make it regular

data_hourly = data.resample("H").mean().fillna(method="ffill")

In [None]:
# Select temperature data and humidity data from dataframe
temp_data = data_hourly['field1']
hum_data = data_hourly['field2']

In [None]:
data_hourly.info()

## The **AutoTS** object is an instance of the Auto Time Series forecasting library, which is a Python library used for time series forecasting. It is designed to automatically search for the best forecasting model for a given time series dataset by trying different combinations of models and parameters, and evaluating them using various metrics.



### In this specific code, the ** _model object is created using AutoTS with the following parameters:

*   **forecast_length**: the number of time steps to forecast, which is set to 6 to forecast every 4 hours for a day.
*   **frequency**: the frequency of the time series, which is set to 4 hours.
* **prediction_interval**: the prediction interval, which is set to 0.7 to generate a 70% prediction interval for the forecasts.
* **ensemble**: the type of ensemble method to use, which is set to 'simple'
* **model_list**: the list of models to use, which is set to 'multivariate' to use models that can handle multiple input variables.
* **ransformer_list**: the list of transformers to use, which is set to 'superfast' for faster computation.
* **drop_most_recent**: the number of most recent time steps to drop from the input data, which is set to 1.
* **max_generations**: the maximum number of generations for the genetic algorithm used to search for the best models and parameters, which is set to 5.
* **num_validations**: the number of cross-validation folds to use, which is set to 1.
* **models_to_validate**: the proportion of models to validate during the search, which is set to 0.2.
* **n_jobs**: the number of CPU cores to use for parallel computation, which is set to 100.


In [None]:
# Dropping the Unsued column "entry_id"
data_hourly = data_hourly.drop(['entry_id'], axis=1)

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import tensorflow as tf
from sklearn.preprocessing import MinMaxScaler
from keras.models import Sequential
from keras.layers import Dense, LSTM
from keras.callbacks import EarlyStopping
import requests
from autots import AutoTS
import pickle


# Split data into training and testing sets
train_size = int(len(data_hourly) * 0.7)
train = data_hourly[:train_size]
test = data_hourly[train_size:]

In [None]:
train['field1'].plot(figsize=(12,5))
test['field1'].plot(figsize=(12, 5))
plt.show();

In [None]:
train_temp = pd.DataFrame(train['field1'])
train_temp

In [None]:
# model for temp
model = AutoTS(
    forecast_length=12,  # Forecasting every 4 hours for a day, so 6 forecasts
    frequency='2H',
    prediction_interval=0.9,
    ensemble='simple',
    model_list='multivariate',
    transformer_list='superfast',
    drop_most_recent=1,
    max_generations=5,
    num_validations=1,
    models_to_validate=0.2,
    n_jobs=100,
)

In [None]:
model.fit(train)

In [None]:
model.predict(test)

In [None]:
print (train.info)
print (test.info)

In [None]:
# model for temp
temp_model = AutoTS(
    forecast_length=6,  # Forecasting every 4 hours for a day, so 6 forecasts
    frequency='4H',
    prediction_interval=0.9,
    ensemble='simple',
    model_list='multivariate',
    transformer_list='superfast',
    drop_most_recent=1,
    max_generations=5,
    num_validations=1,
    models_to_validate=0.2,
    n_jobs=100,
)


# model for humidity
hum_model = AutoTS(
    forecast_length=6,  # Forecasting every 4 hours for a day, so 6 forecasts
    frequency='4H',
    prediction_interval=0.9,
    ensemble='simple',
    model_list='multivariate',
    transformer_list='superfast',
    drop_most_recent=1,
    max_generations=5,
    num_validations=1,
    models_to_validate=0.2,
    n_jobs=100,
)


#**Temprature forecast**

###**temp_model.fit(temp_data)**
is a method call to fit the **temp_data** into the **temp_model** instance using the AutoTS algorithm. This method trains the model and finds the best-fitted model based on the given input data and parameters.

###**temp_forecast = temp_model.predict()**
forecast is a method call to make a forecast using the **temp_model** instance. This method uses the trained model to make future predictions and returns the forecasted values for the specified **forecast_length**. In this case, the forecast length is set to 6, which means the method returns 6 forecasted values for temperature at an interval of every 4 hours.

In [None]:
temp_modelling = temp_model.fit(df_scaled)
# temp_forecast = temp_model.predict().forecast

In [None]:
temp_modelling

In [None]:
test

In [None]:
from autots.evaluation import TSUtil
metrics = TSUtil.metrics(test, temp_forecast)
print(metrics)

In [None]:
temp_forecast = temp_modelling.predict(data_hourly['field1']).forecast

In [None]:
temp_forecast

#**Humidity forecast**
###**hum_model.fit(hum_data)**
is a method call to fit the **hum_data** into the **hum_model** instance using the AutoTS algorithm. This method trains the model and finds the best-fitted model based on the given input data and parameters.

###**hum_forecast = hum_model.predict()**
forecast is a method call to make a forecast using the **hum_model** instance. This method uses the trained model to make future predictions and returns the forecasted values for the specified **forecast_length**. In this case, the forecast length is set to 6, which means the method returns 6 forecasted values for temperature at an interval of every 4 hours.

In [None]:
hum_model.fit(hum_data)
hum_forecast = hum_model.predict().forecast


#**Ploting the Forcasted Temprature and Humidity**

In [None]:
# Plot actual and forecasted temperature and humidity
fig, ax = plt.subplots(figsize=(50, 15))
ax.plot(temp_data.index, temp_data, label='Temperature Actual')
ax.plot(temp_forecast.index, temp_forecast, label='Temperature Forecast')
ax.plot(hum_data.index, hum_data, label='Humidity Actual')
ax.plot(hum_forecast.index, hum_forecast, label='Humidity Forecast')

#**Plot upper and lower forecast bounds as shaded region**

In [None]:
# Plot upper and lower forecast bounds as shaded region
fig, ax = plt.subplots(figsize=(20, 6))
temp_up, temp_low = temp_model.predict().upper_forecast.squeeze(), temp_model.predict().lower_forecast.squeeze()
hum_up, hum_low = hum_model.predict().upper_forecast.squeeze(), hum_model.predict().lower_forecast.squeeze()
ax.fill_between(temp_up.index, temp_up.values, temp_low.values, alpha=0.6)
ax.fill_between(hum_up.index, hum_up.values, hum_low.values, alpha=0.6)

ax.legend()
plt.show()

##**The Date and Time format is**

###**03-24 20**
###**MM-DD HH**


---

###**03**: MONTH
###**24**: DAY
###**20**: HOUR



#STOP

#TEST

In [None]:
hum_forecast

In [None]:
import pickle

# assume your model is named 'model'
with open('temp_model.pkl', 'wb') as file:
    pickle.dump(temp_model, file)

with open('hum_model.pkl', 'wb') as file:
    pickle.dump(hum_model, file)

In [None]:
hum_model

In [None]:
# Combine the predictions into a single DataFrame
preds = pd.merge(temp_forecast, hum_forecast, left_index=True, right_index=True)


# Export the predictions to a CSV file
preds.to_csv('predicted/predictions.csv')

In [None]:
temp_forecast = temp_forecast.rename(columns={"field1": "field3"})
hum_forecast = hum_forecast.rename(columns={"field2": "field4"})

In [None]:
hum_forecast.info()

In [None]:
import requests

# Set ThingSpeak parameters
api_key = '2QE7XJFM47BKNNE6'
url = 'https://api.thingspeak.com/update'

# Get the predicted values
# temp_forecast = temp_model.predict().forecast
# hum_forecast = hum_model.predict().forecast


# Construct the payload
payload = {'api_key': api_key,
           'field3': temp_forecast,  # Send only the last forecasted value
           'field4': hum_forecast}   # Send only the last forecasted value

# Send the data to ThingSpeak
response = requests.post(url, data=payload)

# Print the response
print(response.content)

print (temp_forecast)
print (hum_forecast)



In [None]:
plt.figure(figsize=(20, 6))
plt.plot(temp_data.index, temp_data, label='Actual temperature')
plt.plot(temp_forecast.index, temp_forecast, label='Predicted temperature')
plt.legend()
plt.title('Temperature forecast using AutoARIMA')
plt.xlabel('Date')
plt.ylabel('Temperature')
plt.show()


In [None]:
pip freeze


In [None]:
# Declearing Libraries

import pandas as pd
import matplotlib.pyplot as plt
from autots import AutoTS
from autots.datasets import load_daily
import pytz


CHANNEL_ID = 2039086

# Load data from ThingSpeak channel as .CSV into dataframe
data = pd.read_csv(f'https://api.thingspeak.com/channels/{CHANNEL_ID}/feeds.csv?&results=500')

# Shows what the dataframe contains

# converting Field1 and Field2 to Float string type

data['field1'] = pd.to_numeric(data['field1'], errors='coerce').astype(float)

data['field2'] = pd.to_numeric(data['field2'], errors='coerce').astype(float)

data.dropna(inplace=True)

timezone_wca = pytz.timezone('Africa/Lagos')


# Convert timestamp to datetime format
data['created_at'] = pd.to_datetime(data['created_at']).dt.tz_convert(timezone_wca)

# Set timestamp as index
data.set_index('created_at', inplace=True)
data_hourly = data.resample("H").mean().fillna(method="ffill")

# Select temperature data and humidity data from dataframe
temp_data = data_hourly['field1']
hum_data = data_hourly['field2']

# model for temp
temp_model = AutoTS(
    forecast_length=6,  # Forecasting every 4 hours for a day, so 6 forecasts
    frequency='4H',
    prediction_interval=0.7,
    ensemble='simple',
    model_list='multivariate',
    transformer_list='superfast',
    drop_most_recent=1,
    max_generations=5,
    num_validations=1,
    models_to_validate=0.2,
    n_jobs=100,
)


# model for humidity
hum_model = AutoTS(
    forecast_length=6,  # Forecasting every 4 hours for a day, so 6 forecasts
    frequency='4H',
    prediction_interval=0.7,
    ensemble='simple',
    model_list='multivariate',
    transformer_list='superfast',
    drop_most_recent=1,
    max_generations=5,
    num_validations=1,
    models_to_validate=0.2,
    n_jobs=100,

)

temp_model.fit(temp_data)
temp_forecast = temp_model.predict().forecast

hum_model.fit(hum_data)
hum_forecast = hum_model.predict().forecast

# Combine the predictions into a single DataFrame
preds = pd.merge(temp_forecast, hum_forecast, left_index=True, right_index=True)


# Export the predictions to a CSV file
preds.to_csv('predicted/predictions.csv')



In [None]:

# Export the predictions to a CSV file
preds.to_csv('predicted/predictions.csv')

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import tensorflow as tf
from sklearn.preprocessing import MinMaxScaler
from keras.models import Sequential
from keras.layers import Dense, LSTM
from keras.callbacks import EarlyStopping
import requests
from autots import AutoTS
import pickle

# Set up ThingSpeak API key and channel ID
api_key = 'LXOTOE6HKVWNHZ63'
channel_id = '2039086'

# Get data from ThingSpeak
url = 'https://api.thingspeak.com/channels/{}/feeds.json?api_key={}&results=1000'.format(channel_id, api_key)
response = requests.get(url)
data = response.json()['feeds']
df = pd.DataFrame(data)
df['created_at'] = pd.to_datetime(df['created_at'])
df.set_index('created_at', inplace=True)
df.dropna(inplace=True)
df['field1'] = pd.to_numeric(df['field1'], errors='coerce').astype(float)

df['field2'] = pd.to_numeric(df['field2'], errors='coerce').astype(float)
df = df[['field1', 'field2']]
df.columns = ['temperature', 'humidity']
df = df.resample("4H").mean().fillna(method="ffill")

# Normalize data
scaler = MinMaxScaler()
df_scaled = pd.DataFrame(scaler.fit_transform(df), columns=df.columns, index=df.index)

# Split data into training and testing sets
train_size = int(len(df_scaled) * 0.7)
train = df_scaled[:train_size]
test = df_scaled[train_size:]

# Set up AutoTS
# model = AutoTS(
#     forecast_length=6,
#     frequency='4H',
#     ensemble='stacked',
#     max_generations=10,
#     validation_method='backwards',
#     verbose=2
# )
model = AutoTS(
    forecast_length=6,
    frequency='4H',
    ensemble='all',
    max_generations=6,
    validation_method='Seasonal 364',
    verbose=2,
    min_allowed_train_percent=0.8,
    prediction_interval=0.9
)
# Train model on training data
model = model.fit(train)

# Make predictions on testing data
# predictions = model.predict(test)
# predictions = pd.DataFrame(predictions, columns=test.columns, index=test.index)

# # Plot actual values and predicted values
# plt.figure(figsize=(10, 6))
# sns.lineplot(data=test, dashes=False)
# sns.lineplot(data=predictions, dashes=True)
# plt.legend(['Actual Temperature', 'Actual Humidity', 'Predicted Temperature', 'Predicted Humidity'])
# plt.xlabel('Date')
# plt.ylabel('Normalized Value')
# plt.title('Actual vs. Predicted Temperature and Humidity')
# plt.show()

# # Save model
# filename = 'autots_model.sav'
# pickle.dump(model, open(filename, 'wb'))

# # Load model
# loaded_model = pickle.load(open(filename, 'rb'))

# # Use loaded model to make predictions
# new_predictions = loaded_model.predict(test)


In [None]:
test.index = pd.to_datetime(test.index)

In [None]:
!pip install --upgrade pandas scikit-learn


In [None]:
model

In [None]:
import pickle

# assume your model is named 'model'
with open('model.pkl', 'wb') as file:
    pickle.dump(model, file)


In [None]:
# Make predictions on testing data
predictions = model.predict(forecast_length=6).forecast
predictions = pd.DataFrame(predictions, columns=test.columns, index=test.index)

# Plot actual values and predicted values
plt.figure(figsize=(10, 6))
sns.lineplot(data=test, dashes=False)
sns.lineplot(data=predictions, dashes=True)
plt.legend(['Actual Temperature', 'Actual Humidity', 'Predicted Temperature', 'Predicted Humidity'])
plt.xlabel('Date')
plt.ylabel('Normalized Value')
plt.title('Actual vs. Predicted Temperature and Humidity')
plt.show()

# # Save model
# filename = 'autots_model.sav'
# pickle.dump(model, open(filename, 'wb'))

# # Load model
# loaded_model = pickle.load(open(filename, 'rb'))

# # Use loaded model to make predictions
# new_predictions = loaded_model.predict(test)

In [None]:
predictions

In [None]:
import pandas as pd
from autots import AutoTS

# Load data from Thingspeak channel
api_key = 'LXOTOE6HKVWNHZ63'
channel_id = '2039086'


df = pd.read_csv(f'https://api.thingspeak.com/channels/{channel_id}/feeds.csv?api_key={api_key}&results=1000')

# Convert timestamp to datetime format
df['created_at'] = pd.to_datetime(df['created_at'])
df['field1'] = pd.to_numeric(df['field1'], errors='coerce').astype(float)

df['field2'] = pd.to_numeric(df['field2'], errors='coerce').astype(float)

# Set timestamp as index
df = df.set_index('created_at')

# Split data into training and testing sets
forecast_length = 6
min_allowed_train_percent = 0.9
train_length = int(len(df) * min_allowed_train_percent)
train = df.iloc[:train_length]
test = df.iloc[train_length:train_length+forecast_length]


# Instantiate AutoTS model
model = AutoTS(
    forecast_length=forecast_length,
    frequency='H',
    prediction_interval=0.9,
    ensemble='all',
    max_generations=3,
    num_validations=1,
    validation_method='backwards',
    model_list=['ARIMA', 'Prophet', 'VAR', 'Theta',]
)

# Train model on training data
model = model.fit(train)

# Make predictions on testing data
predictions = model.predict(test)
predictions = pd.DataFrame(predictions, columns=test.columns, index=test.index)

# Save model
model.save('model.pkl')


In [None]:
import requests
import json
import pandas as pd
import pickle
from datetime import datetime, timedelta
import pytz

# Load temperature and humidity models
with open('temp_model.pkl', 'rb') as f:
    temp_model = pickle.load(f)

with open('hum_model.pkl', 'rb') as f:
    hum_model = pickle.load(f)

# Collect data from ThingSpeak API
CHANNEL_ID = 2039086

# Load data from ThingSpeak channel as .CSV into dataframe
data = pd.read_csv(f'https://api.thingspeak.com/channels/{CHANNEL_ID}/feeds.csv?&results=500')

data['field1'] = pd.to_numeric(data['field1'], errors='coerce').astype(float)

data['field2'] = pd.to_numeric(data['field2'], errors='coerce').astype(float)
data.dropna(inplace=True)
timezone_wca = pytz.timezone('Africa/Lagos')


# Convert timestamp to datetime format
data['created_at'] = pd.to_datetime(data['created_at']).dt.tz_convert(timezone_wca)

# Set timestamp as index
data.set_index('created_at', inplace=True)


data_hourly = data.resample("H").mean().fillna(method="ffill")

temp = data['field1']
hum = data['field2']

# Generate timestamp for current day at 00:00
now = datetime.now()
today = datetime(now.year, now.month, now.day)
timestamps = [today + timedelta(hours=i) for i in range(0, 24, 4)]

# Create DataFrame with features for temperature and humidity models
# temp_features = pd.DataFrame({'hour': [ts.hour for ts in timestamps],
#                               'month': [ts.month for ts in timestamps],
#                               'weekday': [ts.weekday() for ts in timestamps],
#                               'temp_1': temp,
#                               'hum_1': hum})
# hum_features = pd.DataFrame({'hour': [ts.hour for ts in timestamps],
#                              'month': [ts.month for ts in timestamps],
#                              'weekday': [ts.weekday() for ts in timestamps],
#                              'temp_1': temp,
#                              'hum_1': hum})

# Make predictions using the models
# temp_preds = temp_model.predict(temp).forecast
# hum_preds = hum_model.predict(hum).forecast

import pandas as pd
import autots
from autots import AutoTS

# Load the saved model
temp_model = autots.load('temp_model.pkl')
hum_model = autots.load('hum_model.pkl')

# Prepare the data
data = pd.read_csv(f'https://api.thingspeak.com/channels/{CHANNEL_ID}/feeds.csv?&results=500')

data['field1'] = pd.to_numeric(data['field1'], errors='coerce').astype(float)

data['field2'] = pd.to_numeric(data['field2'], errors='coerce').astype(float)
data.dropna(inplace=True)
timezone_wca = pytz.timezone('Africa/Lagos')


# Convert timestamp to datetime format
data['created_at'] = pd.to_datetime(data['created_at']).dt.tz_convert(timezone_wca)

# Set timestamp as index
data.set_index('created_at', inplace=True)


data_hourly = data.resample("H").mean().fillna(method="ffill")

temp = data['field1']
hum = data['field2']

# Make forecasts
# forecasts = model.predict(data)
temp_preds = temp_model.predict(temp).forecast
hum_preds = hum_model.predict(hum).forecast

# Evaluate the results
# actuals = data['target_variable']
# mae = mean_absolute_error(actuals, temp_preds)
# mse = mean_squared_error(actuals, temp_preds)
# mae = mean_absolute_error(actuals, hum_preds)
# mse = mean_squared_error(actuals, hum_preds)
# # rmse = np.sqrt(mse)
# # r2 = r2_score(actuals, forecasts)


In [None]:
temp_preds