<a href="https://colab.research.google.com/github/1MuhammadFarhanAslam/ML-Projects/blob/main/Weather_forecasting.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Overview**
*Weather forecasting is the task of forecasting weather conditions for a given location and time. With the use of weather data and algorithms, it is possible to predict weather conditions for the next n number of days.*

*For forecasting weather using Python, we need a dataset containing historical weather data based on a particular location.*

*The given dataset provides data from 1st January 2013 to 24th April 2017 in the city of Delhi, India. The 4 parameters here are*
**meantemp, humidity, wind_speed, meanpressure.**

# **Mounting Google Drive**

In [None]:
from google.colab import drive

# Mount Google Drive
drive.mount('/content/drive')

#**Configure Google Colab to Kaggle through Kaggle API** 

**To connect Kaggle datasets to Google Colab, you need to follow these steps:**

* 1: Install the Kaggle library in Google Colab by running the following command

In [None]:
!pip install kaggle

**Go to the Kaggle website (https://www.kaggle.com) and sign in to your account (or create a new account if you don't have one).**

*Navigate to the dataset you want to use in your Colab notebook.*

*Click on the "Copy API command" button below the dataset description. This will copy the command to download the dataset using the Kaggle API.*

*In your Colab notebook, import the necessary libraries and set up the Kaggle API by running the following code*

In [None]:
import os
import json

# Upload your Kaggle API key file (kaggle.json) to Colab using the file upload feature
from google.colab import files
files.upload()

# Read the contents of the kaggle.json file
with open('kaggle.json', 'r') as file:
    kaggle_json = json.load(file)

## Important about Kaggle API Security

**The command !chmod 600 ~/.kaggle/kaggle.json is used to change the permissions of the kaggle.json file to restrict access permissions.**

*In Linux-based systems, including Google Colab, file permissions are represented by a three-digit number: the first digit represents the owner's permissions, the second digit represents the group's permissions, and the third digit represents other users' permissions.*

**Here's a breakdown of what chmod 600 does:**

* ***6 means the owner (the user who uploaded the kaggle.json file) has read and write permissions (4 for read and 2 for write), but no execute permissions (0 for execute). 0 means the group and other users have no permissions to read, write, or execute the file.***

* ***By setting the permissions to chmod 600, it ensures that only the owner of the file (the user who uploaded the kaggle.json file) has read and write access, and no other users (group or others) can access or modify the file.***

* **This step is important to maintain the security of your Kaggle API key, as it contains sensitive information and should not be accessible to other users of the system.**

In [None]:
# Move the saved kaggle.json file to the required directory
os.makedirs('/root/.kaggle', exist_ok=True)
os.rename('kaggle.json', '/root/.kaggle/kaggle.json')

# Set the appropriate permissions for the Kaggle API key file
os.chmod('/root/.kaggle/kaggle.json', 0o600)

**or**

In [None]:
import os

# Specify the path to the kaggle.json file
kaggle_json_path = os.path.join(os.path.expanduser("~"), ".kaggle", "kaggle.json")

# Check if the kaggle.json file already exists
if os.path.exists(kaggle_json_path):
    print("kaggle.json file already exists.")
else:
    # Move the uploaded Kaggle API key file to the required directory
    !mkdir -p ~/.kaggle    # This command creates a directory named '.kaggle' inside the user's home directory (~). The -p option ensures that the parent directories are also created if they don't exist. If the directory already exists, this command will not throw an error
    !mv kaggle.json ~/.kaggle/    # This command moves the file named 'kaggle.json' to the ~/.kaggle/ directory. The mv command is used for file or directory relocation. The first argument, kaggle.json, represents the current name/path of the file, and the second argument, ~/.kaggle/, represents the destination directory where the file should be moved.
    !chmod 600 ~/.kaggle/kaggle.json
    print("kaggle.json file moved and permissions set successfully.")


**Verifying Kaggle API**

In [None]:
# Verify the Kaggle API is working
!kaggle datasets list

**If the Kaggle API is working correctly, you can download the dataset by running the copied API command in your Colab notebook:**

In [None]:
!kaggle datasets download --force sumanthvrao/daily-climate-time-series-data

**The dataset will be downloaded as a ZIP file. You can unzip the file using the following command**

In [None]:
import zipfile

# Specify the path to the ZIP file
zip_file_path = '/content/daily-climate-time-series-data.zip'

# creating directory to unzip dataset
!mkdir -p /content/daily-climate-time-series-data

# Specify the target directory to extract the files
target_directory = '/content/daily-climate-time-series-data'

# Open the ZIP file
with zipfile.ZipFile(zip_file_path, 'r') as zip_ref:
    # Extract all the files to the target directory
    zip_ref.extractall(target_directory)

print("ZIP file extracted successfully.")


In [None]:
import os

# Specify the directory path
directory_path = '/content/daily-climate-time-series-data'

# Create the directory if it doesn't already exist
if not os.path.exists(directory_path):
    os.makedirs(directory_path)
    print(f"Directory '{directory_path}' created successfully.")
else:
    print(f"Directory '{directory_path}' already exists.")


**Importing essential libraries**

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px

In [None]:
train = pd.read_csv("/content/daily-climate-time-series-data/DailyDelhiClimateTrain.csv")
test = pd.read_csv('/content/daily-climate-time-series-data/DailyDelhiClimateTest.csv')
print(train.head())

In [None]:
print(test.head())

In [None]:
print(train.shape)
print(test.shape)

In [None]:
print(train.describe())

In [None]:
print(test.describe())

In [None]:
train.info()

In [None]:
test.info()

**The date column in this dataset is not having a datetime data type. Here’s how we can change the data type and extract year and month data from the date column.**

In [None]:
train['date'] = pd.to_datetime(train['date'], format='%Y-%m-%d')
train['year'] = train['date'].dt.year
train['month'] = train['date'].dt.month
print(train.head())

In [None]:
test['date'] = pd.to_datetime(test['date'], format='%Y-%m-%d')
test['year'] = test['date'].dt.year
test['month'] = test['date'].dt.month
print(test.head())

In [None]:
train.tail(5)

In [None]:
test.tail(5)

In [None]:
print(train.info())

In [None]:
print(test.info())

**Let’s have a look at the mean temperature in Delhi over the years**

In [None]:
figure = px.line(train, x="date", 
                 y="meantemp", 
                 title='Mean Temperature in Delhi Over the Years')
figure.show()

**Using matplotlib**

In [None]:
fig, ax = plt.subplots(figsize=(14, 4))  # Adjust the size of the figure as needed

# Plot the data
ax.plot(train['date'], train['meantemp'])

# Set the title, x-label, and y-label for the plot
ax.set_title('Mean Temperature in Delhi Over the Years')
ax.set_xlabel('Date')
ax.set_ylabel('Mean Temperature')

# Add a background color
ax.axhspan(min(train['meantemp']), max(train['meantemp']), facecolor='lightgray', alpha=0.3)
ax.axvspan(min(train['date']), max(train['date']), facecolor='lightgray', alpha=0.3)

plt.show()


**Now let’s have a look at the humidity in Delhi over the years**

In [None]:
figure = px.line(train, x="date", 
                 y="humidity", 
                 title='Humidity in Delhi Over the Years')
figure.show()

**Now let’s have a look at the wind speed in Delhi over the years**

In [None]:
figure = px.line(train, x="date", 
                 y="wind_speed", 
                 title='Wind Speed in Delhi Over the Years')
figure.show()

**Analysis**

Till 2015, the wind speed was higher during monsoons (August & September) and retreating monsoons (December & January). After 2015, there were no anomalies in wind speed during monsoons. Now let’s have a look at the relationship between temperature and humidity:

In [None]:
figure = px.scatter(data_frame = train, x="humidity",
                    y="meantemp", size="meantemp", 
                    trendline="ols", 
                    title = "Relationship Between Temperature and Humidity")
figure.show()

*There’s a negative correlation between temperature and humidity in Delhi. It means higher temperature results in low humidity and lower temperature results in high humidity.*

**Now let’s have a look at the temperature change in Delhi over the years**

In [None]:
from matplotlib import style
print(style.available)

In [None]:
plt.style.use('fivethirtyeight')
plt.figure(figsize=(16, 6))
plt.title("Temperature Change in Delhi Over the Years")
sns.lineplot(data = train, x='month', y='meantemp', hue='year')
plt.show()

***Now let’s move to the task of weather forecasting. I will be using the Facebook prophet model for this task. The Facebook prophet model is one of the best techniques for time series forecasting.***

In [None]:
!pip install prophet

***The prophet model accepts time data named as “ds”, and labels as “y”. So let’s convert the data into this format***

In [None]:
train_data = train.rename(columns = {'date' : 'ds', 'meantemp':'y'})
train_data

In [None]:
test_data = test.rename(columns = {'date' : 'ds', 'meantemp':'y'})
test_data

# **model.fit() and model.predict()**

In [None]:
from prophet import Prophet

# Create and fit the Prophet model
model = Prophet()
model.fit(train_data)

# Create future dates to forecast
future_dates = model.make_future_dataframe(periods=len(test_data))

# Make predictions
predictions = model.predict(future_dates)


In [None]:
predictions

In [None]:
# Merge predictions with actual values from the test dataset
forecasted_data = predictions[['ds', 'yhat', 'yhat_lower', 'yhat_upper']].merge(test_data, on='ds', how='left')
forecasted_data

In [None]:
# Print the forecasted values
print(forecasted_data[['ds', 'yhat', 'y', 'yhat_lower', 'yhat_upper']].iloc[1462:1575])

In [None]:
# Visualize the forecasted_data
from prophet.plot import plot_plotly, plot_components_plotly

plot_plotly(model, predictions)

# **Let's make predictions for next 365 days and check the trend for weather condition**

In [None]:
from prophet import Prophet

# Create and fit the Prophet model
model = Prophet()
model.fit(train_data)

# Create future dates to forecast
future_dates365 = model.make_future_dataframe(periods=365)  # Adjust the number of forecast steps as needed

# Make predictions
predictions365 = model.predict(future_dates365)


In [None]:
predictions365

In [None]:
# Print the forecasted values
print(predictions365[['ds', 'yhat', 'yhat_lower', 'yhat_upper']].tail(365))

In [None]:
# Visualize the forecasted values
from prophet.plot import plot_plotly, plot_components_plotly
plot_plotly(model, predictions365)