<a href="https://colab.research.google.com/github/Rachitha2908/MachineLearning2023/blob/main/weatherforecasting.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Weather Forecasting**

Weather forecasting is the task of forecasting weather conditions for a given location and time. With the use of weather data and algorithms, it is possible to predict weather conditions for the next n number of days.

For forecasting weather using Python, we need a dataset containing historical weather data based on a particular location. I found a dataset on Kaggle based on the Daily weather data of New Delhi. We can use this dataset for the task of weather forecasting. You can download the dataset from here: https://www.kaggle.com/datasets/sumanthvrao/daily-climate-time-series-data

https://thecleverprogrammer.com/2022/10/17/weather-forecasting-using-python/

In [None]:
! pip install kaggle

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


In [None]:
! mkdir ~/.kaggle

mkdir: cannot create directory ‘/root/.kaggle’: File exists


In [None]:
! cp kaggle.json ~/.kaggle/

In [None]:
! chmod 600 ~/.kaggle/kaggle.json

In [None]:
! kaggle datasets download sumanthvrao/daily-climate-time-series-data

daily-climate-time-series-data.zip: Skipping, found more recently modified local copy (use --force to force download)


In [None]:
! unzip /content/daily-climate-time-series-data.zip

Archive:  /content/daily-climate-time-series-data.zip
replace DailyDelhiClimateTest.csv? [y]es, [n]o, [A]ll, [N]one, [r]ename: 

In [None]:
#Analyzing Weather Data
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
import webbrowser
#url='https://www.kaggle.com/datasets/sumanthvrao/daily-climate-time-series-data?select=DailyDelhiClimateTrain.csv'
#webbrowser.register('chrome',None,webbrowser.BackgroundBrowser('C:\Program Files\Google\Chrome\Application'))
#webbrowser.get('chrome').open(url)

data = pd.read_csv("/content/DailyDelhiClimateTrain.csv")
print(data.head())

         date   meantemp   humidity  wind_speed  meanpressure
0  2013-01-01  10.000000  84.500000    0.000000   1015.666667
1  2013-01-02   7.400000  92.000000    2.980000   1017.800000
2  2013-01-03   7.166667  87.000000    4.633333   1018.666667
3  2013-01-04   8.666667  71.333333    1.233333   1017.166667
4  2013-01-05   6.000000  86.833333    3.700000   1016.500000


In [None]:
#Let’s have a look at the descriptive statistics of this data before moving forward:
print(data.describe()) 

          meantemp     humidity   wind_speed  meanpressure
count  1462.000000  1462.000000  1462.000000   1462.000000
mean     25.495521    60.771702     6.802209   1011.104548
std       7.348103    16.769652     4.561602    180.231668
min       6.000000    13.428571     0.000000     -3.041667
25%      18.857143    50.375000     3.475000   1001.580357
50%      27.714286    62.625000     6.221667   1008.563492
75%      31.305804    72.218750     9.238235   1014.944901
max      38.714286   100.000000    42.220000   7679.333333


In [None]:
#Now let’s have a look at the information about all the columns in the dataset:
print(data.info())

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1462 entries, 0 to 1461
Data columns (total 5 columns):
 #   Column        Non-Null Count  Dtype  
---  ------        --------------  -----  
 0   date          1462 non-null   object 
 1   meantemp      1462 non-null   float64
 2   humidity      1462 non-null   float64
 3   wind_speed    1462 non-null   float64
 4   meanpressure  1462 non-null   float64
dtypes: float64(4), object(1)
memory usage: 57.2+ KB
None


The date column in this dataset is not having a datetime data type. We will change it when required. Let’s have a look at the mean temperature in Delhi over the years:
The plotly.express module (usually imported as px) contains functions that can create entire figures at once

In [None]:
check_for_null=data.isnull().values.any()
print(check_for_null)
check_for_individual_null=data.isnull()
print(check_for_individual_null)
check_for_duplicate=data.duplicated().any()
print(check_for_duplicate)
data.dropna(inplace=True)
data.drop_duplicates(inplace=True)

False
       date  meantemp  humidity  wind_speed  meanpressure
0     False     False     False       False         False
1     False     False     False       False         False
2     False     False     False       False         False
3     False     False     False       False         False
4     False     False     False       False         False
...     ...       ...       ...         ...           ...
1457  False     False     False       False         False
1458  False     False     False       False         False
1459  False     False     False       False         False
1460  False     False     False       False         False
1461  False     False     False       False         False

[1462 rows x 5 columns]
False


In [None]:
figure = px.line(data, x="date", 
                 y="meantemp", 
                 title='Mean Temperature in Delhi Over the Years')
figure.show()

In [None]:
#Now let’s have a look at the humidity in Delhi over the years:
figure = px.line(data, x="date", 
                 y="humidity", 
                 title='Humidity in Delhi Over the Years')
figure.show()

In [None]:
#Now let’s have a look at the wind speed in Delhi over the years:
figure = px.line(data, x="date", 
                 y="wind_speed", 
                 title='Wind Speed in Delhi Over the Years')
figure.show()

Till 2015, the wind speed was higher during monsoons (August & September) and retreating monsoons (December & January). After 2015, there were no anomalies in wind speed during monsoons. Now let’s have a look at the relationship between temperature and humidity:

In [None]:
figure = px.scatter(data_frame = data, x="humidity",
                    y="meantemp", size="meantemp", 
                    trendline="ols", 
                    title = "Relationship Between Temperature and Humidity")
figure.show()

There’s a negative correlation between temperature and humidity in Delhi. It means higher temperature results in low humidity and lower temperature results in high humidity.

**Analyzing Temperature Change**

Now let’s analyze the temperature change in Delhi over the years. For this task, first convert the data type of the date column into datetime. Then add two new columns in the dataset for year and month values.

In [None]:
#Here’s how we can change the data type and extract year and month data from the date column:
data["date"] = pd.to_datetime(data["date"], format = '%Y-%m-%d')
data['year'] = data['date'].dt.year
data["month"] = data["date"].dt.month
print(data.head())
print(data.info())

        date   meantemp   humidity  wind_speed  meanpressure  year  month
0 2013-01-01  10.000000  84.500000    0.000000   1015.666667  2013      1
1 2013-01-02   7.400000  92.000000    2.980000   1017.800000  2013      1
2 2013-01-03   7.166667  87.000000    4.633333   1018.666667  2013      1
3 2013-01-04   8.666667  71.333333    1.233333   1017.166667  2013      1
4 2013-01-05   6.000000  86.833333    3.700000   1016.500000  2013      1
<class 'pandas.core.frame.DataFrame'>
Int64Index: 1462 entries, 0 to 1461
Data columns (total 7 columns):
 #   Column        Non-Null Count  Dtype         
---  ------        --------------  -----         
 0   date          1462 non-null   datetime64[ns]
 1   meantemp      1462 non-null   float64       
 2   humidity      1462 non-null   float64       
 3   wind_speed    1462 non-null   float64       
 4   meanpressure  1462 non-null   float64       
 5   year          1462 non-null   int64         
 6   month         1462 non-null   int64         


Now let’s have a look at the temperature change in Delhi over the years:

Matplotlib provides the module and functions to create a figure. By using the function matplotlib.pyplot.figure() we can create a new figure. Also we can change the visual appearance of the figure by changing its size, color, dpi, etc.

In [None]:
#matplotlib is used for data visulization,By using this library we can generate plots and figures
plt.style.use('dark_background') #style_name is the name of the style which we want to use :Solarize_Light2 ,dark_background
plt.figure(figsize=(5, 10))#figsize(float, float): These parameter are the width, height in inches.
plt.title("Temperature Change in Delhi Over the Years")
#Seaborn is a library that uses Matplotlib underneath to plot graphs. It will be used to visualize random distributions.
sns.lineplot(data = data, x='month', y='meantemp', hue='year')
plt.show()

Although 2017 was not the hottest year in the summer, we can see a rise in the average temperature of Delhi every year.

**Forecasting Weather**

Now let’s move to the task of weather forecasting. I will be using the Facebook prophet model for this task. The Facebook prophet model is one of the best techniques for time series forecasting. If you have never used this model before, you can install it on your system by using the command mentioned below in your command prompt or terminal: !pip install prophet

Prophet is a procedure for forecasting time series data based on an additive model where non-linear trends are fit with yearly, weekly, and daily seasonality, plus holiday effects. It works best with time series that have strong seasonal effects and several seasons of historical data. Prophet is robust to missing data and shifts in the trend, and typically handles outliers well.

In [None]:
#The prophet model accepts time data named as “ds”, and labels as “y”. So let’s convert the data into this format:

forecast_data = data.rename(columns = {"date": "ds", 
                                       "meantemp": "y"})
print(forecast_data)

             ds          y    humidity  wind_speed  meanpressure  year  month
0    2013-01-01  10.000000   84.500000    0.000000   1015.666667  2013      1
1    2013-01-02   7.400000   92.000000    2.980000   1017.800000  2013      1
2    2013-01-03   7.166667   87.000000    4.633333   1018.666667  2013      1
3    2013-01-04   8.666667   71.333333    1.233333   1017.166667  2013      1
4    2013-01-05   6.000000   86.833333    3.700000   1016.500000  2013      1
...         ...        ...         ...         ...           ...   ...    ...
1457 2016-12-28  17.217391   68.043478    3.547826   1015.565217  2016     12
1458 2016-12-29  15.238095   87.857143    6.000000   1016.904762  2016     12
1459 2016-12-30  14.095238   89.666667    6.266667   1017.904762  2016     12
1460 2016-12-31  15.052632   87.000000    7.325000   1016.100000  2016     12
1461 2017-01-01  10.000000  100.000000    0.000000   1016.000000  2017      1

[1462 rows x 7 columns]


In [None]:
#Now below is how we can use the Facebook prophet model for weather forecasting using Python:

from prophet import Prophet
from prophet.plot import plot_plotly, plot_components_plotly
model = Prophet()
model.fit(forecast_data)
forecasts = model.make_future_dataframe(periods=365)
predictions = model.predict(forecasts)
plot_plotly(model, predictions)

INFO:prophet:Disabling daily seasonality. Run prophet with daily_seasonality=True to override this.
DEBUG:cmdstanpy:input tempfile: /tmp/tmpzavkt1gu/hv8a7vw8.json
DEBUG:cmdstanpy:input tempfile: /tmp/tmpzavkt1gu/xz0yj2lb.json
DEBUG:cmdstanpy:idx 0
DEBUG:cmdstanpy:running CmdStan, num_threads: None
DEBUG:cmdstanpy:CmdStan args: ['/usr/local/lib/python3.9/dist-packages/prophet/stan_model/prophet_model.bin', 'random', 'seed=8446', 'data', 'file=/tmp/tmpzavkt1gu/hv8a7vw8.json', 'init=/tmp/tmpzavkt1gu/xz0yj2lb.json', 'output', 'file=/tmp/tmpzavkt1gu/prophet_model6priky8e/prophet_model-20230315060328.csv', 'method=optimize', 'algorithm=lbfgs', 'iter=10000']
06:03:28 - cmdstanpy - INFO - Chain [1] start processing
INFO:cmdstanpy:Chain [1] start processing
06:03:28 - cmdstanpy - INFO - Chain [1] done processing
INFO:cmdstanpy:Chain [1] done processing


**Summary**

Weather forecasting is the task of forecasting weather conditions for a given location and time. With the use of weather data and algorithms, it is possible to predict weather conditions for the next n number of days. I hope you liked this article on Weather Analysis and Forecasting using Python. Feel free to ask valuable questions in the comments section below.