In [2]:
import pandas as pd
from neuralprophet import NeuralProphet
from matplotlib import pyplot as plt

pandas is a powerful data manipulation library.

NeuralProphet is a forecasting tool based on neural networks.

matplotlib.pyplot is used for plotting graphs.

1. Read in Data and Process Dates

In [None]:
df = pd.read_csv('weatherAUS.csv')
df.head()

This line uses the pandas library to read a CSV file named weatherAUS.csv and load its contents into a DataFrame called df.

The read_csv function reads the file and converts it into a DataFrame, which is a table-like data structure in pandas.

In [None]:
df.Location.unique()

The code df.Location.unique() is used to find all the unique values in the Location column of your DataFrame df. This is helpful to see all the different locations for which you have weather data in your dataset.

In [None]:
df.columns

The code df.columns returns the column labels of the DataFrame df. This is useful for quickly seeing all the column names in your dataset.

In [None]:
df.dtypes

The code df.dtypes returns the data types of each column in the DataFrame df. This is useful for understanding the kind of data stored in each column, which can help in data cleaning and preprocessing.

In [None]:
melb = df[df['Location']=='Melbourne']
melb['Date'] = pd.to_datetime(melb['Date'])
melb.head()

1.Filtering the DataFrame

This line filters the DataFrame df to include only the rows where the Location column is ‘Melbourne’.

The result is stored in a new DataFrame called melb.

2.Converting the ‘Date’ Column to Datetime

This line converts the ‘Date’ column in the melb DataFrame from a string format to a datetime format using pd.to_datetime().

This conversion is essential for performing time series analysis and operations on the date data.

In [None]:
melb.dtypes

Running the code melb.dtypes will return the data types of each column in the melb DataFrame. 

This is useful for verifying that the ‘Date’ column has been successfully converted to a datetime format and for understanding the types of data in each column.

In [None]:
plt.plot(melb['Date'], melb['Temp3pm'])
plt.show()

1.Plotting the Data

This line uses matplotlib.pyplot to create a line plot.

melb['Date'] is used for the x-axis, representing the dates.

melb['Temp3pm'] is used for the y-axis, representing the temperature recorded at 3 PM.

The plot will show how the temperature at 3 PM varies over time in Melbourne. It helps visualize trends, patterns, and anomalies in the temperature data.


In [None]:
melb['Year'] = melb['Date'].apply(lambda x: x.year)
melb = melb[melb['Year']<=2015]
plt.plot(melb['Date'], melb['Temp3pm'])
plt.show()

1.Extracting the Year from the Date

This line creates a new column ‘Year’ in the melb DataFrame.

It uses the apply method with a lambda function to extract the year from each date in the ‘Date’ column.

2.Filtering the DataFrame

This line filters the melb DataFrame to include only the rows where the ‘Year’ is less than or equal to 2015.

This is useful for focusing on data up to the year 2015.

3.Plotting the Data:

This creates a line plot with ‘Date’ on the x-axis and ‘Temp3pm’ (temperature at 3 PM) on the y-axis.

The plot will show how the temperature at 3 PM in Melbourne varies over time, up to the year 2015. This can help visualize trends and patterns in the temperature data.


In [None]:
melb.tail()

The code melb.tail() returns the last five rows of the melb DataFrame by default.

In [None]:
data = melb[['Date', 'Temp3pm']] 
data.dropna(inplace=True)
data.columns = ['ds', 'y'] 
data.head()

1.Selecting Specific Columns

This line creates a new DataFrame data containing only the ‘Date’ and ‘Temp3pm’ columns from the melb DataFrame.

2.Dropping Missing Values

This line removes any rows in the data DataFrame that contain missing values (NaN).

The inplace=True parameter ensures that the operation is performed directly on the data DataFrame without creating a copy.

3.Renaming Columns

This line renames the columns of the data DataFrame to ‘ds’ and ‘y’.

‘ds’ typically stands for ‘datestamp’ and ‘y’ for the target variable, which is common in time series forecasting.



In [None]:
data


2. Train Model

In [61]:
m = NeuralProphet()

we've initialized a NeuralProphet model with the line.

This creates an instance of the NeuralProphet model, which we can now use to fit your data and make predictions. 

In [None]:
model = m.fit(data, freq='D', epochs=1000)

Fitting the Model with Specified Epochs:

This fits the NeuralProphet model to our data DataFrame.

freq='D' specifies that the data frequency is daily.

epochs=1000 means the model will train for 1000 iterations over the dataset, which can help improve the model’s performance by allowing it to learn more from the data.


3. Forecast Away

In [None]:
future = m.make_future_dataframe(data, periods=900)
forecast = m.predict(future)
forecast.head()

1.Creating a Future DataFrame

This line creates a DataFrame future that extends the original data DataFrame by 900 periods (days in this case).

This is used to generate future dates for which we want to make predictions.

2.Making Predictions

This line uses the NeuralProphet model m to predict future values based on the future DataFrame.

The forecast DataFrame will contain the predicted values along with the corresponding dates.


In [None]:
forecast.tail()

In [None]:
plot1 = m.plot(forecast)

Plotting the Forecas

This line creates a plot of the forecasted values stored in the forecast DataFrame.

The plot typically includes the historical data, the forecasted values, and the confidence intervals.


In [None]:
plot2 = m.plot_components(forecast)

This line creates a series of plots that show the different components of the forecast.

These components typically include the overall trend, daily seasonality, weekly seasonality, and any other seasonal patterns or holidays that the model has identified.