# Predicting the number of international airline passengers for the future
in the previous file, we tested it out for results we already knew. Lets test it out for results we dont know.8

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
import tensorflow as tf
from keras import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_error
import numpy as np

Import the necessary modules


In [None]:
dataframe:pd.DataFrame = pd.read_csv('airline-passengers.csv', usecols=[1], engine='python')
plt.plot(dataframe)
plt.show()

Here we read the csv file containing the data as a dataframe, and we plot it.


We should first set a random seed

In [None]:
tf.random.set_seed(7)

Lets now change the *dataframe* from earlier to numpy integer values. Apparently this is better when working with neural networks.

In [None]:
dataset:np.array = dataframe.values

lengthOfPredictions = int(len(dataset) * 0.5)
nans = np.empty((lengthOfPredictions,1))
nans[:] = np.nan
dataset = np.concatenate((dataset, nans), axis=0)

dataset = dataset.astype('float32')
print(dataset)

We want to now scale these values down between 0 and 1 with a sigmoid or logistic function.


In [None]:
scaler = MinMaxScaler(feature_range=(0,1))
dataset = scaler.fit_transform(dataset)
print(dataset)

The airline problem is time based, meaning that the sequence of values is significant. Therefore, lets use the first 2 thirds of the data to train the model, and lets predict the last third, and compare it with the actual values.


In [None]:
trainingSize = int(len(dataset)) - lengthOfPredictions
testingSize = lengthOfPredictions
trainingValues = dataset[:trainingSize]
testingValues = dataset[trainingSize:]
print(f"Training size: {len(trainingValues)}\nTesting size:{len(testingValues)}")

Now we can define a function to create a new dataset, as described above.

The function takes two arguments: the dataset, which is a NumPy array that we want to convert into a dataset, and the look_back, which is the number of previous time steps to use as input variables to predict the next time period — in this case defaulted to 1.

This default will create a dataset where X is the number of passengers at a given time (t) and Y is the number of passengers at the next time (t + 1).

In [None]:
# convert an array of values into a dataset matrix
def create_dataset(dataset, look_back=1):
	dataX, dataY = [], []
	for i in range(len(dataset)-look_back-1):
		a = dataset[i:(i+look_back), 0]
		dataX.append(a)
		dataY.append(dataset[i + look_back, 0])
	return np.array(dataX), np.array(dataY)

Lets use this function


In [None]:
# reshape into X=t and Y=t+1
look_back = 5
trainX, trainY = create_dataset(trainingValues, look_back)
testX, testY = create_dataset(testingValues, look_back)
# print(trainX)
print(trainY)

The LSTM network expects the input data (X) to be provided with a specific array structure in the form of: [samples, time steps, features].

Currently, our data is in the form: [samples, features] and we are framing the problem as one time step for each sample. We can transform the prepared train and test input data into the expected structure using numpy.reshape() as follows:

In [None]:

# reshape input to be [samples, time steps, features]
# print(testX)
trainX = np.reshape(trainX, (trainX.shape[0], 1, trainX.shape[1]))
testX = np.reshape(testX, (testX.shape[0], 1, testX.shape[1]))
print(trainX)

We are now ready to design and fit our LSTM network for this problem.

The network has a visible layer with 1 input, a hidden layer with 4 LSTM blocks or neurons, and an output layer that makes a single value prediction. The default sigmoid activation function is used for the LSTM blocks. The network is trained for 100 epochs and a batch size of 1 is used.

In [None]:

# create and fit the LSTM network
model = Sequential()
model.add(LSTM(4, input_shape=(1, look_back)))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(trainX, trainY, epochs=100, batch_size=1, verbose=2)

Once the model is fit, we can estimate the performance of the model on the train and test datasets. This will give us a point of comparison for new models.

Note that we invert the predictions before calculating error scores to ensure that performance is reported in the same units as the original data (thousands of passengers per month).

In [None]:
# make predictions
trainPredict = model.predict(trainX)
testPredict = model.predict(testX)
print(trainPredict)

In [None]:
# invert predictions
trainPredict = scaler.inverse_transform(trainPredict)
trainY = scaler.inverse_transform([trainY])
testPredict = scaler.inverse_transform(testPredict)
testY = scaler.inverse_transform([testY])
# calculate root mean squared error
# trainScore = np.sqrt(mean_squared_error(trainY[0], trainPredict[:,0]))
# print('Train Score: %.2f RMSE' % (trainScore))
# testScore = np.sqrt(mean_squared_error(testY[0], testPredict[:,0]))
# print('Test Score: %.2f RMSE' % (testScore))


allVals = np.concatenate((trainPredict, testPredict), axis=0)
print(len(allVals))
print(len(testPredict))
print(len(trainPredict))

Finally, we can generate predictions using the model for both the train and test dataset to get a visual indication of the skill of the model.

Because of how the dataset was prepared, we must shift the predictions so that they align on the x-axis with the original dataset. Once prepared, the data is plotted, showing the original dataset in blue, the predictions for the training dataset in orange, and the predictions on the unseen test dataset in green.

In [None]:

# shift train predictions for plotting
print(dataset.shape)
trainPredictPlot = np.empty(shape=(19000, 1))
trainPredictPlot[:, :] = np.nan
trainPredictPlot[look_back:len(trainPredict)+look_back, :] = trainPredict
# shift test predictions for plotting
testPredictPlot = np.empty_like(dataset)
testPredictPlot[:, :] = np.nan
testPredictPlot[len(trainPredict)+(look_back*2)+1:len(dataset)-1, :] = testPredict
print(testPredict.shape)
print(testPredict)
# plot baseline and predictions
plt.plot(scaler.inverse_transform(dataset), label="base")
plt.plot(trainPredictPlot, label="trainpredict")
plt.plot(testPredictPlot, label="testPredict")
# plt.plot(allVals, label="allVals")
plt.legend()
plt.show()

In [None]:

print(testPredictPlot)