Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Help: 'Wrong number of dimensions: expected 3, got 2 with shape (32L, 60L).' in LSTM model #1641

Closed
DanHenry4 opened this issue Feb 4, 2016 · 10 comments

Comments

@DanHenry4
Copy link

Hey everyone,

I'm trying to use custom data on the LSTM model, but it keeps giving shape errors. After reading some other issues along the same lines, I even tried reshaping the input data to size (nb_inputs, timestamps, 1) which looks approximately like (4200, 60, 1), but that returns an error that says a shape of (None, 4200, 60, 1) is no good. Any thoughts?

maxlen = 60
batch_size = 32

print('Loading data...')
(X_train, y_train), (X_test, y_test) = t.LoadData()
print(len(X_train), 'train sequences')
print(len(X_test), 'test sequences')

print('X_train shape:', X_train.shape)
print('X_test shape:', X_test.shape)

print('Build model...')
model = Sequential()
model.add(LSTM(128, input_shape=X_train.shape))

model.compile(loss='binary_crossentropy',
              optimizer='sgd',
              class_mode="categorical")

print("Train...")
model.fit(X_train, y_train, batch_size=batch_size, nb_epoch=3,
          validation_data=(X_test, y_test), show_accuracy=True)
score, acc = model.evaluate(X_test, y_test,
                            batch_size=batch_size,
                            show_accuracy=True)
print('Test score:', score)
print('Test accuracy:', acc)

Output:

Using Theano backend.
Loading data...
4130 train sequences
1016 test sequences
X_train shape: (4130L, 60L)
X_test shape: (1016L, 60L)
Build model...
Train...
Train on 4130 samples, validate on 1016 samples
Epoch 1/3
Traceback (most recent call last):
File "main.py", line 52, in
validation_data=(X_test, y_test), show_accuracy=True)
File "C:\Miniconda2\lib\site-packages\keras\models.py", line 507, in fit
shuffle=shuffle, metrics=metrics)
File "C:\Miniconda2\lib\site-packages\keras\models.py", line 226, in _fit
outs = f(ins_batch)
File "C:\Miniconda2\lib\site-packages\keras\backend\theano_backend.py", line 357, in call
return self.function(*inputs)
File "C:\Miniconda2\lib\site-packages\theano\compile\function_module.py", line 513, in call
allow_downcast=s.allow_downcast)
File "C:\Miniconda2\lib\site-packages\theano\tensor\type.py", line 169, in filter
data.shape))
TypeError: ('Bad input argument to theano function with name "C:\Miniconda2\lib\site-packages\keras\backend\theano_backend.py:354" at index 0(0-based)', 'Wrong number of dimensions: expected 3, got 2 with shape (32L, 60L).')

@wxs
Copy link
Contributor

wxs commented Feb 4, 2016

I even tried reshaping the input data to size (nb_inputs, timestamps, 1)

You will need to do that, but then you shouldn't put X_train.shape as the input_shape parameter of your LSTM. It doesn't care about the total number of training points for the model architecture. You should be able to instead pass in input_shape=X_train.shape[1:].

@DanHenry4
Copy link
Author

Thank you! Reshaping the arrays and adding .shape[1:] lets it run. May I ask why the input shape needs to be .shape[1:]?

Also, (off topic) the output looks like:

Using Theano backend.
Using gpu device 0: GeForce GTX 970
Loading data...
4262 train sequences
1083 test sequences
X_train shape: (4262L, 60L, 1L)
X_test shape: (1083L, 60L, 1L)
Build model...
Train...
Train on 4262 samples, validate on 1083 samples
Epoch 1/3
4262/4262 [==============================] - 48s - loss: -28.0929 - acc: 1.0000 - val_loss: -64.3520 - val_acc: 1.0000
Epoch 2/3
4262/4262 [==============================] - 48s - loss: -64.6251 - acc: 1.0000 - val_loss: -64.3520 - val_acc: 1.0000
Epoch 3/3
4262/4262 [==============================] - 48s - loss: -64.6251 - acc: 1.0000 - val_loss: -64.3520 - val_acc: 1.0000
1083/1083 [==============================] - 1s
Test score: -64.3520374245
Test accuracy: 1.0

Is there a reason the loss is negative?

@wxs
Copy link
Contributor

wxs commented Feb 5, 2016

So X.shape is (samples, timesteps, dimension), but the model architecture doesn't care about many training examples (samples) you have. Once you've built the model you can feed it a hundred million examples, doesn't matter. So you don't pass that as a parameter when you build your model. So X.shape[1:] is just (timesteps, samples) the two dimensions that matter

Incidentally if you're on a Theano backend you also don't need to specify the number of timesteps, but you need to pass "None" for that dimension, then.So instead you would pass in (None, X.shape[2])

As to why your score is negative: there's still something a bit fishy with your model. Your LSTM has 128 output dimensions and then you're evaluating binary cross-entropy on that? Is your y target also 128 dimensional? If it's not, you probably meant to put a Dense(1) layer, bringing your output down to a single output that is compatible with y. Also if you're using a cross-entropy objective you want your output to be a probability distribution so you probably meant to put some sort of activation on top of it to normalize its output.

Or else you probably meant to use a different objective function.

Without knowing more about your data (for instance the size of your y matrix) it's hard for me to help further.

@DanHenry4
Copy link
Author

So X.shape[1:] is just (timesteps, samples) the two dimensions that matter

I'm guessing you meant (timesteps, dimension)?

That makes sense, though. Thank you for the information. As for the output data, yes, a _binary__crossentropy loss function doesn't make much sense, considering the data look like:

[
[5.45, 5.42, ..., 5.26],
[5.25, 5.28, ..., 5.30],
...
[5.12, 5.15, ..., 5.65]
],
[5.13, 5.17, ..., 5.05]

Where the first list contains sequences of input (which are themselves lists), and the output is a single float value.

I've changed the model:

batch_size = 32

print('Loading data...')
(X_train, y_train), (X_test, y_test) = t.LoadData()
print(len(X_train), 'train sequences')
print(len(X_test), 'test sequences')

X_train = np.reshape(X_train, X_train.shape + (1,))
X_test = np.reshape(X_test, X_test.shape + (1,))

print('X_train shape:', X_train.shape)
print('X_test shape:', X_test.shape)

print('Build model...')
model = Sequential()
model.add(LSTM(1, input_shape=X_train.shape[1:]))

model.compile(loss='mse',
              optimizer='sgd',
              class_mode="categorical")

print("Train...")
model.fit(X_train, y_train, batch_size=batch_size, nb_epoch=3,
          validation_data=(X_test, y_test), show_accuracy=True)
score, acc = model.evaluate(X_test, y_test,
                            batch_size=batch_size,
                            show_accuracy=True)
print('Test score:', score)
print('Test accuracy:', acc)

And it now produces output closer to the desired result:

Using Theano backend.
Loading data...
4109 train sequences
998 test sequences
X_train shape: (4109L, 60L, 1L)
X_test shape: (998L, 60L, 1L)
Build model...
Train...
Train on 4109 samples, validate on 998 samples
Epoch 1/3
4109/4109 [==============================] - 3s - loss: 26.1860 - acc: 1.0000 - val_loss: 26.4226 - val_acc: 1.0000
Epoch 2/3
4109/4109 [==============================] - 3s - loss: 26.1860 - acc: 1.0000 - val_loss: 26.4226 - val_acc: 1.0000
Epoch 3/3
4109/4109 [==============================] - 3s - loss: 26.1860 - acc: 1.0000 - val_loss: 26.4226 - val_acc: 1.0000
998/998 [==============================] - 0s
Test score: 26.4226496511
Test accuracy: 1.0

I'll keep plugging away. :)

@wxs
Copy link
Contributor

wxs commented Feb 7, 2016

For most applications you would probably want more than 1 hidden state on your LSTM! You can put a Dense layer (or TimeDistributedDense) with an output dimension of 1 to project the hidden state down to 1 dimension on output, while still retaining more than 1 dimension of state. So something like:

model.add(LSTM(128, input_shape=X_train.shape[1:]))
model.add(Dense(1))
model.add(Activation('sigmoid'))

@stale stale bot added the stale label May 23, 2017
@shamsulmasum
Copy link

from pandas import DataFrame
from pandas import Series
from pandas import concat
from pandas import read_csv
from sklearn.metrics import mean_squared_error
from sklearn.preprocessing import MinMaxScaler
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from math import sqrt
from matplotlib import pyplot
import numpy as np


 
# frame a sequence as a supervised learning problem
def timeseries_to_supervised(data, lag=1):
	df = DataFrame(data)
	columns = [df.shift(i) for i in range(1, lag+1)]
	columns.append(df)
	df = concat(columns, axis=1)
	df.fillna(0, inplace=True)
	return df
 
# create a differenced series
def difference(dataset, interval=1):
	diff = list()
	for i in range(interval, len(dataset)):
		value = dataset[i] - dataset[i - interval]
		diff.append(value)
	return Series(diff)
 
# invert differenced value
def inverse_difference(history, yhat, interval=1):
	return yhat + history[-interval]
 
# scale train and test data to [-1, 1]
def scale(train):
	# fit scaler
	scaler = MinMaxScaler(feature_range=(-1, 1))
	scaler = scaler.fit(train)
	# transform train
	train = train.reshape(train.shape[0], train.shape[1])
	train_scaled = scaler.transform(train)
 
	return scaler, train_scaled
 
# inverse scaling for a forecasted value
def invert_scale(scaler, X, value):
	new_row = [x for x in X] + [value]
	array = np.array(new_row)
	array = array.reshape(1, len(array))
	inverted = scaler.inverse_transform(array)
	return inverted[0, -1]

def generate_features(x, forecast, window):
    """ Concatenates a time series vector x with forecasts from
        the iterated forecasting strategy.

    Arguments:
    ----------
        x:        Numpy array of length T containing the time series.
        forecast: Scalar containing forecast for time T + 1.
        window:   Autoregressive order of the time series model.
    """
    augmented_time_series = np.hstack((x, forecast))

    return augmented_time_series[-window:].reshape(1, -1)

    # fit an LSTM network to training data
def fit_lstm(train, batch_size, nb_epoch, neurons):
	X, y = train[:, 0:-1], train[:, -1]
	X = X.reshape(X.shape[0], 1, X.shape[1])
	model = Sequential()
	model.add(LSTM(neurons, batch_input_shape=(batch_size, X.shape[1], X.shape[2]), stateful=True))
	model.add(Dense(1))
	model.compile(loss='mean_squared_error', optimizer='adam')
	for i in range(nb_epoch):
		model.fit(X, y, epochs=1, batch_size=batch_size, verbose=0, shuffle=False)
		model.reset_states()
	return model
 
def iterative_forecast(model, x, window, H):
    """ Implements iterative forecasting strategy

    Arguments:
    ----------
        model: scikit-learn model that implements a predict() method
               and is trained on some data x.
        x:     Numpy array containing the time series.
        h:     number of time periods needed for the h-step ahead
               forecast
    """
    forecast = np.zeros(H)    
    forecast[0] = model.predict(x.reshape(1, -1))

    for h in range(1, H):
        features = generate_features(x, forecast[:h], window)

        forecast[h] = model.predict(features)

    return forecast
    
    
# load dataset
series = read_csv('shampoosales.csv', header=0, index_col=0, squeeze=True)



# transform data to be stationary
raw_values = series.values
diff_values = difference(raw_values, 1)
 
# transform data to be supervised learning
supervised = timeseries_to_supervised(diff_values, 1)
supervised_values = supervised.values


train = supervised_values[0:-12]
test = supervised_values[-12:]

# transform the scale of the data
scaler, train_scaled = scale(train)
# fit the model
lstm_model = fit_lstm(train_scaled, 1, 3000, 4)

yhat = iterative_forecast(lstm_model, train, 1, 10)
predictions = list()
predictions.append(yhat)

i am trying to discover an algorithm for iterative forecast using LSTM. seems to be something wrong with the code. would you be kind enough to help?

error that i am getting

'Error when checking : expected lstm_2_input to have 3 dimensions, but got array with shape (1, 46)'

@stale stale bot removed the stale label Jun 12, 2017
@stale stale bot added the stale label Sep 10, 2017
@stale
Copy link

stale bot commented Sep 10, 2017

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.

@stale stale bot closed this as completed Oct 10, 2017
@fshah7
Copy link

fshah7 commented Dec 3, 2017

Hi,

I have an input data with three variables / dimensions. with 4080 total samples. I am trying the below RNN script but getting the error.
Any help ?

model=Sequential()
model.add(GRU(3,return_sequences=True,input_shape=(4080,3)))
model.add(Dense(1))
model.compile(loss='categorical_crossentropy',optimizer='adam',metrics=['accuracy'])
model.fit(x_train,dummy_y_train,nb_epoch=20,batch_size=20,verbose=1)

ERROR: Error when checking input: expected gru_1_input to have 3 dimensions, but got array with shape (4080, 3)

@Khalid-Usman
Copy link

@wxs Don't you think there is something fishy in @DanHenry4 work as he is getting the same loss after each epoch and accuracy is always 1 (100%), which is near to impossible in most of machine learning predictions and specially in stock price prediction. I am also getting the loss 0.0 , therefore i am confused, may be i did something wrong.

Please reply me on this, I am using LSTM for the first time and I am confused by seeing the accuracy that may be I am doing something wrong.

Your guidance will be appreciated. Thanks,

@nantomar
Copy link

if the matrix size is different in the test and the data on which model was trained than what can I do? Keras in r. my results are really poor just because I've to add dummy columns to match matrix size.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants