-
Notifications
You must be signed in to change notification settings - Fork 19.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deep neural network with stacked autoencoder on MNIST #358
Comments
Not sure if this is what you are looking for but the following works. from __future__ import absolute_import
from __future__ import print_function
import numpy as np
np.random.seed(1337) # for reproducibility
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import containers
from keras.layers.core import Dense, AutoEncoder
from keras.optimizers import RMSprop
from keras.utils import np_utils
batch_size = 64
nb_classes = 10
nb_epoch = 1
# the data, shuffled and split between train and test sets
(X_train, _), (X_test, _) = mnist.load_data()
X_train = X_train.reshape(-1, 784)
X_test = X_test.reshape(-1, 784)
X_train = X_train.astype("float32") / 255.0
X_test = X_test.astype("float32") / 255.0
print(X_train.shape[0], 'train samples')
print(X_test.shape[0], 'test samples')
#creating the autoencoder
ae = Sequential()
encoder = containers.Sequential([Dense(784, 700), Dense(700, 600)])
decoder = containers.Sequential([Dense(600, 700), Dense(700, 784)])
ae.add(AutoEncoder(encoder=encoder, decoder=decoder,
output_reconstruction=True, tie_weights=True))
ae.compile(loss='mean_squared_error', optimizer=RMSprop())
ae.fit(X_train, X_train, batch_size=batch_size, nb_epoch=nb_epoch,
show_accuracy=False, verbose=1, validation_data=[X_test, X_test]) The output 60000 train samples
10000 test samples
Train on 60000 samples, validate on 10000 samples
Epoch 0
60000/60000 [==============================] - 36s - loss: 0.0371 - val_loss: 0.0229 |
Updated the code to show how to use |
thank you very much for your fast reply it's very apreciable. I can't test this code right now cause I haven't my laptop with me but I'll try it tonight. If I don't misunderstood the method for training Deep neural network with autoencoder, the first step is to train one by one each autoencoder to encode and decode their input. After the pre training is done, I can set the weights of my DNN with the weights of all encoder. Then I can apply a simple SGD. Is my understand right ? So if i right, my goal is to train a second autoencoder with inputs of the firs autoencoder. And that's what I don't find the way to do it. But perhaps with your code I'm going to succeed. |
What you say sounds correct. If you need to do layer-by-layer pre-training, then I think you need to write similar scripts for each stage, save the trained weight with If your goal is to do experiment with pre-trainng, you are doing it right. But if your goal is to train a network, then keep in mind that by applying glorot initialization (which is default initialization scheme in Keras), you don't need to do pre-training. You can normally directly start training the network. |
To do layer by layer pretraining you will currently need to run fit() (or train) on each model and then couple them later after training is done using |
It looks like, I didn't put activation function. (Sorry, I have not used Keras' AE before.) encoder = containers.Sequential([Dense(784, 700, activation='sigmoid'),
Dense(700, 600, activation='sigmoid')])
decoder = containers.Sequential([Dense(600, 700, activation='sigmoid'),
Dense(700, 784, activation='sigmoid')]) |
As a matter of fact, it certainly changes the output
With activation
|
@mthrok : yes you can stack the layers like that, but it is not doing greedy layerwise training. Just so you are aware. |
I try do something like that to do greedy layerwise but it's not working... from future import absolute_import from keras.datasets import mnist batch_size = 10000 adg = Adagrad() #the data, shuffled and split between train and test sets X_train = X_train.reshape(60000, 784) #convert class vectors to binary class matrices #creating the autoencoder #first autoencoder #training the first autoencoder #getting output of the first autoencoder to connect to the input of the #second autoencoder #training the second autoencoder #getting output of the second autoencoder to connect to the input of the #third autoencoder #training the third autoencoder #creating the Deep neural network with all encoder of each autoencoder trained before model = Sequential() model.fit(X_train, Y_train, batch_size=batch_size, nb_epoch=nb_epoch, show_accuracy=True, verbose=2, validation_data=(X_test, Y_test)) |
Here is a layer-by-layer example. from __future__ import absolute_import
from __future__ import print_function
import numpy as np
np.random.seed(1337) # for reproducibility
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import containers
from keras.layers.core import Dense, AutoEncoder
from keras.activations import sigmoid
from keras.utils import np_utils
batch_size = 64
nb_classes = 10
nb_epoch = 1
nb_hidden_layers = [784, 600, 500, 400]
# the data, shuffled and split between train and test sets
(X_train, y_train), (X_test, y_test) = mnist.load_data()
X_train = X_train.reshape(-1, 784)
X_test = X_test.reshape(-1, 784)
X_train = X_train.astype("float32") / 255.0
X_test = X_test.astype("float32") / 255.0
print(X_train.shape[0], 'train samples')
print(X_test.shape[0], 'test samples')
# convert class vectors to binary class matrices
Y_train = np_utils.to_categorical(y_train, nb_classes)
Y_test = np_utils.to_categorical(y_test, nb_classes)
# Layer-wise pretraining
encoders = []
nb_hidden_layers = [784, 600, 500, 400]
X_train_tmp = np.copy(X_train)
for i, (n_in, n_out) in enumerate(zip(nb_hidden_layers[:-1], nb_hidden_layers[1:]), start=1):
print('Training the layer {}: Input {} -> Output {}'.format(i, n_in, n_out))
# Create AE and training
ae = Sequential()
encoder = containers.Sequential([Dense(n_in, n_out, activation='sigmoid')])
decoder = containers.Sequential([Dense(n_out, n_in, activation='sigmoid')])
ae.add(AutoEncoder(encoder=encoder, decoder=decoder,
output_reconstruction=False, tie_weights=True))
ae.compile(loss='mean_squared_error', optimizer='rmsprop')
ae.fit(X_train_tmp, X_train_tmp, batch_size=batch_size, nb_epoch=nb_epoch)
# Store trainined weight and update training data
encoders.append(ae.layers[0].encoder)
X_train_tmp = ae.predict(X_train_tmp)
# Fine-turning
model = Sequential()
for encoder in encoders:
model.add(encoder)
model.add(Dense(nb_hidden_layers[-1], nb_classes, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='rmsprop')
score = model.evaluate(X_test, Y_test, show_accuracy=True, verbose=0)
print('Test score before fine turning:', score[0])
print('Test accuracy after fine turning:', score[1])
model.fit(X_train, Y_train, batch_size=batch_size, nb_epoch=nb_epoch,
show_accuracy=True, validation_data=(X_test, Y_test))
score = model.evaluate(X_test, Y_test, show_accuracy=True, verbose=0)
print('Test score after fine turning:', score[0])
print('Test accuracy after fine turning:', score[1]) Outputs
|
I'm reading an article (thesis of LISA labs) about different method to train deep neural networks. |
Cross entropy is for classification (ie you need classes). Autoencoders are purely MSE based. |
But when i use parameter tie_weights it gives the error: How can i fix it, does it means my keras is in older version? and the document also has no tie_weights parameter for autoencoder :http://keras.io/layers/core/#autoencoder |
This is because weight tying has been removed |
thanks, so what can we do if i want to use tie_weights? |
Traceback (most recent call last): |
@Nidhi1211 : This is unrelated. Your error is clearly in your data load.... |
Traceback (most recent call last): |
@Nidhi1211 : I suggest you learn how to read stack traces. |
Hi,
Running this code with output_reconstructions=True flag in a model I'm able to fit the data X and I can predict a new set of values. But their dimension is the same as my input one. What I wanted is to extract the hidden layer values. The keras documentation says:
So I though I'll use output_reconstructions=False and then I'll be able to extract
Please note, that my data X is a dataset without labels, I used 10000 as a batch size and my dataset has 301 features. I used hidden layer with 100 neurons and run keras version 0.3.0 on GPU. I have several questions:
I would appreciate any suggestions and explanations even using some dummy example. |
model.fit(X, X, nb_epoch = epochs, batch_size = batch_size,
No difference between MNIST and any other dataset.
Cannot understand why. Any more detailed explanation? |
Tenkawa, On 0, Tenkawa Akito notifications@github.com wrote:
|
Hi all, # from https://github.com/fchollet/keras/issues/358
from __future__ import absolute_import
from __future__ import print_function
import numpy as np
from keras import models
from keras import optimizers
from keras.datasets import mnist
from keras.layers import containers
from keras.layers.core import Dense, AutoEncoder
np.random.seed(1337)
batch_size = 100
nb_classes = 10
nb_epoch = 1
(X_train, _), (X_test, _) = mnist.load_data()
X_train = X_train.reshape(-1, 784)
X_test = X_test.reshape(-1, 784)
X_train = X_train.astype("float32") / 255.0
X_test = X_test.astype("float32") / 255.0
print(X_train.shape[0], 'train samples')
print(X_test.shape[0], 'test samples')
# Autoencoder 1
ae1 = models.Sequential()
ae1.add(AutoEncoder(encoder=containers.Sequential([Dense(500, input_dim=784)]),
decoder=containers.Sequential([Dense(784, input_dim=500)]),
output_reconstruction=True))
ae1.compile(loss='mse', optimizer=optimizers.RMSprop())
ae1.fit(X_train, X_train, batch_size=batch_size, nb_epoch=nb_epoch,
show_accuracy=True, verbose=1, validation_data=[X_test, X_test])
X_train_tmp = ae1.predict(X_train)
print("Autoencoder data format: {0} - should be (60000, 500)".format(X_train_tmp.shape))
# Autoencoder 2
ae2 = models.Sequential()
ae2.add(AutoEncoder(encoder=containers.Sequential([Dense(400, input_dim=500)]),
decoder=containers.Sequential([Dense(500, input_dim=400)]),
output_reconstruction=True))
ae2.compile(loss='mse', optimizer=optimizers.RMSprop())
ae2.fit(X_train_tmp, X_train_tmp, batch_size=batch_size, nb_epoch=nb_epoch)
X_train_tmp = ae1.predict(X_train_tmp) The output is:
Can someone help me? Thanks, |
@dchevitarese you are trying to fit your second autoencoder with an input with size 784, while it expects one of 500. If I get it right, you want to sneak on the innermost layer, so take care of what data are you dealing with. here is some hint:
I hope this helps. |
Hi @dibenedetto, I didn't know that I would have to recompile, but it did the trick. In the end, I got ~91% of accuracy. I could use a CNN to do the same job, but I am investigating this AE's to pre-train layers - and this also explains my next question: What do you mean with "take care of what data are you dealing with"? P.S.: I am trying to recreate this: http://www.sciencedirect.com/science/article/pii/S0031320315001181 |
@mthrok Thanks for your help and your code! |
Hi @isalirezag, you can get all configuration by using
Unfortunately, I don't think keras has a good visualization functionality. The only thing you get is a very simple graphviz plot, which is not helpful. |
It seems the code can only work on keras==0.3.0 |
I just wanna know if the |
It has been removed. You can easily accomplish it using the functional api.
|
Hey, guys, I am also working on how to layer-by-layer train AE, and I'm new to Keras. Will such code work ?
|
This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed. |
Drawing random integers is a big confusing across backends, jax and torch have separate randint ops, tf folds support into random.uniform with incomplete dtype support. This gives us a unified surface for drawing random ints.
Drawing random integers is a big confusing across backends, jax and torch have separate randint ops, tf folds support into random.uniform with incomplete dtype support. This gives us a unified surface for drawing random ints.
Helo,
firts of all sorry for my english, it's not my native language (I'm french)
As the tittle said, I'm trying to train deep neural network with stack autoencoder but I'm stuck ...
thanks to fchollet's exemple I managed to implement a simple deep neural network that is work thinks to ReLU activation function (Xavier Glorot thesis).
But now I want to compar the result I have with this simple deep neural network to a deep network with stack auto encoder pre training.
I start with this code but I don't know how I can continue ... and everytime I try to add code I have an error ... so this is my valid code :
from future import absolute_import
from future import print_function
import numpy as np
np.random.seed(1337) # for reproducibility
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation, AutoEncoder, Layer
from keras.optimizers import SGD, Adam, RMSprop, Adagrad, Adadelta
from keras.utils import np_utils
from keras.utils.dot_utils import Grapher
from keras.callbacks import ModelCheckpoint
batch_size = 10000
nb_classes = 10
nb_epoch = 1
adg = Adagrad()
sgd = SGD()
rms = RMSprop()
the data, shuffled and split between train and test sets
(X_train, y_train), (X_test, y_test) = mnist.load_data()
X_train = X_train.reshape(60000, 784)
X_test = X_test.reshape(10000, 784)
X_train = X_train.astype("float64")
X_test = X_test.astype("float64")
X_train /= 255
X_test /= 255
print(X_train.shape[0], 'train samples')
print(X_test.shape[0], 'test samples')
convert class vectors to binary class matrices
Y_train = np_utils.to_categorical(y_train, nb_classes)
Y_test = np_utils.to_categorical(y_test, nb_classes)
model = Sequential()
creating the autoencoder
first hidden layer
model.add(AutoEncoder(encoder=Dense(784, 700),
decoder=Dense(700, 784),
output_reconstruction=False, tie_weights=True))
model.add(Activation('tanh'))
model.compile(loss='mean_squared_error', optimizer=rms)
model.fit(X_train, X_train, batch_size=batch_size, nb_epoch=nb_epoch, show_accuracy=False, verbose=1, validation_data=None)
for layer in model.layers:
config=layer.get_config()
print(config)
print(weight)
second hidden layer
model.add(AutoEncoder(encoder=Dense(700, 600),
decoder=Dense(600, 700),
output_reconstruction=False, tie_weights=True))
model.add(Activation('tanh'))
model.compile(loss='mean_squared_error', optimizer=rms)
model.fit(X_train, X_train, batch_size=batch_size, nb_epoch=nb_epoch, show_accuracy=False, verbose=1, validation_data=None)
Pre-training
fine-tuning
autoencoder evaluation
The text was updated successfully, but these errors were encountered: