Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Theano error: An update must have the same type as the original shared variable #7

Open
TeslaH2O opened this issue Oct 1, 2015 · 18 comments

Comments

@TeslaH2O
Copy link

TeslaH2O commented Oct 1, 2015

Running the command for the mnist dataset

./run.py train --encoder-layers 1000-500-250-250-250-10 --decoder-spec sig --denoising-cost-x 1000,10,0.1,0.1,0.1,0.1,0.1 --labeled-samples 100 --unlabeled-samples 60000 --seed 1 -- mnist_100_full

I get this error:

ERROR:blocks.main_loop:Error occured during training.

Blocks will attempt to run on_error extensions, potentially saving data, before exiting and reraising the error. Note that the usual after_training extensions will not be run. The original error will be re-raised and also stored in the training log. Press CTRL + C to halt Blocks immediately.
Traceback (most recent call last):
File "./run.py", line 649, in
if train(d) is None:
File "./run.py", line 500, in train
main_loop.run()
File "/home/teslah2o/ladder/venv/local/lib/python2.7/site-packages/blocks/main_loop.py", line 188, in run
reraise_as(e)
File "/home/teslah2o/ladder/venv/local/lib/python2.7/site-packages/blocks/utils/init.py", line 225, in reraise_as
six.reraise(type(new_exc), new_exc, orig_exc_traceback)
File "/home/teslah2o/ladder/venv/local/lib/python2.7/site-packages/blocks/main_loop.py", line 164, in run
self.algorithm.initialize()
File "/home/teslah2o/ladder/venv/local/lib/python2.7/site-packages/blocks/algorithms/init.py", line 224, in initialize
self._function = theano.function(self.inputs, [], updates=all_updates)
File "/home/teslah2o/ladder/venv/local/lib/python2.7/site-packages/theano/compile/function.py", line 300, in function
output_keys=output_keys)
File "/home/teslah2o/ladder/venv/local/lib/python2.7/site-packages/theano/compile/pfunc.py", line 488, in pfunc
no_default_updates=no_default_updates)
File "/home/teslah2o/ladder/venv/local/lib/python2.7/site-packages/theano/compile/pfunc.py", line 216, in rebuild_collect_shared
raise TypeError(err_msg, err_sug)
TypeError: ('An update must have the same type as the original shared variable (shared_var=f_5_b, shared_var.type=TensorType(float32, vector), update_val=Elemwise{sub,no_inplace}.0, update_val.type=TensorType(float64, vector))., If the difference is related to the broadcast pattern, you can call the tensor.unbroadcast(var, axis_to_unbroadcast[, ...]) function to remove broadcastable dimensions.\n\nOriginal exception:\n\tTypeError: An update must have the same type as the original shared variable (shared_var=f_5_b, shared_var.type=TensorType(float32, vector), update_val=Elemwise{sub,no_inplace}.0, update_val.type=TensorType(float64, vector))., If the difference is related to the broadcast pattern, you can call the tensor.unbroadcast(var, axis_to_unbroadcast[, ...]) function to remove broadcastable dimensions.', 'If the difference is related to the broadcast pattern, you can call the tensor.unbroadcast(var, axis_to_unbroadcast[, ...]) function to remove broadcastable dimensions.')

Do you know how to fix it?

@parry2403
Copy link

I am also facing same issue .

@arasmus
Copy link
Contributor

arasmus commented Oct 20, 2015

No idea how to fix it. Several people has been successfully running the code so it is probably something to do with library configuration. Perhaps Theano versions behave differently, e.g. 0.7.0 vs bleeding edge. Does this help?

@fedor-chervinskii
Copy link

I've got the same error, and this solved the problem:

THEANO_FLAGS='floatX=float32' python run.py train ...

@hotloo
Copy link
Contributor

hotloo commented Jul 22, 2016

Can you update your local master and try out the new version? Also, could you please also update your python environment according to the environment.yml and then test out the experiments that you have encountered issues?

@kleiba
Copy link

kleiba commented Sep 6, 2016

Is it necessary to have the exact versions of the dependencies installed, or are newer versions ok as well?

@hotloo
Copy link
Contributor

hotloo commented Sep 6, 2016

@kleiba Hi! I would assume that the newer versions of the dependencies, with assumptions that no breaking changes are introduced, should work. However, due to the nature of these compiled code from Theano and other libraries, I would recommend that you can try to use the exact versions.

Could you please also, if possible, share with us your updated environment file if you get it working? Cheers!

@hotloo hotloo closed this as completed Sep 6, 2016
@hotloo hotloo reopened this Sep 6, 2016
@kleiba
Copy link

kleiba commented Sep 7, 2016

@hotloo Hi! I'm not using conda, but the following are the version numbers of the packages in my local installation. I ran the "MNIST 1000 labels -- Full" example from the README as a test and as far as I can tell, the training went through without any issues (however, interestingly, the test error is even lower than the one reported in one of your papers, 0.75).

dependencies:

  • h5py=2.6.0
  • matplotlib=1.5.2
  • nomkl -> not installed
  • openblas=0.2.8 (!)
  • numpy=1.11.1
  • pandas=0.18.1
  • pytables=3.2.3.1
  • python=2.7.6 (!)
  • scipy=0.18.0
  • pip=8.1.2
  • pip:
    • git+git://github.com/Theano/Theano.git@rel-0.8.2
    • git+git://github.com/mila-udem/fuel.git@0.2.0
    • git+git://github.com/mila-udem/blocks.git@0.2

@hotloo
Copy link
Contributor

hotloo commented Sep 7, 2016

@kleiba Ha! Glad to hear that you got it working!

Indeed, we reported, if I remember correctly, that average of 5 runs on different seeds are reported in the paper. I am closing this ticket now since you have reproduced the results!

@hotloo hotloo closed this as completed Sep 7, 2016
@kleiba
Copy link

kleiba commented Sep 7, 2016

But isn't it a bit strange that the test error varies that much?

@hotloo
Copy link
Contributor

hotloo commented Sep 7, 2016

@kleiba True. Would be it possible to do a 10 seed run from 1 to 10? That should tell us how valid the results.

@hotloo hotloo reopened this Sep 7, 2016
@kleiba
Copy link

kleiba commented Sep 8, 2016

Sure thing. So, you want me to use 1,2,...,10 as seed values, or 10 random seeds?

@hotloo
Copy link
Contributor

hotloo commented Sep 8, 2016

Yeah, 1-10 would be nice. :)
On Thu, 8 Sep 2016 at 12:58, Thomas Kleinbauer notifications@github.com
wrote:

Sure thing. So, you want me to use 1,2,...,10 as seed values, or 10 random
seeds?


You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
#7 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ABvx8CB1n_1mhdS_hC4jzqB6UXGlOsJlks5qn9xIgaJpZM4GHTZI
.

@kleiba
Copy link

kleiba commented Sep 8, 2016

Ah, shoot -- I was a bit too fast then. I've started 10 jobs with the following seeds:

1336129658, 2139292564, 1024194972, 1015193191, 755118383, 1238574728, 1490285678, 902708816, 1963117705, 1043170902

I hope that's okay for repeatability, or else I can also do 1,...,10.

@hotloo
Copy link
Contributor

hotloo commented Sep 8, 2016

I think it should do its job! Thanks for all the time and effort that you
put into this! Cheers!
On Thu, 8 Sep 2016 at 16:58, Thomas Kleinbauer notifications@github.com
wrote:

Ah, shoot -- I was a bit too fast then. I've started 10 jobs with the
following seeds:

1336129658, 2139292564, 1024194972, 1015193191, 755118383, 1238574728,
1490285678, 902708816, 1963117705, 1043170902

I hope that's okay for repeatability, or else I can also do 1,...,10.


You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
#7 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ABvx8JI4MtIUe1-0JTrQxQgPtlazbissks5qoBSAgaJpZM4GHTZI
.

@kleiba
Copy link

kleiba commented Sep 8, 2016

No way, thank you for your support!

@kleiba
Copy link

kleiba commented Sep 9, 2016

Okay, so the results are in. As I wrote above, I ran the "MNIST all labels / Full" example from the README file. For the 10 seed runs, I hence used the following command:

run.py train --encoder-layers 1000-500-250-250-250-10 --decoder-spec gauss --denoising-cost-x 1000,1,0.01,0.01,0.01,0.01,0.01 --labeled-samples 60000 --unlabeled-samples 60000 --seed <seed> -- mnist_all_full

where <seed> was one of the ten values posted above. Running run.py evaluate results/xxx on each of the resulting model directories yielded the following test errors:

<seed> Test error
1336129658 0.640000
2139292564 0.670000
1024194972 0.650000
1015193191 0.700000
755118383 0.610000
1238574728 0.760000
1490285678 0.750000
902708816 0.560000
1963117705 0.690000
1043170902 0.710000

Averaged over all runs, this gives us a test error of 0.684.

From the "Semi-Supervised Learning with Ladder Networks" paper, I would have expected something more close to the 0.57 reported on page 10 in Table 1 in that paper.

(I had previously thought that the numbers I got were better than reported (see comments above), but that was my mistake since I had compared my results to the wrong column in said paper.)

Any idea why my numbers are worse? Out of the 10 runs I did, it seems that only one comes close to your 0.57, all others being substantially worse.

Thanks!

@hotloo
Copy link
Contributor

hotloo commented Sep 9, 2016

@kleiba Interesting! Thanks for your effort here. Let me double check if we are experiencing some regressions somewhere. I will report back once I have some results.

@Chromer163
Copy link

@kleiba @hotloo
emm.. I just don't know how to get the test error, I have run mnist 100 labeled-samples ,but
I am not very aware of the meaning of the parameters in the results,such as:

1
Saving to results/mnist_100_full1/trained_params
e 150, i 75000:V_C_class 0.0954, V_E 1.43, V_C_de 0.00561 0.0635 0.927 0.365 0.164 0.0549 0.0352, T_C_de 0.00544 0.0608 0.926 0.362 0.162 0.0519 0.0316, T_C_class 0.000141, VF_C_class 0.0944, VF_E 1.41, VF_C_de 0.00561 0.0636 0.927 0.365 0.163 0.0531 0.033
valid_final_error_rate_clean 1.41
Took 55.3 minutes

thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants