Skip to content
This repository has been archived by the owner on May 19, 2018. It is now read-only.

training process hang! #11

Closed
bigwhite opened this issue Jan 23, 2017 · 6 comments
Closed

training process hang! #11

bigwhite opened this issue Jan 23, 2017 · 6 comments

Comments

@bigwhite
Copy link

I started the training . it ran ok for a short time. When train loop reached to epoch 4,It hang! and never recover.

the snapshot is below:

training-hang-snapshot

The environment is as same as issue 10

@bigwhite
Copy link
Author

Does the "hang status" mean training is over?

@showforj
Copy link

朋友 你 Cycler' object has no attribute 'by_key 的问题怎么解决的

@martin-gorner
Copy link
Owner

I think your training simply finished, yes.
Look towards the end of the samples files for an iterations=XXXXX parameter. That is where you change the number of iterations.

@martin-gorner
Copy link
Owner

@showforj update the cycler module
pip3 install --upgrade cycler

@martin-gorner
Copy link
Owner

@showforj
Correction: pip3 install --upgrade matplotlib
The reason is that on Linux, you need to install matplotlib with apt-get because that also pulls in the graphics backend on which matplotlib relies.
But unfortunately, the version of martplotlib you get is not the freshest.

@bigwhite
Copy link
Author

bigwhite commented Feb 6, 2017

I have seen the comments in each sample files:

//mnist_1.0_softmax.py
# final max test accuracy = 0.9268 (10K iterations). Accuracy should peak above 0.92 in the first 2000 iterations.

//mnist_2.0_five_layers_sigmoid.py
# Some results to expect:
# (In all runs, if sigmoids are used, all biases are initialised at 0, if RELUs are used,
# all biases are initialised at 0.1 apart from the last one which is initialised at 0.)

## learning rate = 0.003, 10K iterations
# final test accuracy = 0.9788 (sigmoid - slow start, training cross-entropy not stabilised in the end)
# final test accuracy = 0.9825 (relu - above 0.97 in the first 1500 iterations but noisy curves)

## now with learning rate = 0.0001, 10K iterations
# final test accuracy = 0.9722 (relu - slow but smooth curve, would have gone higher in 20K iterations)

## decaying learning rate from 0.003 to 0.0001 decay_speed 2000, 10K iterations
# final test accuracy = 0.9746 (sigmoid - training cross-entropy not stabilised)
# final test accuracy = 0.9824 (relu - training set fully learned, test accuracy stable)
... ...

@martin-gorner thanks

@bigwhite bigwhite closed this as completed Feb 6, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants