-
Notifications
You must be signed in to change notification settings - Fork 100
Addition of 2 methods and of 2 examples #10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Addition of 2 methods and of 2 examples #10
Conversation
|
Jeremie, fantastic, thank you, especially for the added examples. Will review this week! |
Dear, I made some modifications to the method fit: the mini-batches are created at random, as you proposed initially (my implementation was to simple). Question: When there are multiple CAF images, each image will go through (approximately) the complete training set, and therefore each epoch will be based on #images times whole training datasets. Should the loop "mini_batches" divide by the number of images, such that one epoch is based on only one whole training set? Or do I miss something? Thank you. |
Quasi-random selection of mini-batches is meant to mimic the "stochastic" of Stochastic Gradient Descent. Note that the random selection is currently not implemented anywhere in the library, but can be implemented in the client code (like example_mnist.f90).
im = size(x, dim=2) ! mini-batch size
nm = size(self % dims) ! number of layers
! get start and end index for mini-batch
indices = tile_indices(im)
is = indices(1)
ie = indices(2)
call db_init(db_batch, self % dims)
call dw_init(dw_batch, self % dims)
do concurrent(i = is:ie)
call self % fwdprop(x(:,i))
call self % backprop(y(:,i), dw, db)
do concurrent(n = 1:nm)
dw_batch(n) % array = dw_batch(n) % array + dw(n) % array
db_batch(n) % array = db_batch(n) % array + db(n) % array
end do
end do
if (num_images() > 1) then
call dw_co_sum(dw_batch)
call db_co_sum(db_batch)
end ifHere, |
|
Thank you @milancurcic for your answers. I indeed reed too quickly the code. Well done! |
|
Hi Jeremie, I mostly like your additions. Few suggestions:
|
Dear Milan, I modified the different names as you suggested. |
*Addition of the methods "fit" and "predict": the method fit trains the model for n epochs with m batchs (the batchs are now selected consecutively (instead of randomly (I am not sure why it was the case); the method predict returns predicted ouput (y)
*Modification of example_minst.f90 to use "fit" and "predict"
*Addition of 2 examples using real data publicaly available and used in 2 scientific studies (see Montesinos-Lopez et al. (2018). G3)
*ISSUE: Activations other than 'sigmoid' do not seem to work properly