Skip to content

Conversation

damienlancry
Copy link
Contributor

@damienlancry damienlancry commented Jun 17, 2019

I created an example script trying to reproduce the results of Deep Bayesian Active Learning with Image Data using modAL.
I used this keras code from one of the authors.
I cannot think of anything I am doing differently and yet their code works and not mine.
For the acquisition function instead of using their modified keras, i used yarin gal's implementation (first author).
Can you spot any mistake in my code?
EDIT: I actually found a mistake in my code, I was not really computing the entropy but rather the other half of BALD function. I fixed this mistake and am currently running the code.
EDIT2: Still not working

@codecov-io
Copy link

codecov-io commented Jun 17, 2019

Codecov Report

Merging #48 into dev will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##              dev      #48   +/-   ##
=======================================
  Coverage   97.17%   97.17%           
=======================================
  Files          31       31           
  Lines        1629     1629           
=======================================
  Hits         1583     1583           
  Misses         46       46

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3c01821...300a518. Read the comment docs.

@damienlancry
Copy link
Contributor Author

damienlancry commented Jun 17, 2019

It seems to be working way better now with a pool2d of size 5 (I was doing pool2d((2,2)) in my code
Edit: it s better with max_entropy but it s equally better with random acquisition, so no improvement ...

@cosmic-cortex
Copy link
Member

Thanks! From the first glance, I don't know what might be wrong, so I'll take a detailed look ASAP, hopefully today! I'll also merge the PR then.

print('Accuracy after query {n}: {acc:0.4f}'.format(n=index + 1, acc=model_accuracy))
perf_hist = [model_accuracy]

np.save('/home/damien/Results/keras_modal_entropy.npy', perf_hist)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hardcoded path, should be removed eventually!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh yes sure my bad

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

np :)

@cosmic-cortex
Copy link
Member

cosmic-cortex commented Jun 19, 2019

I have checked the code, along with the implementation of Yarin Gal. What is missing in his implementation is the actual training part, which might be crucial. Here, when you call
learner.teach(X_pool[query_idx], y_pool[query_idx], epochs=50, batch_size=128, verbose=0),
you actually append the new training instances to the old ones and run the training using all of the data. This is not a problem for classical methods such as any method from scikit-learn because every call to .fit() retrains the model from scratch, however this is not the case for NN-s in Tensorflow or PyTorch. With these, you actually continue the training from the current state, so in effect, after say the 100th query, the initial data has been shown 100*n_epoch times, while the last query has been shown only one time. This can create an inbalance. To solve this, pass the only_new=True argument for the .teach() method of the ActiveLearner. With this, the model is trained on new data only.

So, I started to experiment with this, I'll let you know the results!

(Also I have pointed out a hardcoded path in the code, that should be removed eventually.)

@damienlancry
Copy link
Contributor Author

damienlancry commented Jun 19, 2019

I have checked the code, along with the implementation of Yarin Gal. What is missing in his implementation is the actual training part, which might be crucial. Here, when you call
learner.teach(X_pool[query_idx], y_pool[query_idx], epochs=50, batch_size=128, verbose=0),
you actually append the new training instances to the old ones and run the training using all of the data. This is not a problem for classical methods such as any method from scikit-learn because every call to .fit() retrains the model from scratch, however this is not the case for NN-s in Tensorflow or PyTorch. With these, you actually continue the training from the current state, so in effect, after say the 100th query, the initial data has been shown 100*n_epoch times, while the last query has been shown only one time. This can create an inbalance. To solve this, pass the only_new=True argument for the .teach() method of the ActiveLearner. With this, the model is trained on new data only.

So, I started to experiment with this, I'll let you know the results!

(Also I have pointed out a hardcoded path in the code, that should be removed eventually.)

Yes there is no training in yarin gal's code, just an acquisition example. on the other hand there is a training in riashat islam's code, although I find this implementation very messy, but it works.
Ok let s try with only_new=True but I do not think that is what is recommanded in the paper. I think the paper suggests to train from scratch after every acquisition, and I thought that was what I was doing but no. So i m going to try this next.

Btw, to this end, maybe a method _fit_from_scratch could be useful, what do you think?

Also I think there might be a mistake in my query_strategy function max_entropy because I first take a random subset of the pool and then evaluate the acquisition function on this subset and then take the max indices from the subset. So they are not the right indices. So i m working on fixing that too.

@damienlancry
Copy link
Contributor Author

damienlancry commented Jun 19, 2019

ok i fixed the max_entropy acquisition function and it worked! I think this is ready to be merged now!
dbal-modal

@cosmic-cortex cosmic-cortex merged commit 300a518 into modAL-python:dev Jun 19, 2019
@cosmic-cortex
Copy link
Member

Cool! I have merged the PR, thank you! Also, I propose to implement the acquisition functions to modAL directly as a feature, not just a custom query strategy in the example. I have just created the feature/bayesianDL branch for this purpose.

One challenge would be to write these functions in a backend-agnostic way, which may be difficult. I'll take a shot tomorrow, feel free to contribute if you are interested!

Thanks again for the PR!

@damienlancry
Copy link
Contributor Author

I am totally interested in contributing!

@damienlancry damienlancry mentioned this pull request Jun 21, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants