Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regarding the "reinitialization of networks" #5

Closed
hockman1 opened this issue Dec 19, 2020 · 2 comments
Closed

Regarding the "reinitialization of networks" #5

hockman1 opened this issue Dec 19, 2020 · 2 comments

Comments

@hockman1
Copy link

Hi, I refer to Fig 4 of the main paper, where it states "As a control, also shown is a variant of generative replay whereby the networks are reinitialized before each new task/episode". By reinitialization, does this mean that if we have that model after trained on task1, if we have to train on the next task, then we create a "brand new" model with the exact same architecture and retrain? If it is, then how different will it be from retraining from the current model? Is it likely that there will be no difference in performance with and without reinitialization for other cases because I tried on other types of dataset and the 2 curves kind of overlap rather than show one is better than the other?

@GMvandeVen
Copy link
Owner

Hi, the way you describe the “reinitialization” sounds correct to me. For the brown curves in Fig 4, after each task/episode, the trained current model was indeed replaced with a “brand new” model with the same architecture, which was then trained on the next task. (But the replayed samples were still generated by the model trained on the previous task.) In the code this is controlled with the option --reinit.
The only difference compared to continuing training from the current model is a difference in initialization: when continuing training from the current model you start the new task with an initialization optimized for the previous tasks, while with the --reinit option you start the new task with a random initialization. So, what Fig 4 shows is that if you start with an initialization optimized for the previous tasks, less (good) replay is needed than if you start with a random initialization. (“Not forgetting is easier than learning from scratch.”)
It would surprise me if it is not possible to show similar results on other types of dataset. Do you want to share the results you got? If you prefer you could also email them. Thanks!

@hockman1
Copy link
Author

Thanks! This is because I am working on the keras implementation of generative replay and there might be reproducibility issues with keras , it could be that I didnt do sufficient averaging of over the few runs to get a statistically significant result or it could be the problem with my code :P ! Anyway thx!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants