Regarding the "reinitialization of networks" #5

hockman1 · 2020-12-19T04:09:51Z

Hi, I refer to Fig 4 of the main paper, where it states "As a control, also shown is a variant of generative replay whereby the networks are reinitialized before each new task/episode". By reinitialization, does this mean that if we have that model after trained on task1, if we have to train on the next task, then we create a "brand new" model with the exact same architecture and retrain? If it is, then how different will it be from retraining from the current model? Is it likely that there will be no difference in performance with and without reinitialization for other cases because I tried on other types of dataset and the 2 curves kind of overlap rather than show one is better than the other?

GMvandeVen · 2020-12-21T08:56:45Z

Hi, the way you describe the “reinitialization” sounds correct to me. For the brown curves in Fig 4, after each task/episode, the trained current model was indeed replaced with a “brand new” model with the same architecture, which was then trained on the next task. (But the replayed samples were still generated by the model trained on the previous task.) In the code this is controlled with the option --reinit.
The only difference compared to continuing training from the current model is a difference in initialization: when continuing training from the current model you start the new task with an initialization optimized for the previous tasks, while with the --reinit option you start the new task with a random initialization. So, what Fig 4 shows is that if you start with an initialization optimized for the previous tasks, less (good) replay is needed than if you start with a random initialization. (“Not forgetting is easier than learning from scratch.”)
It would surprise me if it is not possible to show similar results on other types of dataset. Do you want to share the results you got? If you prefer you could also email them. Thanks!

hockman1 · 2020-12-21T09:06:35Z

Thanks! This is because I am working on the keras implementation of generative replay and there might be reproducibility issues with keras , it could be that I didnt do sufficient averaging of over the few runs to get a statistically significant result or it could be the problem with my code :P ! Anyway thx!

hockman1 closed this as completed Dec 21, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Regarding the "reinitialization of networks" #5

Regarding the "reinitialization of networks" #5

hockman1 commented Dec 19, 2020

GMvandeVen commented Dec 21, 2020

hockman1 commented Dec 21, 2020

Regarding the "reinitialization of networks" #5

Regarding the "reinitialization of networks" #5

Comments

hockman1 commented Dec 19, 2020

GMvandeVen commented Dec 21, 2020

hockman1 commented Dec 21, 2020