You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I refer to Fig 4 of the main paper, where it states "As a control, also shown is a variant of generative replay whereby the networks are reinitialized before each new task/episode". By reinitialization, does this mean that if we have that model after trained on task1, if we have to train on the next task, then we create a "brand new" model with the exact same architecture and retrain? If it is, then how different will it be from retraining from the current model? Is it likely that there will be no difference in performance with and without reinitialization for other cases because I tried on other types of dataset and the 2 curves kind of overlap rather than show one is better than the other?
The text was updated successfully, but these errors were encountered:
Hi, the way you describe the “reinitialization” sounds correct to me. For the brown curves in Fig 4, after each task/episode, the trained current model was indeed replaced with a “brand new” model with the same architecture, which was then trained on the next task. (But the replayed samples were still generated by the model trained on the previous task.) In the code this is controlled with the option --reinit.
The only difference compared to continuing training from the current model is a difference in initialization: when continuing training from the current model you start the new task with an initialization optimized for the previous tasks, while with the --reinit option you start the new task with a random initialization. So, what Fig 4 shows is that if you start with an initialization optimized for the previous tasks, less (good) replay is needed than if you start with a random initialization. (“Not forgetting is easier than learning from scratch.”)
It would surprise me if it is not possible to show similar results on other types of dataset. Do you want to share the results you got? If you prefer you could also email them. Thanks!
Thanks! This is because I am working on the keras implementation of generative replay and there might be reproducibility issues with keras , it could be that I didnt do sufficient averaging of over the few runs to get a statistically significant result or it could be the problem with my code :P ! Anyway thx!
Hi, I refer to Fig 4 of the main paper, where it states "As a control, also shown is a variant of generative replay whereby the networks are reinitialized before each new task/episode". By reinitialization, does this mean that if we have that model after trained on task1, if we have to train on the next task, then we create a "brand new" model with the exact same architecture and retrain? If it is, then how different will it be from retraining from the current model? Is it likely that there will be no difference in performance with and without reinitialization for other cases because I tried on other types of dataset and the 2 curves kind of overlap rather than show one is better than the other?
The text was updated successfully, but these errors were encountered: