Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inquiry about "CacheableDataset from wrapper" #165

Closed
AhmedHussKhalifa opened this issue Oct 10, 2021 · 3 comments
Closed

Inquiry about "CacheableDataset from wrapper" #165

AhmedHussKhalifa opened this issue Oct 10, 2021 · 3 comments

Comments

@AhmedHussKhalifa
Copy link

AhmedHussKhalifa commented Oct 10, 2021

Hey,

I am trying to tailor your framework to accommodate my experiments on imagenet. One of my experiments is to implement this paper Knowledge distillation: A good teacher is patient and consistent where they do some data augmentation at the teacher and student side. Each model has different input data. I expect I would need to creat each of them a different dataloader. I will also maintain the same augmentation method for each image to be able to store the output vector of the teacher using default_idx2subpath function on my hard drive (SSD). I figured that saving the data on the SSD compared to loading the images made the training runs slower.

I am trying to train a Student model -ResNet18- using 2 GPUs on ImageNet and a teacher model -Resnet34-. Could u recommend what is the best scenario for this pipeline to run the code faster giving the current resources?

Another question, if I need to train using cosine scheduler [CosineAnnealingLR ] with Adam optimizer, which method should I change?

@yoshitomo-matsubara
Copy link
Owner

yoshitomo-matsubara commented Oct 11, 2021

Hi @AhmedHussKhalifa

Thank you for the inquiry.

Another question, if I need to train using cosine scheduler [CosineAnnealingLR ] with Adam optimizer, which method should I change?

For instance, you can replace the following optimizer and scheduler entries in a YAML file

  optimizer:
    type: 'SGD'
    params:
      lr: 0.1
      momentum: 0.9
      weight_decay: 0.0005
  scheduler:
    type: 'MultiStepLR'
    params:
      milestones: [5, 15]
      gamma: 0.1

with

  optimizer:
    type: 'Adam'
    params:
      lr: 0.1
  scheduler:
    type: 'CosineAnnealingLR'
    params:
      T_max: 100
      eta_min: 0
      last_epoch: -1
      verbose: False

following this PyTorch official document (suppose you set num_epochs: 100 in the YAML file, thus T_max: 100)

For reimplementing the method you mentioned, I'll need some time to see a big picture of the procedure.

In the meantime, could you close this issue (and #154) and start a new discussion in Discussions ? As explained in README, I would like to gather questions / feature requests in Discussions and bugs in Issues for now.

Thank you

@yoshitomo-matsubara
Copy link
Owner

@AhmedHussKhalifa Closing this issue as I haven't seen any follow-up for a while.
Open a new Discussion (not Issue) if you still have questions

@AhmedHussKhalifa
Copy link
Author

Thank you @yoshitomo-matsubara.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants