Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to save ONLY studuent ckpt? #594

Open
MR-hyj opened this issue Oct 11, 2023 · 4 comments
Open

How to save ONLY studuent ckpt? #594

MR-hyj opened this issue Oct 11, 2023 · 4 comments

Comments

@MR-hyj
Copy link

MR-hyj commented Oct 11, 2023

Checklist

  • I have searched related issues but cannot get the expected help.
  • I have read related documents and don't know what to do.

Describe the question you meet

I'm working on a yolov8 distillation project, which involves:

  • training a yolov8 teacher model, i.e. yolov8_teacher_cfg.py (done)
  • training a yolov8 student model, i.e. yolov8_student_cfg.py (done)
  • test converting student model from .pth to .onnx using mmdeploy (done)
  • distillation the student (done)
  • convert the student from .pth to .onnx using mmdeploy

I figured it out to customize a configure, i.e. distill_cfg.py, to distill the student and it worked out.

In order to convert the distilled student model to .onnx, I suppose:

  • mmrazor saves the distilled student model ONLY. or distilled student and teacher are saved in SEPARATE .pth files.

However, it seems that mmrazor saves both teacher and student meta info in its ckpt, i.e. mmrazor_distill.pth.

Yet another problem (maybe easy to solve), student model and teacher model is named after architecture and teacher respectively in mmrazor. In my case, using model instead of architecture when saving the distilled student model would be very helpful.

Thank you for any suggestion!

@Veccoy
Copy link

Veccoy commented May 7, 2024

Hi, I have the same problem, how can I get the .pth file for the student model ONLY ?

I made the same observation as @MR-hyj: the final checkpoint at the end of a distillation training is as heavier as the sum of both checkpoint files of the student and the teacher seperated. I guess it is more convenient to resume a training, but it would be helpful to have access to the checkpoint of the student only.

@MR-hyj
Copy link
Author

MR-hyj commented May 7, 2024

Hi, I have the same problem, how can I get the .pth file for the student model ONLY ?

I made the same observation as @MR-hyj: the final checkpoint at the end of a distillation training is as heavier as the sum of both checkpoint files of the student and the teacher seperated. I guess it is more convenient to resume a training, but it would be helpful to have access to the checkpoint of the student only.

Say the training network model consists of a teacher model named teacher and a student model named architecture, these variables can be easily accessed by model.architecture and model.teacher. I think the easiest way to separetly save the student model, is to customize a hook, say, SaveStudentOnlyHook.

@Veccoy
Copy link

Veccoy commented May 13, 2024

Ok thank you, I will try this

@Veccoy
Copy link

Veccoy commented May 14, 2024

There are also tools in MMRazor that you can use to avoid customizing checkpoints in tools/model_converters/, especially convert_kd_ckpt_to_student.py #381

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants