Temporary patch for quant transfer learning issues #76

KSGulin · 2022-06-29T12:22:06Z

Currently, trying to run quant transfer learning will lead to errors, as the checkpoint recipe includes a QAT modifier.

The primary fix needs to implemented in SparseML and ZooModels. This PR provides a temporary patch which gets around the issue by removing the the QAT modifier from the checkpoint recipe when a model is loaded for training. Additional logic is added for handling model saving and one-shot sparsification in this regime.

In addition, a bug where using the "--resume" keyword would increase the total epochs is fixed.

Fixes for quant transfer learn

37326ae

This was referenced Jun 29, 2022

unable resume training neuralmagic/sparseml#921

Closed

Sparse transfer learning training error: AssertionError: min nan should be less than max nan neuralmagic/sparseml#915

Closed

KSGulin merged commit cacdaee into master Jun 29, 2022

KSGulin added a commit that referenced this pull request Jun 29, 2022

Fixes for quant transfer learn (#76)

bafd7b5

KSGulin added a commit that referenced this pull request Jun 29, 2022

Fixes for quant transfer learn (#76) (#77)

d5807bf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Temporary patch for quant transfer learning issues #76

Temporary patch for quant transfer learning issues #76

Uh oh!

KSGulin commented Jun 29, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Temporary patch for quant transfer learning issues #76

Temporary patch for quant transfer learning issues #76

Uh oh!

Conversation

KSGulin commented Jun 29, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant