Sparseml Integration #196

mathemusician · 2021-09-16T19:20:37Z

🚀 Feature

Add option to use sparseml. Example implementation found here: Google Colab link

Motivation

There is currently no option to use pruning techniques from sparseml out-of-the-box

Pitch

I will make a pull request to add this option to the hydra config. I've already forked a version of the lightning-tranformers library. link

Here's how it will be added on the hydra CLI:

trainer=sparseml

Passing this into the trainer means it will automatically use ddp. How convenient! Sparseml also uses a special conversion to log weights and such, and I've also implemented this:

+trainer/logger=sparsewandb

It is also available as a callback for those who want to train with CPU.

+trainer/callback=sparseml

Sparseml barely supports transformers at the moment, so I've had to make a workaround for their exporter. BERT and other BERT-like models output a ModelOutput which tell the exporter there will be two outputs. But sometimes, there's only one. I've just forced the exporter to treat is all as one output for now. I may open a pull request at sparseml to handle transformer outputs.

The RECIPE_PATH and MODELS_PATH, paths to the recipe yaml and models folder, are passed in as environment variables. I wasn't able to find a way to around this since hydra overwrites added configs after starting the training loop. Maybe there's a better way of doing this.

Alternatives

I haven't thought much about this, but I'll add in a few good alternatives once I find some.

Additional context

This is my first time to make a pull request to a rather large library, so don't be afraid to critique. I need the feedback. I may also need help understanding how to get the "fit" stage works differently from the "train" stage. I'm running training.run_test_after_fit=False becuase the fit stage doesn't work. Training works just fine, however.

The text was updated successfully, but these errors were encountered:

SeanNaren · 2021-09-16T20:50:42Z

This is really cool! Looking forward to the PR :)

Regarding the env variables, it could be possible to make these arguments passed to the callback upon instantiation. I can have a look at making this a possibility once your PR is up!

We've also recently contributed a sparseml callback to the lightning-bolts package, which might also be useful for this: https://lightning-bolts.readthedocs.io/en/latest/callbacks/sparseml.html

mathemusician · 2021-09-16T21:10:58Z

@SeanNaren I'm actually using a variant of your sparseml callback! It's what inspired this. I'll submit a PR after I get more feedback from the neuralmagic community. They're usually pretty quick at responding.

SeanNaren · 2021-09-17T09:10:58Z

epic!! keep me updated :) more than happy to collab on this

mathemusician · 2021-09-17T20:49:16Z

Made the PR. The standard operating procedure is to close the issue after the PR is made, right?

SeanNaren · 2021-09-21T10:20:32Z

Just linked it to the PR, so when the PR is merged, this will close :)

mathemusician added enhancement New feature or request help wanted Extra attention is needed labels Sep 16, 2021

Borda assigned SeanNaren Sep 16, 2021

SeanNaren mentioned this issue Sep 21, 2021

sparseml integration #197

Merged

SeanNaren closed this as completed in #197 Sep 30, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sparseml Integration #196

Sparseml Integration #196

mathemusician commented Sep 16, 2021

SeanNaren commented Sep 16, 2021

mathemusician commented Sep 16, 2021 •

edited

SeanNaren commented Sep 17, 2021

mathemusician commented Sep 17, 2021

SeanNaren commented Sep 21, 2021

Sparseml Integration #196

Sparseml Integration #196

Comments

mathemusician commented Sep 16, 2021

🚀 Feature

Motivation

Pitch

Alternatives

Additional context

SeanNaren commented Sep 16, 2021

mathemusician commented Sep 16, 2021 • edited

SeanNaren commented Sep 17, 2021

mathemusician commented Sep 17, 2021

SeanNaren commented Sep 21, 2021

mathemusician commented Sep 16, 2021 •

edited