Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Export transform and model for torchscript, quantization and deployment #36

Closed
wants to merge 9 commits into from

Conversation

hudeven
Copy link
Contributor

@hudeven hudeven commented Oct 12, 2022

It's troublesome to deal with transform and model separately during torchscript, quantization and deploy with torchserve. So I added CombinedModule wrapper to hold them such that it can consume raw input(text) and produce the prediction(generated text) end to end.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 12, 2022
@hudeven hudeven requested a review from dracifer October 12, 2022 21:58
from torch import Tensor


class CharTransform(nn.Module):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this builds vocabulary of the input data with loading all into memory?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the dataset is only 1M for this example. We will use iterable datasets/datapipes for the vision tutorial.

torchrecipes/paved_path/README.md Show resolved Hide resolved

fs, intput_path = fsspec.core.url_to_fs(args.input_path)
with fs.open(intput_path, "rb") as f:
model = torch.load(f, map_location="cpu")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we use snapshot to save/load trained model as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried but feel it causes more trouble than benefits. Snapshot save/load state_dicts from training states including model, optimizer, progress, etc. 1) we only need model here. It's a waste to pass the larger snapshot file in S3 2) to load the state_dict, we have to initialize the GPT model object here with the same config as training, which make this script less generic.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For "Snapshot save/load state_dicts from training states including model, optimizer, progress, etc.", we should be able to create a separate snapshot that only save model params? but anyway, it sounds reasonable to avoid the overhead. maybe lets check with @yifuwang about the appropriate use case of snapshot

Comment on lines -189 to -191
loss = None
if targets is not None:
loss = F.cross_entropy(logits.view(-1, logits.size(-1)), targets.view(-1))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make sense to move loss out of the main module!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also required to make model torchscriptable.

@facebook-github-bot
Copy link
Contributor

@hudeven has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants