New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Export transform and model for torchscript, quantization and deployment #36
Conversation
from torch import Tensor | ||
|
||
|
||
class CharTransform(nn.Module): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this builds vocabulary of the input data with loading all into memory?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the dataset is only 1M for this example. We will use iterable datasets/datapipes for the vision tutorial.
|
||
fs, intput_path = fsspec.core.url_to_fs(args.input_path) | ||
with fs.open(intput_path, "rb") as f: | ||
model = torch.load(f, map_location="cpu") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we use snapshot to save/load trained model as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried but feel it causes more trouble than benefits. Snapshot save/load state_dicts from training states including model, optimizer, progress, etc. 1) we only need model here. It's a waste to pass the larger snapshot file in S3 2) to load the state_dict, we have to initialize the GPT model object here with the same config as training, which make this script less generic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For "Snapshot save/load state_dicts from training states including model, optimizer, progress, etc.", we should be able to create a separate snapshot that only save model params? but anyway, it sounds reasonable to avoid the overhead. maybe lets check with @yifuwang about the appropriate use case of snapshot
loss = None | ||
if targets is not None: | ||
loss = F.cross_entropy(logits.view(-1, logits.size(-1)), targets.view(-1)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make sense to move loss out of the main module!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is also required to make model torchscriptable.
@hudeven has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
It's troublesome to deal with transform and model separately during torchscript, quantization and deploy with torchserve. So I added
CombinedModule
wrapper to hold them such that it can consume raw input(text) and produce the prediction(generated text) end to end.