Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an example with torchdata and torchserve #1940

Merged
merged 8 commits into from
Nov 3, 2022

Conversation

PratsBhatt
Copy link
Contributor

@PratsBhatt PratsBhatt commented Nov 2, 2022

Description

The pull request provides a simple example of using torchdata with torchserve.
It uses MNIST as the dataset and task to be solved.
The current example builds on top of the already provided example of MNIST.

Please read our CONTRIBUTING.md prior to creating your first pull request.

Please include a summary of the feature or issue being fixed. Please also include relevant motivation and context. List any dependencies that are required for this change.

Fixes #(issue)

Type of change

The current pull request adds an example w.r.t torchdata and torchserve for MNIST model.
It adds an inference.py script that takes care of loading the MNIST dataset and do REST calls to torchserver. It adds a new mnist_handler.py script which adds a preprocessing step to convert the payload of the REST request to tensor as well as to output a class number once inference request is finished.

The output of the inference.py looks as the following.

2022-11-03 02:10:38.996234 - Model prediction Class 1 True Class: tensor([1], dtype=torch.uint8)
2022-11-03 02:10:38.996327 - Model prediction Class 8 True Class: tensor([3], dtype=torch.uint8)
2022-11-03 02:10:38.996433 - Model prediction Class 4 True Class: tensor([4], dtype=torch.uint8)
2022-11-03 02:10:38.996578 - Model prediction Class 7 True Class: tensor([7], dtype=torch.uint8)
2022-11-03 02:10:38.996699 - Model prediction Class 2 True Class: tensor([2], dtype=torch.uint8)
2022-11-03 02:10:38.996821 - Model prediction Class 3 True Class: tensor([3], dtype=torch.uint8)
2022-11-03 02:10:38.996943 - Model prediction Class 7 True Class: tensor([7], dtype=torch.uint8)
2022-11-03 02:10:38.997034 - Model prediction Class 4 True Class: tensor([4], dtype=torch.uint8)
2022-11-03 02:10:38.997124 - Model prediction Class 5 True Class: tensor([5], dtype=torch.uint8)
2022-11-03 02:10:38.997222 - Model prediction Class 9 True Class: tensor([9], dtype=torch.uint8)
2022-11-03 02:10:38.997305 - Model prediction Class 0 True Class: tensor([0], dtype=torch.uint8)

Checklist:

  • Did you have fun?
  • Have you added tests that prove your fix is effective or that this feature works?
  • Has code been commented, particularly in hard-to-understand areas?
  • Have you made corresponding changes to the documentation?

testset = datasets.MNIST('./MNIST_dataset', download=True, train=False, transform=image_transform)

# Creating the dataloader.
inference_dataset = torch.utils.data.DataLoader(testset, batch_size=BATCH_SIZE, shuffle=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall I think the example looks good, it's integrating a torchvision dataset but it's not quite clearly a torchdata integration

Specifically I was hoping we could create some toy torchdata dataset directly without leveraging torchvision. I believe this change would be minor to your code but if it isn't I'm happy to merge this if you have bandwidth to work on the more vanilla torchdata integration

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. @NivekT Wondering do you have existing pipeline that they can take a reference. for vision benchmarking?

Copy link

@NivekT NivekT Nov 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For DataPipe reference:

  1. Here is the torchvision implementation of loading MNIST - it might be too complicated. One option is to import and directly use that here (similar to how datasets.MNIST is used)
  2. A standalone, common example is something like this:
dp = FileLister(str(root), masks=[f"archive_{args.archive_size}*.tar"])
dp = dp.shuffle(buffer_size=10000)
dp = FileOpener(dp, mode="b")
dp = TarArchiveLoader(dp, mode="r:")
dp = dp.shuffle(buffer_size=archive_size)
dp = dp.sharding_filter()
dp = dp.map(pil_loader).map(pil_transformation)
# dp = dp.map(tensor_loader).map(tensor_transformation)  # Alternate - convert image to tensor then transform

Separately, I think we should use DataLoader2 instead of the old version in the example. @ejguan WDYT?

Copy link
Contributor Author

@PratsBhatt PratsBhatt Nov 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @msaroufim , @ejguan, @agunapal and @NivekT for your comments and guidance. I have incorporated the required changes. Looking forward to your feedback. Thank you once again.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the quick turnaround @PratsBhatt . Looks good. I am approving it. Minor feedback: Please link the example here since its a few levels deep and might be missed by others. https://github.com/pytorch/serve/blob/master/examples/README.md

@agunapal
Copy link
Collaborator

agunapal commented Nov 2, 2022

@PratsBhatt Thanks for taking this up. Overall it looks good, but In this example, we would want to explicitly make use of TorchData features (Ex: DataPipes). You could take a look at this example in TorchData and see if you can modify your current example with with. https://github.com/pytorch/data/blob/main/examples/vision/imagefolder.py

Copy link
Member

@msaroufim msaroufim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing thank you for the quick turnaround

@codecov
Copy link

codecov bot commented Nov 3, 2022

Codecov Report

Merging #1940 (4398bc4) into master (f5d4022) will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master    #1940   +/-   ##
=======================================
  Coverage   44.95%   44.95%           
=======================================
  Files          63       63           
  Lines        2609     2609           
  Branches       56       56           
=======================================
  Hits         1173     1173           
  Misses       1436     1436           

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Copy link
Collaborator

@agunapal agunapal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.
Minor feedback: Please link the example here since its a few levels deep and might be missed by others.
https://github.com/pytorch/serve/blob/master/examples/README.md

@PratsBhatt
Copy link
Contributor Author

Thank you @agunapal and @msaroufim , I have implemented the code changes. Looking forward to merging the PR.

@msaroufim msaroufim merged commit 33e1e97 into pytorch:master Nov 3, 2022
msaroufim pushed a commit to altre/serve that referenced this pull request Nov 3, 2022
* Add an example with torchdata

* Update comment.

* Incorporate code review comments.

* Remove unsed imports.

* Apply code review comments.
msaroufim pushed a commit to altre/serve that referenced this pull request Nov 3, 2022
* Add an example with torchdata

* Update comment.

* Incorporate code review comments.

* Remove unsed imports.

* Apply code review comments.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants