Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix INF2 example handler #2378

Merged
merged 5 commits into from
Jun 6, 2023
Merged

Fix INF2 example handler #2378

merged 5 commits into from
Jun 6, 2023

Conversation

namannandan
Copy link
Collaborator

@namannandan namannandan commented Jun 1, 2023

Description

Fix handler for Inferentia2 example to handle partial batches and calls to transformers-neuronx library api

Type of change

  • Bug fix (non-breaking change which fixes an issue)

Feature/Issue validation/testing

  • Successful manual test for 125m parameter variant of the opt model
$ cat sample_text.txt 
Today the weather is really nice and I am planning on
$ curl http://127.0.0.1:8080/predictions/opt-125m -T sample_text.txt 
Today the weather is really nice and I am planning on
going through some camping next weekend. I thought if I didn’t
wear rain gear on Saturday maybe I would have to wear a coat but
I am using my wet shoes
  • Test for the 6.7b parameter variant of opt model [in progress]
$ cat sample_text.txt 
Today the weather is really nice and I am planning on
$ curl http://127.0.0.1:8080/predictions/opt-6.7b -T sample_text.txt 
Today the weather is really nice and I am planning on
spending the day in the park and riding my bike to the
store for ice cream. Then I will come home and study
more Spanish. It is raining and sunny here,

@namannandan namannandan marked this pull request as ready for review June 1, 2023 01:24
Copy link
Collaborator

@HamidShojanazeri HamidShojanazeri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks @namannandan

@namannandan namannandan merged commit 113d22a into inf2-example Jun 6, 2023
lxning pushed a commit that referenced this pull request Jun 16, 2023
* adding inf2 example

* fix the inference func

* Add batch size note

* Fix INF2 example handler (#2378)

* fix INF2 example handler

* Add logging for padding in inf2 handler

* update response timeout and model

* Update documentation to show opt-6.7b as the example model

* Update model batch log

---------

Co-authored-by: Naman Nandan <namannan@amazon.com>

* Update requirements and sample text file

* fix neuron core allocation to worker process

* Fix linter errors and update documentation

* enable core allocation verification in handler

* fix lint error

---------

Co-authored-by: Ubuntu <ubuntu@ip-172-31-45-0.ec2.internal>
Co-authored-by: Hamid Shojanazeri <hamid.nazeri2010@gmail.com>
Co-authored-by: Naman Nandan <namannan@amazon.com>
@namannandan namannandan deleted the inf2-example-fix branch November 9, 2023 18:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants