Inf2 example #2399

namannandan · 2023-06-06T23:59:54Z

Description

Inferentia2 example based on opt-6.7b model

Type of change

New feature (non-breaking change which adds functionality)
This change requires a documentation update

Feature/Issue validation/testing

Successful test for 125m parameter variant of the opt model

$ cat sample_text.txt 
Today the weather is really nice and I am planning on
$ curl http://127.0.0.1:8080/predictions/opt-125m -T sample_text.txt 
Today the weather is really nice and I am planning on
going through some camping next weekend. I thought if I didn’t
wear rain gear on Saturday maybe I would have to wear a coat but
I am using my wet shoes

Successful test for the 6.7b parameter variant of opt model

$ cat sample_text.txt 
Today the weather is really nice and I am planning on
$ curl http://127.0.0.1:8080/predictions/opt-6.7b -T sample_text.txt 
Today the weather is really nice and I am planning on
spending the day in the park and riding my bike to the
store for ice cream. Then I will come home and study
more Spanish. It is raining and sunny here,

* fix INF2 example handler * Add logging for padding in inf2 handler * update response timeout and model * Update documentation to show opt-6.7b as the example model * Update model batch log --------- Co-authored-by: Naman Nandan <namannan@amazon.com>

codecov · 2023-06-07T00:23:57Z

Codecov Report

Merging #2399 (897c05c) into master (679b33d) will not change coverage.
The diff coverage is n/a.

❗ Current head 897c05c differs from pull request most recent head e7559e7. Consider uploading reports for the commit e7559e7 to get more accurate results

@@           Coverage Diff           @@
##           master    #2399   +/-   ##
=======================================
  Coverage   72.01%   72.01%           
=======================================
  Files          78       78           
  Lines        3648     3648           
  Branches       58       58           
=======================================
  Hits         2627     2627           
  Misses       1017     1017           
  Partials        4        4

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

HamidShojanazeri

Thanks very much @namannandan LGTM

lxning · 2023-06-12T17:53:34Z

examples/large_models/inferentia2/inf2_handler.py

+        model_name = ctx.model_yaml_config["handler"]["model_name"]
+
+        # allocate "tp_degree" number of neuron cores to the worker process
+        os.environ["NEURON_RT_NUM_CORES"] = str(tp_degree)


How do you make sure neuron has enough number of cores to support tp_degree?

~~I believe torch-neuronx currently does not have an API that provides the number of available(unallocated) neuron cores.~~ Here, if the required number of neuron cores, i.e tp_degree are not available then the model loading will fail with error of the form:

ERROR TDRV:db_vtpb_get_mla_and_tpb Could not find VNC id 1

Turns out that torch-neuronx does have a method to query the number of available unallocated cores using torch_neuronx.xla_impl.data_parallel.device_count(). Updated the handler to verify that the necessary number of cores are available before proceeding with model loading

namannandan · 2023-06-15T23:44:43Z

Successfully tested the example:

on EC2 with Deep Learning AMI Neuron PyTorch 1.13.0 and
in docker using DLC 763104351884.dkr.ecr.us-west-2.amazonaws.com/pytorch-inference-neuronx:1.13.1-neuronx-py38-sdk2.10.0-ubuntu20.04

Ubuntu and others added 5 commits April 29, 2023 05:19

adding inf2 example

8189fc9

fix the inference func

7460874

Add batch size note

203e37b

Fix INF2 example handler (#2378)

113d22a

* fix INF2 example handler * Add logging for padding in inf2 handler * update response timeout and model * Update documentation to show opt-6.7b as the example model * Update model batch log --------- Co-authored-by: Naman Nandan <namannan@amazon.com>

Merge branch 'master' into inf2-example

b1c1418

Naman Nandan added 3 commits June 7, 2023 12:49

Update requirements and sample text file

4f6d2f4

fix neuron core allocation to worker process

b849542

Fix linter errors and update documentation

a64938c

namannandan marked this pull request as ready for review June 7, 2023 21:27

namannandan requested a review from HamidShojanazeri June 7, 2023 21:31

namannandan mentioned this pull request Jun 8, 2023

Enable opt-6.7b benchmark on inf2 #2400

Merged

3 tasks

HamidShojanazeri approved these changes Jun 11, 2023

View reviewed changes

lxning reviewed Jun 12, 2023

View reviewed changes

enable core allocation verification in handler

50668c5

namannandan force-pushed the inf2-example branch from ecc5e02 to 50668c5 Compare June 13, 2023 00:54

namannandan requested a review from lxning June 13, 2023 01:22

namannandan and others added 2 commits June 15, 2023 16:44

Merge branch 'master' into inf2-example

64685cb

fix lint error

e7559e7

lxning approved these changes Jun 16, 2023

View reviewed changes

lxning merged commit 4e21262 into master Jun 16, 2023

namannandan deleted the inf2-example branch November 9, 2023 18:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inf2 example #2399

Inf2 example #2399

namannandan commented Jun 6, 2023 •

edited

Loading

codecov bot commented Jun 7, 2023 •

edited

Loading

HamidShojanazeri left a comment

lxning Jun 12, 2023

namannandan Jun 12, 2023 •

edited

Loading

namannandan Jun 13, 2023

namannandan commented Jun 15, 2023

Inf2 example #2399

Inf2 example #2399

Conversation

namannandan commented Jun 6, 2023 • edited Loading

Description

Type of change

Feature/Issue validation/testing

codecov bot commented Jun 7, 2023 • edited Loading

Codecov Report

HamidShojanazeri left a comment

Choose a reason for hiding this comment

lxning Jun 12, 2023

Choose a reason for hiding this comment

namannandan Jun 12, 2023 • edited Loading

Choose a reason for hiding this comment

namannandan Jun 13, 2023

Choose a reason for hiding this comment

namannandan commented Jun 15, 2023

namannandan commented Jun 6, 2023 •

edited

Loading

codecov bot commented Jun 7, 2023 •

edited

Loading

namannandan Jun 12, 2023 •

edited

Loading