Add DTensor LLaMA inference model: simple_gpt #1867

xmfan · 2023-09-01T06:16:07Z

Adds simple_gpt + DTensor implemented in https://github.com/pytorch-labs/simple_gpt/pull/7 to torchbench

Tested via python benchmarks/dynamo/torchbench.py -d cuda --output-directory=benchmark_logs --output=performance.csv --inference --performance --timing --print-memory --multiprocess --nothing --only simple_gpt. Note: --nothing is used here to disable compile, since DTensor + compile isn't yet supported in main

dev,name,batch_size,speedup,abs_latency,compilation_latency,compression_ratio,eager_peak_mem,dynamo_peak_mem,calls_captured,unique_graphs,graph_breaks,unique_graph_breaks
cuda,simple_gpt,1,0.966153,196.819773,-0.059319,1.000000,4.576880,4.576880,0,0,0,0
cuda,simple_gpt,1,0.967389,196.608152,-0.058833,1.000000,4.577404,4.577404,0,0,0,0
cuda,simple_gpt,1,0.973152,196.093583,-0.059316,1.000000,4.593133,4.593133,0,0,0,0
cuda,simple_gpt,1,0.973087,196.124046,-0.075580,1.000000,4.611483,4.611483,0,0,0,0
cuda,simple_gpt,1,0.967908,193.998484,-0.040192,1.000000,4.593133,4.593133,0,0,0,0
cuda,simple_gpt,1,0.968949,193.798088,-0.028878,1.000000,4.593133,4.593133,0,0,0,0

2 changes were required to the model:

decorate torch.no_grad() on the caches, previously this was done outside the model, the entire eval call was wrapped in a torch.no_grad() context. After using torchbench, I notice even with only inference mode, we don't disable gradient calculations
rank/world size, added support from torchbench side in Forward rank and world size info to Torchbench models when using dynamo runner pytorch#108438 and updated model to fetch from the provided extra_args

H-Huang

Looks good

H-Huang · 2023-09-05T21:45:31Z

torchbenchmark/models/simple_gpt/__init__.py

+        )
+
+        fabric = L.Fabric(devices=[self._rank], precision="bf16-true")
+        with fabric.init_module(empty_init=True):


Just curious, is the only use of Lightning this init_module? I'm not sure what this does, but can we remove it by initializing with meta device? Then the implementation doesn't rely on third-party libraries can stay as close to native pytorch as possible.

Yeah good point, let me update this with the latest init code that removed the lightning dep

I've actually updated the initial code to avoid the lightning dependency (and it's actually much faster too!)

https://github.com/pytorch-labs/simple_gpt/blob/main/generate.py#L162

For nightly runs, do we usually load the actual weights? The load time should be quick, but the weights file is 10+GB just for LLaMA-7B. Otherwise, I could also update the default weights initialization to use random instead of just torch.zeros

Use the real weights otherwise you might fail accuracy checks in unpredictable ways

Is this still needed if we only run this model through dynamorunner (model requires distributed, multiple GPUs)? The unit tests from this repo are skipped i.e. python test.py -k 'test_simple_gpt_', which includes the accuracy checks

xuzhao9 · 2023-09-05T23:54:45Z

decorate torch.no_grad() on the caches, previously this was done outside the model, the entire eval call was wrapped in a torch.no_grad() context. After using torchbench, I notice even with only inference mode, we don't disable gradient calculations

This is because some models in torchbench (e.g. maml) does need gradient calculations in inference mode. So we leave the choice of whether enabling the gradient context to the eval test code.

There is an open issue about testing that the no_grad is enabled in the eval test: #1838

xmfan · 2023-09-06T19:01:28Z

@pytorchbot merge

facebook-github-bot · 2023-09-07T18:02:06Z

@xmfan has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2023-09-08T21:33:20Z

@xmfan merged this pull request in 0a64c55.

…mo runner (#108438) Adding support to pass rank and world_size to torchbench model, via its extra_args parameter: https://github.com/pytorch/benchmark/blob/main/torchbenchmark/util/model.py#L83C80-L83C90 This is used for models which distribute over multiple GPUs e.g. simple_gpt pytorch/benchmark#1867 Also add an option to skip multiprocess only gpu models Testing via `python benchmarks/dynamo/torchbench.py -d cuda --output=benchmark_logs/performance.csv --inference --performance --timing --print-memory --multiprocess --only simple_gpt` Pull Request resolved: #108438 Approved by: https://github.com/Chillee

xmfan had a problem deploying to docker-s3-upload September 1, 2023 06:16 — with GitHub Actions Error

facebook-github-bot added the cla signed label Sep 1, 2023

Add DTensor LLaMA inference model: simple_gpt

d8d42b5

xmfan force-pushed the xmfan/simple_gpt branch from 2b39a5d to d8d42b5 Compare September 1, 2023 07:45

xmfan had a problem deploying to docker-s3-upload September 1, 2023 07:46 — with GitHub Actions Failure

xmfan had a problem deploying to docker-s3-upload September 1, 2023 21:05 — with GitHub Actions Failure

xmfan mentioned this pull request Sep 1, 2023

Forward rank and world size info to Torchbench models when using dynamo runner pytorch/pytorch#108438

Closed

xmfan force-pushed the xmfan/simple_gpt branch from 52dd24b to ecddecc Compare September 5, 2023 20:14

xmfan had a problem deploying to docker-s3-upload September 5, 2023 20:14 — with GitHub Actions Error

Rework model to be launched via dynamo runner

f365e24

xmfan force-pushed the xmfan/simple_gpt branch from ecddecc to f365e24 Compare September 5, 2023 20:19

xmfan had a problem deploying to docker-s3-upload September 5, 2023 20:19 — with GitHub Actions Failure

xmfan requested review from Chillee and H-Huang September 5, 2023 20:30

xmfan marked this pull request as ready for review September 5, 2023 20:31

H-Huang approved these changes Sep 5, 2023

View reviewed changes

remove lightning dep, use default model param values

6efde91

xmfan had a problem deploying to docker-s3-upload September 5, 2023 22:31 — with GitHub Actions Failure

raise NotImplementedError from __init__ and update convention in wiki

0de9524

xmfan temporarily deployed to docker-s3-upload September 6, 2023 01:47 — with GitHub Actions Inactive

msaroufim self-requested a review September 7, 2023 19:04

msaroufim approved these changes Sep 7, 2023

View reviewed changes

facebook-github-bot closed this in 0a64c55 Sep 8, 2023

facebook-github-bot added the Merged label Sep 8, 2023

xmfan mentioned this pull request Sep 14, 2023

Add a multiprocess CI job to torchbench dynamo runner pytorch/pytorch#109311

Closed

jinsong-mao mentioned this pull request Nov 6, 2023

simple-gpt usage and requirements #2026

Closed

cclauss pushed a commit to cclauss/benchmark that referenced this pull request Jan 22, 2025

upgrade bazel mods. requires c++14 for tests (pytorch#1867)

4987143

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add DTensor LLaMA inference model: simple_gpt #1867

Add DTensor LLaMA inference model: simple_gpt #1867

Uh oh!

xmfan commented Sep 1, 2023 •

edited

Loading

Uh oh!

H-Huang left a comment

Uh oh!

H-Huang Sep 5, 2023

Uh oh!

xmfan Sep 5, 2023

Uh oh!

Chillee Sep 5, 2023

Uh oh!

xmfan Sep 6, 2023

Uh oh!

msaroufim Sep 6, 2023

Uh oh!

xmfan Sep 6, 2023 •

edited

Loading

Uh oh!

xuzhao9 commented Sep 5, 2023

Uh oh!

xmfan commented Sep 6, 2023

Uh oh!

facebook-github-bot commented Sep 7, 2023

Uh oh!

facebook-github-bot commented Sep 8, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Add DTensor LLaMA inference model: simple_gpt #1867

Add DTensor LLaMA inference model: simple_gpt #1867

Uh oh!

Conversation

xmfan commented Sep 1, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

H-Huang left a comment

Choose a reason for hiding this comment

Uh oh!

H-Huang Sep 5, 2023

Choose a reason for hiding this comment

Uh oh!

xmfan Sep 5, 2023

Choose a reason for hiding this comment

Uh oh!

Chillee Sep 5, 2023

Choose a reason for hiding this comment

Uh oh!

xmfan Sep 6, 2023

Choose a reason for hiding this comment

Uh oh!

msaroufim Sep 6, 2023

Choose a reason for hiding this comment

Uh oh!

xmfan Sep 6, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

xuzhao9 commented Sep 5, 2023

Uh oh!

xmfan commented Sep 6, 2023

Uh oh!

facebook-github-bot commented Sep 7, 2023

Uh oh!

facebook-github-bot commented Sep 8, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

xmfan commented Sep 1, 2023 •

edited

Loading

xmfan Sep 6, 2023 •

edited

Loading