Skip to content

Conversation

msaroufim
Copy link
Member

@msaroufim msaroufim commented Aug 2, 2023

All of these OOM on a single device but want to make them all available for the distributed tests @H-Huang is working on

@msaroufim msaroufim requested review from H-Huang and xuzhao9 August 2, 2023 00:22
@msaroufim msaroufim changed the title all the llamas all the llamas in canary Aug 2, 2023
Copy link
Contributor

@xuzhao9 xuzhao9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More llamas!

@facebook-github-bot
Copy link
Contributor

@msaroufim has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Copy link
Member

@H-Huang H-Huang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! Thanks for adding :)

@@ -0,0 +1,15 @@
from torchbenchmark.tasks import NLP
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just curious, these models are in canary_models as opposed to models dir. Does this effect how these models are pulled in either from torchbench or dyanmorunner and if so in what way?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error message explains it super well, should be no difference

  File "/var/lib/jenkins/workspace/benchmarks/dynamo/torchbench.py", line 302, in load_model
    raise ImportError(f"could not import any of {candidates}")
ImportError: could not import any of ['torchbenchmark.models.stable_diffusion', 'torchbenchmark.canary_models.stable_diffusion', 'torchbenchmark.models.fb.stable_diffusion']
ERROR

@facebook-github-bot
Copy link
Contributor

@msaroufim merged this pull request in e7ca300.

pytorchmergebot pushed a commit to pytorch/pytorch that referenced this pull request Aug 3, 2023
Includes stable diffusion, whisper, llama7b and clip

To get this to work I had to Pass in hf auth token to all ci jobs, github does not pass in secrets from parent to child automatically. There's a likelihood HF will rate limit us in case please revert this PR and I'll work on adding a cache next - cc @voznesenskym @penguinwu @anijain2305 @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx @ipiszy @chenyang78 @aakhundov @malfet

Something upstream changed in torchbench too where now `hf_Bert` and `hf_Bert_large` are both failing on some dynamic shape looking error which I'm not sure how to debug yet so for now felt a bit gross but added a skip since others are building on top this work @ezyang

`llamav2_7b_16h` cannot pass through accuracy checks cause it OOMs on deepcloning extra inputs this seems to make it not need to show up in expected numbers csv, will figure this when we update the pin with pytorch/benchmark#1803 cc @H-Huang @xuzhao9 @cpuhrsch

Pull Request resolved: #106009
Approved by: https://github.com/malfet
cclauss pushed a commit to cclauss/benchmark that referenced this pull request Jan 22, 2025
When using `includes`, consumers will apply the headers
using `-isystem`, instead of `-I`. This will allow diagnostics
of consumers to not apply to `benchmark`.

More info:

https://bazel.build/reference/be/c-cpp#cc_library.includes

https://bazel.build/reference/be/c-cpp#cc_library.strip_include_prefix

gtest uses `includes` as well:
https://github.com/google/googletest/blob/1d17ea141d2c11b8917d2c7d029f1c4e2b9769b2/BUILD.bazel#L120
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants