Add HuggingFace Llama3.2 1B to benchmark #5368

guangy10 · 2024-09-13T23:36:29Z

Add llama3.2 1b from Hugging Face to benchmark w/ the following configs:

SpinQuant
QAT+Lora
Original BF16

Switched to use the memory intensive runners in the benchmark workflow to reduce operation cost.

pytorch-bot · 2024-09-13T23:36:32Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/5368

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 1 Cancelled Job

As of commit e2779ee with merge base 8460d42 ():

NEW FAILURE - The following job has failed:

pull / unittest / macos / macos-job (gh)
backends/xnnpack/test/ops/test_conv1d.py::TestConv1d::test_qs8_conv1d_batchnorm_seq

CANCELLED JOB - The following job was cancelled. Please retry:

Check Labels (gh)

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot · 2024-09-14T01:32:38Z

@guangy10 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2024-09-14T01:34:40Z

@guangy10 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

guangy10 · 2024-09-14T03:17:02Z

Upload model artifacts to GitHub is skipped https://github.com/pytorch/executorch/actions/runs/10858058150/job/30136354800. Don't see the reason behind from the log. The model artifacts are placed under artifacts-to-be-uploaded/google/gemma-2b_xnnpack/model.zip @huydhn Any clue why isn't uploaded? IIUC you mentioned any subdirectry under artifacts-to-be-uploaded/ will be uploaded right?

huydhn · 2024-09-14T03:55:53Z

Upload model artifacts to GitHub is skipped https://github.com/pytorch/executorch/actions/runs/10858058150/job/30136354800. Don't see the reason behind from the log. The model artifacts are placed under artifacts-to-be-uploaded/google/gemma-2b_xnnpack/model.zip @huydhn Any clue why isn't uploaded? IIUC you mentioned any subdirectry under artifacts-to-be-uploaded/ will be uploaded right?

Oops, the size of the export model is 11+ GB I think. I think uploading such large file to GH is taking too long and the job timed out.

2024-09-14T02:21:26.5056615Z + ls -All ./gemma-2b_xnnpack_fp32.pte
2024-09-14T02:21:26.5057369Z -rw-r--r--. 1 ci-user ci-user 12122356576 Sep 14 01:43 ./gemma-2b_xnnpack_fp32.pte

I think I need to rework the upload part here as GH doesn't scale, so we need to go straight to S3.

.ci/scripts/test_hf_model.sh

huydhn · 2024-09-14T21:02:43Z

@guangy10 You could try to rebase the PR now and re-run the test now that #5375 has been merged

.github/workflows/android-perf.yml

facebook-github-bot · 2024-09-16T22:29:33Z

@guangy10 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2024-09-17T18:12:51Z

@guangy10 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

guangy10 · 2024-09-17T19:15:08Z

Tried running the gemma-2b on Google Pixel 8 Pro (w/ 12GB RAM). The failure is same. Some I/O failures when connecting the device in the pool: https://github.com/pytorch/executorch/actions/runs/10908663134/job/30277474048. In the stacktrace I see there is a call extra_data_arn = upload_file( trying to upload the extra_data_archive, I'm wondering if it's exceed the set limit because the file is +10GB. cc: @huydhn

huydhn · 2024-09-17T19:21:16Z

I'm checking AWS doc on this https://docs.aws.amazon.com/devicefarm/latest/developerguide/limits.html and it mentions a 4GB limit, but that's for the size of the app, not the extra data archive. Let me run this manually using AWS UI and see if it accepts the model.

The archive size is 5.4 GB https://github.com/pytorch/executorch/actions/runs/10908663134/job/30278173066#step:11:38. IIRC, llam2 7b works but it's only ~3GB

guangy10 · 2024-12-17T20:19:42Z

SpinQuant and QLORA are passing.

job link: https://github.com/pytorch/executorch/actions/runs/12379908626/job/34556180815
view on dashboard

guangy10 · 2024-12-17T21:16:36Z

Orignal BF16 is passing:

facebook-github-bot · 2024-12-17T23:30:35Z

@guangy10 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

guangy10 · 2024-12-17T23:36:41Z

Decided to leaving the logics of running the 1b model in scheduled jobs in a separate PR to simplify the review as it will require significant refactoring in the workflow.

huydhn · 2024-12-18T04:13:07Z

Just a data point that I see your test run shows up on the dashboard at https://hud.pytorch.org/benchmark/llms?startTime=Wed%2C%2011%20Dec%202024%2004%3A07%3A58%20GMT&stopTime=Wed%2C%2018%20Dec%202024%2004%3A07%3A58%20GMT&granularity=hour&lBranch=add_hf_model_to_benchinfra&lCommit=e2779ee5cbe666072a2d0f7a6821d640a11d1ad9&rBranch=add_hf_model_to_benchinfra&rCommit=e2779ee5cbe666072a2d0f7a6821d640a11d1ad9&repoName=pytorch%2Fexecutorch&modelName=All%20Models&backendName=All%20Backends&dtypeName=All%20DType&deviceName=All%20Devices and the extraction logic looks wrong in which llama model has the backend and benchmark configs swapped. I guess this is what you means by introducing the new benchmark_config. So, the https://github.com/pytorch/executorch/blob/main/.github/scripts/extract_benchmark_results.py script will need to be updated accordingly I guess.

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 13, 2024

guangy10 force-pushed the add_hf_model_to_benchinfra branch 3 times, most recently from 449b4d1 to b48035a Compare September 14, 2024 00:22

guangy10 changed the base branch from gemma_executorch to main September 14, 2024 00:23

guangy10 marked this pull request as ready for review September 14, 2024 00:36

guangy10 force-pushed the add_hf_model_to_benchinfra branch 2 times, most recently from 53e7756 to a13a44b Compare September 14, 2024 00:49

guangy10 requested review from huydhn and kirklandsign September 14, 2024 01:32

guangy10 force-pushed the add_hf_model_to_benchinfra branch from a13a44b to 97050c2 Compare September 14, 2024 01:34

kirklandsign approved these changes Sep 14, 2024

View reviewed changes

.ci/scripts/test_hf_model.sh Outdated Show resolved Hide resolved

huydhn reviewed Sep 14, 2024

View reviewed changes

.github/workflows/android-perf.yml Show resolved Hide resolved

huydhn approved these changes Sep 14, 2024

View reviewed changes

guangy10 force-pushed the add_hf_model_to_benchinfra branch 2 times, most recently from cd4c507 to 60b62d3 Compare September 16, 2024 22:28

guangy10 force-pushed the add_hf_model_to_benchinfra branch from 60b62d3 to 009f932 Compare September 17, 2024 18:12

guangy10 force-pushed the add_hf_model_to_benchinfra branch 2 times, most recently from 9e89593 to b2d837e Compare September 30, 2024 23:57

guangy10 force-pushed the add_hf_model_to_benchinfra branch from f936584 to 7b55bb9 Compare December 17, 2024 00:34

Add compatible HuggingFace models to benchmark workflow

6cb6af9

guangy10 force-pushed the add_hf_model_to_benchinfra branch from 7b55bb9 to 6cb6af9 Compare December 17, 2024 00:36

guangy10 had a problem deploying to upload-benchmark-results December 17, 2024 00:57 — with GitHub Actions Failure

guangy10 temporarily deployed to upload-benchmark-results December 17, 2024 01:14 — with GitHub Actions Inactive

guangy10 temporarily deployed to upload-benchmark-results December 17, 2024 01:15 — with GitHub Actions Inactive

guangy10 had a problem deploying to upload-benchmark-results December 17, 2024 01:53 — with GitHub Actions Failure

guangy10 had a problem deploying to upload-benchmark-results December 17, 2024 02:55 — with GitHub Actions Failure

guangy10 had a problem deploying to upload-benchmark-results December 17, 2024 03:04 — with GitHub Actions Failure

guangy10 force-pushed the add_hf_model_to_benchinfra branch from cb3efe3 to bedecd8 Compare December 17, 2024 19:19

guangy10 temporarily deployed to upload-benchmark-results December 17, 2024 19:53 — with GitHub Actions Inactive

guangy10 temporarily deployed to upload-benchmark-results December 17, 2024 19:57 — with GitHub Actions Inactive

guangy10 temporarily deployed to upload-benchmark-results December 17, 2024 20:07 — with GitHub Actions Inactive

guangy10 changed the title ~~Add compatible HuggingFace models to benchmark workflow~~ Add HuggingFace Llama 1B to benchmark Dec 17, 2024

guangy10 changed the title ~~Add HuggingFace Llama 1B to benchmark~~ Add HuggingFace Llama3.2 1B to benchmark Dec 17, 2024

guangy10 temporarily deployed to upload-benchmark-results December 17, 2024 21:09 — with GitHub Actions Inactive

Replace ones with rand to workaround the crash from sdpa kernel

e2779ee

guangy10 force-pushed the add_hf_model_to_benchinfra branch from bedecd8 to e2779ee Compare December 17, 2024 23:30

guangy10 added module: benchmark Issues related to the benchmark infrastructure topic: not user facing labels Dec 17, 2024

guangy10 temporarily deployed to upload-benchmark-results December 18, 2024 00:10 — with GitHub Actions Inactive

guangy10 temporarily deployed to upload-benchmark-results December 18, 2024 00:12 — with GitHub Actions Inactive

guangy10 temporarily deployed to upload-benchmark-results December 18, 2024 00:23 — with GitHub Actions Inactive

guangy10 merged commit 72bb7b7 into main Dec 18, 2024
104 of 106 checks passed

guangy10 deleted the add_hf_model_to_benchinfra branch December 18, 2024 00:32

Add HuggingFace Llama3.2 1B to benchmark #5368

Add HuggingFace Llama3.2 1B to benchmark #5368

Uh oh!

Conversation

guangy10 commented Sep 13, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Sep 13, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/5368

❌ 1 New Failure, 1 Cancelled Job

Uh oh!

facebook-github-bot commented Sep 14, 2024

Uh oh!

facebook-github-bot commented Sep 14, 2024

Uh oh!

guangy10 commented Sep 14, 2024

Uh oh!

huydhn commented Sep 14, 2024

Uh oh!

Uh oh!

huydhn commented Sep 14, 2024

Uh oh!

Uh oh!

facebook-github-bot commented Sep 16, 2024

Uh oh!

facebook-github-bot commented Sep 17, 2024

Uh oh!

guangy10 commented Sep 17, 2024

Uh oh!

huydhn commented Sep 17, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

guangy10 commented Dec 17, 2024

Uh oh!

guangy10 commented Dec 17, 2024

Uh oh!

facebook-github-bot commented Dec 17, 2024

Uh oh!

guangy10 commented Dec 17, 2024

Uh oh!

Uh oh!

huydhn commented Dec 18, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

guangy10 commented Sep 13, 2024 •

edited

Loading

pytorch-bot bot commented Sep 13, 2024 •

edited

Loading

huydhn commented Sep 17, 2024 •

edited

Loading