Add collated reports job to Nvidia CI #40470

ahadnagy · 2025-08-26T20:41:34Z

What does this PR do?

This PR adds the new collated reports job to Nvidia as well that produces reports like this: https://huggingface.co/datasets/optimum-amd/transformers_daily_ci/blob/main/2025-08-25/runs/39-17221003312/ci_results_run_models_gpu/collated_reports_e68146f.json

This is required to compare test result between platforms easily.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

HuggingFaceDocBuilderDev · 2025-08-26T20:55:37Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

ahadnagy · 2025-08-26T21:04:11Z

.github/workflows/self-scheduled.yml

+      job: run_models_gpu
+      report_repo_id: ${{ inputs.report_repo_id }}
+      gpu_name: ${{ inputs.runner_type }}
+      machine_type: ${{ matrix.machine_type }}


Not really sure about this line.

It's single-gpu or multi-gpu, no?

That should be it, yes. But i'm not sure "matrix" works in this context.

Oh yeah I see it now. Move the collated reports section inside of github/workflows/model_jobs.yml instead 👍

Rocketknight1 · 2025-08-27T11:23:28Z

cc @ydshieh

ivarflakstad

Lgtm but I'll let the CI master make the final call

ivarflakstad · 2025-08-27T16:40:15Z

.github/workflows/self-nightly-caller.yml

      slack_report_channel: "#transformers-ci-past-future"
      docker: huggingface/transformers-all-latest-torch-nightly-gpu
      ci_event: Nightly CI
+      runner_type: "a10"


@ydshieh you had an idea for how to get the gpu name dynamically, right?

i am too lazy to do anything here but just keep it a10

BTW, this is not the workflow you want to compare. This workflow is running against torch nightly build.

What you want to compare against is

.github/workflows/self-scheduled-caller.yml

I guess

ivarflakstad · 2025-08-27T16:41:33Z

.github/workflows/self-scheduled.yml

+      job: run_models_gpu
+      report_repo_id: ${{ inputs.report_repo_id }}
+      gpu_name: ${{ inputs.runner_type }}
+      machine_type: ${{ matrix.machine_type }}


It's single-gpu or multi-gpu, no?

ydshieh · 2025-08-28T15:16:49Z

I am not a super fan to have this for Nvidia runs (I might change the mind in the future, but not now).

I thought this report is only for AMD when @ivarflakstad worked on it .

However, I am fine to have it run and see how it goes. Please keep that job working stable (although I guess the whole workflow still works if that jobs fails).

BTW, what are those runs

https://huggingface.co/datasets/optimum-amd/transformers_daily_ci/tree/main/2025-08-25/runs

with small workflow run number 14-17197751843, 15-17208128961 etc

which workflow uploaded them?

ydshieh · 2025-08-28T15:18:07Z

Before merge, please try to trigger a run (use push event) to see it works well , but with a small list of models.

Don't hesitate to reach out to me if you need info about how to do that

ahadnagy · 2025-09-02T10:41:48Z

Just as paper trail, here's a successful run with push trigger:
https://huggingface.co/datasets/hf-internal-testing/transformers_daily_ci/tree/main/2025-09-02/runs/1277-17398796375/ci_results_run_models_gpu

7510f4a

ydshieh

Thanks!

ahadnagy added 2 commits August 26, 2025 20:40

Add collated reports job to Nvidia CI

ae8d9c1

machine_type

244e17b

ahadnagy requested review from ivarflakstad and ydshieh August 26, 2025 20:46

ahadnagy commented Aug 26, 2025

View reviewed changes

ivarflakstad reviewed Aug 27, 2025

View reviewed changes

ahadnagy added 3 commits August 27, 2025 20:43

Merge branch 'main' into collated-reports-nvidia-ci

3f928e9

Move collated reports job to model_jobs

4be041d

Propagate repo id variable

5fe4258

ahadnagy added 2 commits September 2, 2025 07:40

assifgn runner_type is self-scheduled-caller

acf9a6b

Merge branch 'main' into collated-reports-nvidia-ci

3bdd0ac

ydshieh approved these changes Sep 2, 2025

View reviewed changes

ahadnagy merged commit 8c60a7c into huggingface:main Sep 2, 2025
14 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add collated reports job to Nvidia CI #40470

Add collated reports job to Nvidia CI #40470

ahadnagy commented Aug 26, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Aug 26, 2025

Uh oh!

ahadnagy Aug 26, 2025

Uh oh!

ivarflakstad Aug 27, 2025

Uh oh!

ahadnagy Aug 27, 2025

Uh oh!

ivarflakstad Aug 28, 2025

Uh oh!

Rocketknight1 commented Aug 27, 2025

Uh oh!

ivarflakstad left a comment

Uh oh!

ivarflakstad Aug 27, 2025

Uh oh!

ydshieh Aug 28, 2025

Uh oh!

ydshieh Aug 28, 2025

Uh oh!

ivarflakstad Aug 27, 2025

Uh oh!

ydshieh commented Aug 28, 2025

Uh oh!

ydshieh commented Aug 28, 2025

Uh oh!

ahadnagy commented Sep 2, 2025

Uh oh!

ydshieh left a comment

Uh oh!

Uh oh!

Uh oh!

Add collated reports job to Nvidia CI #40470

Add collated reports job to Nvidia CI #40470

Conversation

ahadnagy commented Aug 26, 2025

What does this PR do?

Before submitting

Who can review?

Uh oh!

HuggingFaceDocBuilderDev commented Aug 26, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Rocketknight1 commented Aug 27, 2025

Uh oh!

ivarflakstad left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ydshieh commented Aug 28, 2025

Uh oh!

ydshieh commented Aug 28, 2025

Uh oh!

ahadnagy commented Sep 2, 2025

Uh oh!

ydshieh left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!