Update the benchmark to launch one instance per (dataset, synthesizer) pair by R-Palazzo · Pull Request #611 · sdv-dev/SDGym

R-Palazzo · 2026-05-26T13:32:08Z

Resolve #605
86b9zz525

Tested for single-table here, it successfully launched 81 instances (9 synthesizers and 9 datasets).
Tested for multi-table here, it created 162 instances (54 datasets, 3 synthesizers).
We need to store the list of datasets and synthesizers to run the benchmarks on. I stored them in a Python file, but I can move them to YAML if we prefer. To include a new synthesizer/dataset, we would just need to update these lists.

codecov · 2026-05-26T13:39:09Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 85.81%. Comparing base (6228ecb) to head (4e89898).

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #611      +/-   ##
==========================================
- Coverage   85.91%   85.81%   -0.11%     
==========================================
  Files          40       40              
  Lines        3749     3722      -27     
==========================================
- Hits         3221     3194      -27     
  Misses        528      528

Flag	Coverage Δ
integration	`44.16% <0.00%> (+0.31%)`	⬆️
unit	`81.59% <100.00%> (-0.14%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

fealho

This runs the UniformSynthesizer in every instance. So for example, for the 81 single table runs, it ran 81 times. Is this intended?

Also, do we want to change this? I don't think it affects the monthly run, because the config returns before reaching that line, but if you pass datasets/synthesizers, it still creates only 1 instance.

And could you double check the CLI path still works? I think maybe it depended on the yaml files you deleted. Something like this:

python sdgym/_benchmark_launcher/script.py --modality single_table --output-destination s3://bucket/test --synthesizers CTGANSynthesizer

amontanez24

This looks good. I just want to confirm something. Does the instance that runs the workflow to launch the jobs download all the datasets for all jobs, or only the one for the job it launches?

I understand we are doing one job = 1 synthesizer and dataset, but I want to make sure the instance that launches the job isn't downloading too much

R-Palazzo · 2026-05-27T16:02:47Z

This looks good. I just want to confirm something. Does the instance that runs the workflow to launch the jobs download all the datasets for all jobs, or only the one for the job it launches?

I understand we are doing one job = 1 synthesizer and dataset, but I want to make sure the instance that launches the job isn't downloading too much

@amontanez24, yes, that was the case. The instance that runs the workflow was downloading all the datasets. I tried adding the rel-bench dataset and the workflow crashed when trying to download the data:

https://github.com/sdv-dev/SDGym/actions/runs/26508750022/job/78067938428

To avoid this in this PR I moved the dataset loading during execution so it's only the gcp instances that will load the data. I'm running some test and check that it works end-to-end with the rel-bench datasets:

R-Palazzo · 2026-05-28T10:55:01Z

This runs the UniformSynthesizer in every instance. So for example, for the 81 single table runs, it ran 81 times. Is this intended?

Also, do we want to change this? I don't think it affects the monthly run, because the config returns before reaching that line, but if you pass datasets/synthesizers, it still creates only 1 instance.

And could you double check the CLI path still works? I think maybe it depended on the yaml files you deleted. Something like this:
python sdgym/_benchmark_launcher/script.py --modality single_table --output-destination s3://bucket/test --synthesizers CTGANSynthesizer

Hi @fealho thanks for the review!

Yes, it is intended for the UniformSynthesizer. However, when we aggregate the results to generate the monthly leaderboard, only one UniformSynthesizer per dataset is retained and used to compute Adjusted_Total_Time and Adjusted_Quality_Score. So it’s consistent between all synthesizers and datasets.
Good catch for the scripty.py, I updated so it's always one instance per job
Tested the CLI here, running:

python sdgym/_benchmark_launcher/script.py --modality single_table --output-destination s3://sdgym-benchmark/Debug/Issue_605_CLI --synthesizers GaussianCopulaSynthesizer

R-Palazzo · 2026-06-01T17:18:52Z

+            'instacart_marketbasket_ml',
+            'MovieLens',
+            'rossmann',
+            'Telstra',
+            'walmart',


I checked it with Kalyan this morning and we want to have all the private demo datasets included in the benchmark

sarahmish

I prefer if the list of synthesizers and datasets is moved to a separate file as you suggested to make the processes of adding / removing datasets and synthesizers easy. That could be another issue

R-Palazzo requested review from amontanez24 and fealho May 26, 2026 13:32

R-Palazzo requested a review from a team as a code owner May 26, 2026 13:32

R-Palazzo removed the request for review from a team May 26, 2026 13:32

fealho requested changes May 26, 2026

View reviewed changes

amontanez24 reviewed May 26, 2026

View reviewed changes

R-Palazzo force-pushed the issue-605-single-job-per-instance branch from ea2f9bc to c378d46 Compare May 27, 2026 14:49

R-Palazzo force-pushed the issue-604-2-private-bucket branch from 95be93a to 9a91033 Compare May 27, 2026 14:53

R-Palazzo force-pushed the issue-605-single-job-per-instance branch 2 times, most recently from ba22fa7 to fdb0a93 Compare May 27, 2026 15:01

R-Palazzo force-pushed the issue-604-2-private-bucket branch from 725c22c to 5555441 Compare May 28, 2026 09:27

R-Palazzo force-pushed the issue-605-single-job-per-instance branch from 19c7c43 to 5cb84b0 Compare May 28, 2026 10:30

R-Palazzo requested review from amontanez24 and fealho May 28, 2026 11:00

fealho approved these changes May 28, 2026

View reviewed changes

amontanez24 approved these changes May 28, 2026

View reviewed changes

R-Palazzo requested a review from sarahmish May 29, 2026 09:11

R-Palazzo changed the base branch from issue-604-2-private-bucket to main May 29, 2026 09:14

R-Palazzo changed the base branch from main to issue-604-2-private-bucket May 29, 2026 09:14

R-Palazzo force-pushed the issue-604-2-private-bucket branch from b97fa0d to 0786528 Compare June 1, 2026 17:08

R-Palazzo commented Jun 1, 2026

View reviewed changes

R-Palazzo force-pushed the issue-605-single-job-per-instance branch from cdf74e5 to 142d9b9 Compare June 1, 2026 17:34

R-Palazzo force-pushed the issue-604-2-private-bucket branch 2 times, most recently from 4907bb3 to e4bfb1d Compare June 1, 2026 18:35

R-Palazzo force-pushed the issue-605-single-job-per-instance branch from 142d9b9 to bd95074 Compare June 1, 2026 18:37

sarahmish approved these changes Jun 1, 2026

View reviewed changes

Base automatically changed from issue-604-2-private-bucket to main June 1, 2026 19:55

R-Palazzo added 8 commits June 1, 2026 20:57

def 605

a3c3163

unit tests

4189514

test on rel-bench dataset

3acc538

pip install from the current branch

d3960bf

fix script.py: always launch 1 instance per job in CLI

6a07628

undo changes for testing

a88e0b1

undo workflow change

9c9e513

add all private demo datasets

4e89898

R-Palazzo force-pushed the issue-605-single-job-per-instance branch from bd95074 to 4e89898 Compare June 1, 2026 19:58

R-Palazzo merged commit 00c14d2 into main Jun 1, 2026
51 checks passed

R-Palazzo deleted the issue-605-single-job-per-instance branch June 1, 2026 22:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update the benchmark to launch one instance per (dataset, synthesizer) pair#611

Update the benchmark to launch one instance per (dataset, synthesizer) pair#611
R-Palazzo merged 8 commits into
mainfrom
issue-605-single-job-per-instance

R-Palazzo commented May 26, 2026 •

edited

Loading

Uh oh!

codecov Bot commented May 26, 2026 •

edited

Loading

Uh oh!

fealho left a comment

Uh oh!

amontanez24 left a comment

Uh oh!

R-Palazzo commented May 27, 2026 •

edited

Loading

Uh oh!

R-Palazzo commented May 28, 2026 •

edited

Loading

Uh oh!

R-Palazzo Jun 1, 2026

Uh oh!

sarahmish left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

R-Palazzo commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov Bot commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

fealho left a comment

Choose a reason for hiding this comment

Uh oh!

amontanez24 left a comment

Choose a reason for hiding this comment

Uh oh!

R-Palazzo commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

R-Palazzo commented May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

R-Palazzo Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

sarahmish left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

R-Palazzo commented May 26, 2026 •

edited

Loading

codecov Bot commented May 26, 2026 •

edited

Loading

R-Palazzo commented May 27, 2026 •

edited

Loading

R-Palazzo commented May 28, 2026 •

edited

Loading