Ensure node ids are always selected into the same split by kmontemayor2-sc · Pull Request #104 · Snapchat/GiGL

kmontemayor2-sc · 2025-06-19T02:38:38Z

Changes:

update _fast_hash to run again to improve mixing - without this our N=20 node ids had splits like [18], [0], [2]
enforce val_num and test_num as floats - since we can't guarantee counts easily anymore.
Update splitting logic to ensure nodes are always selected into the same split [discussed more below].
Added a (basic) test for the "distributed" case.
Removed some tests that relied on integer counts.

The issue with our previous splitting logic, is that even though a given node id would always have the same hash on different machines, since we sorted the hashed, it's "position" may be different and so it may be selected into different splits.

For instance, let's assume the hash function is the identity, and rank_0_nodes: [0, 1, 2, 3] rank_1_nodes: [3, 4, 5, 6]

On rank 0, 3 would be selected into Test, as its hash value is the greatest sorted, while it would be in train on rank 1.

Now what we do is fine the globally largest/smallest hash and then normalize the hash values per machine.

We then select the nodes based on the normalized values, since the hashes are consistent, and they get normalized the same, the same node id will be selected into the same split always now.

kmontemayor2-sc · 2025-06-19T02:38:46Z

/unit_test

github-actions · 2025-06-19T02:38:56Z

GiGL Automation

@ 02:38:55UTC : 🔄 Unit Test started.

@ 03:06:03UTC : ❌ Workflow failed.
Please check the logs for more details.

kmontemayor2-sc · 2025-06-19T03:53:29Z

/unit_test

kmontemayor2-sc · 2025-06-19T03:53:34Z

/integration_test

kmontemayor2-sc · 2025-06-19T03:53:37Z

/e2e_test

github-actions · 2025-06-19T03:53:42Z

GiGL Automation

@ 03:53:41UTC : 🔄 Unit Test started.

@ 04:27:15UTC : ✅ Workflow completed successfully.

github-actions · 2025-06-19T03:53:45Z

GiGL Automation

@ 03:53:44UTC : 🔄 Integration Test started.

@ 04:34:46UTC : ✅ Workflow completed successfully.

github-actions · 2025-06-19T03:53:49Z

GiGL Automation

@ 03:53:48UTC : 🔄 E2E Test started.

@ 05:23:37UTC : ✅ Workflow completed successfully.

nshah-sc

thx!

kmontemayor2-sc · 2025-06-23T23:20:13Z

/unit_test

github-actions · 2025-06-23T23:20:26Z

GiGL Automation

@ 23:20:25UTC : 🔄 Unit Test started.

@ 23:43:39UTC : ❌ Workflow failed.
Please check the logs for more details.

kmontemayor2-sc · 2025-06-24T17:03:34Z

/unit_test

github-actions · 2025-06-24T17:03:47Z

GiGL Automation

@ 17:03:47UTC : 🔄 Unit Test started.

@ 17:25:55UTC : ❌ Workflow failed.
Please check the logs for more details.

kmontemayor2-sc · 2025-06-24T20:41:06Z

/unit_test

github-actions · 2025-06-24T20:41:19Z

GiGL Automation

@ 20:41:18UTC : 🔄 Unit Test started.

@ 21:06:43UTC : ❌ Workflow failed.
Please check the logs for more details.

svij-sc

Thanks for the iterations

kmontemayor2-sc · 2025-06-24T22:10:37Z

/unit_test

github-actions · 2025-06-24T22:10:49Z

GiGL Automation

@ 22:10:49UTC : 🔄 Unit Test started.

Ensure node ids are always selected into the same split

245f922

maybe fix test

62cfdf2

kmontemayor2-sc and others added 2 commits June 19, 2025 10:40

Update data_splitters.py

f2e18d3

format

8dde309

nshah-sc reviewed Jun 20, 2025

View reviewed changes

address comments

dc47517

nshah-sc approved these changes Jun 23, 2025

View reviewed changes

Comment thread python/gigl/utils/data_splitters.py

Comment thread python/tests/unit/utils/data_splitters_test.py Outdated

Comment thread python/gigl/utils/data_splitters.py

Comment thread python/gigl/utils/data_splitters.py

svij-sc reviewed Jun 23, 2025

View reviewed changes

address comments - require process group

e726294

comment

4f0c863

svij-sc reviewed Jun 23, 2025

View reviewed changes

Comment thread python/tests/unit/utils/data_splitters_test.py Outdated

Comment thread python/tests/unit/utils/data_splitters_test.py Outdated

Comment thread python/tests/unit/utils/data_splitters_test.py Outdated

cleaner dist setup

ce74c8e

swap to dist test using identity hash

8e63b39

svij-sc reviewed Jun 24, 2025

View reviewed changes

Comment thread python/tests/unit/utils/data_splitters_test.py Outdated

kmonte added 2 commits June 24, 2025 19:00

cleanup in tearDown

5634988

setup process group for neighborloader tests

1f32cb7

svij-sc approved these changes Jun 24, 2025

View reviewed changes

Comment thread python/tests/unit/utils/data_splitters_test.py

kmonte added 3 commits June 24, 2025 21:57

wip

7e63cc1

Merge branch 'main' into kmonte/fix-split

0aae905

launch pg in build_dataset if required

6a2e075

kmontemayor2-sc commented Jun 24, 2025

View reviewed changes

Comment thread python/gigl/distributed/dataset_factory.py

kmontemayor2-sc marked this pull request as ready for review June 24, 2025 22:43

kmontemayor2-sc requested review from mkolodner-sc and yliu2-sc as code owners June 24, 2025 22:43

kmontemayor2-sc added this pull request to the merge queue Jun 24, 2025

Merged via the queue into main with commit dd619e2 Jun 25, 2025
4 checks passed

kmontemayor2-sc deleted the kmonte/fix-split branch June 25, 2025 00:14

kmontemayor2-sc mentioned this pull request Aug 18, 2025

Update comment to reflect code #278

Merged

Conversation

kmontemayor2-sc commented Jun 19, 2025

Uh oh!

kmontemayor2-sc commented Jun 19, 2025

Uh oh!

github-actions Bot commented Jun 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

GiGL Automation

Uh oh!

kmontemayor2-sc commented Jun 19, 2025

Uh oh!

kmontemayor2-sc commented Jun 19, 2025

Uh oh!

kmontemayor2-sc commented Jun 19, 2025

Uh oh!

github-actions Bot commented Jun 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

GiGL Automation

Uh oh!

github-actions Bot commented Jun 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

GiGL Automation

Uh oh!

github-actions Bot commented Jun 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

GiGL Automation

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

nshah-sc left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kmontemayor2-sc commented Jun 23, 2025

Uh oh!

github-actions Bot commented Jun 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

GiGL Automation

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kmontemayor2-sc commented Jun 24, 2025

Uh oh!

github-actions Bot commented Jun 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

GiGL Automation

Uh oh!

Uh oh!

kmontemayor2-sc commented Jun 24, 2025

Uh oh!

github-actions Bot commented Jun 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

GiGL Automation

Uh oh!

svij-sc left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

kmontemayor2-sc commented Jun 24, 2025

Uh oh!

github-actions Bot commented Jun 24, 2025

GiGL Automation

Uh oh!

github-actions Bot commented Jun 19, 2025 •

edited

Loading

github-actions Bot commented Jun 19, 2025 •

edited

Loading

github-actions Bot commented Jun 19, 2025 •

edited

Loading

github-actions Bot commented Jun 19, 2025 •

edited

Loading

github-actions Bot commented Jun 23, 2025 •

edited

Loading

github-actions Bot commented Jun 24, 2025 •

edited

Loading

github-actions Bot commented Jun 24, 2025 •

edited

Loading