-
Notifications
You must be signed in to change notification settings - Fork 327
[CCLs] Port CCLs to 1D fabric #20544
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
tests/ttnn/unit_tests/operations/ccl/test_llama_reduce_scatter_async_TG.py
Show resolved
Hide resolved
| @@ -0,0 +1,55 @@ | |||
| // SPDX-FileCopyrightText: © 2025 Tenstorrent Inc. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For clarity, the functions below are exposed to TTNN CCL ops. They are hence in the API folder.
cc: @pgkeller
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Further note - these will be deleted soon as a part of merging fabric code-bases and migrating the "1d" portion of that impl to the proper control plane based APIs
02d4d1b to
8877b86
Compare
### Ticket [Link to Github Issue](#19961) ### Problem description Currently the ops and tests utilize the fabric setup done via sub-devices. ### What's changed 1. Migrated ops and tests to rather rely on the fabric setup during device init and hence use the appropriate APIs to setup connections with the fabric kernels 2. Cleaned-up tests to get rid of options/vars to setup/teardown fabric 3. Cleaned up the `enable_persistent_fabric_mode` arg indicating that the ops will run in persistent fabric mode by default. ### Checklist - [x] [All post commit](https://github.com/tenstorrent/tt-metal/actions/workflows/all-post-commit-workflows.yaml) CI passes (https://github.com/tenstorrent/tt-metal/actions/runs/14481011716) - [ ] [Blackhole Post commit](https://github.com/tenstorrent/tt-metal/actions/workflows/blackhole-post-commit.yaml) CI passes (if applicable) - [ ] [Model regression](https://github.com/tenstorrent/tt-metal/actions/workflows/perf-models.yaml) CI passes (if applicable) - [ ] [Device performance regression](https://github.com/tenstorrent/tt-metal/actions/workflows/perf-device-models.yaml) CI passes (if applicable) - [ ] **(For models and ops writers)** Full [new models tests](https://github.com/tenstorrent/tt-metal/actions/workflows/full-new-models-suite.yaml) CI passes (if applicable) - [ ] New/Existing tests provide coverage for changes - [ ] [TG Unit Tests](https://github.com/tenstorrent/tt-metal/actions/workflows/tg-unit-tests.yaml) (https://github.com/tenstorrent/tt-metal/actions/runs/14447931719) - [x] [TG Quick](https://github.com/tenstorrent/tt-metal/actions/workflows/tg-quick-trigger.yaml) (https://github.com/tenstorrent/tt-metal/actions/runs/14447914998) - [ ] [TG Demo Test](https://github.com/tenstorrent/tt-metal/actions/workflows/tg-demo-tests.yaml) (https://github.com/tenstorrent/tt-metal/actions/runs/14447942813) - [ ] [TG Nightly](https://github.com/tenstorrent/tt-metal/actions/workflows/tg-nightly-tests.yaml) (https://github.com/tenstorrent/tt-metal/actions/runs/14447925563) - [ ] [TG Frequent Tests](https://github.com/tenstorrent/tt-metal/actions/workflows/tg-frequent-tests.yaml) (https://github.com/tenstorrent/tt-metal/actions/runs/14447936614) - [ ] [TG Model Perf Tests](https://github.com/tenstorrent/tt-metal/actions/workflows/tg-model-perf-tests.yaml) (https://github.com/tenstorrent/tt-metal/actions/runs/14447904540) - [ ] [TG Stress](https://github.com/tenstorrent/tt-metal/actions/workflows/tg-stress-trigger.yaml) (https://github.com/tenstorrent/tt-metal/actions/runs/14447919423) - [ ] [T3K Unit] (https://github.com/tenstorrent/tt-metal/actions/runs/14409722583) - [x] [T3K Nightly] (https://github.com/tenstorrent/tt-metal/actions/runs/14409735747) - [ ] [T3K Frequent] (https://github.com/tenstorrent/tt-metal/actions/runs/14409732454) - [x] [T3K Multiple Pipelines] (https://github.com/tenstorrent/tt-metal/actions/runs/14481016916) --------- Co-authored-by: asaigal <asaigal@tenstorrent.com>
2e7f5c5 to
d53c1a7
Compare
This reverts commit 959239f.
This reverts commit 959239f.
This reverts commit 959239f.
This reverts commit 959239f.
This reverts commit 959239f. Changes getting reverted [here](2665505) (due to broken APC) make this commit non-functional on a 6U. The 6U changes need to go in first. ### Ticket Link to Github Issue ### Problem description Provide context for the problem. ### What's changed Describe the approach used to solve the problem. Summarize the changes made and its impact. ### Checklist - [ ] [All post commit](https://github.com/tenstorrent/tt-metal/actions/workflows/all-post-commit-workflows.yaml) CI passes - [ ] [Blackhole Post commit](https://github.com/tenstorrent/tt-metal/actions/workflows/blackhole-post-commit.yaml) CI passes (if applicable) - [ ] [Model regression](https://github.com/tenstorrent/tt-metal/actions/workflows/perf-models.yaml) CI passes (if applicable) - [ ] [Device performance regression](https://github.com/tenstorrent/tt-metal/actions/workflows/perf-device-models.yaml) CI passes (if applicable) - [ ] **(For models and ops writers)** Full [new models tests](https://github.com/tenstorrent/tt-metal/actions/workflows/full-new-models-suite.yaml) CI passes (if applicable) - [ ] New/Existing tests provide coverage for changes
This reverts commit 959239f.
### Ticket [Link to Github Issue](#19961) ### Problem description Currently the ops and tests utilize the fabric setup done via sub-devices. ### What's changed 1. Migrated ops and tests to rather rely on the fabric setup during device init and hence use the appropriate APIs to setup connections with the fabric kernels 2. Cleaned-up tests to get rid of options/vars to setup/teardown fabric 3. Cleaned up the `enable_persistent_fabric_mode` arg indicating that the ops will run in persistent fabric mode by default. ### Checklist - [x] [All post commit](https://github.com/tenstorrent/tt-metal/actions/workflows/all-post-commit-workflows.yaml) CI passes (https://github.com/tenstorrent/tt-metal/actions/runs/14481011716) - [ ] [Blackhole Post commit](https://github.com/tenstorrent/tt-metal/actions/workflows/blackhole-post-commit.yaml) CI passes (if applicable) - [ ] [Model regression](https://github.com/tenstorrent/tt-metal/actions/workflows/perf-models.yaml) CI passes (if applicable) - [ ] [Device performance regression](https://github.com/tenstorrent/tt-metal/actions/workflows/perf-device-models.yaml) CI passes (if applicable) - [ ] **(For models and ops writers)** Full [new models tests](https://github.com/tenstorrent/tt-metal/actions/workflows/full-new-models-suite.yaml) CI passes (if applicable) - [ ] New/Existing tests provide coverage for changes - [ ] [TG Unit Tests](https://github.com/tenstorrent/tt-metal/actions/workflows/tg-unit-tests.yaml) (https://github.com/tenstorrent/tt-metal/actions/runs/14447931719) - [x] [TG Quick](https://github.com/tenstorrent/tt-metal/actions/workflows/tg-quick-trigger.yaml) (https://github.com/tenstorrent/tt-metal/actions/runs/14447914998) - [ ] [TG Demo Test](https://github.com/tenstorrent/tt-metal/actions/workflows/tg-demo-tests.yaml) (https://github.com/tenstorrent/tt-metal/actions/runs/14447942813) - [ ] [TG Nightly](https://github.com/tenstorrent/tt-metal/actions/workflows/tg-nightly-tests.yaml) (https://github.com/tenstorrent/tt-metal/actions/runs/14447925563) - [ ] [TG Frequent Tests](https://github.com/tenstorrent/tt-metal/actions/workflows/tg-frequent-tests.yaml) (https://github.com/tenstorrent/tt-metal/actions/runs/14447936614) - [ ] [TG Model Perf Tests](https://github.com/tenstorrent/tt-metal/actions/workflows/tg-model-perf-tests.yaml) (https://github.com/tenstorrent/tt-metal/actions/runs/14447904540) - [ ] [TG Stress](https://github.com/tenstorrent/tt-metal/actions/workflows/tg-stress-trigger.yaml) (https://github.com/tenstorrent/tt-metal/actions/runs/14447919423) - [ ] [T3K Unit] (https://github.com/tenstorrent/tt-metal/actions/runs/14409722583) - [x] [T3K Nightly] (https://github.com/tenstorrent/tt-metal/actions/runs/14409735747) - [ ] [T3K Frequent] (https://github.com/tenstorrent/tt-metal/actions/runs/14409732454) - [x] [T3K Multiple Pipelines] (https://github.com/tenstorrent/tt-metal/actions/runs/14481016916) --------- Co-authored-by: asaigal <asaigal@tenstorrent.com>
This reverts commit 959239f. Changes getting reverted [here](2665505) (due to broken APC) make this commit non-functional on a 6U. The 6U changes need to go in first. ### Ticket Link to Github Issue ### Problem description Provide context for the problem. ### What's changed Describe the approach used to solve the problem. Summarize the changes made and its impact. ### Checklist - [ ] [All post commit](https://github.com/tenstorrent/tt-metal/actions/workflows/all-post-commit-workflows.yaml) CI passes - [ ] [Blackhole Post commit](https://github.com/tenstorrent/tt-metal/actions/workflows/blackhole-post-commit.yaml) CI passes (if applicable) - [ ] [Model regression](https://github.com/tenstorrent/tt-metal/actions/workflows/perf-models.yaml) CI passes (if applicable) - [ ] [Device performance regression](https://github.com/tenstorrent/tt-metal/actions/workflows/perf-device-models.yaml) CI passes (if applicable) - [ ] **(For models and ops writers)** Full [new models tests](https://github.com/tenstorrent/tt-metal/actions/workflows/full-new-models-suite.yaml) CI passes (if applicable) - [ ] New/Existing tests provide coverage for changes
This reverts commit 959239f. Changes getting reverted [here](2665505) (due to broken APC) make this commit non-functional on a 6U. The 6U changes need to go in first. ### Ticket Link to Github Issue ### Problem description Provide context for the problem. ### What's changed Describe the approach used to solve the problem. Summarize the changes made and its impact. ### Checklist - [ ] [All post commit](https://github.com/tenstorrent/tt-metal/actions/workflows/all-post-commit-workflows.yaml) CI passes - [ ] [Blackhole Post commit](https://github.com/tenstorrent/tt-metal/actions/workflows/blackhole-post-commit.yaml) CI passes (if applicable) - [ ] [Model regression](https://github.com/tenstorrent/tt-metal/actions/workflows/perf-models.yaml) CI passes (if applicable) - [ ] [Device performance regression](https://github.com/tenstorrent/tt-metal/actions/workflows/perf-device-models.yaml) CI passes (if applicable) - [ ] **(For models and ops writers)** Full [new models tests](https://github.com/tenstorrent/tt-metal/actions/workflows/full-new-models-suite.yaml) CI passes (if applicable) - [ ] New/Existing tests provide coverage for changes
Ticket
Link to Github Issue
Problem description
Currently the ops and tests utilize the fabric setup done via sub-devices.
What's changed
enable_persistent_fabric_modearg indicating that the ops will run in persistent fabric mode by default.Checklist