Skip to content

[A5] Add buffer-id sync ops (get_buf/rls_buf)#169

Merged
zhangstevenunity merged 1 commit into
hw-native-sys:mainfrom
TaoTao-real:codex/a5-buf-sync-ir
Mar 3, 2026
Merged

[A5] Add buffer-id sync ops (get_buf/rls_buf)#169
zhangstevenunity merged 1 commit into
hw-native-sys:mainfrom
TaoTao-real:codex/a5-buf-sync-ir

Conversation

@TaoTao-real
Copy link
Copy Markdown
Contributor

What

  • Add A5 buffer-id synchronization ops: pto.get_buf / pto.rls_buf.
  • Lower them in PTOToEmitC to the CCEC builtin intrinsics get_buf(...) / rls_buf(...).

Notes

  • These ops are intended for A5-only usage; existing code paths are unchanged unless the new ops are emitted.
  • Add an InjectSync .pto smoke sample + runner guard to ensure the lowering keeps producing get_buf/rls_buf calls.

Tests

  • ninja -C build ptoas _pto
  • PYTHONPATH=<mlir_core>:<PTOAS build/python> bash test/samples/runop.sh all

Introduce pto.get_buf/pto.rls_buf to model A5 BufID synchronization and lower them to CCEC builtins get_buf/rls_buf in EmitC.

Also add an InjectSync .pto smoke sample + runop guard to ensure lowering stays intact.

Tests:
- ninja -C build ptoas _pto
- PYTHONPATH=... bash test/samples/runop.sh all
@zhangstevenunity zhangstevenunity merged commit b869ce4 into hw-native-sys:main Mar 3, 2026
12 of 15 checks passed
liggest pushed a commit to liggest/PTOAS that referenced this pull request Apr 27, 2026
* feat: add TileOps templates and basic test cases for tcolexpand operations

* test: add tcolexpand operators test cases

* fix: 添加TODO说明tcolexpanddiv需要高精度版本

* feat: fp32使用vexpdif实现tcolexpandexpdif,fp16使用vsub+vexp

* fix: add PR386 license headers to template and test files

- Add license headers to 7 tcolexpand*_template.py files
- Add license headers to test case files (CMakeLists.txt, compare.py, launch.cpp, main.cpp, gen_data.py, cases.py)

* feat: register tcolexpand operators in CMakeLists.txt

* fix: replace aclFloat16 with uint16_t in tcolexpand test cases

- Replace aclFloat16 with uint16_t in main.cpp and launch.cpp (16 files)
- Remove duplicate license headers in 5 launch.cpp files
- Fix .pto comments: aclFloat16 -> fp16
- Remove unnecessary #include "acl/acl.h" from launch.cpp files
- Align with tpartmax implementation pattern

---------

Co-authored-by: User <user@example.com>
KurrinQu pushed a commit to KurrinQu/PTOAS that referenced this pull request Apr 28, 2026
* feat: add TileOps templates and basic test cases for tcolexpand operations

* test: add tcolexpand operators test cases

* fix: 添加TODO说明tcolexpanddiv需要高精度版本

* feat: fp32使用vexpdif实现tcolexpandexpdif,fp16使用vsub+vexp

* fix: add PR386 license headers to template and test files

- Add license headers to 7 tcolexpand*_template.py files
- Add license headers to test case files (CMakeLists.txt, compare.py, launch.cpp, main.cpp, gen_data.py, cases.py)

* feat: register tcolexpand operators in CMakeLists.txt

* fix: replace aclFloat16 with uint16_t in tcolexpand test cases

- Replace aclFloat16 with uint16_t in main.cpp and launch.cpp (16 files)
- Remove duplicate license headers in 5 launch.cpp files
- Fix .pto comments: aclFloat16 -> fp16
- Remove unnecessary #include "acl/acl.h" from launch.cpp files
- Align with tpartmax implementation pattern

---------

Co-authored-by: User <user@example.com>
Zhendong404 pushed a commit to Zhendong404/PTOAS that referenced this pull request May 1, 2026
* feat: add TileOps templates and basic test cases for tcolexpand operations

* test: add tcolexpand operators test cases

* fix: 添加TODO说明tcolexpanddiv需要高精度版本

* feat: fp32使用vexpdif实现tcolexpandexpdif,fp16使用vsub+vexp

* fix: add PR386 license headers to template and test files

- Add license headers to 7 tcolexpand*_template.py files
- Add license headers to test case files (CMakeLists.txt, compare.py, launch.cpp, main.cpp, gen_data.py, cases.py)

* feat: register tcolexpand operators in CMakeLists.txt

* fix: replace aclFloat16 with uint16_t in tcolexpand test cases

- Replace aclFloat16 with uint16_t in main.cpp and launch.cpp (16 files)
- Remove duplicate license headers in 5 launch.cpp files
- Fix .pto comments: aclFloat16 -> fp16
- Remove unnecessary #include "acl/acl.h" from launch.cpp files
- Align with tpartmax implementation pattern

---------

Co-authored-by: User <user@example.com>
FangRui0 pushed a commit to FangRui0/PTOAS that referenced this pull request May 14, 2026
* feat: add TileOps templates and basic test cases for tcolexpand operations

* test: add tcolexpand operators test cases

* fix: 添加TODO说明tcolexpanddiv需要高精度版本

* feat: fp32使用vexpdif实现tcolexpandexpdif,fp16使用vsub+vexp

* fix: add PR386 license headers to template and test files

- Add license headers to 7 tcolexpand*_template.py files
- Add license headers to test case files (CMakeLists.txt, compare.py, launch.cpp, main.cpp, gen_data.py, cases.py)

* feat: register tcolexpand operators in CMakeLists.txt

* fix: replace aclFloat16 with uint16_t in tcolexpand test cases

- Replace aclFloat16 with uint16_t in main.cpp and launch.cpp (16 files)
- Remove duplicate license headers in 5 launch.cpp files
- Fix .pto comments: aclFloat16 -> fp16
- Remove unnecessary #include "acl/acl.h" from launch.cpp files
- Align with tpartmax implementation pattern

---------

Co-authored-by: User <user@example.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants