forked from NVIDIA/cutlass
-
Notifications
You must be signed in to change notification settings - Fork 56
Support of FP8 Chunk Prefill kernel #542
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
adityachatter
wants to merge
16
commits into
intel:main
Choose a base branch
from
adityachatter:achatter/fp8_chunk_prefill
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Support of FP8 Chunk Prefill kernel #542
adityachatter
wants to merge
16
commits into
intel:main
from
adityachatter:achatter/fp8_chunk_prefill
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The goal is to make warnings throw error during the building process. For this, all compilation warnings are to be handled, and any non-serious unavoidable warnings have to be suppressed. --------- Co-authored-by: Joy, Albin <albin.joy@intel.com>
…ded" Error in Loops (intel#511) **Problem Description:** In the SYCL profiling mode (CUTLASS_SYCL_PROFILING_ENABLED), when calling timer.start() repeatedly in a loop (line 307-313) in examples/03_bmg_gemm_streamk/03_bmg_gemm_streamk.cpp, the code throws : > terminate called after throwing an instance of 'std::runtime_error' > what(): Event is already being recorded. > Aborted (core dumped) **Root Cause:** This occurs because the SYCL event manager checks if an event is already in use (via event.getIndex() != -1 in tools/util/include/cutlass/util/sycl_event_manager.hpp), and the timer's start/stop events are not properly reset after each measurement in milliseconds(), this prevents event reuse in subsequent start() calls, leading to a runtime error. **Proposed Fix:** Update sycl_timer.hpp to resets start_ and stop_ to default SyclEvent{} after each measurement in milliseconds(), ensuring getIndex() returns -1 before the next start(). --------- Signed-off-by: Chen, Xi2 <xi2.chen@intel.com>
…ation in SYCL (intel#515) 1. Generates random data on the host and copies it to the device via syclcompat::memcpy, improving code reuse and supporting a broader range of element types. 2. An optional bits parameter to control fractional precision, which was not present in the original. --------- Signed-off-by: Chen, Xi2 <xi2.chen@intel.com>
This change imports `SYCLCompat` to cutlass-sycl repo as `compat`. Previous dependencies on `syclcompat` are changed to `compat`. This PR also fix some failures of `SYCLCompat` in oneapi 2025.2. --------- Co-authored-by: Roland Schulz <roland.schulz@intel.com>
Signed-off-by: Aditya Chatterjee <Aditya.Chatterjee@intel.com>
Signed-off-by: Aditya Chatterjee <Aditya.Chatterjee@intel.com>
Signed-off-by: Aditya Chatterjee <Aditya.Chatterjee@intel.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
[In progress]
Changes are stacked over Chunk Prefill pull request: #498
Run test code as:
TODO: