Add block_info to csrc #4
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This pull request introduces a new
BlockInfostruct in thecsrc/src/block_info.hfile to manage sequence length and padding information for query and key sequences. The struct is designed for use in GPU kernels, with support for variable-length sequences and efficient offset calculations.Key Changes:
New
BlockInfoStruct:BlockInfostruct with a boolean template parameterVarlento handle variable-length sequences. It includes logic for initializing sequence-related parameters like cumulative sequence lengths, actual sequence lengths, and padding. ([csrc/src/block_info.hR1-R49](https://github.com/flash-algo/flash-sparse-attention/pull/4/files#diff-394b0098afb828601be7aad41de4a4125220493eb2b3bf4e768797a39757beaeR1-R49))Offset Calculation Methods:
q_offsetandk_offsetmethods to compute memory offsets for query and key sequences based on batch and row strides. These methods are optimized for GPU execution with__forceinline__and__device__qualifiers. ([csrc/src/block_info.hR1-R49](https://github.com/flash-algo/flash-sparse-attention/pull/4/files#diff-394b0098afb828601be7aad41de4a4125220493eb2b3bf4e768797a39757beaeR1-R49))Namespace and Header Updates:
FLASH_NAMESPACEnamespace for modularity and included necessary headers likenamespace_config.h. ([csrc/src/block_info.hR1-R49](https://github.com/flash-algo/flash-sparse-attention/pull/4/files#diff-394b0098afb828601be7aad41de4a4125220493eb2b3bf4e768797a39757beaeR1-R49))