Skip to content

Conversation

@pwhMass
Copy link

@pwhMass pwhMass commented Jan 20, 2025

No description provided.

@pwhMass
Copy link
Author

pwhMass commented Jan 20, 2025

目前代码不可用,且只针对2维做了转置优化,正在测试什么情况下性能更好

@YdrMaster YdrMaster self-assigned this Jan 20, 2025
@YdrMaster YdrMaster marked this pull request as draft January 20, 2025 09:37
@pwhMass pwhMass force-pushed the rearrange branch 2 times, most recently from cfd19d9 to b6d2b42 Compare January 24, 2025 13:43
@pwhMass pwhMass force-pushed the rearrange branch 2 times, most recently from de841a0 to 8d4cd02 Compare February 20, 2025 16:23
@YdrMaster YdrMaster marked this pull request as ready for review February 24, 2025 03:47
//src strides 降序 index
let src_strides_desc_idx = (0..scheme_update.ndim())
.zip(src_strides)
.sorted_by(|a, b| b.1.cmp(&a.1))

Check warning

Code scanning / clippy

this expression creates a reference which is immediately dereferenced by the compiler

this expression creates a reference which is immediately dereferenced by the compiler
let dst_cs = dst_cs / unit;
let src_rs = src_rs / unit;
let src_cs = src_cs / unit;
let unit = unit as usize;

Check warning

Code scanning / clippy

casting to the same type is unnecessary (`usize` -> `usize`)

casting to the same type is unnecessary (`usize` -> `usize`)
需要注意目前 ARRAY_SIZE 的大小是5,该常亮与可接受的Tensor的维度有关,但太大会导致kernel计算量增大
Operator 需要用到max_warps_block,warp_size来辅助计算,目前并未用到
block_size 目前固定位256,可进一步优化
@pwhMass pwhMass changed the base branch from main to dev May 7, 2025 13:07
@YdrMaster YdrMaster merged commit 972e357 into YdrMaster:dev May 7, 2025
Ceng23333 pushed a commit to Ceng23333/operators-rs that referenced this pull request Sep 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants