Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Ascend] fuj/replace-rotary-emb #1230

Conversation

jingguo-st
Copy link
Contributor

@jingguo-st jingguo-st commented May 24, 2024

Motivation and Context

  1. replace rotary_emb with aclnn
  2. support dtype convert for AscendTensor vector

Description

image

Use cases (Optional)

BC-breaking (Optional)

Checklist

Before PR:

  • I have read and followed the workflow indicated in the Contributors.md to create this PR.
  • Pre-commit or linting tools indicated in Contributors.md are used to fix the potential lint issues.
  • Bug fixes are covered by unit tests, the case that causes the bug should be added in the unit tests.
  • New functionalities are covered by complete unit tests. If not, please add more unit test to ensure the correctness.
  • The documentation has been modified accordingly, including docstring or example tutorials.

After PR:

  • CLA has been signed and all committers have signed the CLA in this PR.

@jingguo-st jingguo-st reopened this Jun 25, 2024
@yangbofun yangbofun merged commit 68b4f6e into DeepLink-org:feat/replace_opplugin_by_aclnn Jun 26, 2024
14 of 15 checks passed
@yangbofun yangbofun deleted the fuj/replace-rotary-emb branch June 26, 2024 01:38
yangbofun added a commit that referenced this pull request Jul 10, 2024
* [ascend]zq/fix_ops for Ascend 8.0 (#1159)

* Update diopi_test/python/conformance/collect_case.py

* update README

* fix device_config for group_norm

---------

Co-authored-by: jfxu-st <143591296+jfxu-st@users.noreply.github.com>

* [ascend]Zq/update abs with aclnn (#1165)

* add abs by aclnn

* [Ascend] fuj / optim acl workspace for ascend, support hugemem and Size convert (#1170)

optim acl workspace for ascend, support hugemem and Size convert

* [Ascend] fuj/aclnn-hugemem-size-convert fix (#1172)

* optim acl workspace for ascend, support hugemem and Size convert

* fix format

* [ascend]zzf/reimpl uniform (#1179)

reimpl uniform

* [ascend]zzf/topk (#1187)

* reimpl topk with aclnn

* [ascend]Zzf/zeros ones (#1186)

* reimpl zeros, ones with aclnn

* [ascend]zzf/reimpl permute and transpose (#1182)

* reimpl permute and transpose

* [ascend]zq/update bitwise by aclnn (#1183)

* update bitwise by aclnn

* [ascend]zzf/reimpl where with aclnn (#1169)

reimpl where with aclnn

* [ascend]zq/update_arange_by_aclnn (#1174)

* update_arange_by_aclnn

* [ascend]zq/update_col2im_by_aclnn (#1185)

* [ascend]ywt/impl eq op by aclnn (#1173)

* replace eq op

* fix: dropinp call op plugin

* [camb]tyf/fix all bugs in sdk1.18.0 (#1163)

* change contiguous

* [ascend]zq/update ceil by aclnn (#1184)

* update ceil by aclnn

* [ascend]Zq/update flip by aclnn (#1188)

* [Ascend] fuj/aclnn-replace (#1162)

* replace binary op, pow with aclnn

* replace normal with aclnn

* support add, sub, div with aclnn

* using diopiGeneratorGetSeedAndOffset for diopiNormal

* fix ascend config

* Zq/update argmax by aclnn (#1193)

* [ascend]ywt/feature:add_logic op (#1189)

* ywt/feature:add_logic op

* update config.yaml

* [Ascend] Wx/fix adaptor (#1196)

fix adaptor

* [ascend]Zq/update addcdiv and remainder by aclnn (#1167)

* [ascend]zzf/reduce (#1190)

* reimple reduce op with aclnn

* fix diopiSum

* [ascend]Zq/update cumsum by aclnn (#1197)

* tyf/change gen (#1191)

* change gen

* [camb]tyf/fixMSE (#1198)

* zq/update equal by aclnn (#1201)

* update equal by aclnn

* fix

* [ascend]zq/update repeat by aclnn (#1199)

* update repeat by aclnn

* [ascend]Zq/update mseloss by aclnn (#1200)

* update mse_loss with aclnn

* [Ascend] Wx/reimpl some ops (#1180)

* Skip float64 test cases for some ops[batch_norm, adaptive_avg_pool2d, interpolate], as other ops are implemented using DIOPI_ASCEND_CALL_ACLNN.
* Reimpl activation, cast, atan, sin, cos, fill, floor, isnan, lerp, linalg_vec_norm, linspace, remainder, sgn, sort, threshold, neg, sqrt, rsqrt, erf, log, log2, log10, exp, reciprocal, rms_norm using DIOPI_ASCEND_CALL_ACLNN.
* Fix permute.
* Remove redundant dtype cast.

* [Ascend] fuj/aclnn-replace (#1195)

* fix maximum and minimum

* [ascend]Zq/update clamp by aclnn (#1194)

* [ascend] tyf/fixMaskedSelect (#1206)

* [Ascend] fuj/replace-baddbmm-and-fix-max-and-min-config (#1208)

* replace baddbmm and fix max and min config

* replace bmm with aclnn

* support bfloat16

* add ci

* zq/updata_addmm_by_aclnn (#1171)

* zq/update_matmul_by_aclnn (#1202)

* [Ascend] Wx/reimpl norm op (#1216)

reimpl norm op by aclnn

* [Ascend] fuj/aclnn-replace-dropout (#1214)

replace dropout with aclnn

* [ascend]Zq/update split by aclnn (#1213)

* zzf/reimpl one_hot with aclnn (#1217)

* reimpl one_hot with aclnn

* replace mm with aclnn

* [Ascend]zq/update_addcmul_by_aclnn (#1168)

* [ascend]zq/update LayerNorm by aclnn (#1204)

* [diopi]add attention define and impl on ascend (#1228)

* [Ascend]Zq/update gather by aclnn (#1229)

* [ascend]Zq/update conv2d by aclnn (#1221)

* [ascend]zq/update masked_fill by aclnn (#1205)

* [ascend] fix_Ascend_ci (#1248)

* fix_Ascend_ci

* [ascend]Zq/reimpl max pool2d (#1253)



---------

Co-authored-by: yangbofun <yangbofun@163.com>

* [ci]Align the test logic between V100 and A100 (#1232)

* [ascend]Zq/update group norm by aclnn (#1209)

* [Ascend]Zq/update adaptive avg pool with aclnn (#1192)

* [Ascend]zq/update batch_norm by aclnn (#1181)

* [Ascend] fuj/fix-generator-with-getSeedAndOffset (#1233)

fix generator with getSeedAndOffset

* [Ascend] Wx/fix dtype cast bug of adamw op (#1260)

* fix dtype cast bug of adamw op

* [ascend]Zcx/llama2 infer 910b (#1254)

* optimize lightllm

* fix promptFlashAttention on a+x

* add check for incre flash attention

* add description of the added funtion

* [Ascend] Wx/reimpl index select, cat and stack op (#1211)

* [Ascend] Wx/reimpl adamw op (#1227)

* reimpl adamw op

* [Ascend] Wx/reimpl scatter op (#1220)

* reimpl scatter op via aclnn

* [Ascend] Wx/reimpl multinomial op (#1218)

* reimpl multinomial op

* [ascend]Zzf/linear (#1231)

* reimpl linear with aclnn

---------

Co-authored-by: NeosZhang <zhangqiu1994@outlook.com>

* [Ascend] fuj/fix-AscendTensor-storage-offset (#1259)

* [ascend]Zzf/interpolate (#1235)

* reimple upsample with aclnn


---------

Co-authored-by: NeosZhang <zhangqiu1994@outlook.com>

* [ascend]zzf/reimpl non_zero with aclnn (#1223)



---------

Co-authored-by: NeosZhang <zhangqiu1994@outlook.com>

* Yb/modify call aclnn impl (#1265)

* add computeWorkspaceSize

* modify macro

* zq/Update masked select by aclnn with expand (#1263)

* zq/fix masked_select (#1272)

fix masked_select

* [Ascend] fuj/replace-rotary-emb (#1230)

impl rotary embedding with aclnn

* replace inplace copy with aclnn

* reimpl index_put with aclnn

---------

Co-authored-by: ZhangQiu <100055343+NeosZhang@users.noreply.github.com>
Co-authored-by: jfxu-st <143591296+jfxu-st@users.noreply.github.com>
Co-authored-by: Zhangzefeng <zhang_zefeng@foxmail.com>
Co-authored-by: Peter Ye <44945378+yewentao256@users.noreply.github.com>
Co-authored-by: Bonbon-Tang <77844719+Bonbon-Tang@users.noreply.github.com>
Co-authored-by: wangxing <131418410+POI-WX@users.noreply.github.com>
Co-authored-by: yangbofun <yangbofun@163.com>
Co-authored-by: zhaoguochun1995 <109069909+zhaoguochun1995@users.noreply.github.com>
Co-authored-by: liwenjian-sensetime <109193776+liwenjian-sensetime@users.noreply.github.com>
Co-authored-by: zhaochaoxing <109726331+zhaochaoxing@users.noreply.github.com>
Co-authored-by: NeosZhang <zhangqiu1994@outlook.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants