Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add flash attn for af2 #8

Open
wants to merge 428 commits into
base: develop
Choose a base branch
from
Open

Add flash attn for af2 #8

wants to merge 428 commits into from

Conversation

Xreki
Copy link
Owner

@Xreki Xreki commented Apr 24, 2023

PR types

Performance optimization

PR changes

OPs

Description

RT

@Xreki Xreki force-pushed the add_flash_attn_for_af2 branch 2 times, most recently from de37e2f to cf4a1c8 Compare April 24, 2023 15:16
tianshuo78520a and others added 28 commits May 5, 2023 14:17
* test,test=develop

* test,test=develop

* test,test=develop

* test,test=develop

* test,test=develop

* test,test=develop

* test,test=develop

* test,test=develop
* move UniformRawKernel to legacy

* Update uniform_kernel.cc

* Update uniform_kernel.cu

* Update uniform_kernel.cc

* Update uniform_kernel.cu

* Update uniform_kernel.h

* Update uniform_kernel.cc

* Empty Commit to setup deployments
* rem npu in test

* restore some code
* Add trt pow converter.

* update to use AddConstantLayer

* add dims=0 ut
* Rename randint_raw and move it to legacy

* Update fetch_v2_op.cc

* Update randint_kernel.cc

* Update randint_kernel.cu

* Empty Commit to setup deployments
* polish

* polish

* polish

* polish

* polish

* polish

* polish

* polish

* polish

* polish

* polish
* use int64 to calc dim for c softmax

* fix complie bug
* Add fused_gate_attention API.

* Implement FusedDropout API.

* Fix doc and add unittest.

* Skip for non-gpu device.

* Add unittest.
* add OpTrait OpInterface ValueIterator TypeList

* refine code

* refine code

* refine code

* add opinfo

* add typeid copy constructor

* add trait interface construct method for opinfo_impl

* add trait interface construct method for opinfo_impl

* add trait interface construct method for opinfo_impl

* add trait interface construct method for opinfo_impl

* add trait interface construct method for opinfo_impl

* add create

* add member func for opinfo

* fix compile bug

* add op interface in ircontext

* fix compile bug

* fix compile bug

* refine code

* fix compile bug

* add ut

* refine ut

* refine code of opinfo_impl

* delete unused code

* add dyncast for operation

* refine comment

* refine opinfo_impl

* delete unused code

* refine code by comment

* refine code

* refine code

* refine code for registerOp

* refine opfin create

* refine code of search method of ircontext

* refine op attribute

* change opinfo_map key from type_id to string
* add mul doubel grad

* add sub_double_grad

* add add sub high test

* add mutiply test

* modify other unsqueeze

* delete api.yaml

* only for make ci run

* midify unsqueeze

* modify unsqueeze

* tmp

* modify operants gen

* review modify

* modify review

* debug

* debug

* modify ci cross boundary

* delete log
* fix strided_slice ut

* remove check_dygraph
Xreki and others added 30 commits May 17, 2023 18:06
…addle#53744)

* optimize logsumexp in small data scale

* fix

* fix

* add #pragma once

* compile protobuf offline

* add submodlu gflags

* check_submodules

* check_submodules

* add_submodule protobuf

* add_submodule_protobuf

* add_submodule

* add .gitmodules

* add_submodules

* fix_compiler error

* support offline compile

* support offline compile

* support offline_compile

* remove cub

* remove brpc

* support offline compile

* support offline compile

* canning patching on cryptopp

* modify .gitigonre of cryptopp

* test

* offline compile

* add_submodule zlib

* modify .gitmodules

* modify .gitmodules

* fix setup.py bug

* delete submodule cryptopp

* fix windows compile bug

* fix xxhash compile problem

---------

Co-authored-by: Asthestarsfalll <1186454801@qq.com>
Co-authored-by: Asthestarsfalll <72954905+Asthestarsfalll@users.noreply.github.com>
)

* suport device_guard for npu

* fix comment

* fix typo
* add master gradients on static graph

* add unit test for bf16 master grad static graph

* use float16 as v100 test dtype

* only skip GPU which do not support bf16

* use linear layer to test master grad

* 1.push master grad creation before all optimizer ops; 2.remove useless unittest; 3.use a function to create master grad states
* rm cmake npu

* Update generic.cmake

* Update generic.cmake
* rm tools npu

* Update get_pr_ut.py

* Update get_pr_ut.py
…53862)

* [XPU] do not call check_nccl_version_for_p2p under xpu

* refine code.
* simplify layer_norm_op.cc

* support auto generate for op layer_norm

* update unittest for composite_layer_norm

* remove layer_norm_op.cc from scripts

* replace layer_norm_op with generated_op

* add get_expected_kernel for layer_norm

* update cmake kernel register function for layer_norm_mkldnn_op
…ddle#52006)

* [Dy2static-Fallback] add set_eval_frame function in pybind.
1. add set_eval_frame function in pybind.

* add unittest for eval frame hooker.

* [support py38]

* fix-GeneratorExit error in eval frame hooker

* support python == 3.9

* support 3.10

* fix some comments
* move sequence_mask op InferShape func

* add dtype infer
* Fused elementwises kernels and ops

* change fuse pass name

* adjust .pbtxt files

* adjust quantization attributes

* add missing arguments and fix others, review fixed

* simplify fused kernel registration

* fix elementwise unit tests

* reuse one fused elementwise op

* adjust proto

* Add supported datatypes

* Change 'Scale' to 'scale' in tests, change some tests to onednn

* Revert breaking changes

* Fix unit tests

* Delete obsolete test cases

* Delete commented out code

* Fix codestyle

* delete temporary condition

* fix conflicts and delete duplicate fusing

* Fix code after merge

* Move tests to new directory

* fix tests volatility

* Rename test_elementwise_add_onednn_op.py to test_elementwise_add_mkldnn_op.py

* Update CMakeLists.txt add mkldnn op test

---------

Co-authored-by: Silv3S <slawomir.siwek@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment