Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add flash attn for af2 #8

Open
wants to merge 428 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
428 commits
Select commit Hold shift + click to select a range
e85fbac
Mv cpp_extension test dir (#53330)
tianshuo78520a May 5, 2023
b02de1b
【Hackathon No.61】uniform_random 算子FP16/BF16单测完善 (#52949)
co63oc May 5, 2023
d27f15e
[XPU] Fix the out_max of the branch in xpu_conv2d op(#53343)
sprouteer May 5, 2023
2039115
[XPU] Fusion of gather and assign operators to fused_mt op for reduci…
shentanyue May 5, 2023
58435ae
remove some [-Wunused-parameter]warning (#53397)
Galaxy1458 May 5, 2023
0d9a23b
[Dygraph] Fix bugs in dp_pp_comm_overlap for HybridParallel (#53384)
haohongxiang May 5, 2023
d463f8e
Revert "【Hackathon No.52】为 Paddle dist 算子实现 float16 数据类型支持 (#50915)" …
jinyouzhi May 5, 2023
13e2e10
move UniformRawKernel to legacy (#53158)
zhangyuqin1998 May 6, 2023
a499731
rem npu in test (#53469)
KimBioInfoStudio May 6, 2023
5a44bf7
Add trt pow converter. (#53462)
jiweibo May 6, 2023
3e7be9c
Rename randint_raw and move it to legacy (#53157)
zhangyuqin1998 May 6, 2023
12406ca
[inference][trt] add reduce_all and reduce_any (#53088)
zhangjun May 6, 2023
03fe3ce
fix brpc double link (#53512)
liuzhenhai93 May 6, 2023
da963ea
use int64 to calc dim for c softmax (#53541)
FeixLiu May 6, 2023
08a8b75
Rewirte the reshape of temp_mask and temp_bias.
Xreki May 5, 2023
6a65ee0
Merge branch 'develop' into add_flash_attn_for_af2
Xreki May 6, 2023
eda8df7
[XPU] substitute new api kernel for combinatorial adaptive avg_pool2d…
RuohengMa May 6, 2023
4682c0d
Update commit and fix reduce_dim.
Xreki May 6, 2023
99399f3
XPU Support external stream (#53334)
csy0225 May 6, 2023
b729512
Add fused_gate_attention API. (#53432)
Xreki May 6, 2023
165afab
Merge branch 'develop' into add_flash_attn_for_af2
Xreki May 6, 2023
dd2860e
API support use_flash_attn.
Xreki May 6, 2023
d91d758
[IR] OpTrait & OpInterface & OpInfo (#52846)
zhangbo9674 May 6, 2023
f5476da
Use copy_if_different to avoid recompilation of generated cutlass (#5…
umiswing May 6, 2023
a5a0e8f
【prim】Elementwise double grad (#53014)
xiaoguoguo626807 May 6, 2023
1d8c82b
fix strided_slice ut (#53553)
USTCKAY May 6, 2023
ca174ea
fix conv1d_transpose insert quant node bug (#53320)
xiaoluomi May 6, 2023
08b44e6
[inference][trt] add lookup_table op trt converter, use trt gather la…
yuanlehome May 6, 2023
b65e932
Add PADDLE_THROW in take_along_axis kernel when the datatype of index…
Xreki May 6, 2023
fe80730
Fix compiling error on CI.
Xreki May 8, 2023
184cf9a
Merge branch 'develop' into add_flash_attn_for_af2
Xreki May 8, 2023
a299153
[cutlass] Avoid rewrite generated kernels in recompilation. (#53575)
umiswing May 8, 2023
65c6ed1
fix tag commit id with newest
JamesLim-sy May 8, 2023
acefdeb
Fix typos, test=document_fix (#53540)
co63oc May 8, 2023
3fd2e76
Fix sample code of poisson_nll_loss (#53551)
LyndonKong May 8, 2023
462e36e
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…
JamesLim-sy May 8, 2023
2bf6128
[AMP] fix static promote (#53439)
zhangting2020 May 8, 2023
3458f8c
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…
JamesLim-sy May 8, 2023
fca8595
[Paddle-TRT] add generic plugin for lookup_table_v2(embedding) op (#5…
yuanlehome May 8, 2023
be44a91
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…
JamesLim-sy May 8, 2023
a01b20d
[test]mv fluid op mkldnn to test/cpp/fluid/mkldnn (#53458)
gouzil May 8, 2023
2f50338
add ut for lookup_table op trt converter (#53563)
yuanlehome May 8, 2023
a9ba1ba
Merge branch 'add_flash_attn_for_af2' of https://github.com/JamesLim-…
Xreki May 8, 2023
c0f497a
Merge branch 'add_flash_attn_for_af2' of https://github.com/JamesLim-…
JamesLim-sy May 8, 2023
f3f3d57
[Dy2St]Following update of register_hook for static mode (#53572)
yangguohao May 8, 2023
bfe5a8c
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…
JamesLim-sy May 8, 2023
0b7fda0
Merge branch 'add_flash_attn_for_af2' of https://github.com/JamesLim-…
JamesLim-sy May 8, 2023
ac3ff47
change flash_attn commit id
JamesLim-sy May 8, 2023
fe91940
throw warning at __getitem__, not slice_utils (#53579)
zoooo0820 May 8, 2023
10f9249
[inference][trt]Unary operation support 0d (#53506)
zhangjun May 8, 2023
e988251
docs: Fix code example vsplit in docstring (#53558)
Asthestarsfalll May 8, 2023
186f5e0
hack 1-D tensor to Scalar (#53552)
zoooo0820 May 8, 2023
0a59825
[XPU] Optimize fp16 xpu models (#53523)
wz1qqx May 8, 2023
2aedd9d
[BUG] fix paddle.to_tensor/Tensor.item/Tensor.numpy BF16 bug (#53567)
zhwesky2010 May 8, 2023
26c3077
Fix timeout v2 (#53514)
sljlp May 8, 2023
7dcf5e5
Fix core dumped in training when check_nan_inf=1
AnnaTrainingG May 8, 2023
6d396ac
rm npu (#53566)
Liyulingyue May 8, 2023
b6c0407
Fix the calculation of y_grad in divide_backward (#53582)
ZzSean May 8, 2023
f74237c
fix: newExe didnot gc transferred variable (#53545)
kangguangli May 8, 2023
116fcad
【BugFix】fix err of api `to_tensor`, which caused by numpy version upd…
feifei-111 May 8, 2023
70180df
A copy of https://github.com/PaddlePaddle/Paddle/pull/51438 (#53587)
jiweibo May 8, 2023
ce937f6
fix op unitest for fused_gate_attn
JamesLim-sy May 8, 2023
e522ceb
add complex support for optest (#53356)
GGBond8488 May 8, 2023
e4bf1a8
[Zero-Dim] Support p_norm/reduce_sum_p output 0D (#53421)
zhwesky2010 May 8, 2023
727fa27
remove some [-Wunused-parameter]warning and WITH_DISTRIBUTE flag (#53…
Galaxy1458 May 9, 2023
8d340ee
Fix xpu2 kp compile error (#53548)
zhangbopd May 9, 2023
af2ad8d
fix error sample code in static.nn.loss.nce (#53588)
RedContritio May 9, 2023
aec4e38
[Paddle-TRT] Del 2 useless pass (#53414)
zhoutianzi666 May 9, 2023
eb12e62
fix eval branch of prim vjp of batch_norm in amp mode (#53598)
cyber-pioneer May 9, 2023
6029e02
[Zero-Dim] add 0D test for linalg.norm/linalg.cond (#53592)
GGBond8488 May 9, 2023
9682b04
[cutlass] Add generated files to .gitignore (#53607)
umiswing May 9, 2023
72cb09e
add logaddexp api (#52284)
zhiboniu May 9, 2023
ea0abf9
Support trt cuda graph. (#53406)
jiweibo May 9, 2023
dd90f10
Support static graph code-gen for unpool3d (#53479)
sanbuphy May 9, 2023
0f1b077
fix ut timeout, test=document_fix (#53629)
XieYunshen May 9, 2023
14c642c
【Hackathon 4th No.30】为 Paddle 新增 paddle.sparse.sum 稀疏 API (#51406)
zrr1999 May 9, 2023
7e9c87c
[PHI kernels] Bind XPU kernels (#53336)
RuohengMa May 9, 2023
a37ef76
rem tools/infer_prune_patches (#53596)
KimBioInfoStudio May 9, 2023
e588f2d
[Zero-Dim] add 0D Tensor UT case for XPU and expand kernel support 0D…
zhwesky2010 May 9, 2023
bafc346
remove some [-Wunused-parameter]warning (#53617)
Galaxy1458 May 9, 2023
9cd0a5b
[Hackthon 4th No.7] add paddle.unflatten API (#53055)
Ainavo May 9, 2023
9244ceb
update openblas version (#52983)
jiweibo May 9, 2023
eaed168
[static op generation] coalesce_tensor (#53570)
gouzil May 9, 2023
45ce0ad
[CINN]Adjust Bert unittest loss ground truth (#53628)
Aurelius84 May 9, 2023
4907485
Add compare accuracy api (#53430)
May 9, 2023
3be7a6c
Support static graph code-gen for yolo_loss (#52946)
sanbuphy May 10, 2023
26fe2dc
Fix the index calculation in cross_entroy_kernel. (#53659)
Xreki May 10, 2023
ee1aa69
fix two minor issues (#53658)
WintersMontagne10335 May 10, 2023
7a8635d
Add elementwise_heaviside tests (#53549)
co63oc May 10, 2023
2eea311
Add einsum to the default white_list. (#53586)
Xreki May 10, 2023
aafaad9
Revert "Optimize the implementation of the argsort operator. (#47738)…
Vvsmile May 10, 2023
e077678
[CustomDevice] fix reducer when input on cpu (#53662)
ronny1996 May 10, 2023
65e57a7
remove some [-Wunused-parameter] warning and WITH_DISTRIBUT flags (#5…
Galaxy1458 May 10, 2023
6a279df
scale, square, sum, swish trt op converter support zero dim (#53660)
yuanlehome May 10, 2023
f023d42
[NPU] PP for npu (#53501)
sljlp May 10, 2023
c828934
[CINN] fix check_cinn in Optest check condition error bug (#53676)
thisjiang May 10, 2023
0f319f8
Fix bug in log_softmax kernel when lastdim is larger than 100000 (#53…
ZzSean May 10, 2023
7f39bcd
[LAUNCH] add log overwrite flag (#53608)
kuizhiqing May 10, 2023
4f33f44
[static op generation] lstsq (#53290)
Liyulingyue May 10, 2023
38d664b
[XPU]Conv transpose fp16 && fix unittest (#53626)
wz1qqx May 10, 2023
65a3a58
no value clip for parallel cross entropy (#53547)
FeixLiu May 10, 2023
f3393f4
add index_put api (#52886)
Courtesy-Xs May 10, 2023
23c108f
Merge branch 'develop' into add_flash_attn_for_af2
Xreki May 10, 2023
fb8ea98
【prim】add dygraph error code when close prim flag for op who has comp…
xiaoguoguo626807 May 11, 2023
ad3f70a
Merge branch 'add_flash_attn_for_af2' of https://github.com/JamesLim-…
Xreki May 11, 2023
00ded2e
Fix div error when dtype is int64 in static mode (#53705)
0x45f May 11, 2023
0d45ac7
昇腾和寒武纪相关代码退场 npu相关代码退场2 (#53568)
Liyulingyue May 11, 2023
7ff9f5e
Try to crop the flash-attention lib.
Xreki May 11, 2023
d67d74c
[XPU] update log for bkcl function calls. (#53609)
houj04 May 11, 2023
44aebd4
[XPU] update dependency for xccl. (#53697)
houj04 May 11, 2023
49de9de
[Doc] remove execution_strategy doc (#53668)
kangguangli May 11, 2023
6ec8d85
up index warning level (#53691)
zoooo0820 May 11, 2023
08b6f5d
[XPU] add depthwise_conv2d_transpose (#53680)
SaltFish11 May 11, 2023
aebff6d
[Paddle-Inference] Support trt 0dims of expand_as_v2 and mish. (#53627)
xiaoxiaohehe001 May 11, 2023
8075752
[test]mv fluid [controlflow,detection,dlnne,tensorrt] tests to tests …
gouzil May 11, 2023
4a97ba5
[KUNLUN]Revert "revert p2p communication for xpu (#53496)" (#53633)
sljlp May 11, 2023
32dae48
add unitest for reshpe 0 dims (#53685)
zhoutianzi666 May 11, 2023
314d041
remove a part of npu (#53677)
jjyaoao May 11, 2023
b4024aa
Revert elementwise (#53663)
xiaoguoguo626807 May 11, 2023
9555ae8
Update openblas version. (#53708)
jiweibo May 11, 2023
6f28eb7
[XPU][PHI Kernels] add pad op for xpu (#53684)
lj970926 May 11, 2023
793f3b9
move DataLoader code to paddle.io (#48699)
heavengate May 11, 2023
2f56b6d
[CustomDevice] fix recompute (#53718)
ronny1996 May 11, 2023
dbb6269
remove some [-Wunused-parameter] warning (#53683)
Galaxy1458 May 11, 2023
04e5e7b
[inference Zero-Dim]add equal, elementwise_op trt 0d (#53704)
zhangjun May 11, 2023
4a69a53
[inference][trt]add trt sparse weights switch (#53562)
zhangjun May 11, 2023
e92a9bb
Correct the condition of whether can use flash-attn.
Xreki May 11, 2023
5417382
fix doc of compare_accuracy (#53661)
May 11, 2023
82c7388
[inference Zero-Dim]prelu trt converter support zero dim tensor (#53634)
yuanlehome May 11, 2023
3888682
add cinn bf16 support (#53637)
lanxianghit May 11, 2023
dc003fa
revise 'Examples' of LBFGS to create right docs(cn), test=docs_previe…
lijialin03 May 11, 2023
b150b16
[Inference Zero-Dim] Support trt 0dim of gelu, hard_swish, hard_sigmo…
xiaoxiaohehe001 May 11, 2023
13cdaab
[XPU] bkcl_broadcast support int64_t (#53720)
houj04 May 12, 2023
6cd7609
fix gpu mem alloc: use phi::memory_utils::Alloc (#53721)
Wangzheee May 12, 2023
92db839
fix er error msg of index_put Op (#53717)
Courtesy-Xs May 12, 2023
3e3297c
fix jacobian and hessian's docstring (#53732)
HydrogenSulfate May 12, 2023
b73594b
【Prim】support higher order autodiff for dy2static+composite (#53171)
cxxly May 12, 2023
58916e3
Skip fake alloc in static build for some communication OPs (#53593)
From00 May 12, 2023
95ae5d5
Revert "[CINN]Adjust Bert unittest loss ground truth (#53628)" (#53731)
Aurelius84 May 12, 2023
4e416c9
fix doc eror of index_put in develop (#53727)
Courtesy-Xs May 12, 2023
348565b
move pow2_decay_with_linear_warmup kernel to phi (#53741)
huangjiyi May 12, 2023
0603777
[PHI] update xpu api version; bind reduce_any_bool xpu kernel; remove…
RuohengMa May 12, 2023
d2b1e3c
sequence_mask functionalization (#53478)
GreatV May 12, 2023
05d3fc8
[inference zero dim] softmax, stack op trt converter support zero dim…
yuanlehome May 12, 2023
eb97f4f
[Inference] Update switch stream logical. (#53589)
jiweibo May 12, 2023
df8c302
Remove the softmax_out argument.
Xreki May 12, 2023
fc3c281
Merge branch 'develop' into add_flash_attn_for_af2
Xreki May 12, 2023
d03bbef
[CustomDevice] add inference MP support, PART0 (#53719)
ronny1996 May 12, 2023
3846111
【prim】add forward output for Silu grad signature (#53632)
xiaoguoguo626807 May 12, 2023
d01c89c
Remove is_causal.
Xreki May 12, 2023
1019b26
[XPU] remove clip of c_softmax_with_cross_entropy_op (#53734)
houj04 May 12, 2023
772b490
fix dtype missmatch error (#53712)
zhangting2020 May 12, 2023
60cf9b5
test(prim-cinn): split test_resnet and test_bert into three tests (#5…
6clc May 12, 2023
c497b43
[CustomDevice] add inference MP support, PART1 (#53702)
ronny1996 May 12, 2023
4d39cc7
fix add_n kernel of large shape (#53749)
zhiqiu May 12, 2023
ce256f7
【Hackathon 4 No.20】Add i0 / i0e to paddle (#52058)
PommesPeter May 12, 2023
b75d8c7
Revert elementwise add (#53745)
xiaoguoguo626807 May 13, 2023
3e90a46
fix build error (#53790)
tianshuo78520a May 14, 2023
34122e3
move OneHotRawKernel to legacy (#53200)
zhangyuqin1998 May 15, 2023
3dce9f0
Tranpose layout (#53351)
AnnaTrainingG May 15, 2023
8105607
[XPU][PHI] bind index_sample_grad xpu kernel (#53753)
RuohengMa May 15, 2023
00e415d
relocate python/paddle/fluid/regularizer.py (#53106)
longranger2 May 15, 2023
359f43a
[CI]Fix test_bert_primm_cinn gt loss value (#53796)
Aurelius84 May 15, 2023
a822a08
bug fixes (#53798)
risemeup1 May 15, 2023
5152971
Fix bug of hybrid_parallel_optimizer, amp use scaler.minimize(), (#53…
GhostScreaming May 15, 2023
b428e8f
[PHI]Add Filter for get_kernel_signatures.py (#53760)
YuanRisheng May 15, 2023
3d4d7c1
[UnitTest]Fix deprecate fluid.regularizer test=document_fix (#53805)
Aurelius84 May 15, 2023
3d6bd6a
add check ops for prim (#52302)
Charles-hit May 15, 2023
a9c3e32
fix bug of test_pad_op for cinn (#53772)
zyfncg May 15, 2023
96188fc
remove some [-Wunused-paramter]warning (#53681)
Galaxy1458 May 15, 2023
ca2ea16
remove some [-Wunsed-parameter]warning (#53679)
Galaxy1458 May 15, 2023
8ed01e8
remove some [-Wunsed-parameter] warning (#53687)
Galaxy1458 May 15, 2023
3e1fffe
remove some [-Wunsed-parameter] warning (#53689)
Galaxy1458 May 15, 2023
972daa4
[BUG] fix windows kernel dispatch of _lzcnt bug (#53728)
zhwesky2010 May 15, 2023
0ef5180
Reduce inference library size and compile time (#53369)
chalsliu May 15, 2023
94c3880
Silu double grad (#53605)
xiaoguoguo626807 May 15, 2023
cc9aeda
[inference Zero-Dim][trt] Add Zero-Dim tensor support for clip, cast,…
bukejiyu May 15, 2023
e04f8d4
[CustomDevice] add inference MP support, PART2 (#53701)
ronny1996 May 15, 2023
56fded1
[CustomDevice] add inference MP support, PART3 (#53703)
ronny1996 May 15, 2023
efd410c
move dequantize kernel to phi (#53739)
huangjiyi May 15, 2023
848deec
[AMP]fix embedding model weight type mismatch error (#53770)
shaojiewang May 15, 2023
2174e91
[PHI] fix duplicate instantiation of reduce_any when compiling with X…
RuohengMa May 16, 2023
434343c
fix pinv api for divide zero (#53815)
andyjiang1116 May 16, 2023
926b886
[CustomDevice] fix BatchNorm (#53820)
ronny1996 May 16, 2023
847c48a
fix simple typos (#53783)
MahmoudAshraf97 May 16, 2023
4f7dfd0
Fix typos, test=document_fix (#53803)
co63oc May 16, 2023
79c84ba
Fix typos in paddle/phi/core/generator.cc (#53802)
co63oc May 16, 2023
00c21ab
[phi] move stft to phi - Step 1 (#53517)
gouzil May 16, 2023
2a94b81
[inference][trt]Remove unused code from teller.cc (#53758)
zhangjun May 16, 2023
db407bf
[AMP] Allow to switch whether to use promote strategy to choose kerne…
Xreki May 16, 2023
51ecd93
[Inference] clean unused code/target for reduce inference so volume (…
yuanlehome May 16, 2023
98100fd
Add fill_constant_batch_size_like tests (#53736)
co63oc May 16, 2023
b133317
[dygraph]remove legacy code : _in_eager_mode_ and _in_eager_without_d…
liudongxue01 May 16, 2023
481511a
【Hackathon4 No.61】remainder 算子FP16/BF16单测完善 (#52920)
enkilee May 16, 2023
ad45b36
Add Japanese README (#53726)
eltociear May 16, 2023
7b81092
[static op generation] InstanceNorm (#53340)
Liyulingyue May 16, 2023
b86bbe8
support auto generation V2 abs (#53341)
enkilee May 16, 2023
312f018
static graph autogen code support for softmax op (#53581)
GreatV May 16, 2023
5e5481d
Move fused batchnorm to Phi (#53476)
AndSonder May 16, 2023
0689e2a
fix _strip_grad_suffix_ bugs when input patten is 'x@GRAD@RENAME'
cxxly May 12, 2023
5b054d2
昇腾和寒武纪相关代码退场 npu相关代码退场3 (#53699)
Liyulingyue May 16, 2023
52889e3
move cudnn_lstm kernel to phi (#53730)
huangjiyi May 16, 2023
c2c3bd4
[AMP] support OD level for static (#53768)
AnnaTrainingG May 16, 2023
32e36b1
[XPU] Add sigmoid_elementmul_xpu_fuse_pass (#53580)
sprouteer May 16, 2023
69161a9
【static】modify backward prune logic for EmptygradOpMaker (#53746)
xiaoguoguo626807 May 16, 2023
c33ba9d
Fix some tests for issuse 52842 (#53795)
liuzhenhai93 May 16, 2023
0ab7f94
add timer to pp (#53831)
FeixLiu May 16, 2023
e592534
【PaddlePaddle Hackathon 4 No.34】为 Paddle 优化 Lerp OP 在 GPU 上的性能 (#53154)
WintersMontagne10335 May 16, 2023
10a38b4
remove some [-Wunused-parameter] warning and fix a file to pass cppl…
Galaxy1458 May 16, 2023
640cff0
【Hackathon No57】add bf16 for mode (#53195)
Difers May 16, 2023
50f0acc
[Zero-Dim] update 0d tensor api en doc, test=document_fix (#53823)
zhwesky2010 May 16, 2023
74b91bc
Add huber_loss tests (#53535)
co63oc May 16, 2023
d6d3de7
fix bug of unpool in the module 'enet' of paddleseg (#53840)
heavyrain-lzy May 17, 2023
8965366
update openblas version (#53748)
jiweibo May 17, 2023
a63fb4c
【Hackathon 4 No.21】Add i1 / i1e to paddle (#53210)
LyndonKong May 17, 2023
38e5cd0
[fluid] decoupling abn op (#53826)
gouzil May 17, 2023
56e8aff
test(cinn): fix resnet50 precision (#53649)
6clc May 17, 2023
91a0ea5
Merge branch 'develop' into add_flash_attn_for_af2
Xreki May 17, 2023
78967ad
[IR] Program & Parameter & PaddleDialect (#53557)
zhangbo9674 May 17, 2023
2cb2801
remove fluid memory_usage_calc&model_stat&op_frequence (#53838)
zoooo0820 May 17, 2023
f6be954
Polish codes.
Xreki May 17, 2023
734dc44
Supports offline compilation of Paddle third-party libraries (#53744)
risemeup1 May 17, 2023
9e045ee
[CustomDevice] suport device_guard for custom device (#53808)
YanhuiDua May 17, 2023
4f1bf19
[CINN] extend cinn single test timeout from 150 to 200, test=document…
thisjiang May 17, 2023
972581d
[AMP]Master grad in static graph (#53362)
shaojiewang May 18, 2023
65ce688
Fix typos in send_v2_op.cu.cc (#53904)
co63oc May 18, 2023
92121d1
Fix typos, test=document_fix (#53916)
co63oc May 18, 2023
6d7076c
fix -Werror=format-security (#53886)
engineer1109 May 18, 2023
79ce3fa
rm cmake npu (#53869)
Liyulingyue May 18, 2023
d294eef
rm tools npu (#53870)
Liyulingyue May 18, 2023
5d638fe
[XPU] do not call check_nccl_version_for_p2p under xpu (#53862)
houj04 May 18, 2023
236e742
[Fix Typo] Fix gpu_info.h, Wheter->Whether (#53564)
jiahy0825 May 18, 2023
4f07b65
support auto generate for op layer_norm (#53178)
RedContritio May 18, 2023
1ac28b6
Fix typos in executor_statistics.cc (#53917)
co63oc May 18, 2023
2d0c694
[CustomOp Unittest] Fix XPU unittest, discard static backward (#53899)
jiahy0825 May 18, 2023
7b1695a
[Dy2static-Fallback] add set_eval_frame function in pybind. (#52006)
2742195759 May 18, 2023
acb5039
Del test_async_read_write in CPU (#53882)
tianshuo78520a May 18, 2023
2782b29
Fix typos in elementwise dir (#53907)
co63oc May 18, 2023
a862deb
move sequence_mask op InferShape func (#53782)
GreatV May 18, 2023
d8407c5
add fp16 and bf16 for trunc (#53876)
longranger2 May 18, 2023
e916e80
Fix typos, test=document_fix (#53927)
co63oc May 18, 2023
117e951
Fix typos (#53912)
co63oc May 18, 2023
0bed220
Add segment_pool tests (#53785)
co63oc May 18, 2023
26da689
move fusion_group kernel to phi (#53781)
huangjiyi May 18, 2023
fb4a6ec
Fused elementwises kernels and ops (#51427)
HulekJakub May 18, 2023
d53d8fd
remove CopyWithContext limitation (#53771)
engineer1109 May 18, 2023
ba84941
Fix qkv_transpose_out's shape and scaling of Q * K.
Xreki May 18, 2023
c3c8579
Add einsum tests (#53722)
co63oc May 18, 2023
bee8537
Merge branch 'develop' into add_flash_attn_for_af2
Xreki May 18, 2023
3747978
Update commit of flash-attention.
Xreki May 18, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
7 changes: 4 additions & 3 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -52,11 +52,11 @@ CMakeSettings.json
Makefile
.test_env/
.cache/
third_party/
build/third_party/

*~
bazel-*
third_party/
build/third_party/

build_*
# clion workspace.
Expand All @@ -75,7 +75,8 @@ tools/nvcc_lazy
# TODO(zhiqiang) Move this file to build directory.
paddle/fluid/pybind/eager_op_function.cc
tools/nvcc_lazy

paddle/phi/kernels/sparse/gpu/cutlass_generator/all_gemm_operations.h
paddle/phi/kernels/sparse/gpu/cutlass_generator/configurations.h

# these files (directories) are generated before build system generation
paddle/fluid/operators/generated_op*.cc
Expand Down
39 changes: 39 additions & 0 deletions .gitmodules
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
[submodule "third_party/protobuf"]
path = third_party/protobuf
url = https://github.com/protocolbuffers/protobuf.git
[submodule "third_party/gflags"]
path = third_party/gflags
url = https://github.com/gflags/gflags.git
[submodule "third_party/gloo"]
path = third_party/gloo
url = https://github.com/ziyoujiyi/gloo.git
[submodule "third_party/dlpack"]
path = third_party/dlpack
url = https://github.com/dmlc/dlpack.git
[submodule "third_party/utf8proc"]
path = third_party/utf8proc
url = https://github.com/JuliaStrings/utf8proc.git
[submodule "third_party/warpctc"]
path = third_party/warpctc
url = https://github.com/baidu-research/warp-ctc.git
[submodule "third_party/warprnnt"]
path = third_party/warprnnt
url = https://github.com/PaddlePaddle/warp-transducer.git
[submodule "third_party/xxhash"]
path = third_party/xxhash
url = https://github.com/Cyan4973/xxHash.git
[submodule "third_party/eigen3"]
path = third_party/eigen3
url = https://gitlab.com/libeigen/eigen.git
[submodule "third_party/leveldb"]
path = third_party/leveldb
url = https://github.com/google/leveldb
[submodule "third_party/threadpool"]
path = third_party/threadpool
url = https://github.com/progschj/ThreadPool.git
[submodule "third_party/zlib"]
path = third_party/zlib
url = https://github.com/madler/zlib.git
[submodule "third_party/glog"]
path = third_party/glog
url = https://github.com/google/glog.git
21 changes: 21 additions & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,11 @@ message(STATUS "C compiler: ${CMAKE_C_COMPILER}, version: "
"${CMAKE_C_COMPILER_ID} ${CMAKE_C_COMPILER_VERSION}")
message(STATUS "AR tools: ${CMAKE_AR}")

if((CMAKE_CXX_COMPILER_ID STREQUAL "GNU") AND CMAKE_CXX_COMPILER_VERSION
VERSION_GREATER 10.4)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wno-error=uninitialized")
endif()

# MUSL build turn off warnings
if(WITH_MUSL)
set(CMAKE_CXX_FLAGS
Expand Down Expand Up @@ -246,6 +251,7 @@ option(WITH_DISTRIBUTE "Compile with distributed support" OFF)
option(WITH_BRPC_RDMA "Use brpc rdma as the rpc protocal" OFF)
option(ON_INFER "Turn on inference optimization and inference-lib generation"
ON)
option(WITH_CPP_DIST "Install PaddlePaddle C++ distribution" OFF)
################################ Internal Configurations #######################################
option(WITH_NV_JETSON "Compile PaddlePaddle with NV JETSON" OFF)
option(WITH_PROFILER "Compile PaddlePaddle with GPU profiler and gperftools"
Expand Down Expand Up @@ -662,6 +668,21 @@ if(WITH_STRIP)
endif()
endif()

if(WITH_CPP_DIST)
# TODO(huangjiyi): Separate installing C++ distribution from python package
# installation and support for installing C++ distribution on more platforms.
if(NOT LINUX OR NOT WITH_PYTHON)
set(WITH_CPP_DIST
OFF
CACHE
STRING
"Currently C++ Distribution Generation is only available on Linux and compiling WITH_PYTHON=ON."
FORCE)
else()
include(paddle_lib)
endif()
endif()

add_subdirectory(paddle)
if(WITH_PYTHON)
add_subdirectory(python)
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

--------------------------------------------------------------------------------

English | [简体中文](./README_cn.md)
English | [简体中文](./README_cn.md) | [日本語](./README_ja.md)

[![Documentation Status](https://img.shields.io/badge/docs-latest-brightgreen.svg?style=flat)](https://paddlepaddle.org.cn/documentation/docs/en/guides/index_en.html)
[![Documentation Status](https://img.shields.io/badge/中文文档-最新-brightgreen.svg)](https://paddlepaddle.org.cn/documentation/docs/zh/guides/index_cn.html)
Expand Down
2 changes: 1 addition & 1 deletion README_cn.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@

--------------------------------------------------------------------------------

[English](./README.md) | 简体中文
[English](./README.md) | 简体中文 | [日本語](./README_ja.md)

[![Documentation Status](https://img.shields.io/badge/docs-latest-brightgreen.svg?style=flat)](https://paddlepaddle.org.cn/documentation/docs/en/guides/index_en.html)
[![Documentation Status](https://img.shields.io/badge/中文文档-最新-brightgreen.svg)](https://paddlepaddle.org.cn/documentation/docs/zh/guides/index_cn.html)
Expand Down
96 changes: 96 additions & 0 deletions README_ja.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
<p align="center">
<img align="center" src="doc/imgs/logo.png", width=1600>
<p>

--------------------------------------------------------------------------------

[English](./README.md) | [简体中文](./README_cn.md) | 日本語

[![Documentation Status](https://img.shields.io/badge/docs-latest-brightgreen.svg?style=flat)](https://paddlepaddle.org.cn/documentation/docs/en/guides/index_en.html)
[![Documentation Status](https://img.shields.io/badge/中文文档-最新-brightgreen.svg)](https://paddlepaddle.org.cn/documentation/docs/zh/guides/index_cn.html)
[![Release](https://img.shields.io/github/release/PaddlePaddle/Paddle.svg)](https://github.com/PaddlePaddle/Paddle/releases)
[![License](https://img.shields.io/badge/license-Apache%202-blue.svg)](LICENSE)
[![Twitter](https://img.shields.io/badge/Twitter-1ca0f1.svg?logo=twitter&logoColor=white)](https://twitter.com/PaddlePaddle_)

PaddlePaddle GitHub へようこそ。

PaddlePaddle は中国初の独立系 R&D ディープラーニングプラットフォームとして、2016年からプロのコミュニティに正式にオープンソース化されました。コアとなる深層学習フレームワーク、基本モデルライブラリ、エンドツーエンドの開発キット、ツール&コンポーネント、さらにサービスプラットフォームを網羅する、高度な技術と豊富な機能を備えた産業プラットフォームです。
PaddlePaddle は、工業化に対するコミットメントを持つ工業的実践から生まれたものです。製造業、農業、企業サービスなど幅広い分野で採用され、535万人以上の開発者、20万以上の企業、67万以上のモデルを生み出しています。それにより PaddlePaddle は、ますます多くのパートナーの AI 商用化を支援しています。


## インストール

### PaddlePaddle の最新リリース: [v2.4](https://github.com/PaddlePaddle/Paddle/tree/release/2.4)

私たちのビジョンは、PaddlePaddle を通じて、誰もが深層学習を行えるようにすることです。
PaddlePaddle の最新機能を追跡するために、私たちの[リリースのお知らせ](https://github.com/PaddlePaddle/Paddle/releases)を参照してください。
### 最新の安定版リリースのインストール:
```
# CPU
pip install paddlepaddle
# GPU
pip install paddlepaddle-gpu

```
インストール方法については、[クイックインストール](https://www.paddlepaddle.org.cn/install/quick)をご覧ください

この度、開発者の皆様が Tesla V100 のオンライン計算資源を無償で取得できるようになりました。AI Studio でプログラムを作成した場合、1日あたり8時間のオンライン学習が可能です。[スタートはこちら](https://aistudio.baidu.com/aistudio/index)。

## 四大技術

- **ディープニューラルネットワークの産業用開発のためのアジャイルフレームワーク**

PaddlePaddle ディープラーニングフレームワークは、ニューラルネットワークをアーキテクトするプログラマブルスキームを活用することで、技術的負担を軽減しながら開発を容易にする。宣言型プログラミングと命令型プログラミングの両方をサポートし、開発の柔軟性と高い実行性能を両立しています。 ニューラル・アーキテクチャは、アルゴリズムによって自動的に設計され、人間の専門家が設計したものよりも優れた性能を発揮する可能性があります。


- **ディープニューラルネットワークの超大規模学習をサポート**

PaddlePaddle は、超大規模なディープニューラルネットワークのトレーニングでブレークスルーを起こしました。数百のノードに分散したデータソースを用いて、1000億の特徴量と数兆のパラメータを持つディープネットワークのトレーニングをサポートする、世界初の大規模オープンソース・トレーニング・プラットフォームを立ち上げたのです。PaddlePaddle は、超大規模ディープラーニングモデルのオンラインディープラーニングの課題を克服し、さらに1兆以上のパラメータでリアルタイムにモデル更新を実現しました。
[詳しくはこちら](https://github.com/PaddlePaddle/Fleet)


- **総合的な展開環境に対応した高性能推論エンジン**

PaddlePaddle は、サードパーティのオープンソースフレームワークで学習されたモデルとの互換性があるだけでなく、様々な生産シナリオに対応した完全な推論エンジン、システム、スイートを提供しています。当社の推論エンジン、システム、スイートには、[Paddle Inference](https://paddle-inference.readthedocs.io/en/master/guides/introduction/index_intro.html) があります: [Paddle Serving](https://github.com/PaddlePaddle/Serving): 高性能なサーバーおよびクラウド推論用のネイティブ推論ライブラリ: [Paddle Serving](https://github.com/PaddlePaddle/Paddle-Lite): 分散型やパイプライン型プロダクションに適したサービス指向フレームワーク; [Paddle Lite](https://github.com/PaddlePaddle/Paddle-Lite): モバイルや IoT 環境向けの超軽量推論エンジン; [Paddle.js](https://www.paddlepaddle.org.cn/paddle/paddlejs): ブラウザやミニアプリのためのフロントエンド推論エンジンです。さらに、各シナリオの主要なハードウェアに最適化することで、Paddle の推論エンジンは他の主流フレームワークのほとんどを凌駕しています。


- **オープンソースリポジトリによる業界指向のモデルやライブラリ**

PaddlePaddle は、業界で長い間実践され、磨かれてきた100以上の主流モデルを含み、維持しています。これらのモデルの中には、主要な国際コンペティションで主要な賞を受賞したものもあります。一方、PaddlePaddle は、産業用アプリケーションの迅速な開発を促進するために、200以上のプレトレーニングモデル(そのうちのいくつかはソースコード付き)をさらに整備しています。
[詳しくはこちら](https://github.com/PaddlePaddle/models)


## ドキュメント

[英語](https://www.paddlepaddle.org.cn/documentation/docs/en/guides/index_en.html)と
[中国語](https://www.paddlepaddle.org.cn/documentation/docs/zh/guide/index_cn.html)のドキュメントを提供しています。

- [ガイド](https://www.paddlepaddle.org.cn/documentation/docs/en/guides/index_en.html)

PaddlePaddle でディープラーニングの基本を実装する方法から始めてみてはいかがでしょうか。

- [プラクティス](https://www.paddlepaddle.org.cn/documentation/docs/zh/tutorial/index_cn.html)

Paddle を使ってモデルを構築し、ディープラーニングタスクをより効率的に実行しましょう。

- [API リファレンス](https://www.paddlepaddle.org.cn/documentation/docs/en/api/index_en.html)

新しい API により、より短時間のプログラムが可能となりました。

- [コントリビュート方法](https://www.paddlepaddle.org.cn/documentation/docs/en/guides/08_contribution/index_en.html)

皆様のご投稿に感謝いたします!

## コミュニケーション

- [Github Issues](https://github.com/PaddlePaddle/Paddle/issues): バグレポート、機能リクエスト、インストールに関する問題、使用方法に関する問題など。
- QQディスカッショングループ: 441226485 (PaddlePaddle)です。
- [フォーラム](https://aistudio.baidu.com/paddle/forum): 実装や研究などについて話し合います。

## コース

- [Server Deployments](https://aistudio.baidu.com/aistudio/course/introduce/19084): ローカルサービスやリモートサービスを利用した高性能なサーバー展開を紹介するコースです。
- [Edge Deployments](https://aistudio.baidu.com/aistudio/course/introduce/22690): モバイル、IoT から Web、アプレットまで、エッジの展開を紹介するコース。

## Copyright とライセンス
PaddlePaddle は [Apache-2.0 license](LICENSE) の下で提供されています。
33 changes: 33 additions & 0 deletions cmake/PaddleConfig.cmake.in
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# Paddle CMake configuration file
# -------
#
# Finds the Paddle library
#
# This will define the following variables:
#
# PADDLE_FOUND -- True if the system has the Paddle library
# PADDLE_INCLUDE_DIRS -- The include directories for Paddle
# PADDLE_LIBRARIES -- Libraries to link against

get_filename_component(PADDLE_INSTALL_PREFIX "${CMAKE_CURRENT_LIST_FILE}/../.." ABSOLUTE)

# include directories
set(PADDLE_INCLUDE_DIRS
${PADDLE_INSTALL_PREFIX}/include
${PADDLE_INSTALL_PREFIX}/include/third_party
)

# Library dependencies.
set(PADDLE_LIBRARIES_DIRS ${PADDLE_INSTALL_PREFIX}/lib)
link_directories(${PADDLE_LIBRARIES_DIRS})

file(GLOB PADDLE_LIBRARIES ${PADDLE_LIBRARIES_DIRS}/lib*)

find_package(PythonLibs @PY_VERSION@ REQUIRED)
list(APPEND PADDLE_INCLUDE_DIRS ${PYTHON_INCLUDE_DIRS})
list(APPEND PADDLE_LIBRARIES ${PYTHON_LIBRARIES})

if(@WITH_GPU@)
find_package(CUDA @CUDA_VERSION@ REQUIRED)
list(APPEND PADDLE_LIBRARIES ${CUDA_LIBRARIES})
endif()
2 changes: 1 addition & 1 deletion cmake/external/cinn.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ if(NOT CINN_GIT_TAG)
set(CINN_GIT_TAG develop)
endif()

message(STATUS "CINN version: " ${CINN_GIT_TAG})
message(STATUS "CINN version: " ${CINN_GIT_TAG})

# TODO(zhhsplendid): CINN has lots of warnings during early development.
# They will be treated as errors under paddle. We set no-error now and we will
Expand Down
31 changes: 24 additions & 7 deletions cmake/external/cutlass.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -42,20 +42,37 @@ ExternalProject_Add(
INSTALL_COMMAND ""
TEST_COMMAND "")

set(tmp_gemm_operations_file
${CMAKE_SOURCE_DIR}/paddle/phi/kernels/sparse/gpu/cutlass_generator/generated/gemm/all_gemm_operations.h.tmp
)
set(tmp_configurations_file
${CMAKE_SOURCE_DIR}/paddle/phi/kernels/sparse/gpu/cutlass_generator/generated/gemm/configurations.h.tmp
)
set(gemm_operations_file
${CMAKE_SOURCE_DIR}/paddle/phi/kernels/sparse/gpu/cutlass_generator/all_gemm_operations.h
)
set(configurations_file
${CMAKE_SOURCE_DIR}/paddle/phi/kernels/sparse/gpu/cutlass_generator/configurations.h
)

add_custom_target(
cutlass_codegen
COMMAND
rm -rf
${CMAKE_SOURCE_DIR}/paddle/phi/kernels/sparse/gpu/cutlass_generator/build
COMMAND
mkdir -p
${CMAKE_SOURCE_DIR}/paddle/phi/kernels/sparse/gpu/cutlass_generator/build/generated/gemm
COMMAND
${PYTHON_EXECUTABLE} -B
${CMAKE_SOURCE_DIR}/paddle/phi/kernels/sparse/gpu/cutlass_generator/gather_gemm_scatter_generator.py
"${THIRD_PARTY_PATH}/cutlass/src/extern_cutlass/tools/library/scripts/"
"${CMAKE_SOURCE_DIR}/paddle/phi/kernels/sparse/gpu/cutlass_generator/build"
"${CMAKE_SOURCE_DIR}/paddle/phi/kernels/sparse/gpu/cutlass_generator"
"${CMAKE_CUDA_COMPILER_VERSION}"
COMMAND ${CMAKE_COMMAND} -E copy_if_different ${tmp_gemm_operations_file}
${gemm_operations_file}
COMMAND
${CMAKE_COMMAND} -E echo
"copy_if_different ${tmp_gemm_operations_file} to ${gemm_operations_file}"
COMMAND ${CMAKE_COMMAND} -E copy_if_different ${tmp_configurations_file}
${configurations_file}
COMMAND
${CMAKE_COMMAND} -E echo
"copy_if_different ${tmp_configurations_file} to ${configurations_file}"
VERBATIM)

add_library(cutlass INTERFACE)
Expand Down
11 changes: 4 additions & 7 deletions cmake/external/dlpack.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -15,18 +15,15 @@
include(ExternalProject)

set(DLPACK_PREFIX_DIR ${THIRD_PARTY_PATH}/dlpack)

set(DLPACK_REPOSITORY ${GIT_URL}/dmlc/dlpack.git)
set(DLPACK_TAG v0.4)

set(DLPACK_INCLUDE_DIR ${THIRD_PARTY_PATH}/dlpack/src/extern_dlpack/include)
include_directories(${DLPACK_INCLUDE_DIR})
set(SOURCE_DIR ${PADDLE_SOURCE_DIR}/third_party/dlpack)
include_directories(${SOURCE_DIR}/include)

ExternalProject_Add(
extern_dlpack
${EXTERNAL_PROJECT_LOG_ARGS} ${SHALLOW_CLONE}
GIT_REPOSITORY ${DLPACK_REPOSITORY}
GIT_TAG ${DLPACK_TAG}
${EXTERNAL_PROJECT_LOG_ARGS}
SOURCE_DIR ${SOURCE_DIR}
PREFIX ${DLPACK_PREFIX_DIR}
UPDATE_COMMAND ""
CONFIGURE_COMMAND ""
Expand Down
24 changes: 11 additions & 13 deletions cmake/external/eigen.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -17,8 +17,8 @@ include(ExternalProject)
# update eigen to the commit id f612df27 on 03/16/2021
set(EIGEN_PREFIX_DIR ${THIRD_PARTY_PATH}/eigen3)
set(EIGEN_SOURCE_DIR ${THIRD_PARTY_PATH}/eigen3/src/extern_eigen3)
set(EIGEN_REPOSITORY https://gitlab.com/libeigen/eigen.git)
set(EIGEN_TAG f612df273689a19d25b45ca4f8269463207c4fee)
set(SOURCE_DIR ${PADDLE_SOURCE_DIR}/third_party/eigen3)

if(WIN32)
add_definitions(-DEIGEN_STRONG_INLINE=inline)
Expand All @@ -28,14 +28,12 @@ elseif(LINUX)
# which will cause compiler error of using __host__ funciont
# in __host__ __device__
file(TO_NATIVE_PATH ${PADDLE_SOURCE_DIR}/patches/eigen/Meta.h native_src)
file(TO_NATIVE_PATH ${EIGEN_SOURCE_DIR}/Eigen/src/Core/util/Meta.h
native_dst)
file(TO_NATIVE_PATH ${SOURCE_DIR}/Eigen/src/Core/util/Meta.h native_dst)
file(TO_NATIVE_PATH ${PADDLE_SOURCE_DIR}/patches/eigen/TensorReductionGpu.h
native_src1)
file(
TO_NATIVE_PATH
${EIGEN_SOURCE_DIR}/unsupported/Eigen/CXX11/src/Tensor/TensorReductionGpu.h
native_dst1)
file(TO_NATIVE_PATH
${SOURCE_DIR}/unsupported/Eigen/CXX11/src/Tensor/TensorReductionGpu.h
native_dst1)
set(EIGEN_PATCH_COMMAND cp ${native_src} ${native_dst} && cp ${native_src1}
${native_dst1})
endif()
Expand All @@ -51,20 +49,20 @@ if(CMAKE_COMPILER_IS_GNUCC)
if(GCC_VERSION GREATER_EQUAL "12.0")
file(TO_NATIVE_PATH ${PADDLE_SOURCE_DIR}/patches/eigen/Complex.h.patch
complex_header)
# See: [Why calling some `git` commands before `patch`?]
set(EIGEN_PATCH_COMMAND
patch -d ${EIGEN_SOURCE_DIR}/Eigen/src/Core/arch/SSE/ <
${complex_header})
git checkout -- . && git checkout ${EIGEN_TAG} && patch -Nd
${SOURCE_DIR}/Eigen/src/Core/arch/SSE/ < ${complex_header})
endif()
endif()

set(EIGEN_INCLUDE_DIR ${EIGEN_SOURCE_DIR})
set(EIGEN_INCLUDE_DIR ${SOURCE_DIR})
include_directories(${EIGEN_INCLUDE_DIR})

ExternalProject_Add(
extern_eigen3
${EXTERNAL_PROJECT_LOG_ARGS} ${SHALLOW_CLONE}
GIT_REPOSITORY ${EIGEN_REPOSITORY}
GIT_TAG ${EIGEN_TAG}
${EXTERNAL_PROJECT_LOG_ARGS}
SOURCE_DIR ${SOURCE_DIR}
PREFIX ${EIGEN_PREFIX_DIR}
UPDATE_COMMAND ""
PATCH_COMMAND ${EIGEN_PATCH_COMMAND}
Expand Down
2 changes: 1 addition & 1 deletion cmake/external/flashattn.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ set(FLASHATTN_PREFIX_DIR ${THIRD_PARTY_PATH}/flashattn)
set(FLASHATTN_SOURCE_SUBDIR csrc/flash_attn)
set(FLASHATTN_INSTALL_DIR ${THIRD_PARTY_PATH}/install/flashattn)
set(FLASHATTN_REPOSITORY ${GIT_URL}/PaddlePaddle/flash-attention.git)
set(FLASHATTN_TAG 5ff4bbf56ad066750407c4aef16ac740ebda0717)
set(FLASHATTN_TAG 18106c1ba0ccee81b97ca947397c08a141815a47)

set(FLASHATTN_INCLUDE_DIR
"${FLASHATTN_INSTALL_DIR}/include"
Expand Down
Loading