{"payload":{"feedbackUrl":"https://github.com/orgs/community/discussions/53140","repo":{"id":494232964,"defaultBranch":"main","name":"flash-attention","ownerLogin":"Dao-AILab","currentUserCanPush":false,"isFork":false,"isEmpty":false,"createdAt":"2022-05-19T21:22:06.000Z","ownerAvatar":"https://avatars.githubusercontent.com/u/139507659?v=4","public":true,"private":false,"isOrgOwned":true},"refInfo":{"name":"","listCacheKey":"v0:1722374534.0","currentOid":""},"activityList":{"items":[{"before":"5d5bfbb61911a90c40f34567e38cb19f7db8b807","after":"3669b25206d5938e3cc74a5f7860e31c38af8204","ref":"refs/heads/main","pushedAt":"2024-08-06T04:27:52.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"tridao","name":"Tri Dao","path":"/tridao","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/5616128?s=80&v=4"},"commit":{"message":"bwd benchmark + small fixes (#1129)","shortMessageHtmlLink":"bwd benchmark + small fixes (#1129)"}},{"before":"3f1b4d38e7c9ba56a333f2a6e2afe65b844c8da5","after":"5d5bfbb61911a90c40f34567e38cb19f7db8b807","ref":"refs/heads/main","pushedAt":"2024-08-05T21:47:10.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"tridao","name":"Tri Dao","path":"/tridao","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/5616128?s=80&v=4"},"commit":{"message":"Remove contiguous checks","shortMessageHtmlLink":"Remove contiguous checks"}},{"before":"3f6ff1c1c52fa3d148b502e465ffb7bc88f7a50e","after":"3f1b4d38e7c9ba56a333f2a6e2afe65b844c8da5","ref":"refs/heads/main","pushedAt":"2024-08-05T15:59:23.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"tridao","name":"Tri Dao","path":"/tridao","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/5616128?s=80&v=4"},"commit":{"message":"Fix: check the type of max_seqlen_k instead of checking max_seqlen twice (#1127)","shortMessageHtmlLink":"Fix: check the type of max_seqlen_k instead of checking max_seqlen tw…"}},{"before":"c33de664a105533853cfe807f2caa50a05dd46e8","after":"3f6ff1c1c52fa3d148b502e465ffb7bc88f7a50e","ref":"refs/heads/main","pushedAt":"2024-08-02T08:00:07.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"tridao","name":"Tri Dao","path":"/tridao","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/5616128?s=80&v=4"},"commit":{"message":"Remove struct : cute::aligned_struct to avoid error with gcc 12","shortMessageHtmlLink":"Remove struct : cute::aligned_struct to avoid error with gcc 12"}},{"before":"bafe253042fb251a28f351ad0a2657da26263f31","after":"c33de664a105533853cfe807f2caa50a05dd46e8","ref":"refs/heads/main","pushedAt":"2024-08-01T09:14:37.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"tridao","name":"Tri Dao","path":"/tridao","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/5616128?s=80&v=4"},"commit":{"message":"Fix import in test","shortMessageHtmlLink":"Fix import in test"}},{"before":"abffb0f98c7df80380f87d0dacb713ae0533440c","after":"bafe253042fb251a28f351ad0a2657da26263f31","ref":"refs/heads/main","pushedAt":"2024-08-01T08:57:29.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"tridao","name":"Tri Dao","path":"/tridao","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/5616128?s=80&v=4"},"commit":{"message":"[FA3] Bwd","shortMessageHtmlLink":"[FA3] Bwd"}},{"before":"5018ac6ac531aabdb05c8af1ba3d98a2235bdbde","after":"abffb0f98c7df80380f87d0dacb713ae0533440c","ref":"refs/heads/main","pushedAt":"2024-08-01T05:42:06.000Z","pushType":"pr_merge","commitsCount":2,"pusher":{"login":"ipiszy","name":"Ying Zhang","path":"/ipiszy","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/10527447?s=80&v=4"},"commit":{"message":"Merge pull request #1115 from ipiszy/bench\n\nAdd cudnn benchmark for var-len","shortMessageHtmlLink":"Merge pull request #1115 from ipiszy/bench"}},{"before":"6b468f8406dff9026c4d834abc5a8eb6d01f6584","after":null,"ref":"refs/heads/fp8-kernel-with-transpose-V","pushedAt":"2024-07-30T21:22:14.000Z","pushType":"branch_deletion","commitsCount":0,"pusher":{"login":"jayhshah","name":null,"path":"/jayhshah","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/17012019?s=80&v=4"}},{"before":"c4b9015d74bd9f638c6fd574482accf4bbbd4197","after":"5018ac6ac531aabdb05c8af1ba3d98a2235bdbde","ref":"refs/heads/main","pushedAt":"2024-07-30T21:14:14.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"jayhshah","name":null,"path":"/jayhshah","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/17012019?s=80&v=4"},"commit":{"message":"Fp8 kernel with \"in-kernel\" transpose of V in producer (#1100)\n\n* base version\r\n\r\n* restructure pipelines, add special fp8 epilogue\r\n\r\n* add variants\r\n\r\n* add fp8 causal and modify dynamic tile scheduler\r\n\r\n* better causal schedule\r\n\r\n* maintain two schedules for non causal and causal\r\n\r\n* removing macros\r\n\r\n* fix regression\r\n\r\n* clean up unneeded methods and variants\r\n\r\n* fix mistake with NumProducerThreads\r\n\r\n* base version\r\n\r\n* restructure pipelines, add special fp8 epilogue\r\n\r\n* add variants\r\n\r\n* add fp8 causal and modify dynamic tile scheduler\r\n\r\n* better causal schedule\r\n\r\n* maintain two schedules for non causal and causal\r\n\r\n* removing macros\r\n\r\n* fix regression\r\n\r\n* clean up unneeded methods and variants\r\n\r\n* fix mistake with NumProducerThreads\r\n\r\n* use seqlen traits\r\n\r\n* add fp8 .cu files and benchmark script\r\n\r\n* fix merge issue\r\n\r\n* fix merge issue\r\n\r\n* fix merge issue\r\n\r\n* remove duplicate code\r\n\r\n* fix regression with varseqlen\r\n\r\n* move varseqlen init in constexpr\r\n\r\n* fix test script\r\n\r\n* more constexpr on varseqlen and add max offset\r\n\r\n* add back test cases","shortMessageHtmlLink":"Fp8 kernel with \"in-kernel\" transpose of V in producer (#1100)"}},{"before":"bd6505988a0791c60280c29905b9158b4a7656fc","after":"6b468f8406dff9026c4d834abc5a8eb6d01f6584","ref":"refs/heads/fp8-kernel-with-transpose-V","pushedAt":"2024-07-30T21:13:25.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"jayhshah","name":null,"path":"/jayhshah","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/17012019?s=80&v=4"},"commit":{"message":"add back test cases","shortMessageHtmlLink":"add back test cases"}},{"before":"ddc1d71dbe545a713ace2bc467d70eb4bebcd8d6","after":"bd6505988a0791c60280c29905b9158b4a7656fc","ref":"refs/heads/fp8-kernel-with-transpose-V","pushedAt":"2024-07-30T21:07:17.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"jayhshah","name":null,"path":"/jayhshah","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/17012019?s=80&v=4"},"commit":{"message":"more constexpr on varseqlen and add max offset","shortMessageHtmlLink":"more constexpr on varseqlen and add max offset"}},{"before":"bd7155c638ab181629322c4ab5e66139d1a2ded7","after":"ddc1d71dbe545a713ace2bc467d70eb4bebcd8d6","ref":"refs/heads/fp8-kernel-with-transpose-V","pushedAt":"2024-07-29T22:39:45.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"jayhshah","name":null,"path":"/jayhshah","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/17012019?s=80&v=4"},"commit":{"message":"fix test script","shortMessageHtmlLink":"fix test script"}},{"before":"418d677192b483dfc1decfdf9aadca40b402485d","after":"c4b9015d74bd9f638c6fd574482accf4bbbd4197","ref":"refs/heads/main","pushedAt":"2024-07-27T18:13:30.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"tridao","name":"Tri Dao","path":"/tridao","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/5616128?s=80&v=4"},"commit":{"message":"Add benchmark_gemm.py","shortMessageHtmlLink":"Add benchmark_gemm.py"}},{"before":"d3d0aa49cbc21eb003cec4fd37221cba104bdde8","after":"bd7155c638ab181629322c4ab5e66139d1a2ded7","ref":"refs/heads/fp8-kernel-with-transpose-V","pushedAt":"2024-07-26T08:13:03.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"jayhshah","name":null,"path":"/jayhshah","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/17012019?s=80&v=4"},"commit":{"message":"move varseqlen init in constexpr","shortMessageHtmlLink":"move varseqlen init in constexpr"}},{"before":"54e4fc53d8fca057f9bcb79ea22eea680874e2b6","after":"d3d0aa49cbc21eb003cec4fd37221cba104bdde8","ref":"refs/heads/fp8-kernel-with-transpose-V","pushedAt":"2024-07-26T07:13:41.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"jayhshah","name":null,"path":"/jayhshah","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/17012019?s=80&v=4"},"commit":{"message":"fix regression with varseqlen","shortMessageHtmlLink":"fix regression with varseqlen"}},{"before":"e458b1386eb4d5134bbf1cc882ae6ff05ab38fda","after":"54e4fc53d8fca057f9bcb79ea22eea680874e2b6","ref":"refs/heads/fp8-kernel-with-transpose-V","pushedAt":"2024-07-26T06:46:01.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"jayhshah","name":null,"path":"/jayhshah","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/17012019?s=80&v=4"},"commit":{"message":"remove duplicate code","shortMessageHtmlLink":"remove duplicate code"}},{"before":"59a8b8907522fc3f582acf3edbaf11d6de06db68","after":"e458b1386eb4d5134bbf1cc882ae6ff05ab38fda","ref":"refs/heads/fp8-kernel-with-transpose-V","pushedAt":"2024-07-26T06:44:52.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"jayhshah","name":null,"path":"/jayhshah","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/17012019?s=80&v=4"},"commit":{"message":"fix merge issue","shortMessageHtmlLink":"fix merge issue"}},{"before":"c94a19948d6509b906fac6947e0e580dd2677483","after":"59a8b8907522fc3f582acf3edbaf11d6de06db68","ref":"refs/heads/fp8-kernel-with-transpose-V","pushedAt":"2024-07-26T06:42:21.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"jayhshah","name":null,"path":"/jayhshah","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/17012019?s=80&v=4"},"commit":{"message":"fix merge issue","shortMessageHtmlLink":"fix merge issue"}},{"before":"a6acfb7812d9c0bb8642e984b1519da43f1796b9","after":"c94a19948d6509b906fac6947e0e580dd2677483","ref":"refs/heads/fp8-kernel-with-transpose-V","pushedAt":"2024-07-26T06:41:21.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"jayhshah","name":null,"path":"/jayhshah","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/17012019?s=80&v=4"},"commit":{"message":"fix merge issue","shortMessageHtmlLink":"fix merge issue"}},{"before":"a8d0b7be05123468d58e124fff861914de0bd279","after":"a6acfb7812d9c0bb8642e984b1519da43f1796b9","ref":"refs/heads/fp8-kernel-with-transpose-V","pushedAt":"2024-07-26T06:28:23.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"jayhshah","name":null,"path":"/jayhshah","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/17012019?s=80&v=4"},"commit":{"message":"add fp8 .cu files and benchmark script","shortMessageHtmlLink":"add fp8 .cu files and benchmark script"}},{"before":"17dd4b7e5ebe53f433298c923fd1d8200bfc94a5","after":"a8d0b7be05123468d58e124fff861914de0bd279","ref":"refs/heads/fp8-kernel-with-transpose-V","pushedAt":"2024-07-26T06:25:14.000Z","pushType":"push","commitsCount":30,"pusher":{"login":"jayhshah","name":null,"path":"/jayhshah","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/17012019?s=80&v=4"},"commit":{"message":"Merge branch 'fp8-kernel-with-transpose-V' of github.com:Dao-AILab/flash-attention into fp8-kernel-with-transpose-V","shortMessageHtmlLink":"Merge branch 'fp8-kernel-with-transpose-V' of github.com:Dao-AILab/fl…"}},{"before":"a00492e3fa4ed4386c6cd409b40b83731c1c090c","after":"17dd4b7e5ebe53f433298c923fd1d8200bfc94a5","ref":"refs/heads/fp8-kernel-with-transpose-V","pushedAt":"2024-07-26T00:50:36.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"jayhshah","name":null,"path":"/jayhshah","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/17012019?s=80&v=4"},"commit":{"message":"fix mistake with NumProducerThreads","shortMessageHtmlLink":"fix mistake with NumProducerThreads"}},{"before":null,"after":"a00492e3fa4ed4386c6cd409b40b83731c1c090c","ref":"refs/heads/fp8-kernel-with-transpose-V","pushedAt":"2024-07-26T00:37:24.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"jayhshah","name":null,"path":"/jayhshah","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/17012019?s=80&v=4"},"commit":{"message":"clean up unneeded methods and variants","shortMessageHtmlLink":"clean up unneeded methods and variants"}},{"before":"65205d350ea1b3074d94bd615b4111a1415e274b","after":"418d677192b483dfc1decfdf9aadca40b402485d","ref":"refs/heads/main","pushedAt":"2024-07-25T08:33:26.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"tridao","name":"Tri Dao","path":"/tridao","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/5616128?s=80&v=4"},"commit":{"message":"Bump to v2.6.3","shortMessageHtmlLink":"Bump to v2.6.3"}},{"before":"3aae9c18c11ce58865374e64640a474877ddee3d","after":"65205d350ea1b3074d94bd615b4111a1415e274b","ref":"refs/heads/main","pushedAt":"2024-07-25T08:30:46.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"tridao","name":"Tri Dao","path":"/tridao","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/5616128?s=80&v=4"},"commit":{"message":"[CI] Compile for pytorch 2.4.0","shortMessageHtmlLink":"[CI] Compile for pytorch 2.4.0"}},{"before":"1899c970c8639e82e6b8a78408f4041425e9f900","after":"3aae9c18c11ce58865374e64640a474877ddee3d","ref":"refs/heads/main","pushedAt":"2024-07-25T08:29:33.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"tridao","name":"Tri Dao","path":"/tridao","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/5616128?s=80&v=4"},"commit":{"message":"Revert \"Changes For FP8 (#1075)\"\n\nThis reverts commit 1899c970c8639e82e6b8a78408f4041425e9f900.","shortMessageHtmlLink":"Revert \"Changes For FP8 (#1075)\""}},{"before":"59594f2a67eb6a9de730b4f8fe90baa89c5439d3","after":"1899c970c8639e82e6b8a78408f4041425e9f900","ref":"refs/heads/main","pushedAt":"2024-07-23T20:51:14.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"tridao","name":"Tri Dao","path":"/tridao","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/5616128?s=80&v=4"},"commit":{"message":"Changes For FP8 (#1075)\n\n* adding files for fp8 changes.\r\n\r\n* removed contiguous check.\r\n\r\n* enable all tests except odd-seq-lengths, where it crashes now.\r\n\r\n* undid clang formatting.\r\n\r\n* change to correct tile size for headdim=128.\r\n\r\n* fixed odd-seq-len-k.\r\n\r\n* minor formatting.\r\n\r\n* minor reformatting.\r\n\r\n---------\r\n\r\nCo-authored-by: Tri Dao ","shortMessageHtmlLink":"Changes For FP8 (#1075)"}},{"before":"81b379c54db74ffde24161890973cec0de2ed64a","after":"d5893f3c74ae53d0b42632fe5cfeb755d6bb0c7a","ref":"refs/heads/changes_for_fp8","pushedAt":"2024-07-23T20:51:06.000Z","pushType":"push","commitsCount":15,"pusher":{"login":"tridao","name":"Tri Dao","path":"/tridao","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/5616128?s=80&v=4"},"commit":{"message":"Merge branch 'main' into changes_for_fp8","shortMessageHtmlLink":"Merge branch 'main' into changes_for_fp8"}},{"before":"299563626fbfcd8345e7da2f4e1bb93886b58341","after":"59594f2a67eb6a9de730b4f8fe90baa89c5439d3","ref":"refs/heads/main","pushedAt":"2024-07-23T09:30:19.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"tridao","name":"Tri Dao","path":"/tridao","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/5616128?s=80&v=4"},"commit":{"message":"Bump to v2.6.2","shortMessageHtmlLink":"Bump to v2.6.2"}},{"before":"4488acee8deceed0cdd3e5c283daf85b23df2787","after":"299563626fbfcd8345e7da2f4e1bb93886b58341","ref":"refs/heads/main","pushedAt":"2024-07-23T09:04:28.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"tridao","name":"Tri Dao","path":"/tridao","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/5616128?s=80&v=4"},"commit":{"message":"Fix test with alibi and cache_leftpad","shortMessageHtmlLink":"Fix test with alibi and cache_leftpad"}}],"hasNextPage":true,"hasPreviousPage":false,"activityType":"all","actor":null,"timePeriod":"all","sort":"DESC","perPage":30,"cursor":"djE6ks8AAAAEkqqVtwA","startCursor":null,"endCursor":null}},"title":"Activity · Dao-AILab/flash-attention"}