{"payload":{"feedbackUrl":"https://github.com/orgs/community/discussions/53140","repo":{"id":354138793,"defaultBranch":"main","name":"FasterTransformer","ownerLogin":"NVIDIA","currentUserCanPush":false,"isFork":false,"isEmpty":false,"createdAt":"2021-04-02T21:36:33.000Z","ownerAvatar":"https://avatars.githubusercontent.com/u/1728152?v=4","public":true,"private":false,"isOrgOwned":true},"refInfo":{"name":"","listCacheKey":"v0:1682930850.0","currentOid":""},"activityList":{"items":[{"before":"afdf9a9eb86f15363c0249117d166d6b45dbb371","after":"df4a7534860137e060e18d2ebf019906120ea204","ref":"refs/heads/main","pushedAt":"2023-10-19T14:44:07.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"byshiue","name":"byshiue_NV","path":"/byshiue","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/11360707?s=80&v=4"},"commit":{"message":"Update README.md","shortMessageHtmlLink":"Update README.md"}},{"before":"f8e42aac45815c5be92c0915b12b9a6652386e8c","after":"afdf9a9eb86f15363c0249117d166d6b45dbb371","ref":"refs/heads/main","pushedAt":"2023-09-08T06:30:48.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"byshiue","name":"byshiue_NV","path":"/byshiue","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/11360707?s=80&v=4"},"commit":{"message":"fix memory leak","shortMessageHtmlLink":"fix memory leak"}},{"before":"7777ff1d9f13bedd7705bd4900e851b5f8177254","after":"f8e42aac45815c5be92c0915b12b9a6652386e8c","ref":"refs/heads/main","pushedAt":"2023-07-06T01:33:04.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"byshiue","name":"byshiue_NV","path":"/byshiue","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/11360707?s=80&v=4"},"commit":{"message":"remove mpi_cxx from multi-gpu build for now (#705)","shortMessageHtmlLink":"remove mpi_cxx from multi-gpu build for now (#705)"}},{"before":"eb9b81b65909cb14f582581c1ed4ee8e1e299be9","after":"7777ff1d9f13bedd7705bd4900e851b5f8177254","ref":"refs/heads/main","pushedAt":"2023-06-29T10:20:48.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"byshiue","name":"byshiue_NV","path":"/byshiue","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/11360707?s=80&v=4"},"commit":{"message":"Support size_per_head=112 (#660)\n\n* fix multi-gpu build\r\n\r\n* add support for size_per_head=112 for gpt decoder","shortMessageHtmlLink":"Support size_per_head=112 (#660)"}},{"before":"1cf9b515f3aa35d5483b1fd00aecc03fc9347dad","after":"eb9b81b65909cb14f582581c1ed4ee8e1e299be9","ref":"refs/heads/main","pushedAt":"2023-06-26T03:02:50.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"byshiue","name":"byshiue_NV","path":"/byshiue","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/11360707?s=80&v=4"},"commit":{"message":"fix: swap tensor bug (#683)","shortMessageHtmlLink":"fix: swap tensor bug (#683)"}},{"before":"c6e8f60ec40da218804a60e6aa986903e7fa8594","after":"1cf9b515f3aa35d5483b1fd00aecc03fc9347dad","ref":"refs/heads/main","pushedAt":"2023-06-21T03:24:03.138Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"byshiue","name":"byshiue_NV","path":"/byshiue","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/11360707?s=80&v=4"},"commit":{"message":"[bugfix] Fix 2-shot All Reduce correctness issue (indexing bug). (#672)\n\nFasterTransformer 2-shot all reduce is implemented as a reduce-scatter + all-gather. There is an indexing bug in the all-gather step. Prior to this change, 2-shot all reduce was only producing correct results on device 0. Now, all devices have the correct results.","shortMessageHtmlLink":"[bugfix] Fix 2-shot All Reduce correctness issue (indexing bug). (#672)"}},{"before":"539330280fbfde4e0e72ce8b460d4c92ba13a17a","after":null,"ref":"refs/heads/fix/gpt_early_stop","pushedAt":"2023-05-01T08:47:30.000Z","pushType":"branch_deletion","commitsCount":0,"pusher":{"login":"byshiue","name":"byshiue_NV","path":"/byshiue","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/11360707?s=80&v=4"}},{"before":"19b2956db648f5fa996c27840f6d00e32d2e21f0","after":"c6e8f60ec40da218804a60e6aa986903e7fa8594","ref":"refs/heads/main","pushedAt":"2023-05-01T08:45:58.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"byshiue","name":"byshiue_NV","path":"/byshiue","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/11360707?s=80&v=4"},"commit":{"message":"Fix/gpt early stop (#584)\n\n* fix: fix bug of early stopping of gpt","shortMessageHtmlLink":"Fix/gpt early stop (#584)"}},{"before":"b5d67652271b2d86dad338f64827646f100a9321","after":"539330280fbfde4e0e72ce8b460d4c92ba13a17a","ref":"refs/heads/fix/gpt_early_stop","pushedAt":"2023-05-01T08:45:39.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"byshiue","name":"byshiue_NV","path":"/byshiue","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/11360707?s=80&v=4"},"commit":{"message":"fix: remove useless codes","shortMessageHtmlLink":"fix: remove useless codes"}},{"before":null,"after":"b5d67652271b2d86dad338f64827646f100a9321","ref":"refs/heads/fix/gpt_early_stop","pushedAt":"2023-05-01T08:43:18.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"byshiue","name":"byshiue_NV","path":"/byshiue","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/11360707?s=80&v=4"},"commit":{"message":"fix codes","shortMessageHtmlLink":"fix codes"}},{"before":"a3ca909df949d4d8ce0b8e18c2a0eb90cf0cb411","after":"08b45b47d1cd3f157e1af0d804bc86fa13d04764","ref":"refs/heads/fix/softprompt_mask","pushedAt":"2023-04-25T01:50:43.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"byshiue","name":"byshiue_NV","path":"/byshiue","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/11360707?s=80&v=4"},"commit":{"message":"fix: fix bug of mask of gptj/gptneox","shortMessageHtmlLink":"fix: fix bug of mask of gptj/gptneox"}},{"before":null,"after":"a3ca909df949d4d8ce0b8e18c2a0eb90cf0cb411","ref":"refs/heads/fix/softprompt_mask","pushedAt":"2023-04-24T09:39:49.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"byshiue","name":"byshiue_NV","path":"/byshiue","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/11360707?s=80&v=4"},"commit":{"message":"fix: fix bug of preparing mask for soft prompt","shortMessageHtmlLink":"fix: fix bug of preparing mask for soft prompt"}},{"before":"3460e20ac0c7b8492c0beb98a9648dee713a726f","after":"19b2956db648f5fa996c27840f6d00e32d2e21f0","ref":"refs/heads/main","pushedAt":"2023-04-24T08:06:02.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"byshiue","name":"byshiue_NV","path":"/byshiue","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/11360707?s=80&v=4"},"commit":{"message":"perf(bloom): improve performance of huggingface_bloom_convert.py, decrease the time cost and the mem using (#568)\n\nCo-authored-by: r.yang ","shortMessageHtmlLink":"perf(bloom): improve performance of huggingface_bloom_convert.py, dec…"}},{"before":"d7ccf83a15c3fd30020d5007415a9c16e99f5f42","after":"3460e20ac0c7b8492c0beb98a9648dee713a726f","ref":"refs/heads/main","pushedAt":"2023-04-24T02:52:18.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"byshiue","name":"byshiue_NV","path":"/byshiue","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/11360707?s=80&v=4"},"commit":{"message":"[Enhancement]create huggingface_gptneox_convert.py (#569)\n\n* create huggingface_gptneox_convert.py\r\n\r\nSigned-off-by: AkiyamaYummy <842720660@qq.com>\r\n\r\n* adjust HF's multi bin files\r\n\r\nSigned-off-by: AkiyamaYummy <842720660@qq.com>\r\n\r\n* update gptneox_guide.md\r\n\r\nSigned-off-by: AkiyamaYummy <842720660@qq.com>\r\n\r\n---------\r\n\r\nSigned-off-by: AkiyamaYummy <842720660@qq.com>","shortMessageHtmlLink":"[Enhancement]create huggingface_gptneox_convert.py (#569)"}},{"before":"adb21c30442e5964531d982c040da29b5aedb737","after":"d7ccf83a15c3fd30020d5007415a9c16e99f5f42","ref":"refs/heads/main","pushedAt":"2023-04-20T00:28:43.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"byshiue","name":"byshiue_NV","path":"/byshiue","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/11360707?s=80&v=4"},"commit":{"message":"Update unfused_attention_kernels.cu\n\nfix bug of softmax kernel","shortMessageHtmlLink":"Update unfused_attention_kernels.cu"}},{"before":"c6ba315e06e97c0933a9fcff42ef720cf93c7168","after":"adb21c30442e5964531d982c040da29b5aedb737","ref":"refs/heads/main","pushedAt":"2023-04-19T07:41:37.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"byshiue","name":"byshiue_NV","path":"/byshiue","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/11360707?s=80&v=4"},"commit":{"message":"fix overflow in softmax_kernel when process long seqlen and big batch_size (#524)","shortMessageHtmlLink":"fix overflow in softmax_kernel when process long seqlen and big batch…"}},{"before":"a6ef7af16c094ac2bb55a3b821dc85d0f0240e08","after":"c6ba315e06e97c0933a9fcff42ef720cf93c7168","ref":"refs/heads/main","pushedAt":"2023-04-18T07:54:12.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"byshiue","name":"byshiue_NV","path":"/byshiue","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/11360707?s=80&v=4"},"commit":{"message":"Update cublasMMWrapper.cc","shortMessageHtmlLink":"Update cublasMMWrapper.cc"}},{"before":"169b8df80d568bf2337a35088c3979d207ab4495","after":"a6ef7af16c094ac2bb55a3b821dc85d0f0240e08","ref":"refs/heads/main","pushedAt":"2023-04-18T06:31:12.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"byshiue","name":"byshiue_NV","path":"/byshiue","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/11360707?s=80&v=4"},"commit":{"message":"Update cublasMMWrapper.cc\n\nFix the CUBLAS_VERSION checking of cublasMMWrapper","shortMessageHtmlLink":"Update cublasMMWrapper.cc"}},{"before":"0c128050e14b65d72b3c28c0324cc9db6f677be8","after":"169b8df80d568bf2337a35088c3979d207ab4495","ref":"refs/heads/main","pushedAt":"2023-04-18T05:21:50.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"byshiue","name":"byshiue_NV","path":"/byshiue","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/11360707?s=80&v=4"},"commit":{"message":"[Enhancement]add pytorch backend support for gptneox (#550)\n\n* add pytorch backend support for gptneox\r\n\r\nSigned-off-by: AkiyamaYummy <842720660@qq.com>\r\n\r\n* fix early stopping invalid\r\n\r\n* 1) Some unused parameters and logic have been removed. 2) Revisions that would affect pipeline parallelism have been reverted. 3) The code has been made capable of direct validation on TabbyML/NeoX-1.3B.\r\n\r\nSigned-off-by: AkiyamaYummy <842720660@qq.com>\r\n\r\n* Change the names of classes, removing 'parallel' from their names\r\n\r\nSigned-off-by: AkiyamaYummy <842720660@qq.com>\r\n\r\n* Format the code.\r\n\r\nSigned-off-by: AkiyamaYummy <842720660@qq.com>\r\n\r\n* Only print results when rank is 0.\r\n\r\nSigned-off-by: AkiyamaYummy <842720660@qq.com>\r\n\r\n* Add dist.init_process_group().\r\n\r\nSigned-off-by: AkiyamaYummy <842720660@qq.com>\r\n\r\n* update docs\r\n\r\nSigned-off-by: AkiyamaYummy <842720660@qq.com>\r\n\r\n---------\r\n\r\nSigned-off-by: AkiyamaYummy <842720660@qq.com>","shortMessageHtmlLink":"[Enhancement]add pytorch backend support for gptneox (#550)"}},{"before":"4402759e48f2340220638675f464b6ba1f79ac3c","after":"0c128050e14b65d72b3c28c0324cc9db6f677be8","ref":"refs/heads/main","pushedAt":"2023-04-17T03:04:38.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"byshiue","name":"byshiue_NV","path":"/byshiue","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/11360707?s=80&v=4"},"commit":{"message":"Update T5DecodingWeight.cc\n\nfix: fix loading bug of t5","shortMessageHtmlLink":"Update T5DecodingWeight.cc"}},{"before":"bc4139e636c410195fd81b849d44d1325f5fef7d","after":"4402759e48f2340220638675f464b6ba1f79ac3c","ref":"refs/heads/main","pushedAt":"2023-04-06T07:15:14.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"byshiue","name":"byshiue_NV","path":"/byshiue","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/11360707?s=80&v=4"},"commit":{"message":"fix: fix bug of gpt buffer and gpt gemm overflow","shortMessageHtmlLink":"fix: fix bug of gpt buffer and gpt gemm overflow"}},{"before":"e2dd1641880840db76b8902b34106c85b026a0af","after":"e045811c39572dca9016c830f7d0700858e82c32","ref":"refs/heads/tmp/fix_gpt_earlystop","pushedAt":"2023-04-06T05:30:57.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"byshiue","name":"byshiue_NV","path":"/byshiue","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/11360707?s=80&v=4"},"commit":{"message":"Update ParallelGpt.cc\n\nfix bug of calling invokeFindContextDups in parallelGpt","shortMessageHtmlLink":"Update ParallelGpt.cc"}},{"before":"e8384260b709b10459b138b22af42c3ff82d5c9e","after":"bc4139e636c410195fd81b849d44d1325f5fef7d","ref":"refs/heads/main","pushedAt":"2023-03-29T06:23:24.601Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"byshiue","name":"byshiue_NV","path":"/byshiue","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/11360707?s=80&v=4"},"commit":{"message":"Update gpt_guide.md (#529)","shortMessageHtmlLink":"Update gpt_guide.md (#529)"}},{"before":null,"after":"e2dd1641880840db76b8902b34106c85b026a0af","ref":"refs/heads/tmp/fix_gpt_earlystop","pushedAt":"2023-03-24T09:59:53.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"byshiue","name":"byshiue_NV","path":"/byshiue","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/11360707?s=80&v=4"},"commit":{"message":"fix: fix bug of gpt early stop. But this fixing would lead to hang on for pipeline parallelism","shortMessageHtmlLink":"fix: fix bug of gpt early stop. But this fixing would lead to hang on…"}},{"before":null,"after":"7c0ebe8691da23d3da68bfd2aa0a1b09a3f7f532","ref":"refs/heads/dev/gptj_shared_context","pushedAt":"2023-03-22T04:22:51.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"byshiue","name":"byshiue_NV","path":"/byshiue","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/11360707?s=80&v=4"},"commit":{"message":"feat: support shared context in gptj","shortMessageHtmlLink":"feat: support shared context in gptj"}},{"before":"bb94e2d9bbed65c7ea09b0fd340c290dae50f706","after":"e8384260b709b10459b138b22af42c3ff82d5c9e","ref":"refs/heads/main","pushedAt":"2023-03-17T03:37:12.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"byshiue","name":"byshiue_NV","path":"/byshiue","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/11360707?s=80&v=4"},"commit":{"message":"fix: gpt tensor shapes inconsistency (#505)\n\nSigned-off-by: AkiyamaYummy <842720660@qq.com>","shortMessageHtmlLink":"fix: gpt tensor shapes inconsistency (#505)"}},{"before":"72d3dceb97d26c1dd20d9ad9b6ab44cda2f2919a","after":"bb94e2d9bbed65c7ea09b0fd340c290dae50f706","ref":"refs/heads/main","pushedAt":"2023-03-14T07:47:02.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"byshiue","name":"byshiue_NV","path":"/byshiue","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/11360707?s=80&v=4"},"commit":{"message":"fix: change int of some kernels to int64_t to prevent overflow","shortMessageHtmlLink":"fix: change int of some kernels to int64_t to prevent overflow"}},{"before":"303e05273e283388778b53eb52abe43ffc62f546","after":"72d3dceb97d26c1dd20d9ad9b6ab44cda2f2919a","ref":"refs/heads/main","pushedAt":"2023-03-10T03:14:16.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"byshiue","name":"byshiue_NV","path":"/byshiue","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/11360707?s=80&v=4"},"commit":{"message":"Update beam_search_topk_kernels.cu\n\nfix: fix bug of beam search","shortMessageHtmlLink":"Update beam_search_topk_kernels.cu"}}],"hasNextPage":false,"hasPreviousPage":false,"activityType":"all","actor":null,"timePeriod":"all","sort":"DESC","perPage":30,"cursor":"djE6ks8AAAADmxJOFQA","startCursor":null,"endCursor":null}},"title":"Activity · NVIDIA/FasterTransformer"}