Skip to content
View OpenMOSE's full-sized avatar
Block or Report

Block or report OpenMOSE

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned

  1. RWKV-LM-LISA RWKV-LM-LISA Public

    Layerwise Importance Sampled AdamW for RWKV, aiming for RWKV-5 and 6. SFT, Aligning(DPO,ORPO). Cuda and Rocm6.0. can train 7b on 24GB GPU!

    Python 11

  2. RWKV5-LM-LoRA RWKV5-LM-LoRA Public

    RWKV v5,v6 LoRA Trainer on Cuda and Rocm Platform. RWKV is a RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN an…

    Python 9 1

  3. RWKV-infctx-trainer-LoRA RWKV-infctx-trainer-LoRA Public

    RWKV v5, v6 infctx LoRA trainer with 4bit quantization,Cuda and Rocm supported, for training arbitary context sizes, to 10k and beyond!

    Python 8 2

  4. RWKV-LM-RLHF-DPO-LoRA RWKV-LM-RLHF-DPO-LoRA Public

    Forked from Triang-jyed-driung/RWKV-LM-RLHF-DPO

    Direct Preference Optimization LoRA for RWKV, aiming for RWKV-5 and 6.

    Python 1

  5. RWKV-LM-State-4bit-Orpo RWKV-LM-State-4bit-Orpo Public

    State tuning with Orpo of RWKV v6 can be performed with 4-bit quantization. Every model can be trained with Orpo on Single 24GB GPU!

    Python 4 1

  6. RWKV-Infer RWKV-Infer Public

    A large-scale RWKV v6 inference wrapper using the Cuda backend. Easy to deploy on docker. Supports multi-batch generation and dynamic State switching. Let's spread RWKV, which combines RNN technolo…

    Python 4 1