rl-training

Star

Here are 7 public repositories matching this topic...

rohithreddy024 / Text-Summarizer-Pytorch

Star

Pytorch implementation of "A Deep Reinforced Model for Abstractive Summarization" paper and pointer generator network

pytorch text-summarization beam-search rl-training mle-training

Updated Oct 1, 2019
Python

sb-ai-lab / Sim4Rec

Star

Simulator for training and evaluation of Recommender Systems

recommender-system recommendation user-modeling evaluation-framework synthetic-data rl-training

Updated Mar 24, 2025
Jupyter Notebook

An easy python package to run quick basic QA evaluations. This package includes standardized QA evaluation metrics and semantic evaluation metrics: Black-box and Open-Source large language model prompting and evaluation, exact match, F1 Score, PEDANT semantic match, transformer match. Our package also supports prompting OPENAI and Anthropic API.

qa-automation-test rl-training llm exact-matching llm-evaluation llm-evaluation-toolkit llm-evaluation-framework reward-modeling

Updated Jun 21, 2025
Python

zli12321 / long_form_rl

Star

grpo to train long form QA and instructions with long-form reward model

reinforcement-learning-algorithms evaluation-framework reward-design rl-training long-form-text-generation qwen2-5 grpo rlvr

Updated Jun 23, 2025
Python

Amirhosein-gh98 / Guided-by-Gut

Star

The official PyTorch implementation for the Guided by Gut: Efficient Test-Time Scaling with Reinforced Intrinsic Confidence

efficient tree-search gg prm self-consistency confidence dvts rl-training llm inference-time-compute grpo test-time-scaling guided-by-gut

Updated Jun 9, 2025
Python

sotheara-leang / txt-summarization

Star

Deep Reinforced Model for Abstractive Summarization

pytorch text-summarization abstractive-summarization rl-training mle-training temporal-attention share-decoder-weight

Updated Nov 22, 2022
Python

tryxmta / Guided-by-Gut

Star

Guided by Gut offers a streamlined approach to Test-Time Scaling using advanced reinforcement learning techniques. Explore the repository for practical implementations and insights into confidence-based fine-tuning and self-guided tree search. 🐙🌟

first the are jesus jesus-christ prm self-consistency wednesday confidence dvts gut rl-training religions consequences llm grpo test-time-scaling guided-by-gut

Updated Jun 25, 2025

Improve this page

Add a description, image, and links to the rl-training topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the rl-training topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rl-training

Here are 7 public repositories matching this topic...

rohithreddy024 / Text-Summarizer-Pytorch

sb-ai-lab / Sim4Rec

zli12321 / qa_metrics

zli12321 / long_form_rl

Amirhosein-gh98 / Guided-by-Gut

sotheara-leang / txt-summarization

tryxmta / Guided-by-Gut

Improve this page

Add this topic to your repo