Awesome Reasoning LLM Tutorial/Survey/Guide
-
Updated
Jun 5, 2025 - Python
Awesome Reasoning LLM Tutorial/Survey/Guide
心理健康大模型 (LLM x Mental Health), Pre & Post-training & Dataset & Evaluation & Depoly & RAG, with InternLM / Qwen / Baichuan / DeepSeek / Mixtral / LLama / GLM series models
Explore the Multimodal “Aha Moment” on 2B Model
Train a Language Model with GRPO to create a schedule from a list of events and priorities
A brief and partial summary of RLHF algorithms.
[CVPR 2025] Science-T2I: Addressing Scientific Illusions in Image Synthesis
Revisiting Mid-training in the Era of RL Scaling
Lightweight replication study of DeepSeek-R1-Zero. Interesting findings include "No Aha Moment", "Longer CoT ≠ Accuracy", and "Language Mixing in Instruct Models".
A High-Efficiency System of Large Language Model Based Search Agents
[EMNLP 2022] Continual Training of Language Models for Few-Shot Learning
A novel alignment framework that leverages image retrieval to mitigate hallucinations in Vision Language Models.
Pure RL to post-train base models for social reasoning capabilities. Lightweight replication of DeepSeek-R1-Zero with Social IQa dataset.
This repository collects research papers on learning from rewards in the context of post-training and test-time scaling of large language models (LLMs).
The official implementation of Regularized Policy Gradient (RPG) (https://arxiv.org/abs/2505.17508)
RFTT: Reasoning with Reinforced Functional Token Tuning
Official implementation for "Diffusion Instruction Tuning"
Official repository for the paper "Monocular Event-Based Vision for Obstacle Avoidance with a Quadrotor" by Bhattacharya, et al. (2024) from GRASP, Penn & RPG, UZH.
Code repository for "Post-pre-training for Modality Alignment in Vision-Language Foundation Models" (CVPR2025)
Official repo for "ProSec: Fortifying Code LLMs with Proactive Security Alignment"
Add a description, image, and links to the post-training topic page so that developers can more easily learn about it.
To associate your repository with the post-training topic, visit your repo's landing page and select "manage topics."