-
Wuhan University
- Wuhan, China
- https://gknl.github.io
Highlights
- Pro
Stars
verl: Volcano Engine Reinforcement Learning for LLMs
[AAAI 2025 oral] Evaluating Mathematical Reasoning Beyond Accuracy
Chain of Thoughts (CoT) is so hot! so long! We need short reasoning process!
Awesome-Long2short-on-LRMs is a collection of state-of-the-art, novel, exciting long2short methods on large reasoning models. It contains papers, codes, datasets, evaluations, and analyses.
📜 Paper list on decoding methods for LLMs and LVLMs
Training Large Language Model to Reason in a Continuous Latent Space
✨ A synthetic dataset generation framework that produces diverse coding questions and verifiable solutions - all in one framwork
Official Repo for Open-Reasoner-Zero
✨✨Latest Papers and Benchmarks in Reasoning with Foundation Models
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)
OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models
Implementation of the Quiet-STAR paper (https://arxiv.org/pdf/2403.09629.pdf)
An Open Large Reasoning Model for Real-World Solutions
Paper list of misinformation research using (multi-modal) large language models, i.e., (M)LLMs.
The related works and background techniques about Openai o1
Implementation for "Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs"
The official repository of our survey paper: "Towards a Unified View of Preference Learning for Large Language Models: A Survey"
Paper Reproduction Google SCoRE(Training Language Models to Self-Correct via Reinforcement Learning)
This my attempt to create Self-Correcting-LLM based on the paper Training Language Models to Self-Correct via Reinforcement Learning by google
[NeurIPS'24] HippoRAG is a novel RAG framework inspired by human long-term memory that enables LLMs to continuously integrate knowledge across external documents. RAG + Knowledge Graphs + Personali…
Code for paper "AdaCAD: Adaptively Decoding to Balance Conflicts between Contextual and Parametric Knowledge"
Stanford NLP Python library for Representation Finetuning (ReFT)
Enhancing contextual understanding in large language models through contrastive decoding
[ICLR'24 Spotlight] "Adaptive Chameleon or Stubborn Sloth: Revealing the Behavior of Large Language Models in Knowledge Conflicts"
[EMNLP 2024] The official GitHub repo for the survey paper "Knowledge Conflicts for LLMs: A Survey"
Paper list for the survey "Combating Misinformation in the Age of LLMs: Opportunities and Challenges" and the initiative "LLMs Meet Misinformation", accepted by AI Magazine 2024