#

llm-reasoning

Here are 20 public repositories matching this topic...

YangLing0818 / buffer-of-thought-llm

[NeurIPS 2024 Spotlight] Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models

large-language-models chain-of-thought-reasoning retrieval-augmented-generation llm-reasoning

Updated Feb 11, 2025
Python

bruno686 / Awesome-RL-based-LLM-Reasoning

Awesome RL-based LLM Reasoning

reinforcment-learning llm llm-reasoning rl-based-llm-reasoning

Updated Mar 20, 2025

IAAR-Shanghai / Awesome-Attention-Heads

An awesome repository & A comprehensive survey on interpretability of LLM attention heads.

awesome survey transformer gpt attention-mechanism research-paper circuit-analysis interpretability cognitive-neuroscience visualization-tools large-language-models llm chain-of-thought llm-reasoning machine-psychology attention-head-mining

Updated Mar 2, 2025
TeX

yinizhilian / ICLR2025-Papers-with-Code

历年ICLR论文和开源项目合集，包含ICLR2021、ICLR2022、ICLR2023、ICLR2024、ICLR2025.

python machine-learning transformer gpt nlp-machine-learning nlp-keywords-extraction iclr2021 paperwithcode iclr2022 llms iclr2023 llm-agent llm-training gemmini llm-framework iclr2024 llm-reasoning llama3 deep-learning-paper

Updated Mar 14, 2025

inclusionAI / AReaL

Distributed RL System for LLM Reasoning

reinforcement-learning rl machine-learning-systems mlsys llm llm-reasoning

Updated Mar 11, 2025
Python

inclusionAI / Ling

Ling is a MoE LLM provided and open-sourced by InclusionAI.

machine-learning rl moe llm llm-reasoning

Updated Mar 17, 2025
Python

YangLing0818 / SuperCorrect-llm

[ICLR 2025] SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correction

reflection self-correction dpo llm llm-reasoning

Updated Feb 28, 2025
Python

nl4opt / ORQA

[AAAI 2025] ORQA is a new QA benchmark designed to assess the reasoning capabilities of LLMs in a specialized technical domain of Operations Research. The benchmark evaluates whether LLMs can emulate the knowledge and reasoning skills of OR experts when presented with complex optimization modeling tasks.

optimization linear-programming operations-research mathematical-modelling mixed-integer-programming multi-choice llm llm-reasoning llm-benchmarking llm4math aaai2025 ai4or llm4or llm4opt

Updated Mar 20, 2025

tsinghua-fib-lab / SmartAgent

The official repository of "SmartAgent: Chain-of-User-Thought for Embodied Personalized Agent in Cyber World".

personalization human-computer-interaction multi-modal embodied-ai chain-of-thought large-language-model llm-agent llm-reasoning lvlm openai-o1 human-centric-ai

Updated Mar 21, 2025

CodeEval-Pro / CodeEval-Pro

Official repo for "HumanEval Pro and MBPP Pro: Evaluating Large Language Models on Self-invoking Code Generation Task"

code-generation llm llm-evaluation llm-evaluation-toolkit llm4code llm-reasoning

Updated Jan 20, 2025
Python

UKPLab / emnlp2024-code-prompting

Code Prompting Elicits Conditional Reasoning Abilities in Text+Code LLMs. EMNLP 2024

openai gpt reasoning conditional-reasoning large-language-models llm llms program-aided-language-model code-prompting llm-reasoning

Updated Nov 13, 2024
Python

bowen-upenn / llm_token_bias

[EMNLP 2024] A Peek into Token Bias: Large Language Models Are Not Yet Genuine Reasoners

logical-reasoning large-language-models llm llm-evaluation llm-reasoning token-bias

Updated Dec 11, 2024
Python

pittisl / PhyT2V

official code repo of CVPR 2025 paper PhyT2V: LLM-Guided Iterative Self-Refinement for Physics-Grounded Text-to-Video Generation

video-generation diffusion-models prompt-tuning llm-reasoning cvpr2025

Updated Mar 17, 2025
Python

Cristian-Curaba / CryptoFormalEval

We introduce a benchmark for testing how well LLMs can find vulnerabilities in cryptographic protocols. By combining LLMs with symbolic reasoning tools like Tamarin, we aim to improve the efficiency and thoroughness of protocol analysis, paving the way for future AI-powered cybersecurity defenses.

cryptography communication-protocol evaluation vulnerability-detection llm-reasoning llm-benchmarking llm-based-agents

Updated Nov 4, 2024
Haskell

ethicalabs-ai / ouroboros

Self-Improving LLMs Through Iterative Refinement

dataset-generation reasoning metacognition synthetic-data synthetic-dataset-generation llm llms llm-inference llms-reasoning llm-reasoning deepseek-r1

Updated Mar 1, 2025
Python

yahskapar / LLMs-and-Probabilistic-Reasoning

Data and software artifacts for the EMNLP 2024 (Main) paper "What Are the Odds? Language Models Are Capable of Probabilistic Reasoning"

benchmarking distributions fewshot zeroshot probabilistic-reasoning llm numerical-reasoning llms llm-reasoning emnlp2024 health-reasoning finance-reasoning climate-reasoning

Updated Sep 30, 2024
Jupyter Notebook

tegridydev / mixture-of-persona-research

A “Mixture of Perspectives” Framework for Ethical AI

open-source alignment ai-framework open-research ai-research mechanistic-interpretability llm-reasoning mop-ai moral-machines

Updated Jan 29, 2025

hriaz17 / SayLessRAG

Code for the paper: "Say Less, Mean More: Leveraging Pragmatics in Retrieval-Augmented Generation"

information-retrieval information-extraction rag dense-retrieval retrieval-augmented-generation llm-reasoning gricean-pragmatics

Updated Mar 2, 2025

ogrnv / Creating-sample-means-for-measurement-standards-of-intelligence

Creating sample means for measurement standards of intelligence

testing intelligence r statistics ai agi tests statistical-analysis testing-tools confidence-intervals artificial-general-intelligence iq psychometics sample-mean randomly-generated llm llm-reasoning measurement-standard

Updated Jan 16, 2025
C

kang-ml / chain_of_thought_with_guidance

Implement CoT using guidance-ai

chain-of-thought llm-reasoning guidance-ai

Updated Oct 7, 2024
Python

Improve this page

Add a description, image, and links to the llm-reasoning topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the llm-reasoning topic, visit your repo's landing page and select "manage topics."