Open Source Deep Research Alternative to Reason and Search on Private Data. Written in Python.
-
Updated
Mar 12, 2025 - Python
Open Source Deep Research Alternative to Reason and Search on Private Data. Written in Python.
Simple extension on vLLM to help you speed up reasoning model without training.
Agentic Deep Graph Reasoning Implementation
State Sandbox is an experimental game for socioeconomic simulation. It uses Large Language Models (o3-mini) to simulate the world and complex policy impacts.
Lightweight replication study of DeepSeek-R1-Zero. Interesting findings include "No Aha Moment", "Longer CoT ≠ Accuracy", and "Language Mixing in Instruct Models".
Experiments on test-time scaling approaches for reasoning LM's to enforce better <think> or <wait> capabilities.
Pure RL without SFT to post-train base models for social reasoning capabilities. Lightweight replication of DeepSeek-R1-Zero with Social IQa dataset.
Model Mondays is a weekly livestreamed series on Microsoft Reactor that helps you make informed model choice decisions with timely updates and model deep-dives. Watch live for the content. Join Discord for the discussions.
Add a description, image, and links to the reasoning-models topic page so that developers can more easily learn about it.
To associate your repository with the reasoning-models topic, visit your repo's landing page and select "manage topics."