Stars
Clean, minimal, accessible reproduction of DeepSeek R1-Zero
Code for Paper: Training Software Engineering Agents and Verifiers with SWE-Gym
This repo contains the dataset and code for the paper "SWE-Lancer: Can Frontier LLMs Earn $1 Million from Real-World Freelance Software Engineering?"
Collection of leaked system prompts
A natural language interface for computers
[NeurIPS 2024] OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
jax-triton contains integrations between JAX and OpenAI Triton
An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.
DialogStudio: Towards Richest and Most Diverse Unified Dataset Collection and Instruction-Aware Models for Conversational AI
🐙 OctoPack: Instruction Tuning Code Large Language Models
👨💻 An awesome and curated list of best code-LLM for research.
[ICML 2023] Data and code release for the paper "DS-1000: A Natural and Reliable Benchmark for Data Science Code Generation".
tree is a library for working with nested data structures
Type annotations and runtime checking for shape and dtype of JAX/NumPy/PyTorch/etc. arrays. https://docs.kidger.site/jaxtyping/
Rich is a Python library for rich text and beautiful formatting in the terminal.
A high-throughput and memory-efficient inference and serving engine for LLMs
A concrete syntax tree parser and serializer library for Python that preserves many aspects of Python's abstract syntax tree
Using Tree-of-Thought Prompting to boost ChatGPT's reasoning
Aligning pretrained language models with instruction data generated by themselves.
ConvLab-2: An Open-Source Toolkit for Building, Evaluating, and Diagnosing Dialogue Systems
In-Context Learning User Simulators for Task-Oriented Dialog Systems