Skip to content
View jankinf's full-sized avatar
🤪
🤪

Organizations

@CLIAgroup

Block or report jankinf

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Sampling profiler for Python programs

Rust 13,356 449 Updated Feb 6, 2025

Reproduce R1 Zero on Logic Puzzle

Python 2,072 134 Updated Mar 3, 2025

Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.

Python 2,539 182 Updated Mar 10, 2025

minimal-cost for training 0.5B R1-Zero

Python 624 81 Updated Mar 10, 2025

Lightweight version of MAPPO to help you quickly migrate to your local environment.

Python 601 90 Updated Feb 26, 2025

This is the official implementation of Multi-Agent PPO (MAPPO).

Python 1,467 315 Updated Jul 18, 2024

Mobile-Agent: The Powerful Mobile Device Operation Assistant Family

Python 3,612 347 Updated Mar 10, 2025

Align Anything: Training All-modality Model with Feedback

Python 2,722 357 Updated Mar 9, 2025

DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding

Python 4,466 1,658 Updated Feb 26, 2025

Official codebase for "STAIR: Improving Safety Alignment with Introspective Reasoning"

Python 24 1 Updated Feb 26, 2025

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…

C++ 9,655 1,139 Updated Mar 7, 2025

Scalable toolkit for efficient model alignment

Python 737 90 Updated Mar 10, 2025

Accelerating new GitHub Actions workflows

TypeScript 9,865 5,767 Updated Mar 5, 2025

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

Python 2,997 233 Updated Mar 10, 2025

Official Repo for Open-Reasoner-Zero

Python 1,561 73 Updated Mar 5, 2025

Code for the paper Fine-Tuning Language Models from Human Preferences

Python 1,296 168 Updated Jul 25, 2023
Python 150 5 Updated Feb 20, 2025

Fully open reproduction of DeepSeek-R1

Python 22,499 2,020 Updated Mar 10, 2025

Deep Reinforcement Learning

3,652 615 Updated Dec 10, 2022

s1: Simple test-time scaling

Python 5,918 684 Updated Mar 6, 2025

Clean, minimal, accessible reproduction of DeepSeek R1-Zero

Python 11,073 1,411 Updated Mar 10, 2025

verl: Volcano Engine Reinforcement Learning for LLMs

Python 4,557 430 Updated Mar 10, 2025

Janus-Series: Unified Multimodal Understanding and Generation Models

Python 16,649 2,183 Updated Feb 1, 2025

Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation

Python 1,605 71 Updated Aug 15, 2024

This framework provides out-of-the-box implementations of Referential Games variants in order to study the emergence of artificial languages using deep learning, relying on PyTorch (https://www.pyt…

Python 22 3 Updated Feb 24, 2025

EGG: Emergence of lanGuage in Games

Jupyter Notebook 297 106 Updated Apr 4, 2024

Stable Diffusion web UI

Python 149,171 27,859 Updated Mar 4, 2025

The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.

Python 70,430 7,601 Updated Mar 10, 2025

StarCraft II Learning Environment

Python 8,089 1,158 Updated Jul 23, 2024
Next
Showing results