Ring-V2.5

Introduction

Introducing Ring-2.5-1T: the world's first open-source trillion-parameter thinking model based on hybrid linear attention architecture.

In a major step toward general-purpose AI agents, we're scaling hybrid linear attention across pre-training and RL. Our efficient 1:7 MLA + Lightning Linear Attention boosts reasoning speed and exploration, while expanded RL training enhances deep thinking and long-horizon task execution.

Performance

Ring-2.5-1T model achieves gold-medal🏅 level performance at both IMO 2025 and CMO 2025. For detailed solutions of our model, please see examples.

Model Downloads

Model	Context Length	Download
Ring-2.5-1T	128K -> 256K (YaRN)	🤗 HuggingFace 🤖 ModelScope

Note: If you are interested in previous version, please visit the past model collections in Huggingface or ModelScope.

Deployment

SGLang

Environment Preparation

We will later submit our model to SGLang official release, now we can prepare the environment following steps:

git clone -b ling_2_5 git@github.com:antgroup/sglang.git
cd sglang

# Install the python packages
pip install --upgrade pip
pip install -e "python"

Run Inference

Both BF16 and FP8 models are supported by SGLang now. It depends on the dtype of the model in ${MODEL_PATH}.

Here is the example to run Ring-1T with multiple GPU nodes, where the master node IP is ${MASTER_IP} and server port is ${PORT}:

Start server:

# Node 0:
python -m sglang.launch_server --model-path $MODEL_PATH --tp-size 8 --pp-size 4 --dp-size 1 --trust-remote-code --dist-init-addr $MASTER_IP:2345 --port $PORT --nnodes 4 --node-rank 0 
# Node 1:
python -m sglang.launch_server --model-path $MODEL_PATH --tp-size 8 --pp-size 4 --dp-size 1 --trust-remote-code --dist-init-addr $MASTER_IP:2345 --port $PORT --nnodes 4 --node-rank 1 
# Node 2:
python -m sglang.launch_server --model-path $MODEL_PATH --tp-size 8 --pp-size 4 --dp-size 1 --trust-remote-code --dist-init-addr $MASTER_IP:2345 --port $PORT --nnodes 4 --node-rank 2 
# Node 3:
python -m sglang.launch_server --model-path $MODEL_PATH --tp-size 8 --pp-size 4 --dp-size 1 --trust-remote-code --dist-init-addr $MASTER_IP:2345 --port $PORT --nnodes 4 --node-rank 3

# This is only an example. Please adjust arguments according to your actual environment.

Client:

curl -s http://${MASTER_IP}:${PORT}/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model": "auto", "messages": [{"role": "user", "content": "What is the capital of France?"}]}'

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
examples		examples
figures		figures
models		models
LICENSE		LICENSE
README.md		README.md
requiremetns.txt		requiremetns.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ring-V2.5

Introduction

Performance

Model Downloads

Deployment

SGLang

Environment Preparation

Run Inference

License

Citation

About

Uh oh!

Releases

Packages

Languages

License

inclusionAI/Ring-V2.5

Folders and files

Latest commit

History

Repository files navigation

Ring-V2.5

Introduction

Performance

Model Downloads

Deployment

SGLang

Environment Preparation

Run Inference

License

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages