Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 8 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -230,11 +230,11 @@ curl --location 'http://localhost:3000/v1/chat/completions' --header 'Content-Ty

## Supported Models

| | HuggingFace | Blog |
|:-------------|:----------------------------:|:----------------------------:|
|GPT-OSS |[Link](https://huggingface.co/collections/openai/gpt-oss-68911959590a1634ba11c7a4)|[Link](https://openai.com/index/introducing-gpt-oss/)|
|Qwen3-Next |[Link](https://huggingface.co/collections/Qwen/qwen3-next-68c25fd6838e585db8eeea9d)|[Link](https://qwen.ai/blog?id=4074cca80393150c248e508aa62983f9cb7d27cd&from=research.latest-advancements-list)|
|Qwen3 |[Link](https://huggingface.co/collections/Qwen/qwen3-67dd247413f0e2e4f653967f)|[Link](https://qwenlm.github.io/blog/qwen3/)|
|Qwen2.5 |[Link](https://huggingface.co/collections/Qwen/qwen25-66e81a666513e518adb90d9e)|[Link](https://qwenlm.github.io/blog/qwen2.5/)|
|Llama3 |[Link](https://huggingface.co/meta-llama/collections)|[Link](https://ai.meta.com/blog/meta-llama-3/)|
|Kimi |[Link](https://huggingface.co/collections/moonshotai/kimi-k2-6871243b990f2af5ba60617d)|[Link](https://platform.moonshot.ai/blog)|
| | Provider | HuggingFace Collection | Blog | Description |
|:-------------|:-------------|:----------------------------:|:----------------------------:|:----------------------------|
|gpt-oss | OpenAI | [gpt-oss](https://huggingface.co/collections/openai/gpt-oss-68911959590a1634ba11c7a4) | [Introducing gpt-oss](https://openai.com/index/introducing-gpt-oss/) | "gpt-oss" refers to OpenAI's open-source GPT models, including gpt-oss-20b and gpt-oss-120b. The number (e.g., 20b, 120b) indicates the parameter count (20 billion, 120 billion). |
|Qwen3-Next | Qwen | [Qwen3-Next](https://huggingface.co/collections/Qwen/qwen3-next-68c25fd6838e585db8eeea9d) | [Qwen3-Next: Towards Ultimate Training & Inference Efficiency](https://qwen.ai/blog?id=4074cca80393150c248e508aa62983f9cb7d27cd&from=research.latest-advancements-list) | "Qwen3-Next" is the latest generation of Qwen models by Alibaba/Qwen, with improved efficiency and performance. Includes models like Qwen3-Next-80B-A3B-Instruct (80B parameters, instruction-tuned) and Qwen3-Next-80B-A3B-Thinking (80B, reasoning enhanced). Variants include FP8 quantized and instruction-tuned models. |
|Qwen3 | Qwen | [Qwen3](https://huggingface.co/collections/Qwen/qwen3-67dd247413f0e2e4f653967f) | [Qwen3: Think Deeper, Act Faster](https://qwen.ai/blog?id=1e3fa5c2d4662af2855586055ad037ed9e555125&from=research.research-list) | "Qwen3" is the third generation of Qwen LLMs, available in multiple sizes (e.g., 0.6B, 1.7B, 4B, 8B, 14B, 30B, 32B, 235B). Variants include FP8 quantized and instruction-tuned models. |
|Qwen2.5 | Qwen | [Qwen2.5](https://huggingface.co/collections/Qwen/qwen25-66e81a666513e518adb90d9e) | [Qwen2.5: A Party of Foundation Models!](https://qwen.ai/blog?id=6da44b4d3b48c53f5719bab9cc18b732a7065647&from=research.research-list) | "Qwen2.5" is an earlier generation of Qwen models, with sizes like 0.5B, 1.5B, 3B, 7B, 14B, 32B, 72B. These models are available in base and instruction-tuned versions. |
|Meta Llama 3 | Meta | [Meta Llama 3](https://huggingface.co/collections/meta-llama/meta-llama-3-66214712577ca38149ebb2b6) <br>[Llama 3.1](https://huggingface.co/collections/meta-llama/llama-31-669fc079a0c406a149a5738f) <br>[Llama 3.2](https://huggingface.co/collections/meta-llama/llama-32-66f448ffc8c32f949b04c8cf) <br>[Llama 3.3](https://huggingface.co/collections/meta-llama/llama-33-67531d5c405ec5d08a852000) | [Introducing Meta Llama 3: The most capable openly available LLM to date](https://ai.meta.com/blog/meta-llama-3/) | "Meta Llama 3" is Meta's third-generation Llama model, available in sizes such as 8B and 70B parameters. Includes instruction-tuned and quantized (e.g., FP8) variants. |
|Kimi-K2 | Moonshot AI | [Kimi-K2](https://huggingface.co/collections/moonshotai/kimi-k2-6871243b990f2af5ba60617d) | [Kimi K2: Open Agentic Intelligence](https://moonshotai.github.io/Kimi-K2/) | "Kimi-K2" is Moonshot AI's Kimi-K2 model family, including Kimi-K2-Instruct and Kimi-K2-Instruct-0905. The models are designed for agentic intelligence and available in different versions and parameter sizes. |
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ dependencies = [
"protobuf==6.31.1",
"dijkstar==2.6.0",
"huggingface-hub",
"lattica==1.0.1",
"lattica==1.0.2",
]

[project.scripts]
Expand Down
3 changes: 2 additions & 1 deletion src/backend/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -150,8 +150,9 @@ async def serve_index():

model_name = args.model_name
init_nodes_num = args.init_nodes_num
is_local_network = args.is_local_network
if model_name is not None and init_nodes_num is not None:
scheduler_manage.run(model_name, init_nodes_num)
scheduler_manage.run(model_name, init_nodes_num, is_local_network)

port = args.port

Expand Down
2 changes: 1 addition & 1 deletion src/backend/server/scheduler_manage.py
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ def __init__(
self.stubs = {}
self.is_local_network = False

def run(self, model_name, init_nodes_num, is_local_network=False):
def run(self, model_name, init_nodes_num, is_local_network=True):
"""
Start the scheduler and the P2P service for RPC handling.
"""
Expand Down
4 changes: 4 additions & 0 deletions src/backend/server/server_args.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,10 @@ def parse_args() -> argparse.Namespace:

parser.add_argument("--init-nodes-num", type=int, default=None, help="Number of initial nodes")

parser.add_argument(
"--is-local-network", type=bool, default=True, help="Whether to use local network"
)

args = parser.parse_args()

return args