diff --git a/README.md b/README.md index 288f9883..4a403378 100644 --- a/README.md +++ b/README.md @@ -230,11 +230,11 @@ curl --location 'http://localhost:3000/v1/chat/completions' --header 'Content-Ty ## Supported Models -| | HuggingFace | Blog | -|:-------------|:----------------------------:|:----------------------------:| -|GPT-OSS |[Link](https://huggingface.co/collections/openai/gpt-oss-68911959590a1634ba11c7a4)|[Link](https://openai.com/index/introducing-gpt-oss/)| -|Qwen3-Next |[Link](https://huggingface.co/collections/Qwen/qwen3-next-68c25fd6838e585db8eeea9d)|[Link](https://qwen.ai/blog?id=4074cca80393150c248e508aa62983f9cb7d27cd&from=research.latest-advancements-list)| -|Qwen3 |[Link](https://huggingface.co/collections/Qwen/qwen3-67dd247413f0e2e4f653967f)|[Link](https://qwenlm.github.io/blog/qwen3/)| -|Qwen2.5 |[Link](https://huggingface.co/collections/Qwen/qwen25-66e81a666513e518adb90d9e)|[Link](https://qwenlm.github.io/blog/qwen2.5/)| -|Llama3 |[Link](https://huggingface.co/meta-llama/collections)|[Link](https://ai.meta.com/blog/meta-llama-3/)| -|Kimi |[Link](https://huggingface.co/collections/moonshotai/kimi-k2-6871243b990f2af5ba60617d)|[Link](https://platform.moonshot.ai/blog)| +| | Provider | HuggingFace Collection | Blog | Description | +|:-------------|:-------------|:----------------------------:|:----------------------------:|:----------------------------| +|gpt-oss | OpenAI | [gpt-oss](https://huggingface.co/collections/openai/gpt-oss-68911959590a1634ba11c7a4) | [Introducing gpt-oss](https://openai.com/index/introducing-gpt-oss/) | "gpt-oss" refers to OpenAI's open-source GPT models, including gpt-oss-20b and gpt-oss-120b. The number (e.g., 20b, 120b) indicates the parameter count (20 billion, 120 billion). | +|Qwen3-Next | Qwen | [Qwen3-Next](https://huggingface.co/collections/Qwen/qwen3-next-68c25fd6838e585db8eeea9d) | [Qwen3-Next: Towards Ultimate Training & Inference Efficiency](https://qwen.ai/blog?id=4074cca80393150c248e508aa62983f9cb7d27cd&from=research.latest-advancements-list) | "Qwen3-Next" is the latest generation of Qwen models by Alibaba/Qwen, with improved efficiency and performance. Includes models like Qwen3-Next-80B-A3B-Instruct (80B parameters, instruction-tuned) and Qwen3-Next-80B-A3B-Thinking (80B, reasoning enhanced). Variants include FP8 quantized and instruction-tuned models. | +|Qwen3 | Qwen | [Qwen3](https://huggingface.co/collections/Qwen/qwen3-67dd247413f0e2e4f653967f) | [Qwen3: Think Deeper, Act Faster](https://qwen.ai/blog?id=1e3fa5c2d4662af2855586055ad037ed9e555125&from=research.research-list) | "Qwen3" is the third generation of Qwen LLMs, available in multiple sizes (e.g., 0.6B, 1.7B, 4B, 8B, 14B, 30B, 32B, 235B). Variants include FP8 quantized and instruction-tuned models. | +|Qwen2.5 | Qwen | [Qwen2.5](https://huggingface.co/collections/Qwen/qwen25-66e81a666513e518adb90d9e) | [Qwen2.5: A Party of Foundation Models!](https://qwen.ai/blog?id=6da44b4d3b48c53f5719bab9cc18b732a7065647&from=research.research-list) | "Qwen2.5" is an earlier generation of Qwen models, with sizes like 0.5B, 1.5B, 3B, 7B, 14B, 32B, 72B. These models are available in base and instruction-tuned versions. | +|Meta Llama 3 | Meta | [Meta Llama 3](https://huggingface.co/collections/meta-llama/meta-llama-3-66214712577ca38149ebb2b6)
[Llama 3.1](https://huggingface.co/collections/meta-llama/llama-31-669fc079a0c406a149a5738f)
[Llama 3.2](https://huggingface.co/collections/meta-llama/llama-32-66f448ffc8c32f949b04c8cf)
[Llama 3.3](https://huggingface.co/collections/meta-llama/llama-33-67531d5c405ec5d08a852000) | [Introducing Meta Llama 3: The most capable openly available LLM to date](https://ai.meta.com/blog/meta-llama-3/) | "Meta Llama 3" is Meta's third-generation Llama model, available in sizes such as 8B and 70B parameters. Includes instruction-tuned and quantized (e.g., FP8) variants. | +|Kimi-K2 | Moonshot AI | [Kimi-K2](https://huggingface.co/collections/moonshotai/kimi-k2-6871243b990f2af5ba60617d) | [Kimi K2: Open Agentic Intelligence](https://moonshotai.github.io/Kimi-K2/) | "Kimi-K2" is Moonshot AI's Kimi-K2 model family, including Kimi-K2-Instruct and Kimi-K2-Instruct-0905. The models are designed for agentic intelligence and available in different versions and parameter sizes. | diff --git a/pyproject.toml b/pyproject.toml index 0fe9a48b..206e8790 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -30,7 +30,7 @@ dependencies = [ "protobuf==6.31.1", "dijkstar==2.6.0", "huggingface-hub", - "lattica==1.0.1", + "lattica==1.0.2", ] [project.scripts] diff --git a/src/backend/main.py b/src/backend/main.py index 917f6d52..899df926 100644 --- a/src/backend/main.py +++ b/src/backend/main.py @@ -150,8 +150,9 @@ async def serve_index(): model_name = args.model_name init_nodes_num = args.init_nodes_num + is_local_network = args.is_local_network if model_name is not None and init_nodes_num is not None: - scheduler_manage.run(model_name, init_nodes_num) + scheduler_manage.run(model_name, init_nodes_num, is_local_network) port = args.port diff --git a/src/backend/server/scheduler_manage.py b/src/backend/server/scheduler_manage.py index 29360e67..56907358 100644 --- a/src/backend/server/scheduler_manage.py +++ b/src/backend/server/scheduler_manage.py @@ -49,7 +49,7 @@ def __init__( self.stubs = {} self.is_local_network = False - def run(self, model_name, init_nodes_num, is_local_network=False): + def run(self, model_name, init_nodes_num, is_local_network=True): """ Start the scheduler and the P2P service for RPC handling. """ diff --git a/src/backend/server/server_args.py b/src/backend/server/server_args.py index 5a04b8d7..43faab6c 100644 --- a/src/backend/server/server_args.py +++ b/src/backend/server/server_args.py @@ -33,6 +33,10 @@ def parse_args() -> argparse.Namespace: parser.add_argument("--init-nodes-num", type=int, default=None, help="Number of initial nodes") + parser.add_argument( + "--is-local-network", type=bool, default=True, help="Whether to use local network" + ) + args = parser.parse_args() return args