Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
FROM nvidia/cuda:12.4.0-runtime-ubuntu20.04 as base
FROM nvcr.io/nvidia/tritonserver:24.04-py3-min as base
ARG PYTORCH_VERSION=2.5.1
ARG PYTHON_VERSION=3.9
ARG CUDA_VERSION=12.4
Expand Down Expand Up @@ -38,5 +38,7 @@ WORKDIR /root
COPY ./requirements.txt /lightllm/requirements.txt
RUN pip install -r /lightllm/requirements.txt --no-cache-dir --ignore-installed --extra-index-url https://download.pytorch.org/whl/cu124

RUN pip install --no-cache-dir nvidia-nccl-cu12==2.25.1 # for allreduce hang issues in multinode H100

COPY . /lightllm
RUN pip install -e /lightllm --no-cache-dir
1 change: 1 addition & 0 deletions docs/CN/source/getting_started/quickstart.rst
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,7 @@
双机H100部署 DeepSeek-R1 模型,启动命令如下:

.. code-block:: console

$ # Node 0
$ LOADWORKER=8 python -m lightllm.server.api_server --model_dir ~/models/DeepSeek-R1 --tp 16 --graph_max_batch_size 100 --nccl_host master_addr --nnodes 2 --node_rank 0
$ # Node 1
Expand Down
1 change: 1 addition & 0 deletions docs/EN/source/getting_started/quickstart.rst
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,7 @@ For the DeepSeek-R1 model on single H200, it can be launched with the following
For the DeepSeek-R1 model on two H100, it can be launched with the following command:

.. code-block:: console

$ # Node 0
$ LOADWORKER=8 python -m lightllm.server.api_server --model_dir ~/models/DeepSeek-R1 --tp 16 --graph_max_batch_size 100 --nccl_host master_addr --nnodes 2 --node_rank 0
$ # Node 1
Expand Down
2 changes: 1 addition & 1 deletion lightllm/server/api_http.py
Original file line number Diff line number Diff line change
Expand Up @@ -326,7 +326,7 @@ async def tokens(request: Request):
try:
request_dict = await request.json()
prompt = request_dict.pop("text")
parameters = request_dict.pop("parameters")
parameters = request_dict.pop("parameters", {})
return JSONResponse({"ntokens": g_objs.httpserver_manager.tokens(prompt, parameters)}, status_code=200)
except Exception as e:
return create_error_response(HTTPStatus.EXPECTATION_FAILED, f"error: {str(e)}")
Expand Down
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
package_data = {"lightllm": ["common/all_kernel_configs/*/*.json"]}
setup(
name="lightllm",
version="1.0.0",
version="1.0.1",
packages=find_packages(exclude=("build", "include", "test", "dist", "docs", "benchmarks", "lightllm.egg-info")),
author="model toolchain",
author_email="",
Expand Down