Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 7 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -120,22 +120,22 @@ Supported models list:
First, download the image we provide:
```bash
# A2 x86
docker pull xllm/xllm-ai:xllm-0.7.1-dev-hb-rc2-x86
docker pull xllm/xllm-ai:xllm-dev-hb-rc2-x86
# A2 arm
docker pull xllm/xllm-ai:xllm-0.7.1-dev-hb-rc2-arm
docker pull xllm/xllm-ai:xllm-dev-hb-rc2-arm
# A3 arm
docker pull xllm/xllm-ai:xllm-0.7.1-dev-hc-rc2-arm
docker pull xllm/xllm-ai:xllm-dev-hc-rc2-arm
# or
# A2 x86
docker pull quay.io/jd_xllm/xllm-ai:xllm-0.7.1-dev-hb-rc2-x86
docker pull quay.io/jd_xllm/xllm-ai:xllm-dev-hb-rc2-x86
# A2 arm
docker pull quay.io/jd_xllm/xllm-ai:xllm-0.7.1-dev-hb-rc2-arm
docker pull quay.io/jd_xllm/xllm-ai:xllm-dev-hb-rc2-arm
# A3 arm
docker pull quay.io/jd_xllm/xllm-ai:xllm-0.7.1-dev-hc-rc2-arm
docker pull quay.io/jd_xllm/xllm-ai:xllm-dev-hc-rc2-arm
```
Then create the corresponding container:
```bash
sudo docker run -it --ipc=host -u 0 --privileged --name mydocker --network=host --device=/dev/davinci0 --device=/dev/davinci_manager --device=/dev/devmm_svm --device=/dev/hisi_hdc -v /var/queue_schedule:/var/queue_schedule -v /usr/local/Ascend/driver:/usr/local/Ascend/driver -v /usr/local/Ascend/add-ons/:/usr/local/Ascend/add-ons/ -v /usr/local/sbin/npu-smi:/usr/local/sbin/npu-smi -v /usr/local/sbin/:/usr/local/sbin/ -v /var/log/npu/conf/slog/slog.conf:/var/log/npu/conf/slog/slog.conf -v /var/log/npu/slog/:/var/log/npu/slog -v /export/home:/export/home -w /export/home -v ~/.ssh:/root/.ssh -v /var/log/npu/profiling/:/var/log/npu/profiling -v /var/log/npu/dump/:/var/log/npu/dump -v /home/:/home/ -v /runtime/:/runtime/ -v /etc/hccn.conf:/etc/hccn.conf xllm/xllm-ai:xllm-0.7.1-dev-hb-rc2-x86
sudo docker run -it --ipc=host -u 0 --privileged --name mydocker --network=host --device=/dev/davinci0 --device=/dev/davinci_manager --device=/dev/devmm_svm --device=/dev/hisi_hdc -v /var/queue_schedule:/var/queue_schedule -v /usr/local/Ascend/driver:/usr/local/Ascend/driver -v /usr/local/Ascend/add-ons/:/usr/local/Ascend/add-ons/ -v /usr/local/sbin/npu-smi:/usr/local/sbin/npu-smi -v /usr/local/sbin/:/usr/local/sbin/ -v /var/log/npu/conf/slog/slog.conf:/var/log/npu/conf/slog/slog.conf -v /var/log/npu/slog/:/var/log/npu/slog -v /export/home:/export/home -w /export/home -v ~/.ssh:/root/.ssh -v /var/log/npu/profiling/:/var/log/npu/profiling -v /var/log/npu/dump/:/var/log/npu/dump -v /home/:/home/ -v /runtime/:/runtime/ -v /etc/hccn.conf:/etc/hccn.conf xllm/xllm-ai:xllm-dev-hb-rc2-x86
```

Install official repo and submodules:
Expand Down
14 changes: 7 additions & 7 deletions README_zh.md
Original file line number Diff line number Diff line change
Expand Up @@ -116,22 +116,22 @@ xLLM 提供了强大的智能计算能力,通过硬件系统的算力优化与
首先下载我们提供的镜像:
```bash
# A2 x86
docker pull quay.io/jd_xllm/xllm-ai:xllm-0.7.1-dev-hb-rc2-x86
docker pull quay.io/jd_xllm/xllm-ai:xllm-dev-hb-rc2-x86
# A2 arm
docker pull quay.io/jd_xllm/xllm-ai:xllm-0.7.1-dev-hb-rc2-arm
docker pull quay.io/jd_xllm/xllm-ai:xllm-dev-hb-rc2-arm
# A3 arm
docker pull quay.io/jd_xllm/xllm-ai:xllm-0.7.1-dev-hc-rc2-arm
docker pull quay.io/jd_xllm/xllm-ai:xllm-dev-hc-rc2-arm
# 或者
# A2 x86
docker pull xllm/xllm-ai:xllm-0.7.1-dev-hb-rc2-x86
docker pull xllm/xllm-ai:xllm-dev-hb-rc2-x86
# A2 arm
docker pull xllm/xllm-ai:xllm-0.7.1-dev-hb-rc2-arm
docker pull xllm/xllm-ai:xllm-dev-hb-rc2-arm
# A3 arm
docker pull xllm/xllm-ai:xllm-0.7.1-dev-hc-rc2-arm
docker pull xllm/xllm-ai:xllm-dev-hc-rc2-arm
```
然后创建对应的容器
```bash
sudo docker run -it --ipc=host -u 0 --privileged --name mydocker --network=host --device=/dev/davinci0 --device=/dev/davinci_manager --device=/dev/devmm_svm --device=/dev/hisi_hdc -v /var/queue_schedule:/var/queue_schedule -v /usr/local/Ascend/driver:/usr/local/Ascend/driver -v /usr/local/Ascend/add-ons/:/usr/local/Ascend/add-ons/ -v /usr/local/sbin/npu-smi:/usr/local/sbin/npu-smi -v /usr/local/sbin/:/usr/local/sbin/ -v /var/log/npu/conf/slog/slog.conf:/var/log/npu/conf/slog/slog.conf -v /var/log/npu/slog/:/var/log/npu/slog -v /export/home:/export/home -w /export/home -v ~/.ssh:/root/.ssh -v /var/log/npu/profiling/:/var/log/npu/profiling -v /var/log/npu/dump/:/var/log/npu/dump -v /home/:/home/ -v /runtime/:/runtime/ -v /etc/hccn.conf:/etc/hccn.conf quay.io/jd_xllm/xllm-ai:xllm-0.7.1-dev-hb-rc2-x86
sudo docker run -it --ipc=host -u 0 --privileged --name mydocker --network=host --device=/dev/davinci0 --device=/dev/davinci_manager --device=/dev/devmm_svm --device=/dev/hisi_hdc -v /var/queue_schedule:/var/queue_schedule -v /usr/local/Ascend/driver:/usr/local/Ascend/driver -v /usr/local/Ascend/add-ons/:/usr/local/Ascend/add-ons/ -v /usr/local/sbin/npu-smi:/usr/local/sbin/npu-smi -v /usr/local/sbin/:/usr/local/sbin/ -v /var/log/npu/conf/slog/slog.conf:/var/log/npu/conf/slog/slog.conf -v /var/log/npu/slog/:/var/log/npu/slog -v /export/home:/export/home -w /export/home -v ~/.ssh:/root/.ssh -v /var/log/npu/profiling/:/var/log/npu/profiling -v /var/log/npu/dump/:/var/log/npu/dump -v /home/:/home/ -v /runtime/:/runtime/ -v /etc/hccn.conf:/etc/hccn.conf xllm/xllm-ai:xllm-dev-hb-rc2-x86
```

下载官方仓库与模块依赖:
Expand Down
27 changes: 0 additions & 27 deletions RELEASE.md
Original file line number Diff line number Diff line change
@@ -1,30 +1,3 @@
# Release xllm 0.7.1

## **Major Features and Improvements**

### Model Support

- Support GLM-4.5-Air.
- Support Qwen3-VL-Moe.

### Feature

- Support scheduler overlap when enable chunked prefill and MTP.
- Enable multi-process mode when running VLM model.
- Support AclGraph for GLM-4.5.

### Bugfix

- Reslove core dump of qwen embedding 0.6B.
- Resolve duplicate content in multi-turn tool call conversations.
- Support sampler parameters for MTP.
- Enable MTP and schedule overlap to work simultaneously.
- Resolve google.protobuf.Struct parsing failures which broke tool_call and think toggle functionality.
- Fix the precision issue in the Qwen2 model caused by model_type is not be assigned.
- Fix core dump of GLM 4.5 when enable MTP.
- Temporarily use heap allocation for VLM backend.
- Reslove core dump of stream chat completion request for VLM.

# Release xllm 0.7.0

## **Major Features and Improvements**
Expand Down
2 changes: 1 addition & 1 deletion cibuild/build_npu.sh
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ function error() {
exit 1
}

IMAGE="quay.io/jd_xllm/xllm-ai:xllm-0.7.1-dev-hb-rc2-x86"
IMAGE="quay.io/jd_xllm/xllm-ai:xllm-dev-hb-rc2-x86"

RUN_OPTS=(
--rm
Expand Down
2 changes: 1 addition & 1 deletion version.txt
Original file line number Diff line number Diff line change
@@ -1 +1 @@
0.7.1
0.7.0