intel · hzjane · Oct 30, 2025 · Oct 29, 2025 · Oct 30, 2025 · Oct 30, 2025
diff --git a/vllm/KNOWN_ISSUES.md b/vllm/KNOWN_ISSUES.md
@@ -12,9 +12,9 @@ Workaround: Change the PCIe slot configuration in BIOS from Auto/x16 to x8/x8.
 With this change, over 40 GB/s bi-directional P2P bandwidth can be achieved.
 Root cause analysis is still in progress.
 
-# 03. Container OOM killed by using `--enable-auto-tool-choice` and starting container not by /bin/bash and not run `source /opt/intel/oneapi/setvars.sh`
+# 03. Container OOM killed (and vllm performance drop) when starting container not by /bin/bash and not run `source /opt/intel/oneapi/setvars.sh`
 
-When using `--enable-auto-tool-choice` and deploy container by docker-compose without `source /opt/intel/oneapi/setvars.sh`, the LD_LIBRARY_PATH will be different and cause the container OOM. It can be reproduced by this two command:
+When using `--enable-auto-tool-choice` and deploy container by docker-compose without `source /opt/intel/oneapi/setvars.sh`, the LD_LIBRARY_PATH will be different and cause the container OOM (or performance drop). It can be reproduced by this two command:
 
 ```bash
 docker run --rm  --entrypoint "/bin/bash" --name=test intel/llm-scaler-vllm:latest -c env | grep LD_LIBRARY_PATH

diff --git a/vllm/Miner-U/README.md b/vllm/Miner-U/README.md
@@ -53,3 +53,5 @@ mineru-gradio --server-name 0.0.0.0 --server-port 7860
 ```
 
 Refer to [here](https://opendatalab.github.io/MinerU/zh/usage/quick_usage/#_2) for more details.
+
+### Refer to [here](https://github.com/intel/llm-scaler/tree/main/vllm#243-mineru-26-support) for new version 2.6.1 of mineru-vllm, which has performance improvements.
diff --git a/vllm/README.md b/vllm/README.md
@@ -2177,6 +2177,8 @@ curl http://localhost:8000/v1/chat/completions \
     "max_tokens": 128
   }'
 ```
+
+if want to process image in server local, you can `"url": "file:/llm/models/test/1.jpg"` to test.
 ---
 
 ### 2.4.1 Audio Model Support [Deprecated]
@@ -2276,16 +2278,9 @@ TORCH_LLM_ALLREDUCE=1 VLLM_USE_V1=1  CCL_ZE_IPC_EXCHANGE=pidfd VLLM_ALLOW_LONG_M
 
 ---
 
-### 2.4.3 MinerU 2.5 Support
-
-This guide shows how to launch the MinerU 2.5 model using the vLLM inference backend.
-
-#### Install MinerU Core
+### 2.4.3 MinerU 2.6 Support
 
-First, install the core MinerU package:
-```bash
-pip install mineru[core]
-```
+This guide shows how to launch the MinerU 2.6 model using the vLLM inference backend.
 
 #### Start the MinerU Service
 
@@ -2305,7 +2300,10 @@ python3 -m vllm.entrypoints.openai.api_server \
   --trust-remote-code \
   --gpu-memory-util 0.85 \
   --no-enable-prefix-caching \
+  --max-num-batched-tokens=32768 \
+  --max-model-len=32768 \
   --block-size 64 \
+  --max-num-seqs 256 \
   --served-model-name MinerU \
   --tensor-parallel-size 1 \
   --pipeline-parallel-size 1 \
@@ -2318,14 +2316,38 @@ python3 -m vllm.entrypoints.openai.api_server \
 
 
 
-#### Run the demo
-To verify your setup, clone the official MinerU repository and run the demo script:
+#### how to use MinerU
+1.To verify mineru by command line
+
+```bash
+#mineru -p <input_path> -o <output_path> -b vlm-http-client -u http://127.0.0.1:8000
+mineru -p /llm/MinerU/demo/pdfs/small_ocr.pdf -o ./ -b vlm-http-client -u http://127.0.0.1:8000
+```
+
+2.Using by gradio
 
 ```bash
-git clone https://github.com/opendatalab/MinerU.git
-cd MinerU/demo
-python3 demo.py
+mineru-gradio --server-name 0.0.0.0 --server-port 8002
+```
+
+```python
+from gradio_client import Client, handle_file
+
+client = Client("http://localhost:8002/")
+result = client.predict(
+    file_path=handle_file('/llm/MinerU/demo/pdfs/small_ocr.pdf'),
+    end_pages=500,
+    is_ocr=False,
+    formula_enable=True,
+    table_enable=True,
+    language="ch",
+    backend="vlm-http-client",
+    url="http://localhost:8000",
+    api_name="/to_markdown"
+)
+print(result)
 ```
+More details you can refer to gradio's [api guide](http://your_ip:8002/?view=api)
 
 ---
 

diff --git a/vllm/docker/Dockerfile b/vllm/docker/Dockerfile
@@ -57,12 +57,12 @@ RUN git clone -b v0.10.2 https://github.com/vllm-project/vllm.git && \
     python3 setup.py install
 
 # Clone + patch miner-U
-RUN git clone https://github.com/opendatalab/MinerU.git && \
+RUN git clone -b release-2.6.2 https://github.com/opendatalab/MinerU.git && \
     cd MinerU && \
-    git checkout de41fa58590263e43b783fe224b6d07cae290a33 && \
-    git apply /tmp/miner-u.patch && \
-    pip install -e .[core] && \
-    sed -i 's/select_device(self.args.device, verbose=verbose)/torch.device(self.args.device)/' /usr/local/lib/python3.12/dist-packages/ultralytics/engine/predictor.py
+    pip install -e .[core] --no-deps && \
+    pip install mineru_vl_utils==0.1.14 gradio gradio-client gradio-pdf && \
+    sed -i 's/kwargs.get("max_concurrency", 100)/kwargs.get("max_concurrency", 200)/' /llm/MinerU/mineru/backend/vlm/vlm_analyze.py && \
+    sed -i 's/kwargs.get("http_timeout", 600)/kwargs.get("http_timeout", 1200)/' /llm/MinerU/mineru/backend/vlm/vlm_analyze.py
 
 
 # ======= Add oneCCL build =======
Original file line number	Diff line number	Diff line change
Expand Up		@@ -53,3 +53,5 @@ mineru-gradio --server-name 0.0.0.0 --server-port 7860
		```

		Refer to [here](https://opendatalab.github.io/MinerU/zh/usage/quick_usage/#_2) for more details.

		### Refer to [here](https://github.com/intel/llm-scaler/tree/main/vllm#243-mineru-26-support) for new version 2.6.1 of mineru-vllm, which has performance improvements.