LmDeploy InternVL3.5 reason error when input multiple images

Dear InternVL3.5 Team,

Thanks for your great work. I try to use LMDeploy to deploy the InternVL3.5-38B and input a video to let VLM response to my question.
When I add the frames of video to 13, I will encounter **this error**:
`2025-09-29 20:22:46,084 - lmdeploy - INFO - async_engine.py:732 - session=7, history_tokens=0, input_tokens=43382, max_new_tokens=700, seq_start=True, seq_end=True, step=0, prep=True
2025-09-29 20:23:41,280 - lmdeploy - ERROR - async_engine.py:874 - session 7 finished, reason "error"
2025-09-29 20:23:41,282 - lmdeploy - INFO - request.py:297 - Receive END_SESSION Request: 1`

And the command I use to deploy the VLM is as follows:
`lmdeploy serve api_server OpenGVLab/InternVL3_5-38B-HF --server-port 23333 --tp 2 --backend pytorch --log-level INFO`

The test code is as follows:

```
from openai import OpenAI
from PIL import Image
from io import BytesIO
import base64
import os
import json
from lmdeploy.vl.constants import IMAGE_TOKEN

LMDEPLOY_API_URL = "http://localhost:23333/v1"
ROOT_PATH = './'
MODEL_NAME = "OpenGVLab/InternVL3_5-38B-HF"

def local_image_to_base64(root_path, image_path: str, target_size=(480, 360)) -> str:
    if len(image_path) == 0:
        return [], None
    base64_list = []
    for path in image_path:
        path = os.path.join(root_path, path)
        with Image.open(path).convert("RGB") as img:
            img.thumbnail(target_size)
            buffer = BytesIO()
            img.save(buffer, format="JPEG", quality=95)
            base64_str = base64.b64encode(buffer.getvalue()).decode("utf-8")
            size = (img.width, img.height)
            
            base64_list.append(f"data:image/jpeg;base64,{base64_str}")
    
    return base64_list, size


client = OpenAI(
        base_url=LMDEPLOY_API_URL,
        api_key="dummy_key"
    )
input_images = [
    '001.jpg', '005.jpg', '010.jpg', '014.jpg', '020.jpg', '028.jpg', '034.jpg', '040.jpg', '043.jpg', '046.jpg', '050.jpg', '054.jpg', '060.jpg'
]

imgs_base64, _ = local_image_to_base64(ROOT_PATH, input_images)

question = ''
for i in range(len(imgs_base64)):
    question = question + f'Frame{i+1}: {IMAGE_TOKEN}\n'

question += 'Describe the camera motion in detail. And find which frames contain a piano.'
content = [{'type': 'text', 'text': question}]
for img in imgs_base64:
    content.append(
        {
            "type": "image_url",
            "image_url": {'max_dynamic_patch': 1, "url": img}
        }
    )
message = [
    {
        'role': 'user',
        'content': content
    }
]

response = client.chat.completions.create(
    model=MODEL_NAME,
    messages=message,
    max_tokens=700,
    temperature=0.6,
    stream=False
)
print(response.choices[0].message.content.strip())
```
By the way, I find that the answer of which frame contains a piano is always wrong. It seems that it cannot ground a target in a certain frame:
```The camera starts by focusing on a wooden door with glass panels, slowly moving forward to reveal an ornate room. As the camera progresses, it pans slightly to the right, showcasing blue upholstered furniture and paintings on the walls. The camera continues to move forward into another room, revealing more of the interior, including a long dining table set with chairs. The ceiling has visible damage, indicating possible neglect or disrepair. In the first frame, there is a piano on the right side of the doorway. The camera angle shifts to provide a broader view of the room as it moves deeper into the space.```

I would like to know if this is normal. Have I overlooked anything or made any mistakes? Could you please help me with this? Thank you very much.

[images.zip](https://github.com/user-attachments/files/22597577/images.zip)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

LmDeploy InternVL3.5 reason error when input multiple images #1198

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

LmDeploy InternVL3.5 reason error when input multiple images #1198

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions