Add LMM Probe Results reproduction guide by Copilot · Pull Request #79 · EvolvingLMMs-Lab/OneVision-Encoder

Copilot · 2026-02-06T06:07:41Z

The LMM Probe Results section displayed benchmark results but lacked reproduction instructions. Users needed clear guidance on using the llava-next folder for Stage-2 fine-tuning with the 740K+800K sample mixed dataset.

Changes

Added reproduction subsection under LMM Probe Results with prominent note directing users to llava_next folder
Included step-by-step workflow: Docker setup → data preparation → Stage-2 fine-tuning → evaluation
Cross-referenced llava-next README for detailed training data format and troubleshooting

The instructions follow the existing collapsible section pattern for consistency.

Example

cd llava_next
bash scripts/sft_ov_encoder.sh
TASKS="videomme" bash scripts/eval/eval_ov_encoder.sh

Original prompt

LMM Probe Results
We train the model on a mixed dataset comprising 740K samples from LLaVA-OneVision and 800K samples from LLaVA-Video SFT, proceeding directly to Stage-2 fine-tuning. Following a streamlined native-resolution strategy inspired by LLaVA-OneVision, input frames that match the model’s native resolution are fed directly into the network without tiling or cropping, allowing us to fully evaluate the ViT’s native-resolution modeling capability.

这里，强调下，使用llava-next文件夹的内容来复现

✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

…folder Co-authored-by: anxiangsir <31175974+anxiangsir@users.noreply.github.com>

Copilot

Pull request overview

This PR adds reproduction instructions for the LMM Probe Results benchmarks that were previously displayed without guidance on how to reproduce them. The instructions guide users to use the llava_next folder for Stage-2 fine-tuning with a mixed dataset of 740K LLaVA-OneVision and 800K LLaVA-Video samples.

Changes:

Added a new "Reproducing LMM Probe Results" subsection with step-by-step instructions
Included Docker setup, data preparation, training, and evaluation commands
Cross-referenced the llava_next README for detailed documentation

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-06T06:19:51Z

+4. **Run Stage-2 fine-tuning:**
+   ```bash
+   # Configure the training script with your data paths
+   bash scripts/sft_ov_encoder.sh
+   ```


Step 4 mentions running Stage-2 fine-tuning, but the training script (scripts/sft_ov_encoder.sh line 30) requires a pretrained projector checkpoint that is not mentioned in these reproduction instructions. Users will need to either:

Run Stage-1 pretraining first to generate the mm_projector.bin file, or

Download a pretrained projector checkpoint

Consider adding a note about this prerequisite or adding a step for Stage-1 pretraining before Stage-2 fine-tuning.

Copilot · 2026-02-06T06:19:51Z

+       -v "$(pwd)":/workspace/OV-Encoder-Llava \
+       -w /workspace/OV-Encoder-Llava \
+       ov_encoder_llava:26.01 bash


The Docker command here is simplified compared to the official llava_next/README.md and may cause issues with the DeepSpeed-based Stage-2 fine-tuning in step 4. The training script uses DeepSpeed for distributed training, which typically requires additional Docker flags for proper IPC and memory management.

Consider using the complete Docker command from llava_next/README.md (lines 34-40) which includes:

--cap-add IPC_LOCK for memory locking

--ulimit memlock=-1 --ulimit stack=67108864 for memory limits

bash -c "service ssh restart; bash" for multi-node training support

Alternatively, add a note that this simplified command is suitable for evaluation only, and users should refer to llava_next/README.md for the complete setup when running training.

Suggested change

-v "$(pwd)":/workspace/OV-Encoder-Llava \

-w /workspace/OV-Encoder-Llava \

ov_encoder_llava:26.01 bash

--cap-add IPC_LOCK \

--ulimit memlock=-1 --ulimit stack=67108864 \

-v "$(pwd)":/workspace/OV-Encoder-Llava \

-w /workspace/OV-Encoder-Llava \

ov_encoder_llava:26.01 bash -c "service ssh restart; bash"

Copilot · 2026-02-06T06:19:51Z

+   ```bash
+   # Using Docker (recommended)
+   docker build -t ov_encoder_llava:26.01 .
+   docker run -it --gpus all --ipc host --net host --privileged \


The recommended docker run command uses --privileged and --net host, which effectively disables container isolation and grants the container near-complete control over the host system and network stack. If the image or any of its dependencies are ever compromised (e.g., via a supply-chain attack or malicious contribution), an attacker running inside this container could escape to the host, access host files, services, and credentials, and fully compromise the machine. Please drop --privileged and --net host here and instead document a minimal set of flags required for GPU access and networking (relying on default isolation wherever possible).

Suggested change

docker run -it --gpus all --ipc host --net host --privileged \

docker run -it --gpus all --ipc host \

Initial plan

6ad826e

Copilot AI assigned Copilot and anxiangsir Feb 6, 2026

Copilot started work on behalf of anxiangsir February 6, 2026 06:08 View session

Add reproduction instructions for LMM Probe Results using llava-next …

5ed9c45

…folder Co-authored-by: anxiangsir <31175974+anxiangsir@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Train model with LLaVA mixed dataset for probing~~ Add LMM Probe Results reproduction guide Feb 6, 2026

Copilot AI requested a review from anxiangsir February 6, 2026 06:11

Copilot finished work on behalf of anxiangsir February 6, 2026 06:11

anxiangsir approved these changes Feb 6, 2026

View reviewed changes

anxiangsir marked this pull request as ready for review February 6, 2026 06:13

Copilot AI review requested due to automatic review settings February 6, 2026 06:13

anxiangsir merged commit c453abf into main Feb 6, 2026
3 checks passed

Copilot started reviewing on behalf of anxiangsir February 6, 2026 06:14 View session

Copilot AI reviewed Feb 6, 2026

View reviewed changes

anxiangsir deleted the copilot/train-model-on-lm-probe branch March 2, 2026 04:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add LMM Probe Results reproduction guide#79

Add LMM Probe Results reproduction guide#79
anxiangsir merged 2 commits intomainfrom
copilot/train-model-on-lm-probe

Copilot AI commented Feb 6, 2026 •

edited

Loading

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 6, 2026

Uh oh!

Copilot AI Feb 6, 2026

Uh oh!

Copilot AI Feb 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	docker run -it --gpus all --ipc host --net host --privileged \
	docker run -it --gpus all --ipc host \

Conversation

Copilot AI commented Feb 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Example

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Copilot AI commented Feb 6, 2026 •

edited

Loading