From 213bd42b1f154c4bfa92fe21e6b25a544054b832 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Wed, 24 Dec 2025 15:31:43 +0000 Subject: [PATCH 1/2] Initial plan From d94b502cfb071a643ee9f8e3aded5c05944a8d99 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Wed, 24 Dec 2025 15:33:51 +0000 Subject: [PATCH 2/2] Remove Input Method Comparison section and change Case Demonstrations to two rows layout Co-authored-by: anxiangsir <31175974+anxiangsir@users.noreply.github.com> --- README.md | 24 ++++-------------------- 1 file changed, 4 insertions(+), 20 deletions(-) diff --git a/README.md b/README.md index ca911990..a368c750 100644 --- a/README.md +++ b/README.md @@ -32,24 +32,6 @@ OneVision Encoder is a vision encoder designed for multimodal large language mod OneVision Encoder Method Overview

-### Input Method Comparison - - - - - - - -
Frame Sampling Input vs Codec Input
- Animated demonstration of traditional uniform frame sampling method for video processing
- Frame Sampling Input
- Traditional uniform frame sampling approach -
- Animated demonstration of efficient codec-based input decomposition with I-frames and P-frames
- Codec Input
- Our efficient codec-based input decomposition -
- ### Cluster Discrimination Visualization

@@ -61,11 +43,13 @@ OneVision Encoder is a vision encoder designed for multimodal large language mod + +
- Case 4 Demonstration
+ Case 4 Demonstration
Case 4
- Case 6 Demonstration
+ Case 6 Demonstration
Case 6