diff --git a/README.md b/README.md index ca911990..a368c750 100644 --- a/README.md +++ b/README.md @@ -32,24 +32,6 @@ OneVision Encoder is a vision encoder designed for multimodal large language mod OneVision Encoder Method Overview

-### Input Method Comparison - - - - - - - -
Frame Sampling Input vs Codec Input
- Animated demonstration of traditional uniform frame sampling method for video processing
- Frame Sampling Input
- Traditional uniform frame sampling approach -
- Animated demonstration of efficient codec-based input decomposition with I-frames and P-frames
- Codec Input
- Our efficient codec-based input decomposition -
- ### Cluster Discrimination Visualization

@@ -61,11 +43,13 @@ OneVision Encoder is a vision encoder designed for multimodal large language mod + +
- Case 4 Demonstration
+ Case 4 Demonstration
Case 4
- Case 6 Demonstration
+ Case 6 Demonstration
Case 6