Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,9 +14,9 @@

📝 **[Homepage](https://www.lmms-lab.com/onevision-encoder/index.html)**
🤗 **[Models](https://huggingface.co/lmms-lab-encoder/onevision-encoder-large)** |
🤗 **[Datasets](coming)** |
📄 **[Tech Report (coming)]()** |
📋 **[Model Card](docs/model_card.md)**
📋 **[Model Card](docs/model_card.md)** |
📊 **[Data Card](docs/data_card.md)**

</div>

Expand Down Expand Up @@ -283,7 +283,7 @@ cd eval_encoder
Then run the following command:

```bash
bash eval_encoder/shells_eval_ap/eval_ov_encoder_large_16frames.sh
bash shells_eval_ap/eval_ov_encoder_large_16frames.sh
```

**Sampling-Specific Parameters:**
Expand Down Expand Up @@ -320,8 +320,8 @@ torchrun --nproc_per_node=8 --master_port=29512 attentive_probe_codec.py \

**Codec-Specific Parameters:**
- `K_keep`: Number of patches to keep.
- `cache_dir`: Directory for cached codec patches. This is where the codec-selected patches will be stored/loaded.
- `mv_compensate`: Motion vector compensation method (e.g., `median`).
- `cache_dir` (optional): Directory for cached codec patches. Use this to specify where codec-selected patches are stored/loaded when you want to persist or reuse them.

#### Shared Parameters

Expand Down
5 changes: 4 additions & 1 deletion docs/datacard.md → docs/data_card.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,11 @@
# Data Card: OneVision Encoder Training Data

> **📦 Data Availability Notice:** The training data requires approximately **200TB** of storage. We are currently looking for suitable storage solutions. If you need access to the data immediately, please contact [anxiangsir@outlook.com](mailto:anxiangsir@outlook.com).


## Overview

This document describes the datasets used for training OneVision Encoder. The training data consists of both image and video datasets, totaling approximately 754 million samples.
This document describes the datasets used for training OneVision Encoder. The training data consists of both image and video datasets.

## Dataset Summary

Expand Down