diff --git a/docs/OSL.md b/docs/OSL.md index 498abf2..da3b4b7 100644 --- a/docs/OSL.md +++ b/docs/OSL.md @@ -7,6 +7,49 @@ An OSL JSON file is a single JSON object with dataset metadata, a label schema, and a `data` array of samples. Each sample points to one or more media inputs and can carry task-specific annotations. +## Minimal Valid File + +This is the smallest practical shape for a dataset with one video sample: + +```json +{ + "version": "2.0", + "date": "2026-05-19", + "dataset_name": "minimal-demo", + "description": "", + "modalities": ["video"], + "metadata": {}, + "labels": {}, + "data": [ + { + "id": "clip_0001", + "inputs": [ + { + "type": "video", + "path": "clips/clip_0001.mp4" + } + ] + } + ] +} +``` + +!!! note "Relative paths" + Relative `inputs[].path` values are resolved from the folder that contains + the JSON file. If you move the JSON without moving its media folders, + playback can fail. + +## Common Mistakes + +| Mistake | Result | Fix | +|---|---|---| +| Root JSON is an array | The app rejects the file. | Use one root object with a `data` array. | +| `data` is missing or not a list | The app rejects the file. | Set `data` to `[]` or a list of sample objects. | +| Using top-level `questions` for Q/A | Legacy question banks are dropped on save. | Store Q/A in each sample's grouped `answers[]`. | +| Dense captions use `start_ms`/`end_ms` only | The current dense editor expects point timestamps. | Use `dense_captions[].position_ms`. | +| Annotation head names do not match root `labels` | Controls may not show the expected labels. | Keep `data[].labels` keys and `events[].head` values aligned with root `labels`. | +| Relative media paths no longer point to files | Samples load but playback cannot find media. | Keep media beside the JSON or resave after correcting paths. | + ## Top-Level Object The smallest useful file is a JSON object with `data` as a list. When loading, diff --git a/docs/annotating.md b/docs/annotating.md index 16f963b..c265ad4 100644 --- a/docs/annotating.md +++ b/docs/annotating.md @@ -1,38 +1,70 @@ # Annotating +All annotation tabs work on the currently selected sample from the Dataset +Explorer. The JSON field names below match the canonical [OSL JSON Format](OSL.md) +page. + ## Classification -1. Select a sample in the Dataset Explorer. +Use `CLS` for clip-level labels. + +1. Select a sample. 2. Open `CLS`. -3. Choose labels in each head. -4. Changes persist immediately when they are effective. +3. Add or choose label heads and labels. +4. Select the label values for the current sample. + +Effective manual changes are saved immediately into the sample's `labels` +object. Single-label heads write `{"label": "..."}` and multi-label heads write +`{"labels": [...]}`. Smart predictions add `confidence_score` until confirmed or +rejected. ## Localization +Use `LOC` for point events on the timeline. + 1. Select a sample and open `LOC`. -2. Use spotting buttons to add events at current time. -3. Edit or delete events from the event table. -4. Optional: run smart inference for a selected head. +2. Choose a label head and label. +3. Move the playhead to the event time. +4. Use the spotting controls to add the event. +5. Edit or delete rows in the event table when needed. + +Events are stored in `events[]` with `head`, `label`, and `position_ms`. +Smart inference can add predicted rows with `confidence_score`; confirming a row +keeps the event and removes only the confidence marker. ## Description +Use `DESC` for one clip-level caption. + 1. Select a sample and open `DESC`. -2. Edit caption text. -3. Autosave stores the caption in `captions`. +2. Enter or edit the caption text. +3. Wait for autosave or save the project. + +The text is stored in `captions[]`. Manual description edits currently write an +English caption entry with `lang` set to `en`. ## Dense Description +Use `DENSE` for timestamped text descriptions. + 1. Select a sample and open `DENSE`. -2. Click **Add New Description**. -3. Enter text in the modal; event is stored at current `position_ms`. -4. Edit time/text from the table when needed. +2. Move the playhead to the desired timestamp. +3. Click **Add New Description**. +4. Enter text in the modal. +5. Edit time or text from the table when needed. + +Dense descriptions are stored in `dense_captions[]` with `position_ms`, `lang`, +and `text`. The table keeps rows ordered by timestamp. ## Question/Answer -1. Open `Q/A`. -2. Add or select a sample question group. -3. Use the add/edit dialog to choose a previous dataset question or enter custom text. -4. Double-click a question group to edit it, or right-click it to edit/remove it. -5. Click **Answer** to add an answer in a multiline dialog. -6. Double-click an answer to edit it, or right-click it to edit/remove it. -7. Answers are stored as grouped `answers` with `question` and `answers[]`. +Use `Q/A` for grouped questions and one or more answers per question. + +1. Select a sample and open `Q/A`. +2. Click **Add** to create a question group. +3. Choose a previous dataset question or enter custom question text. +4. Click **Answer** to add answer text. +5. Double-click or right-click a question or answer to edit or remove it. + +Answers are stored as grouped `answers[]` entries with `question` and +`answers[]`. The app does not write a top-level `questions` bank. diff --git a/docs/batch_tools.md b/docs/batch_tools.md index de355e0..7adade0 100644 --- a/docs/batch_tools.md +++ b/docs/batch_tools.md @@ -1,45 +1,59 @@ -# Batch Tools +# Data Transfer and Batch Tools -The app supports Hugging Face dataset transfer from the **Data** menu and script/API workflows for batch conversion. +The app supports Hugging Face dataset transfer from the **Data** menu and +script/API workflows for batch conversion. Dataset JSON inputs follow the +[OSL JSON Format](OSL.md). ## In-App Data Menu ### Download Dataset from HF... -- Opens a dialog for: - - repo ID - - branch/revision - - split - - format - - output directory - - optional token - - dry-run mode -- Supports: - - JSON split downloads (`.json`) - - Parquet split downloads (`/`) -- Writes files under `//`. -- On successful non-dry-run JSON download, source metadata is written into the JSON root: - - `hf_repo_id` - - `hf_branch` - - `hf_split` +The download dialog asks for: + +- repo ID +- branch/revision +- split +- format +- output directory +- optional token +- dry-run mode + +It supports JSON split downloads (`.json`) and Parquet split downloads +(`/`). Files are written under `//`. + +For successful non-dry-run JSON downloads, source metadata is written into the +JSON root: + +- `hf_repo_id` +- `hf_branch` +- `hf_split` + +!!! note "Dry-run support" + Dry-run size estimation is available for JSON downloads. Parquet downloads + run as real downloads/conversions. ### Upload Dataset to HF... -Requires an opened dataset JSON from disk. +Upload requires an opened dataset JSON from disk. Upload modes: -- **Upload as JSON**: uploads current dataset JSON plus files referenced by `data[].inputs[].path` in one commit. -- **Parquet + WebDataset**: converts locally, then uploads generated Parquet/shards (shard size configurable). +- **Upload as JSON** uploads the current dataset JSON plus every file referenced + by `data[].inputs[].path`. +- **Parquet + WebDataset** converts locally, then uploads generated + Parquet/WebDataset artifacts. -If repository/branch is missing, the app can prompt to create it and retry. +If the target repository or branch is missing, the app can prompt to create it +and retry. ## CLI Scripts -### Download referenced files +Run commands from the repository root. + +### Download Referenced Files ```bash -python test_data/download_osl_hf.py \ +python tools/download_osl_hf.py \ --repo-id \ --revision main \ --split test \ @@ -48,14 +62,32 @@ python test_data/download_osl_hf.py \ --dry-run ``` -### Upload referenced files +### Upload Referenced Files ```bash -python test_data/upload_osl_hf.py \ +python tools/upload_dataset_to_hf.py \ --repo-id \ --json-path \ --split test \ - --revision main + --revision main \ + --format json +``` + +### Convert JSON to Parquet + WebDataset + +```bash +python tools/osl_json_to_parquet_webdataset.py \ + annotations.json \ + /path/to/media/root \ + /path/to/output_dataset +``` + +### Convert Parquet + WebDataset Back to JSON + +```bash +python tools/parquet_webdataset_to_osl_json.py \ + /path/to/output_dataset \ + reconstructed.json ``` ## Python Conversion API @@ -66,3 +98,5 @@ from opensportslib.tools import convert_json_to_parquet, convert_parquet_to_json convert_json_to_parquet(json_path="annotations.json", media_root=".", output_dir="out_parquet") convert_parquet_to_json(dataset_dir="out_parquet", output_json_path="reconstructed.json") ``` + +For full script options, run any tool with `--help`. diff --git a/docs/changelog.md b/docs/changelog.md index 825c32f..86baad5 100644 --- a/docs/changelog.md +++ b/docs/changelog.md @@ -1 +1,14 @@ # Changelog + +Release notes for packaged builds are published on GitHub Releases: + +- https://github.com/OpenSportsLab/VideoAnnotationTool/releases + +## Documentation Notes + +- The public site now treats [OSL JSON Format](OSL.md) as the canonical dataset + schema reference. +- Workflow pages link back to the OSL format page instead of duplicating long + schema examples. +- Saving/loading docs describe the current grouped Q/A format and no longer list + legacy top-level `questions` as persisted project data. diff --git a/docs/getting_started.md b/docs/getting_started.md index f25559f..8a781aa 100644 --- a/docs/getting_started.md +++ b/docs/getting_started.md @@ -1,5 +1,9 @@ # Getting Started +This walkthrough takes you from an empty project to a saved OSL JSON dataset. +For field-level JSON details, use [OSL JSON Format](OSL.md) as the canonical +reference. + ## 1. Launch Start the app from the repository root: @@ -10,33 +14,52 @@ python annotation_tool/main.py You will land on the **Welcome** screen. -## 2. Create Or Open A Dataset +## 2. Create or Open a Dataset + +- Choose **Create New Dataset** to start with a blank OSL JSON project. +- Choose **Load Dataset** to open an existing `.json` file. +- Reopen known files from the recent-datasets list when available. -- **Create New Dataset** for a blank OSL dataset. -- **Load Dataset** to open an existing JSON. -- You can also reopen files from the recent-datasets list. +!!! warning "Keep JSON and media paths together" + OSL input paths are usually relative to the dataset JSON file. If you move a + JSON file without moving the referenced media folders, playback may fail + until the paths are fixed or the dataset is saved again in the expected + location. ## 3. Add Samples In the Dataset Explorer: -- Click **Add Data**. -- Select files or folders. -- Selected folders are treated as multi-input samples (for multi-view workflows). +1. Click **Add Data**. +2. Select one or more files, or select folders that contain supported files. +3. Review the sample rows that appear in the tree. + +Selected files become separate samples. Selected folders are treated as +multi-input samples for multi-view workflows. The app stores each input under +`data[].inputs[]` and infers the input type from the file extension when needed. ## 4. Annotate -Use the right-side tabs: +Select a sample in the Dataset Explorer, then use the right-side annotation tabs: -- `CLS` for classification labels -- `LOC` for timestamped events -- `DESC` for clip-level captions -- `DENSE` for timestamped dense captions -- `Q/A` for per-sample question groups and answers +| Tab | Use it for | JSON field | +|---|---|---| +| `CLS` | Clip-level classification labels | `labels` | +| `LOC` | Timestamped events | `events` | +| `DESC` | Clip-level text captions | `captions` | +| `DENSE` | Timestamped dense captions | `dense_captions` | +| `Q/A` | Per-sample question groups and answers | `answers` | + +See [Annotating](annotating.md) for the per-mode workflow. ## 5. Save -- `Ctrl+S` saves to the current JSON path. -- `Ctrl+Shift+S` saves as a new JSON file. +- **Save Dataset** (`Ctrl+S`) writes to the current JSON path. +- **Save Dataset As** (`Ctrl+Shift+S`) writes to a new JSON path. + +On save, the app normalizes sample IDs, removes empty optional task blocks, and +rewrites `data[].inputs[].path` relative to the saved JSON location when +possible. See [Saving and Loading](saving_loading.md) for the full save behavior. -When you close with unsaved changes, you can **Save**, **Save As**, **Close Without Saving**, or **Cancel**. +When you close with unsaved changes, choose **Save**, **Save As**, +**Close Without Saving**, or **Cancel**. diff --git a/docs/saving_loading.md b/docs/saving_loading.md index e195b78..740642a 100644 --- a/docs/saving_loading.md +++ b/docs/saving_loading.md @@ -1,21 +1,42 @@ -# Saving And Loading +# Saving and Loading + +This page covers project file behavior. For the exact JSON structure, see +[OSL JSON Format](OSL.md). ## Loading Use **File > Load Dataset** (`Ctrl+O`) to open an OSL JSON file. -The app normalizes loaded content (for example, missing IDs) while preserving unknown root/sample fields when possible. +On load, the app validates that the root is a JSON object and that top-level +`data` is a list. It also fills missing standard root fields, normalizes missing +or duplicate sample IDs, canonicalizes known input types, and preserves unknown +root/sample fields when possible. + +!!! note "Relative input paths" + Relative `data[].inputs[].path` values are resolved from the directory that + contains the opened JSON file. ## Saving -- **Save Dataset** (`Ctrl+S`): writes to current path. -- **Save Dataset As** (`Ctrl+Shift+S`): writes to a new path. +- **Save Dataset** (`Ctrl+S`) writes to the current path. +- **Save Dataset As** (`Ctrl+Shift+S`) writes to a new path. + +On write, paths in `data[].inputs[].path` are rewritten relative to the chosen +save location when possible. + +Save/export also: -On write, paths in `data[].inputs[].path` are rewritten relative to the chosen save location. +- Ensures sample IDs are unique. +- Recomputes `modalities` from sample inputs. +- Removes empty optional sample blocks such as `labels`, `events`, `captions`, + `dense_captions`, `answers`, and `metadata`. +- Normalizes Q/A payloads to grouped `answers[]` entries with non-empty answer + text. +- Removes retired smart keys such as `smart_labels` and `smart_events`. ## Close With Unsaved Changes -If the dataset is dirty and you close/quit, the app prompts: +If the dataset is dirty and you close or quit, the app prompts: - **Save** - **Save As** @@ -24,9 +45,12 @@ If the dataset is dirty and you close/quit, the app prompts: ## What Is Persisted -- Core OSL fields (`labels`, `events`, `captions`, `dense_captions`, `questions`, `answers`) -- Unknown/custom root/sample fields +- Standard OSL fields such as `labels`, `events`, `captions`, + `dense_captions`, and grouped `answers`. +- Unknown/custom root and sample fields when they do not conflict with retired + fields. -### Not persisted in dataset JSON +### Not Persisted in Dataset JSON -- Localization `label_colors` (stored in app settings) +- Legacy top-level `questions` and per-answer `question_id` entries. +- Localization `label_colors`, which are stored in app settings. diff --git a/docs/troubleshooting.md b/docs/troubleshooting.md index 1ae040f..159fceb 100644 --- a/docs/troubleshooting.md +++ b/docs/troubleshooting.md @@ -1,6 +1,6 @@ # Troubleshooting -## App Fails To Start +## App Fails to Start - Confirm your environment is active. - Reinstall dependencies: @@ -9,6 +9,32 @@ pip install -r requirements.txt ``` +## Dataset JSON Does Not Load + +- Confirm the file is valid JSON. +- Confirm the root value is a JSON object, not an array. +- Confirm top-level `data` is a list. +- Check the expected structure in [OSL JSON Format](OSL.md). + +Legacy VQA files that use top-level `questions` and per-answer `question_id` +entries should be converted before editing: + +```bash +python tools/convert_legacy_vqa_to_grouped.py \ + --input-json old_vqa.json \ + --output-json grouped_vqa.json +``` + +## Media Files Are Missing After Loading + +Relative paths in `data[].inputs[].path` are resolved from the directory that +contains the dataset JSON. If the JSON was moved separately from its media +folders, move the folders back beside the JSON or update the paths. + +!!! tip "Saving can repair path layout" + After opening a dataset from the intended folder, **Save Dataset As** can + rewrite input paths relative to the new JSON location. + ## Hugging Face Transfer Errors - Ensure `huggingface_hub` is installed. @@ -19,15 +45,18 @@ huggingface-cli login ``` - For upload failures: - - `Repository Not Found`: create repo or let the app create it from the prompt. - - `Revision/Branch Not Found`: create branch or let the app create it from the prompt. + - `Repository Not Found`: create the repo or let the app create it from the prompt. + - `Revision/Branch Not Found`: create the branch or let the app create it from the prompt. ## Download URL 404 / Not Found -- Verify the URL points to the intended file or folder in the dataset repo. -- If a previously successful URL is now invalid, reselect/correct it in the dialog. +- Verify the repo ID, revision, split, and format in the dialog. +- JSON mode expects `.json`. +- Parquet mode expects a `/` folder. +- If a previously successful URL is now invalid, reselect or correct it in the + dialog. -## Video Playback Error Or Black Screen +## Video Playback Error or Black Screen Some codecs are not decoded by your platform backend. Convert to H.264/AAC MP4: @@ -35,6 +64,9 @@ Some codecs are not decoded by your platform backend. Convert to H.264/AAC MP4: ffmpeg -i input.mp4 -vcodec libx264 -acodec aac output.mp4 ``` -## No Playback After Selecting A Row +## No Playback After Selecting a Row -If the selected input is non-video (for example text metadata), this is expected: the row stays selectable but media playback is not started. +If the selected input is not playable media for the current backend, the row can +stay selected while playback does not start. For example, `frames_npy` and +`tracking_parquet` inputs use specialized renderers, and unsupported text or +metadata files are not played as video. diff --git a/mkdocs.yml b/mkdocs.yml index 601c6ac..9887af4 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -3,6 +3,9 @@ site_description: A PyQt6 GUI tool for analyzing and annotating OSL datasets (Op site_author: OpenSportsLab repo_url: https://github.com/OpenSportsLab/VideoAnnotationTool repo_name: OpenSportsLab/VideoAnnotationTool +exclude_docs: | + README.md + assets/README.md theme: name: material