# CaptionQA Quickstart

Welcome to the CaptionQA quickstart notebook. This guide helps you verify your development environment, explore the dataset utilities bundled with the repository, and establish a baseline workflow for panoramic captioning + QA research.

## 1. Environment setup

CaptionQA uses [uv](https://docs.astral.sh/uv/) for dependency management and targets Python 3.10+. On Windows 11 PowerShell, the recommended bootstrap sequence is:

```powershell
winget install --id Astral.Uv -e
uv venv captionqa
.\captionqa\Scripts\Activate.ps1
uv pip install --editable .
```

If you are working on macOS or Linux, the commands are identical except for the activation step (`source captionqa/bin/activate`).

> **Tip:** Ensure that FFmpeg is installed and available on your `PATH` before attempting to process audio/video assets.

Run the following cell to confirm the Python version and virtual environment information inside the notebook kernel.

In [None]:
import os
import platform
import sys

print(f'Python version: {sys.version.split()[0]}')
print(f'Executable: {sys.executable}')
print(f'Platform: {platform.platform()}')
print(f'Virtual env: {os.environ.get("VIRTUAL_ENV", "<none>")}')

If the `Virtual env` field shows `<none>`, activate the `captionqa` environment (or your preferred venv) and restart the Jupyter kernel before proceeding.

## 2. Repository paths

The repository expects a `datasets/` directory in the project root. On Windows, the default clone uses a symbolic link that points to `D:/CaptionQA/data`. If you are running inside WSL or a containerized environment, update the link to a valid location or replace it with a regular directory.

In [None]:
from pathlib import Path

repo_root = Path.cwd().resolve()
datasets_dir = repo_root / 'datasets'

print(f'Repository root: {repo_root}')
print(f'datasets/ exists: {datasets_dir.exists()}')
if datasets_dir.exists():
    print(f'datasets/ points to: {datasets_dir.resolve()}')
else:
    print('datasets/ directory is missing. Create it or update the symlink before downloading datasets.')

If the dataset path is missing or points to an inaccessible drive, create a directory that suits your platform, for example:

```powershell
# Windows PowerShell
Remove-Item datasets
New-Item -ItemType Directory -Path datasets
```

```bash
# macOS / Linux
rm -rf datasets
mkdir -p datasets
```


## 3. Dataset utilities

The `data.download` module centralizes dataset metadata and provides a command-line interface. The next cell lists all datasets currently configured. This is a safe operation that does **not** download any data.

In [None]:
from data.download import DATASETS

for name, task in DATASETS.items():
    print('{:<10s} - {}'.format(name, task.description))

Use the CLI from a terminal to download assets once you have granted Hugging Face access where required:

```bash
python -m data.download --list
python -m data.download 360x --output datasets --dry-run
python -m data.download leader360v --output datasets
```

The `--dry-run` option prints the operations without performing any downloads, which is useful for verifying credentials and paths.

## 4. Inspecting a downloaded dataset

After downloading, you can explore the file structure programmatically. The example below demonstrates how to enumerate the top-level contents of the 360x dataset. If the dataset is not yet downloaded, the code will notify you.

In [None]:
from itertools import islice

hr_root = datasets_dir / '360x' / '360x_dataset_HR'
if hr_root.exists():
    entries = sorted(hr_root.iterdir())
    print(f'Found {len(entries)} items under {hr_root}:')
    for path in islice(entries, 10):
        print(' -', path.relative_to(hr_root))
    if len(entries) > 10:
        print('...')
else:
    print('360x high-resolution dataset not found. Run the downloader once you have access.')

Repeat the pattern for other datasets—adjust the root path and traversal depth depending on the structure (archives vs. Git repositories).

## 5. Next steps

* Prototype captioning models using your framework of choice (e.g., PyTorch + Hugging Face Transformers).
* Ingest panoramic video clips and convert them into frame sequences or features suitable for model training.
* Integrate QA annotations by aligning temporal segments with generated captions.

This notebook will continue to evolve as the project matures. Feel free to add exploratory experiments, preprocessing utilities, and evaluation routines in subsequent sections.