# Converting your dataset with Pixano

This notebook will help you converting your dataset to parquet format in order to access it with Pixano.

## 1. Setting up

### Load imports

In [2]:
from pathlib import Path

from pixano import notebook
from pixano.data import COCOLoader, ImageLoader

## 2. Converting dataset

### Set a dataset

Please provide your dataset information and the target directory.

In [3]:
name = "COCO Instances"
description = "COCO Instances Dataset"

library_dir = Path("datasets/")
target_dir = library_dir / "coco_instances"

splits = ["train2017", "val2017"]

### Convert the dataset

*Please note: After generating the dataset, the media directories like "image" will be automatically moved inside the target directory so they can be accessed by Pixano.*

#### a. Image only dataset
If your dataset contains only images, you can use our predefined ImageLoader to convert it to parquet format.

In [None]:
source_dirs = {
    "image": Path("coco/media/image"),
}

loader = ImageLoader(name, description, source_dirs, target_dir, splits)
loader.convert_dataset()

#### b. COCO-like image dataset

If your dataset contains annotations in COCO format, you can use our predefined COCOLoader to convert it to parquet format.

In [4]:
source_dirs = {
    "image": Path("coco/media/image"),
    "objects": Path("coco/annotations"),
}

loader = COCOLoader(name, description, source_dirs, target_dir, splits)
loader.convert_dataset()

#### c. Custom format dataset

If your dataset contains media or annotations in a custom format, you will have to define your own loader to convert it to parquet format.

Please take a look at the `template_loader.py` file next to this notebook for inspiration on how to build your own.

## 3. Browsing the dataset

You can now browse the converted dataset in the Pixano Explorer.

*Start the explorer cell, then stop it to run the display function.\
Please restart the notebook when you're done to release the URL used by Pixano Explorer.*

In [None]:
%%sh -s "$library_dir"
pixano-explorer $1

In [None]:
notebook.display()