# Importing your datasets with Pixano

This notebook will help you importing your datasets from various formats to Pixano format.

This will allow you to access them with the Pixano Explorer and the Pixano Annotator.

## 1. Setting up

### Install dependencies

This notebook requires installing `pixano`.

If you are running this notebook on your computer, we strongly recommend creating a virtual environment for using Pixano like so:

```shell
conda create -n pixano_env python=3.10
conda activate pixano_env
```

```shell
pip install pixano
```

If you are running this notebook in Google Colab, run the cell below to install `pixano`.

In [None]:
try:
  import google.colab
  ENV = "colab"
  !pip install pixano
except:
  ENV = "jupyter"

### Load dependencies

In [None]:
from pathlib import Path

from pixano.apps import ExplorerApp
from pixano.data import COCOLoader, ImageLoader

## 2. Importing a dataset

### Set the dataset

To get started, please provide information on your dataset (name, description, splits).

Select the path of your Pixano dataset library, and the target import directory for your dataset inside that library.

In [None]:
name = "COCO Instances"
description = "COCO Instances Dataset"
splits = ["train2017", "val2017"]

library_dir = Path("datasets/")
import_dir = library_dir / "coco_instances"

### Import the dataset

Here, you will define the source directories for your datasets, such as images and annotations, and launch the dataset import.

How Pixano handles **annotations**:
- Annotations will be **transformed to Pixano format** and stored in a database while **keeping the original files intact**.

How Pixano handles **media files**:
- By default, media files such as images and videos will be **referred to using their current path or URL**. This is the best option for **large datasets** and datasets on **remote servers** and **S3 buckets**.
- You can use the `portable` option to **move or download the media files** inside the Pixano format dataset. This is the best option for **smaller datasets**.

#### a. Image only dataset
If your dataset contains only images, you can use our predefined ImageLoader to import it to Pixano format.

In [None]:
input_dirs = {
    "image": Path("coco/media/image"),
}

loader = ImageLoader(name, description, splits)
loader.import_dataset(input_dirs, import_dir, portable=False)

#### b. COCO-like image dataset

If your dataset contains images and annotations in COCO format, you can use our predefined COCOLoader to import it to Pixano format.

In [None]:
input_dirs = {
    "image": Path("coco/media/image"),
    "objects": Path("coco/annotations"),
}

loader = COCOLoader(name, description, splits)
loader.import_dataset(input_dirs, import_dir, portable=False)

#### c. Custom format dataset

If your dataset contains media or annotations in a custom format, you will have to define your own loader to import it to Pixano format.

Please take a look at the `template_loader.py` file next to this notebook for inspiration on how to build your own.

Also do not hesitate to reach out to us if you think Pixano could benefit from a loader for your dataset, and we will try to add it in a future version. 

## 3. Browsing the dataset

With the import complete, you can now browse your dataset with the Pixano Explorer.

You can stop the Explorer app by restarting the notebook.

In [None]:
explorer = ExplorerApp(library_dir)

In [None]:
explorer.display()