#  Fetching Brain Tumor Segemntation Dataset

In this notebook, we will learn:
- how we can use [MONAI Core APIs](https://github.com/Project-MONAI/MONAI) to download the brain tumor segmentation data from the [Medical Segmentation Decathlon](http://medicaldecathlon.com) challenge.
- how we can upload the dataset to Weights & Biases and use it as a dataset artifact.

## 🌴 Setup and Installation

First, let us install the latest version of both MONAI and Weights and Biases.

In [None]:
!pip install -q -U monai wandb

## 🌳 Initialize a W&B Run

We will start a new W&B run to start tracking our experiment.

In [None]:
import wandb

wandb.init(
    project="brain-tumor-segmentation",
    entity="lifesciences",
    job_type="fetch_dataset"
)

## 🍁 Fetching the Dataset using MONAI

The [`monai.apps.DecathlonDataset`](https://docs.monai.io/en/stable/apps.html#monai.apps.DecathlonDataset) lets us automatically download the data of [Medical Segmentation Decathlon challenge](http://medicaldecathlon.com/) and generate items for training, validation, or testing. We will use this API in the later notebooks to load and transform our datasets automatically.

In [None]:
import os

# Make the dataset directory
os.makedirs("./dataset/", exist_ok=True)


from monai.apps import DecathlonDataset

# Fetch the training split of the brain tumor segmentation dataset
train_dataset = DecathlonDataset(
    root_dir="./dataset/",
    task="Task01_BrainTumour",
    section="training",
    download=True,
    cache_rate=0.0,
    num_workers=4,
)

# Fetch the validation split of the brain tumor segmentation dataset
val_dataset = DecathlonDataset(
    root_dir="./dataset/",
    task="Task01_BrainTumour",
    section="validation",
    download=False,
    cache_rate=0.0,
    num_workers=4,
)

# Fetch the test split of the brain tumor segmentation dataset
test_dataset = DecathlonDataset(
    root_dir="./dataset/",
    task="Task01_BrainTumour",
    section="test",
    download=False,
    cache_rate=0.0,
    num_workers=4,
)

In [None]:
print("Train Set Size:", len(train_dataset))
print("Validation Set Size:", len(val_dataset))
print("Test Set Size:", len(test_dataset))

## 💿 Upload the Dataset to W&B as an Artifact

[W&B Artifacts](https://docs.wandb.ai/guides/artifacts) can be used to track and version any serialized data as the inputs and outputs of your W&B Runs. For example, a model training run might take in a dataset as input and a trained model as output.

![](https://docs.wandb.ai/assets/images/artifacts_landing_page2-b6bd49ea5db62eff00f582a95845fed9.png)

Let us now see how we can upload this dataset as a W&B artifact.

In [None]:
artifact = wandb.Artifact(name="decathlon_brain_tumor", type="dataset")
artifact.add_dir(local_path="./dataset/")
wandb.log_artifact(artifact)

Now we end the experiment by calling `wandb.finish()`.

In [None]:
wandb.finish()