# Downloading Public Datasets

## What you will learn in this tutorial:

* how to download and extract one of the available public datasets
* how to customize the default directory structure

## Preparations

We import `pymovements` as the alias `pm` for convenience.

In [None]:
import pymovements as pm

pymovements provides a library of publicly available datasets.

You can browse through the available dataset definitions here:
[Datasets](https://pymovements.readthedocs.io/en/latest/reference/pymovements.datasets.html#module-pymovements.datasets)

For this tutorial we will limit ourselves to the `ToyDataset` due to its minimal space requirements.

Other datasets can be downloaded by simply replacing `ToyDataset` with one of the other available datasets.

 ## Initialization

First we initialize the dataset by specifying the root data directory.
Our dataset will then be placed in a directory with the name of the dataset:

In [None]:
dataset = pm.datasets.ToyDataset(root='data/')

dataset.path

If you don't want to create this additional directory and just use the root path as your dataset path, you can specify the `dataset_dirname` explicitly and set it to `.`:

In [None]:
pm.datasets.ToyDataset(root='data/', dataset_dirname='.').path

## Downloading

The dataset will then be downloaded by calling:

In [None]:
dataset.download()

As we see from the download message, the dataset resource has been downloaded to a downloads directory.

You can get the path to this directory from the `downloads_rootpath` attribute:

In [None]:
dataset.downloads_rootpath

You can also specify a custom directory name during initialization:

In [None]:
pm.datasets.ToyDataset(root='data/', downloads_dirname='my_downloads').downloads_rootpath

## Extracting

You can then extract you downloaded data by calling:

In [None]:
dataset.extract()

Your data is now extracted to the following directory:

In [None]:
dataset.raw_rootpath

## Loading into memory

Finally we can load the data into our working memory by using the common `load()` method:

In [None]:
dataset.load()

Let's verify that we have correctly scanned the dataset files:

In [None]:
dataset.fileinfo

Wonderful, all of our data has been downloaded successfully!

## What you have learned in this tutorial:

* how to initialize a public dataset
* how to download and extract dataset resources
* how to customize the default directory structure
* how to load the dataset into your working memory