
# The KantoData class

Parameters
----------

The print function will return the contents of the dataset. can also do with
parameters or directories

Most computationally intensive methods will have three versions:
``method`` (single)
``method_r`` (n elements), remote
``method_parallel``, parallelised/distributed w/ a ray cluster


.. code-block:: python
    :linenos:

    print(dataset.parameters)


To get some basic info on the data in the dataset:

.. code-block:: python
    :linenos:
    
    # Check dataset length
    dataset.sample_info()
.. code-block:: none

    Total length: 800
    Vocalisations: 740
    Noise: 60
    Unique IDs: 3

Plot some info 

.. code-block:: python
    :linenos:

    # Plot some information about the dataset
    dataset.plot_summary(variable='all')


Check sample size per individual ID in the dataset:

.. code-block:: python
    :linenos:

    dataset.vocs['ID'].value_counts()

.. code-block:: none

    B119    159
    B108    157
    B163    134
    B226    134
    B216    117
    


Creating a dataset for which there is already derived data (e.g. spectrograms).
This is something that might happen if, say, creating a dataset fails but
at least some spectrogram files were saved succesfully. 

.. code-block:: python
    :linenos:

    DATASET_ID = "BIGBIRD"
    dataset = KantoData(DATASET_ID, DIRS, parameters=params,
                        overwrite_dataset=True, overwrite_data=False)


Note: 
    You can use any matplotlib palette here using the 'cmap' argument.
    See `colourmaps`_.

.. _colourmaps: https://matplotlib.org/stable/tutorials/colors/colormaps.html




Load an existing dataset:

In [None]:
# Opening an existing dataset
from pykanto.utils.read import load_dataset
DATASET_ID = 'BIG_BIRD'
out_dir = DIRS.DATA / "datasets" / DATASET_ID / f"{DATASET_ID}.db"
dataset = load_dataset(out_dir)

In [None]:

# If you want to save the dataset as a .csv file,
# which I recommend you do as backup,
csv_dir = dataset.DIRS.DATASET.parent
dataset.to_csv(csv_dir)


# If you want to save the new metadata you have generated
# (vocalisation type labels and onset/offsets, for example)
# to the original .json files (as a backup or to use with other software):
from pykanto.utils.write import save_to_jsons
save_to_jsons(dataset)