# OpenML Datasets

In [None]:
!pip install -U pip numpy pandas jupyter

In [None]:
!pip install -U openml

## Authentication

You first need to find your API key.

* Create an OpenML account (free) on http://www.openml.org. 
* Log in, click your avatar/picture, open 'API authentication'.
* Your API key is a secret 32-character string

You can copy this API key into your code (but only if you never share it):

In [None]:
# Uncomment and set your OpenML key. 
import openml as oml
# Replace the foloowing string with yout API key
oml.config.apikey = 'YOUR_API_KEY_HERE'

In [None]:
# Making sure we are using the right settings
oml.config.server = 'https://www.openml.org/api/v1/xml' 
oml.config.apikey = 'c0c42819af31e706efe1f4b88c23c6c1'
# Suppress sklearn warnings
import warnings
warnings.simplefilter(action="ignore", category=DeprecationWarning)

## Import Dataset

`datasets.get_dataset(data_id)` returns an `OpenMLData` object with the dataset and meta-data.

In [None]:
# This is done based on the dataset ID.
# Replace with yours

dataset = oml.datasets.get_dataset(ID)

# Print a summary
print(
    f"This is dataset '{dataset.name}', the target feature is "
    f"'{dataset.default_target_attribute}'"
)
print(f"URL: {dataset.url}")
print(dataset.description[:500])

#### Get the data itself
`OpenMLData.getdata()` returns the actual data as numpy arrays. 

In [None]:
import pandas as pd

X, y, categorical_indicator, attribute_names = dataset.get_data(
    dataset_format="array", target=dataset.default_target_attribute
)
eeg = pd.DataFrame(X, columns=attribute_names)
eeg["class"] = y
print(eeg[:10])

#### Get the meta-data
Every dataset comes with rich meta-data:
* name, version, date, creator, licence, description, ...
* `dataset.qualities` returns 100+ statistical data properties
* `dataset.features` returns all variables and their data types
* tags added by the OpenML community

In [None]:
vars(dataset)

## Further information
That's it. You are now an expert in using OpenML. In case you have further questions:

[OpenML Documentation](https://docs.openml.org)  
[Python API Documentation and examples](https://openml.github.io/openml-python)