## DebiAI project creation notebook

This notebook will:

- Convert the Exathlon dataset to a DataFrame format
- Connect to DebiAI
- Create a new project
- Add the dataset to the DebiAI project

Make sure to have installed the required packages before running this notebook. You can install them by running:

```bash
pip install -r requirements.txt
pip install debiai
```


### Convert the Woodscape dataset to a DataFrame format

In [4]:
from datasets import create_dataset

# This function will convert the Exathlon dataset into a Pandas DataFrame
dataset = create_dataset()
print(dataset.shape)

(2474, 196)


### Connecting to DebiAI

In [6]:
from debiai import debiai

DEBIAI_URL = "http://localhost:3000/"  # Change this to the locally installed Debiai URL
PROJECT_NAME = "Exathlon"

In [7]:
# Connect to the Debiai server 
my_debiai = debiai.Debiai(DEBIAI_URL)

### Creating a new DebiAI project

In [8]:
# Create the DebiAI project
dataset_project = my_debiai.create_project(PROJECT_NAME)

In [None]:
# Set the project block structure
# The block structure defines the structure of the dataset,
# it allows to define the project's data type and categories.

# The Exathlon structure is as follows:
# - First column: Timestamp
# - Last column is the Grout Truth: groundtruth
# - The columns in between are the sensors data: inputs (will be grouped)

# Build the inputs
inputs = []
for column in dataset.columns[1:-1]:
    inputs.append({"name": column, "type": "number", "group": "sensors"})

# This block structure is specific to the WoodScape dataset
# It follows the generated DataFrame structure
block_structure = [
    {
        "name": "timestamp",
        "inputs": inputs,
        "groundTruth": [
            {"name": "is_anomaly", "type": "number"},
        ],
    }
]

# Add the dataset block structure to the DebiAI project
dataset_project.set_blockstructure(block_structure)

The project columns should now be visible in the dashboard:
![DebiAI project columns](course_images/debiai_project_columns.png)

### Adding the dataset to the DebiAI project


In [None]:
# Add the dataset samples to the DebiAI project
dataset_project.add_samples_pd(dataset)

After refreshing the DebiAI dashboard, you should see that the project has been populated with the 8234 rows of the Woodscape dataset:
![DebiAI project populated](course_images/debiai_project_nb_samples.png)
Yo can now start analyzing the dataset in the DebiAI dashboard.
