# Tabular Classification Example

## Introduction

In this notebook, we'll walk-through a detailed example of how you can use Velour to evaluate classifications made on a tabular dataset. We'll use `sklearn's` breast cancer dataset to make a binary prediction about whether a woman has breast cancer based on a table of descriptive features (e.g., mean radius, mean texture, etc.). 

For a conceptual introduction to Velour, [check out our project overview](https://striveworks.github.io/velour/). For a higher-level example notebook, [check out our "Getting Started" notebook](https://github.com/Striveworks/velour/blob/main/examples/getting_started.ipynb).


## Defining Our Datasets

We start by fetching our dataset, dividing it into test/train splits, and uploading both sets to Velour.

In [1]:
from velour import Dataset, Model, Datum, Annotation, GroundTruth, Prediction, Label
from velour.enums import TaskType
from velour.client import Client

from sklearn.model_selection import train_test_split
from sklearn.datasets import load_breast_cancer
from sklearn.linear_model import LogisticRegression
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import make_pipeline
from sklearn.metrics import classification_report

from velour import Dataset, Model, Datum, Annotation, GroundTruth, Prediction, Label
from velour.enums import TaskType
from velour.client import Client

# connect to Velour API
client = Client("http://localhost:8000")

ConnectionError: HTTPConnectionPool(host='localhost', port=8000): Max retries exceeded with url: /api-version (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x14756ae10>: Failed to establish a new connection: [Errno 61] Connection refused'))

In [None]:
# load data from sklearn
dset = load_breast_cancer()
dset.feature_names

In [None]:
# split datasets
X, y, target_names = dset["data"], dset["target"], dset["target_names"]
X_train, X_test, y_train, y_test = train_test_split(X, y)

# show an example input
X_train.shape, y_train[:4], target_names

In [None]:
# create train dataset in Velour
velour_train_dataset = Dataset(client, "breast-cancer-train")

# create test dataset in Velour
velour_test_dataset = Dataset(client, "breast-cancer-test")

### Adding GroundTruths to our Dataset

Now that our two datasets exists in Velour, we can add `GroundTruths` to each dataset.

In [None]:
# format training groundtruths
training_groundtruths = [
    GroundTruth(
        datum=Datum(
            uid=f"train{i}",
        ),
        annotations=[
            Annotation(
                task_type=TaskType.CLASSIFICATION,
                labels=[Label(key="class", value=target_names[t])]
            )
        ]
    )
    for i, t in enumerate(y_train)
]

# format testing groundtruths
testing_groundtruths = [
    GroundTruth(
        datum=Datum(
            uid=f"test{i}",
        ),
        annotations=[
            Annotation(
                task_type=TaskType.CLASSIFICATION,
                labels=[Label(key="class", value=target_names[t])]
            )
        ]
    )
    for i, t in enumerate(y_test)
]

# add the training groundtruths
for gt in tqdm(training_groundtruths):
    velour_train_dataset.add_groundtruth(gt)

# add the testing groundtruths
for gt in tqdm(testing_groundtruths):
    velour_test_dataset.add_groundtruth(gt)

100%|██████████| 426/426 [00:07<00:00, 58.74it/s]
100%|██████████| 143/143 [00:02<00:00, 56.26it/s]


### Finalizing Our Datasets

Lastly, we finalize both datasets to prep them for evaluation.

In [None]:
velour_train_dataset.finalize()
velour_test_dataset.finalize()

## Defining Our Model

Now that our `Datasets` have been defined, we can describe our model in Velour using the `Model` object.

In [None]:
# fit an sklearn model to our data
pipe = make_pipeline(StandardScaler(), LogisticRegression())
pipe.fit(X_train, y_train)

# get predictions on both of our datasets
y_train_probs = pipe.predict_proba(X_train)
y_test_probs = pipe.predict_proba(X_test)

# show an example output
y_train_probs[:4]

In [None]:
# create our model in Velour
velour_model = Model(client, "breast-cancer-linear-model", delete_if_exists=True)

### Adding Predictions to Our Model

With our model defined in Velour, we can post predictions for each of our `Datasets` to our `Model` object. Each `Prediction` should contain a list of `Labels` describing the prediction and its associated confidence score. Since we're running a classification task, the confidence scores over all prediction classes should sum to (approximately) 1.

In [None]:

# define our predictions
training_predictions = [
    Prediction(
        datum=Datum(
            dataset=velour_train_dataset.name,
            uid=f"train{i}",
        ),
        annotations=[
            Annotation(
                task_type=TaskType.CLASSIFICATION,
                labels=[
                    Label(
                        key="class", 
                        value=target_names[j],
                        score=p,
                    )                        
                    for j, p in enumerate(prob)
                ]
            )
        ]
    )
    for i, prob in enumerate(y_train_probs)
]

testing_predictions = [
    Prediction(
        datum=Datum(
            dataset=velour_test_dataset.name,
            uid=f"test{i}",
        ),
        annotations=[
            Annotation(
                task_type=TaskType.CLASSIFICATION,
                labels=[
                    Label(
                        key="class",
                        value=target_names[j],
                        score=p,
                    )                        
                    for j, p in enumerate(prob)
                ]
            )
        ]
    )
    for i, prob in enumerate(y_test_probs)
]

# add the train predictions
for pd in tqdm(training_predictions):
    velour_model.add_prediction(pd)

# add the test predictions
for pd in tqdm(testing_predictions):
    velour_model.add_prediction(pd)

### Finalizing Our Model

Finally, we finalize our `Model` to prep it for evaluation.

In [10]:
velour_model.finalize_inferences(velour_train_dataset)
velour_model.finalize_inferences(velour_test_dataset)

## Evaluating Performance

With our `Dataset` and `Model` defined, we're ready to evaluate our performance and display the results. Note that we use the `wait_for_completion` method since all evaluations run as a postgres `BackgroundTask`; this method ensures that the evaluation finishes before we display the results.

In [12]:
train_eval_job = velour_model.evaluate_classification(velour_train_dataset)
train_eval_job.wait_for_completion()
results = train_eval_job.results()

results

In [14]:
results.confusion_matrices

As a brief sanity check, we can check Velour's outputs against `sklearn's` own classification report. We see that the two results are equal.

In [15]:
y_train_preds = pipe.predict(X_train)
print(classification_report(y_train, y_train_preds, digits=6, target_names=target_names))