# Tutorial 1: A basic avatarization

In this tutorial, we will connect to a server to perform the avatarization of a dataset that does not require any pre-processing. We'll retrieve the anonymized dataset and the associated avatarization report. 

In [None]:
# This is the main file for the Avatar tutorial.
from avatars.manager import Manager
# The following are not necessary to run avatar but are used in this tutorial
from avatars.models import JobKind
from avatars.runner import Results

import pandas as pd
import os

url = os.environ.get("AVATAR_BASE_API_URL","https://scaleway-prod.octopize.app/api")
username = os.environ.get("AVATAR_USERNAME")
password = os.environ.get("AVATAR_PASSWORD")

In [None]:
# Change this to your actual server endpoint, e.g. base_url="https://avatar.company.com"
manager = Manager(base_url=url)
# Authenticate with the server
manager.authenticate(username, password)

In [None]:
# Verify that we can connect to the API server
manager.auth_client.health.get_health()

## Loading data

We recommend loading your file as a pandas dataframe. It enables you to check your data before avatarization and to pre-process it if required. 

In this tutorial, we use the simple and well-known `iris` dataset to demonstrate the main steps of an avatarization.

In [None]:
df = pd.read_csv("fixtures/iris.csv")

In [None]:
df

In [None]:
from avatars.runner import Runner

# The runner is the object that will be used to upload data to the server and run the avatarization
runner = manager.create_runner()

# Then upload the data, you can either use a pandas dataframe or a file
runner.add_table("iris", df)

## Creating and launching an avatarization job

In [None]:
runner.set_parameters("iris", k=5)

In [None]:
avatarization_job = runner.run() # by default we run all jobs : avatarization, privacy and signal metrics and report
# You can also choose to run only the avatarization job for example
# avatarization_job = runner.run(job_kind=JobKind.standard)

## Retrieving the completed avatarization job

In [None]:
results=runner.get_all_results()

## Retrieving the avatars

In [None]:
runner.shuffled("iris").head()

## Retrieving the privacy metrics

In [None]:
runner.privacy_metrics("iris")

## Retrieving the signal metrics

In [None]:
runner.signal_metrics("iris")

# Download the report

In [None]:
runner.download_report('my_report.pdf')

# How to print an error message 
There are multiple types of error and we encourage you to have a look at our [documentation](https://python.docs.octopize.io/latest/user_guide.html#understanding-errors) to understand them.

The most common error is when server validation prevents a job from running.

The following section show how to print an error message. 

In [None]:
runner = manager.create_runner()
runner.add_table("iris", df)

runner.set_parameters("iris", k=500)  # k is too big (bigger than the dataset !)

runner.run(jobs_to_run=[JobKind.standard])

In [None]:
error_job = runner.get_job(JobKind.standard)
print(error_job.status)
print(error_job.exception)