# Introduction
The python web api implementation found in the brevettai package gives a lightweight programmatical interface to all the functionality that the [Brevetti AI platform](https://platform.brevetti.ai) offers.

This enables high level access for
* automation of tasks on the platform
* tagging of datasets or models
* dataset management
* managing models etc...

This document shows how this api can be used to get access to datasets and create a model training job.

Web access is granted with your website user, allowing you to automate tasks on the platform. In Python this is achieved through the **BrevettiAI** object.

# Brevetti AI package installation and imports
Install brevettiai using the pip package manager.

In [None]:
pip install brevettiai

In [None]:
# Setup logging to avoid verbosity
import logging
log = logging.getLogger(__name__)
logging.basicConfig()
log.root.setLevel(logging.DEBUG)
logging.getLogger("urllib3").setLevel(logging.WARNING)
logging.getLogger("tensorflow").setLevel(logging.WARNING)
logging.getLogger("matplotlib").setLevel(logging.WARNING)

# API: BrevettiAI Login

High level access for automation of tasks on the platform, tagging, dataset management, models, etc...

## Platform Login
As on the web page you have 60 minutes of access before needing to log back in.

In [2]:
# Imports and setup
from brevettiai.platform import BrevettiAI

web = BrevettiAI()

help(web)

Help on PlatformAPI in module brevettiai.platform.web_api object:

class PlatformAPI(builtins.object)
 |  PlatformAPI(username=None, password=None, host=None, remember_me=False)
 |  
 |  Methods defined here:
 |  
 |  __init__(self, username=None, password=None, host=None, remember_me=False)
 |      Initialize self.  See help(type(self)) for accurate signature.
 |  
 |  create(self, obj: Union[brevettiai.platform.models.dataset.Dataset, brevettiai.platform.models.tag.Tag, brevettiai.platform.models.web_api_types.Model, brevettiai.platform.models.web_api_types.Report], **kwargs)
 |  
 |  create_model(self, name, datasets, settings: brevettiai.platform.models.job.JobSettings = None, model_type=None, tags=None, application: brevettiai.platform.models.web_api_types.Application = None)
 |      Create a model on the platform
 |      
 |      Args:
 |          name:
 |          datasets:
 |          settings:
 |          model_type:
 |          tags:
 |          application:
 |      
 |      

# API: Element acces, list datasets, tags, models...

with the web object you can list, manage and edit elements on the web page.
Most of the functions require an id, the guid of the object to manipulate. Alternatively to get all use None as id.

EX: to list datasets, tags, and models, call get_... with no id (id=None)

In [3]:
datasets = web.get_dataset()
tags = web.get_tag()
models = web.get_model()

# List 10 first dataset names
[d.name for d in datasets][:10]

['NeurIPS 2018',
 'NeurIPS 2018 large',
 'Blood Cell Images',
 'Agar plates',
 'NeurIPS vials TRAIN']

For a single dataset, model or ... use the get_... functions with an id

In [4]:
dataset = web.get_dataset(datasets[0].id)
dataset

Dataset(id='21263db3-1c9b-456b-be1d-ecfa2afb5d99', bucket='s3://data.criterion.ai/21263db3-1c9b-456b-be1d-ecfa2afb5d99', name='NeurIPS 2018', locked=False, reference='Batch HW0001', notes='', tags=[])

# API: Customized Job Settings
Settings are esstentially the serialized configuration of a training job algorithm.
Settings can be used for a training job configuration by letting a user change settings, and settings are included in the default job output, such that the parameters of a training job can be saved and tracked for comparison and audit purposes.

In [5]:
from brevettiai import Job, JobSettings
        
class MyAlgoObject(JobSettings):
    multiply_factor: float = 2.0
    enable : bool = True

    def __call__(self, x):
        factor = 1.0
        if self.enable:
            factor *= self.multiply_factor
        return x * factor
test_obj = MyAlgoObject(multiply_factor=3.0)

# Settings used for creating the job
print(test_obj, test_obj(2))

extra={} multiply_factor=3.0 enable=True 6.0


In [6]:
class MyJob(Job):
    settings: MyAlgoObject
    
    def run(self): # This function should be overloaded and is run when job is started

        print(f"Run my custom code using custom parameters : {self.settings.__dict__}")
        print(f"Result on input 2.0: {self.settings(2.0)}")
        return None # Return path to model artifacts to be uploaded after job is completed
        


# API: Create Model Training Job
To enter the job context you can either create a model on the platform or programatically via the web api.

The following code finds the firs dataset and creates a model (job) with access to this model.
The model context type is the id of a model type on the platform to use.
After running the model is available on the website, along with an s3 bucket for artifacts for your job outputs


When creating a model you have the option to include datasets and tags and settings defining your model.

In [7]:
# Datasets to add to the created job
datasets = web.get_dataset()[1:2]

model = web.create_model(name=f'Test {web.user["firstName"]} {web.user["lastName"]}',
                         settings=test_obj,
                         datasets=datasets)

## Start job

The model id and the model api key gives you access to use the python sdk to access data, and to upload artifacts and lots of other cool stuff. To enable this, we need to start model training - this is the same as selecting "Simulate training" on the platform.

In [8]:
# Starting training in simulate mode
job = web.initialize_training(model=model, job_type=MyJob)
print(f"Model url: {web.host}/models/{model.id} (Please check it out :)\n")
print("To access data and model through python SDK use the following")
print(f"Model id: {model.id}")
print(f"Model api key (invalid when job is completed, or model i deleted)): {model.api_key}")

INFO:brevettiai.platform.models.job:<class '__main__.MyJob'> initialized


Model url: https://platform.brevetti.ai/models/39ae1701-7272-474a-a9b6-7702037dfb93 (Please check it out :)

To access data and model through python SDK use the following
Model id: 39ae1701-7272-474a-a9b6-7702037dfb93
Model api key (invalid when job is completed, or model i deleted)): JTJA4JkW70PbG0LOjD5rhDfE


In [9]:
job.start()

INFO:brevettiai.platform.models.job:Uploading output.json to s3://data.criterion.ai/39ae1701-7272-474a-a9b6-7702037dfb93/artifacts/output.json


Run my custom code using custom parameters : {'extra': {'extra': {}}, 'multiply_factor': 3.0, 'enable': True}
Result on input 2.0: 6.0


INFO:brevettiai.platform.models.job:Uploading output.json to s3://data.criterion.ai/39ae1701-7272-474a-a9b6-7702037dfb93/artifacts/output.json
INFO:brevettiai.platform.models.job:Job completed: modelPath=


# API: Create dataset and upload data
Datasets can be created and accessed from the command line, or from code.



In [13]:
import os

dataset = job.datasets[0]
samples = dataset.get_image_samples()
file_name_0 = samples.path.iloc[0].split("/")[-1]
os.makedirs("test_upload", exist_ok=True)
job.io.copy(samples.path.iloc[0], os.path.join("test_upload", file_name_0))


INFO:brevettiai.platform.models.dataset:Getting image samples from dataset 'NeurIPS 2018 large' [https://platform.brevetti.ai/data/4679259e-7a0a-4e85-90cf-a52f3451cf38]
INFO:brevettiai.platform.models.dataset:Contents: {('good',): 97, ('missing_cap',): 96, ('failed_cap',): 94}


150582

In [23]:
os.system("python -m brevettiai.utils.upload_data test_upload --name \"Test dataset: NeurIPS demo data single image\" ")

2

## NB: Clean up

### Delete job
If the job has not been deployed, and you are e.g. just testing interfaces, you may delete a job

In [10]:
# NB: delete model, there is no simple "undo" funcionality for this
web.delete(model)

### Delete new dataset

In [None]:
new_dataset = web.get_dataset(name="Test dataset: NeurIPS demo data single image")
web.delete(new_dataset)