[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/PicselliaTeam/picsellia-notebooks/blob/master/training-notebooks/tensorflow2/detection-segmentation/PicselliaTrainingQuickStart.ipynb)

In [1]:
!pip install picsellia_tf2
!pip install picsellia

Collecting picsellia_tf2
  Using cached picsellia_tf2-0.10.27-py3-none-any.whl (1.6 MB)
Collecting tensorflow==2.6.0
  Using cached tensorflow-2.6.0-cp38-cp38-manylinux2010_x86_64.whl (458.4 MB)
Collecting tf-models-official==2.3.0
  Using cached tf_models_official-2.3.0-py2.py3-none-any.whl (840 kB)
Collecting pillow==7.2.0
  Using cached Pillow-7.2.0-cp38-cp38-manylinux1_x86_64.whl (2.2 MB)
Collecting avro-python3==1.9.2.1
  Using cached avro_python3-1.9.2.1-py3-none-any.whl
Collecting contextlib2==0.6.0.post1
  Using cached contextlib2-0.6.0.post1-py2.py3-none-any.whl (9.8 kB)
Collecting pycocotools==2.0.2
  Using cached pycocotools-2.0.2-cp38-cp38-linux_x86_64.whl
Collecting lvis==0.5.3
  Using cached lvis-0.5.3-py3-none-any.whl (14 kB)
Collecting tf-slim==1.1.0
  Using cached tf_slim-1.1.0-py2.py3-none-any.whl (352 kB)
Collecting six==1.15.0
  Using cached six-1.15.0-py2.py3-none-any.whl (10 kB)
Collecting matplotlib==3.3.4
  Using cached matplotlib-3.3.4-cp38-cp38-manylinux1_x86_

🥑 **Welcome To Picsellia Training Quickstart Notebook** 🥑

In this Notebook, you will see how to launch a training from a created experiment on the Platform and log all the evaluation metrics to analyse your trained model.

**Step 1, let's import our python SDK and our tensorflow2 wrapper** 

If you do not have our packages you can run: 
- pip install picsellia picsellia_tf2 


In [3]:
!pip install numpy --upgrade

Collecting numpy
  Using cached numpy-1.22.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (16.8 MB)
Installing collected packages: numpy
  Attempting uninstall: numpy
    Found existing installation: numpy 1.19.5
    Uninstalling numpy-1.19.5:
      Successfully uninstalled numpy-1.19.5
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
tensorflow 2.6.0 requires numpy~=1.19.2, but you have numpy 1.22.2 which is incompatible.[0m[31m
[0mSuccessfully installed numpy-1.22.2


In [1]:
from picsellia import Client
from picsellia_tf2 import pxl_utils
from picsellia_tf2 import pxl_tf
import os

Error while loading conf file for logging. No logging done.


**Step 2, fetch your experiment files from the Picsellia servers**

Every experiment you make has an unique identifier allowing you to retrieve all the necessary informations with one command.

By passing `tree` and `with_files` to True, you will automatically get a folder architecture like:

- `experiment_name`/
    - checkpoint/
    - config/
    - exported_model/
    - images/
    - metrics/
    - records/
    - results/

In [1]:
from picsellia import Client
from picsellia_tf2 import pxl_utils
from picsellia_tf2 import pxl_tf
import os

api_token = ''
project_name = ''
experiment_name = ''

client = Client(
    api_token=api_token,
    organization=None # Set to an organization name if you want to checkout an other org.
)

project= client.get_project(project_name)

experiment = project.get_experiment(
    name=experiment_name, 
    tree=True, 
    with_artifacts=True
)

Error while loading conf file for logging. No logging done.
capture/checkpoint
config
model-latest
|[92m▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬[0m|◉ 100% 10582337/10582337    
--*--
checkpoint-data-latest
|[92m▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬[0m|◉ 100% 10752177/10752177    
--*--
checkpoint-index-latest
logs


**Step 3, Dataset fetching** 

Download all the necessary data for your training:
- Dataset
- Annotations

Then generate the labelmap for your model, `labelmap` is needed for Picsellia to map your verbose labels *(i.e "cat", "dog", "hot-dog")* to your categorical labels *(i.e 1, 2, 3)*.

>
> You can find more info about the labelmap format [here](https://google.com)
>

Finally performing train-test-split to perform training, `train_test_split`  is recommended to be stored inside Picsellia, this way you will be able to access the validation interface and have full visibility over your training data 👊

If you want to know more about our train_test_split format, here is the [documentation page](https://google.com)
**(default repartition is 0.8 / 0.2 from train/test)**

Once the train test set created, we send the repartition to Picsellia platform in order to visualize it later.



In [7]:
experiment.download_annotations()
experiment.download_pictures()
experiment.generate_labelmap()
experiment.log('labelmap', experiment.label_map, 'labelmap', replace=True)
train_list, eval_list, train_list_id, eval_list_id, label_train, label_eval, categories = experiment.train_test_split()

train_split = {
    'x': categories,
    'y': label_train,
    'image_list': train_list_id
}
experiment.log('train-split', train_split, 'bar', replace=True)

test_split = {
    'x': categories,
    'y': label_eval,
    'image_list': eval_list_id
}
experiment.log('test-split', test_split, 'bar', replace=True)

-----
|[92m▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬[0m|◉ 100% 129/129    
--*--


**Step 4, Pre-processing** 

Now we will create the necessary record files to perform training, and initialize the training with the parameters that you chose for your experiment on Picsellia

In [8]:
parameters = experiment.get_log(name='parameters')

pxl_utils.create_record_files(
        dict_annotations=experiment.dict_annotations, 
        train_list=train_list, 
        train_list_id=train_list_id, 
        eval_list=eval_list, 
        eval_list_id=eval_list_id,
        label_path=experiment.label_path, 
        record_dir=experiment.record_dir, 
        tfExample_generator=pxl_tf.tf_vars_generator, 
        annotation_type=parameters['annotation_type']
        )
    
pxl_utils.edit_config(
        model_selected=experiment.checkpoint_dir, 
        input_config_dir=experiment.config_dir,
        output_config_dir=experiment.config_dir,
        record_dir=experiment.record_dir, 
        label_map_path=experiment.label_path, 
        num_steps=parameters["steps"],
        batch_size=parameters['batch_size'],
        learning_rate=parameters['learning_rate'],
        annotation_type=parameters['annotation_type'],
        eval_number=5,
        parameters=parameters,
        )

Creating record file at capture/records/train.record
annotation type used for the variable generator: rectangle
Successfully created the TFRecords: capture/records/train.record
Creating record file at capture/records/eval.record
annotation type used for the variable generator: rectangle
Successfully created the TFRecords: capture/records/eval.record
learning_rate {
  exponential_decay_learning_rate {
    initial_learning_rate: 0.0010000000474974513
    decay_steps: 500
    decay_factor: 0.8999999761581421
    staircase: true
  }
}
momentum_optimizer_value: 0.8999999761581421

Configuration successfully edited and saved at capture/config


**Step 5, Training** 

Then just launch training, and go grab a cup of coffee :D 

In [None]:
pxl_utils.train(
        ckpt_dir=experiment.checkpoint_dir, 
        config_dir=experiment.config_dir,
        log_real_time=experiment,
    )

**Step 6, Evaluation**

Now let's run evaluation on your trained model in order to analyse the performances later

In [None]:
pxl_utils.export_graph(
    ckpt_dir=experiment.checkpoint_dir, 
    exported_model_dir=experiment.exported_model_dir, 
    config_dir=experiment.config_dir
)
pxl_utils.evaluate(
    experiment.metrics_dir, 
    experiment.config_dir, 
    experiment.checkpoint_dir
)  
conf, evaluation = pxl_utils.get_confusion_matrix(
    input_tfrecord_path=os.path.join(experiment.record_dir, 'eval.record'),
    model=os.path.join(experiment.exported_model_dir, 'saved_model'),
    labelmap=experiment.label_map
)


confusion = {
    'categories': list(experiment.label_map.values()),
    'values': conf.tolist()
}

experiment.log('confusion-matrix', confusion, 'heatmap', replace=True)
experiment.log('evaluation', evaluation, 'evaluation', replace=True) 

**Step 7, Exporting and Inference**

This part will export your trained model as saved_model in order to use in production or for inference in Picsellia. 

Inference will be performed on several images of your test set and sent to Picsellia platform to visualize some results and share it with your colloborators or community. 

Then all the evaluation metrics will be uploaded to your experiments pages in order to visualize all your graphs :) 

In [None]:

pxl_utils.infer(
    experiment.record_dir, 
    exported_model_dir=experiment.exported_model_dir, 
    label_map_path=os.path.join(experiment.base_dir,'label_map.pbtxt'), 
    results_dir=experiment.results_dir, 
    from_tfrecords=True, 
    disp=False
    )

metrics = pxl_utils.tf_events_to_dict('{}/metrics'.format(experiment.experiment_name), 'eval')
logs = pxl_utils.tf_events_to_dict('{}/checkpoint'.format(experiment.experiment_name), 'train')
for variable in logs.keys():
    data = {
        'steps': logs[variable]["steps"],
        'values': logs[variable]["values"]
    }
    experiment.log('-'.join(variable.split('/')), data, 'line', replace=True)
experiment.log('metrics', metrics, 'table', replace=True)
experiment.store('model-latest')
experiment.store('config')
experiment.store('checkpoint-data-latest')
experiment.store('checkpoint-index-latest')


![](https://www.google.com/search?q=google+image&client=ubuntu&hs=Q7r&channel=fs&source=lnms&tbm=isch&sa=X&ved=2ahUKEwii17Hsr9juAhWlnVwKHRUoD0sQ_AUoAXoECBQQAw&biw=2560&bih=931#imgrc=TMmIvimt9rgaYM)