# Calculate mean average presicion (mAP) metric

A ready-to-use script to compute mean average precision (mAP) metric.

**Input**:
- Existing project annotated with both ground truth and predicted objects. The predicted objects must be labeled with `confidence` tags with values equal to the model prediction confidence.
- At least one pair of corresponding ground truth and prediction class names, e.g. ("person", "person_predicted").

**Output**:
- Mean average precision for each corresponding class pair.

## Imports

In [1]:
import supervisely_lib as sly
import os
import collections
from prettytable import PrettyTable
from tqdm import tqdm
from supervisely_lib.metric.map_metric import AP

## Configuration

Edit the following settings for your own case

In [2]:
# Change this field to the name of your team, where target workspace exists.
team_name = "jupyter_tutorials"

# Change this field to the of your workspace, where target project exists.
workspace_name = "metrics_tutorials"

# Change this field to the name of your target project.
project_name = "map_metric_demo_project"

# Configure the following dictionary so that is will match pairs of ground truth and predicted classes
# for which the metrics will be caluclated.
classes_mapping = {
    "bike": "motorbike_pred",
    "dog": "dog_pred",
    "person": "person_pred",
}

# Minimum intersection over uinon value for which to overlapping objects will be
# considered to have matched. Increase to only take close matches into account;
# decrease to also consider less significant overlaps.
iou_threshold = 0.5

# If you are running this notebook on a Supervisely web instance, the connection
# Edit those values if you run this notebook on your own PC
# details below will be filled in from environment variables automatically.
#
# If you are running this notebook locally on your own machine, edit to fill in the
# connection details manually. You can find your access token at
# "Your name on the top right" -> "Account settings" -> "API token".
address = os.environ['SERVER_ADDRESS']
token = os.environ['API_TOKEN']

## Script setup

Initialize Supervisely API to remotely manage your projects

In [3]:
# Initialize API object
api = sly.Api(address, token)

## Verify input values

Test that context (team / workspace / project) exists

In [4]:
team = api.team.get_info_by_name(team_name)
if team is None:
    raise RuntimeError("Team {!r} not found".format(team_name))

workspace = api.workspace.get_info_by_name(team.id, workspace_name)
if workspace is None:
    raise RuntimeError("Workspace {!r} not found".format(workspace_name))
    
project = api.project.get_info_by_name(workspace.id, project_name)
if project is None:
    raise RuntimeError("Project {!r} not found".format(project_name))
    
print("Team: id={}, name={}".format(team.id, team.name))
print("Workspace: id={}, name={}".format(workspace.id, workspace.name))
print("Project: id={}, name={}".format(project.id, project.name))

Team: id=3, name=jupyter_tutorials
Workspace: id=10, name=metrics_tutorials
Project: id=401, name=map_metric_demo_project


## Get source project meta

Project meta contains information about classes and tags.

In [5]:
meta_json = api.project.get_meta(project.id)
meta = sly.ProjectMeta.from_json(meta_json)

# check that all classes exist
project_classes_names = list(classes_mapping.keys()) + list(classes_mapping.values())

for class_name in project_classes_names:
    if not meta.obj_classes.has_key(class_name):
        raise RuntimeError("Class {!r} not found in source project {!r}".format(class_name, project.name))

## Create metric evaluator

In [6]:
map_evaluator = sly.MAPMetric(classes_mapping, iou_threshold, confidence_tag_name='confidence_pred')

## Iterate over all images, and calculate metric by annotations pairs

In [7]:
for dataset in api.dataset.get_list(project.id):
    
    # generate dataset name in destination project if it exists
    print("Processing: project = {!r}, dataset = {!r}".format(project.name, dataset.name), flush=True)
    
    images = api.image.get_list(dataset.id)
    with tqdm(total=len(images), desc="Process annotations") as progress_bar:
        for batch in sly.batched(images):
            image_ids = [image_info.id for image_info in batch]
            ann_infos = api.annotation.download_batch(dataset.id, image_ids)
            
            for ann_info in ann_infos:
                ann = sly.Annotation.from_json(ann_info.annotation, meta)
                # We are using the same annotation on the both side of the metric computation
                # (classes_mapping provides the corresponding classes that we will look for
                # in the annotation), but it is also possible to use different annotations
                # on left and right, e.g. to compare the source hand-labeled project to a
                # neural netork inference result.
                map_evaluator.add_pair(ann, ann)
            
            progress_bar.update(len(batch))

Processing: project = 'map_metric_demo_project', dataset = 'dataset_02'


Process annotations: 100%|██████████| 2/2 [00:00<00:00, 48.75it/s]

Processing: project = 'map_metric_demo_project', dataset = 'dataset_01'



Process annotations: 100%|██████████| 3/3 [00:00<00:00, 21.86it/s]


## Print results with default logger

The results are logged with the default Supervisely logger, so that the same code can be used in any custom plugin, and the log output would be nicely formatted in the task log.

In [8]:
map_evaluator.log_total_metrics()

{"message": "                                                                                ", "timestamp": "2019-04-24T14:48:32.764Z", "level": "info"}
{"message": "***************** Result metrics values for 0.5 IoU threshold ******************", "timestamp": "2019-04-24T14:48:32.768Z", "level": "info"}
{"message": "Start evaluation of macro metrics.", "timestamp": "2019-04-24T14:48:32.769Z", "level": "info"}
{"message": "Finish macro evaluation", "timestamp": "2019-04-24T14:48:32.770Z", "level": "info"}
{"message": "                                                                                ", "timestamp": "2019-04-24T14:48:32.771Z", "level": "info"}
{"message": "*********** Results for pair of classes <<bike <-> motorbike_pred>>  ***********", "timestamp": "2019-04-24T14:48:32.771Z", "level": "info"}
{"message": "Average Precision (AP): 0.6363636363636364", "timestamp": "2019-04-24T14:48:32.772Z", "level": "info"}
{"message": "                                                  

## Print results manually

In [9]:
# Metrics for each pair of classes separately.
results = map_evaluator.get_metrics()

# Metrics aggregated over all pairs of classes from classes_mapping
total_results = map_evaluator.get_total_metrics()

table = PrettyTable(["classes pair", "metrics values"])

def build_values_text(values):
    return ''.join(
        "{}: {}\n".format(metrics_name, value)
        for metrics_name, value in values.items())
    
for gt_class, metric_values in results.items():
    pair_text = "{} <-> {}".format(gt_class, classes_mapping[gt_class])
    table.add_row([pair_text, build_values_text(metric_values)])

table.add_row(["TOTAL", "average-precision: {}".format(total_results[AP]) ])
print(table.get_string())

+-------------------------+---------------------------------------+
|       classes pair      |             metrics values            |
+-------------------------+---------------------------------------+
| bike <-> motorbike_pred | average-precision: 0.6363636363636364 |
|                         |                                       |
|     dog <-> dog_pred    |         average-precision: 1.0        |
|                         |                                       |
|  person <-> person_pred | average-precision: 0.6363636363636364 |
|                         |                                       |
|          TOTAL          | average-precision: 0.7575757575757575 |
+-------------------------+---------------------------------------+


# Done!