# Annotating ground truth for image segmentation
Here’s how we solve the classic problem of annotating images for training segmentation algorithms.

## The challenge
We have a set of real-life photos of roads:

<table  align="center">
  <tr><td>
    <img src="https://tlk.s3.yandex.net/sdc/photos/0b35956a9afc639a71045f09745096de.jpg"
         alt="Sample road photo"  width="800">
  </td></tr>
  <tr><td align="center">
    <b>Figure 1.</b> Sample road photo
  </td></tr>
</table>

We need to have every traffic sign outlined. Ultimately, we need to get a set of contours, defined by an array of points, that represent the road signs in each photo. Here’s what it may look like in the image:

<table  align="center">
  <tr><td>
    <img src="./img/segmentation_example.png"
         alt="Example of how road sign segmentation can be performed"  width="800">
  </td></tr>
  <tr><td align="center">
    <b>Figure 2.</b> Example of how road sign segmentation can be performed.
  </td></tr>
</table>

In real-world tasks, annotation is usually done with a polygon. We chose to use a rectangular outline to simplify the task so that we can reduce costs and speed things up.

### Detailed task description
[Task description and how to execute it in the interface.](https://yandex.com/support/toloka-requester/concepts/image-segmentation-overview.html?lang=en)

We'll skip the first project, ("Does the image contain a specific object?"), since it's easy to implement using the "verification project" code and would make the example longer than necessary. 

Here are the two projects we’re going to implement using the code below: 
- Segmentation project - [Select an object in the image.](https://yandex.com/support/toloka-requester/concepts/image-segmentation-project2.html?lang=en)
- Verification project - [Are the bounding boxes correct?](https://yandex.com/support/toloka-requester/concepts/image-segmentation-project3.html?lang=en)

Control tasks and majority vote aren't used for the first type of project, because we can’t expect the area annotations provided by the performers to match each other exactly. Instead, we’ll check segmentation results in the second project, where a different group of performers will determine whether the traffic signs were annotated correctly. 

### Set up the environment
First of all, you'll need to register in Toloka as a requester. Learn more in [Help.](https://yandex.com/support/toloka-requester/concepts/access.html)

We used the production version in our example, but you can use the Toloka sandbox. 

The second step is to obtain your OAuth token. Learn more in [Help.](https://doc.yandex-team.ru/toloka/doc/concepts/access.html?lang=en)

<table  align="center">
  <tr><td>
    <img src="./img/OAuth.png"
         alt="OAuth"  width="800">
  </td></tr>
  <tr><td align="center">
    <b>Figure 3.</b> How to get an OAuth token from your Profile
  </td></tr>
</table>

In [None]:
token = input("Enter your token:")
if token == '':
    print('The token you entered may be invalid. Please try again.')
else:
    print('OK')

In [None]:
# Prepare an environment and import everything we need
!pip install toloka-kit==0.1.3
!pip install ipyplot

import os
import datetime
import time

import ipyplot
import pandas

import toloka.client as toloka
import toloka.client.project.template_builder as tb

In [None]:
# Create a Toloka client instance
# All API calls will pass through it
toloka_client = toloka.TolokaClient(token, 'PRODUCTION')  # or switch to SANDBOX

# We check the money available in your account, which also checks the validity of the OAuth token
requester = toloka_client.get_requester()
print('You have enough money on your account - ', requester.balance > 3.0)

### Review the dataset
Our dataset is just a collection of image URLs.

In [None]:
# Load the dataset as a series of links
dataset = pandas.read_csv('dataset.tsv', sep='\t')
ipyplot.plot_images(
    [url for url in dataset['image'].sample(n=50)],
    max_images=5,
    img_width=1000
)

---
---
## Create a new segmentation project
In this project, performers select image areas that contain traffic signs.

Learn more on [how to do this in the web interface](https://yandex.com/support/toloka-requester/concepts/image-segmentation-project2.html?lang=en).

At this stage, we need to configure how performers will see the task, write instructions, and define the input and output format. It's important that we write clear instructions with examples to make sure the performers do exactly what we want. We also highly recommend checking the task interface.

In [None]:
# How performers will see the task
project_interface = toloka.project.view_spec.TemplateBuilderViewSpec(
    config=tb.TemplateBuilder(
        view=tb.fields.ImageAnnotationFieldV1(  # component for selecting areas in images
            image=tb.data.InputData(path='image'),  # getter for the input image
            data=tb.data.OutputData(path='result'),  # path for writing output data
            shapes={tb.fields.ImageAnnotationFieldV1.Shape.RECTANGLE: True},  # allow to select only rectangular areas
            validation=tb.conditions.RequiredConditionV1()  # at least one area should be selected
        )
    )
)

# You can write instructions and upload them from a file or enter them later in the web interface
# prepared_instruction = open('instruction.html').read().strip()
prepared_instruction = '<b>Draw a rectangle around all the traffic signs in the image.</b>'

# Set up the project
segmentation_project = toloka.project.Project(
    assignments_issuing_type=toloka.project.Project.AssignmentsIssuingType.AUTOMATED,
    public_name='Outline the traffic signs in the image',
    public_description='Outline all traffic signs in the image with a rectangle',
    public_instructions=prepared_instruction,
    # Set up the task: view, input, and output parameters
    task_spec=toloka.project.task_spec.TaskSpec(
        input_spec={'image': toloka.project.field_spec.UrlSpec()},
        output_spec={'result': toloka.project.field_spec.JsonSpec()},
        view_spec=project_interface,
    ),
)

# Call the API to create a new project
segmentation_project = toloka_client.create_project(segmentation_project)
print(f'Created segmentation project with id {segmentation_project.id}')
print(f'To view the project, go to: https://toloka.yandex.com/requester/project/{segmentation_project.id}')

### Review your project and check the task interface

Visit the project page to make sure the task interface is working correctly.
To do this, follow the link in the output above.
In the project interface, click "project actions" in the top right and "preview" in the menu that appears.

<table  align="center">
  <tr><td>
    <img src="./img/segmentation_project_look.png"
         alt="Project interface"  width="800">
  </td></tr>
  <tr><td align="center">
    <b>Figure 4.</b> What the project interface might look like:
  </td></tr>
</table>

In the preview page that appears, click "Change input data" and insert an image URL (for example, https://tlk.s3.yandex.net/sdc/photos/0b35956a9afc639a71045f09745096de.jpg) into the "image" field.

<table  align="center">
  <tr><td>
    <img src="./img/segmentation_task_look.png"
         alt="Task interface"  width="800">
  </td></tr>
  <tr><td align="center">
    <b>Figure 5.</b> What the task interface might look like and how to insert images in the preview.
  </td></tr>
</table>

Click the "Instructions" button. Make sure the instructions are shown and that they say what you want them to.

Try to select multiple areas with a rectangle. Click "Submit" and then "View responses". The result window will appear. Check that your results are in the expected format and that the data is being entered correctly. 

<table  align="center">
  <tr><td>
    <img src="./img/segmentation_results_preview.png"
         alt="Task interface"  width="800">
  </td></tr>
  <tr><td align="center">
    <b>Figure 6.</b> What the results window might look like
  </td></tr>
</table>

We strongly recommend checking the task interface and instructions every time you create a project. This helps ensure that the performers will complete the task and that your results will be useful. 

Another good tip is to do a trial run with a small amount of data. Make sure that after running the entire pipeline, you get data in the expected format and quality.

### Add performer skills
A skill can describe any performer characteristic. Skills are defined by a number from 0 to 100. For example, you can record the percentage of correct responses as a skill. Learn more in [Help.](https://yandex.com/support/toloka-requester/concepts/nav.html)

In our projects, we'll use two skills: 
- **Segmentation skill**: Shows the performer completed at least one segmentation task. We'll later filter out these performers from verification tasks, so that no one can check their own segmentation. 
- **Verification skill**: How good the current performer is compared to others. We'll need this skill later when aggregating the results of the second project.

In [None]:
segmentation_skill = next(toloka_client.get_skills(name='Area selection of road signs'), None)
if segmentation_skill:
    print('Segmentation skill already exists')
else:
    print('Create new segmentation skill')
    segmentation_skill = toloka_client.create_skill(
        name='Area selection of road signs',
        hidden=True,
        public_requester_description={'EN': 'Performer is annotating road signs'},
    )

verification_skill = next(toloka_client.get_skills(name='Segmentation verification'), None)
if verification_skill:
    print('Verification skill already exists')
else:
    print('Create new verification skill')
    verification_skill = toloka_client.create_skill(
        name='Segmentation verification',
        hidden=True,
        public_requester_description={'EN': 'How good a performer is at verifying segmentation tasks'},
    )

### Create a pool in a segmentation project
A pool is a set of paid tasks sent out for completion at the same time.

First, create an instance of the pool and set the basic parameters:
- Payment amount per task.
- Non-automatic acceptance of results.
- Number of tasks performers will see on one page.
- Performer filter: who can access this task.

In [None]:
segmentation_pool = toloka.pool.Pool(
    project_id=segmentation_project.id,
    private_name='Pool 1',  # Only you can see this information.
    may_contain_adult_content=False,
    will_expire=datetime.datetime.utcnow() + datetime.timedelta(days=365),  # Pool will close after one year
    reward_per_assignment=0.01,     # We set the minimum payment amount for one task page
    auto_accept_solutions=False,    # We will only pay the performer for completing the task,
                                    #    based on the verification results of the second project
    auto_accept_period_day=1,       # Number of days to determine if we'll pay
    assignment_max_duration_seconds=60*20,  # Give performers 20 minutes to complete one task
    defaults=toloka.pool.Pool.Defaults(
        # We don't need overlapping for segmentation tasks
        default_overlap_for_new_task_suites=1,
        default_overlap_for_new_tasks=1,
    ),
)

# Set the number of tasks per page
segmentation_pool.set_mixer_config(real_tasks_count=1, golden_tasks_count=0, training_tasks_count=0)
# Please note that the payment amount specified when creating the pool is the amount the performer receives for completing one page of tasks.
# If you specify 10 tasks per page above, then reward_per_assignment will be paid for completing 10 tasks.

# We'll only show our tasks to English-speaking users because the description of the task is in English.
# This means that only people who speak English will be able to accept this task.
segmentation_pool.filter = toloka.filter.Languages.in_('EN')

print(segmentation_pool.private_name)

**Quality control rules**

View a detailed description of our quality control rules [here](https://yandex.com/support/toloka-requester/concepts/control.html).

Each quality control rule consist of the following:
- Collector: How to collect statistics and what metrics can be used in this rule.
- Condition: When the rule will be triggered. Under this condition, only parameters that apply to the collector can be used.
- Action: What to do if the condition is true.

In [None]:
# The first rule in this project restricts pool access for performers who often make mistakes
segmentation_pool.quality_control.add_action(
    collector=toloka.collectors.AcceptanceRate(),
    conditions=[
        # Performer completed more than 2 tasks
        toloka.conditions.TotalAssignmentsCount > 2,
        # and more than 35% of their responses were rejected
        toloka.conditions.RejectedAssignmentsRate > 35,
    ],
    # This action tells Toloka what to do if the condition above is True
    # In our case, we'll restrict access for 15 days
    # Always leave a comment: it may be useful later on
    action=toloka.actions.RestrictionV2(
        scope=toloka.user_restriction.UserRestriction.ALL_PROJECTS,
        duration=15,
        duration_unit='DAYS',
        private_comment='Performer often make mistakes',  # Only you will see this comment
    )
)

# The second useful rules is "Fast responses". It allows us to filter out performers who respond too quickly.
segmentation_pool.quality_control.add_action(
    # Let's monitor fast submissions for the last 5 completed task pages
    # and define a quick response as one that takes less than 20 seconds
    collector=toloka.collectors.AssignmentSubmitTime(history_size=5, fast_submit_threshold_seconds=20),
    # If we see more than one fast response,
    conditions=[toloka.conditions.FastSubmittedCount > 1],
    # we ban the performer from all our projects for 10 days
    action=toloka.actions.RestrictionV2(
        scope=toloka.user_restriction.UserRestriction.ALL_PROJECTS,
        duration=10,
        duration_unit='DAYS',
        private_comment='Fast responses',  # Only you will see this comment
    )
)

# Another rule we use is for automatically updating skills
# This isn't really about quality, but rules can do a lot of useful things
# We update the segmentation skill for performers who complete at least one task
segmentation_pool.quality_control.add_action(
    collector=toloka.collectors.AnswerCount(),
    # If the performer completed at least one task,
    conditions=[toloka.conditions.AssignmentsAcceptedCount > 0],
    # it doesn't add to the skill, it sets the new skill to 1
    action=toloka.actions.SetSkill(skill_id=segmentation_skill.id, skill_value=1),
)

# Recompletion of rejected assignments sends the tasks you rejected to other performers according to a specified rules.
segmentation_pool.quality_control.add_action(
    collector=toloka.collectors.AssignmentsAssessment(),
    # Check if a task was rejected
    conditions=[toloka.conditions.AssessmentEvent == toloka.conditions.AssessmentEvent.REJECT],
    # If the condition is True, add 1 to overlap and open the pool
    action=toloka.actions.ChangeOverlap(delta=1, open_pool=True),
)
print('Quality rules count:', len(segmentation_pool.quality_control.configs))

### Create a pool and review it

Now call the Toloka API to create a pool in the segmentation project.

Afterwards, you can open and explore the pool in the web interface. You'll see there aren't any tasks in it. We'll add them later.

In [None]:
segmentation_pool = toloka_client.create_pool(segmentation_pool)
print(f'To view this pool, visit: https://toloka.yandex.com/requester/project/{segmentation_project.id}/pool/{segmentation_pool.id}')

---
---
## Create a new verification project
In this project, performers will determine if traffic signs were outlined correctly.

Learn how to do this in the [web interface](https://yandex.com/support/toloka-requester/concepts/image-segmentation-project3.html?lang=en).

This will be a standard classification project with only two classes: "OK" and "BAD". We’ll explicitly define these as the allowed output values in order to use them for aggregating results.

In [None]:
# How performers will see the task
verification_interface = toloka.project.view_spec.TemplateBuilderViewSpec(
    config=tb.TemplateBuilder(
        view=tb.view.ListViewV1(  # list of components that should be positioned from top to bottom in the ui
            items=[
                tb.fields.ImageAnnotationFieldV1(  # image and selected areas to verify
                    image=tb.data.InputData(path='image'),
                    data=tb.data.InternalData(path='selection',
                                              default=tb.data.InputData(path='selection')),  # using the input field as default value to display the selected areas
                    disabled=True  # disable adding and deleting areas
                ),
                tb.fields.RadioGroupFieldV1(  # a component for selecting one value out of several options
                    label='Are all traffic signs outlined correctly?',  # label above the options
                    data=tb.data.OutputData(path='result'),  # path for writing output data
                    options=[
                        tb.fields.GroupFieldOption(label='Yes', value='OK'),
                        tb.fields.GroupFieldOption(label='No', value='BAD'),
                    ],
                    validation=tb.conditions.RequiredConditionV1()  # requirement to select one of the options
                )
            ]
        ),
        plugins=[
            tb.plugins.HotkeysPluginV1( # shortcuts for selecting options using the keyboard
                key_1=tb.actions.SetActionV1(data=tb.data.OutputData(path='result'), payload='OK'),
                key_2=tb.actions.SetActionV1(data=tb.data.OutputData(path='result'), payload='BAD')
            )
        ]
    )
)

# You can write instructions and upload them from a file or enter them later in the web interface
# prepared_instruction = open('instruction.html').read().strip()
verification_instruction = '''<b>Look at the image and answer the question:</b><br/>
Are all traffic signs outlined correctly?<br/>
If they are, click Yes.<br/>
If they aren't, click No.<br/>
For example, the road signs here are outlined correctly, so the correct answer is Yes.'''

# Set up the project
verification_project = toloka.project.Project(
    assignments_issuing_type=toloka.project.Project.AssignmentsIssuingType.AUTOMATED,
    public_name='Are the traffic signs outlined correctly?',
    public_description='Look at the image and decide whether or not the traffic signs are outlined correctly',
    public_instructions=verification_instruction,
    # Set up the task: view, input, and output parameters
    task_spec=toloka.project.task_spec.TaskSpec(
        input_spec={
            'image': toloka.project.field_spec.UrlSpec(),
            'selection': toloka.project.field_spec.JsonSpec(),
            'assignment_id': toloka.project.field_spec.StringSpec(),
        },
        # We have to set allowed_values because we'll be using smart mixing to get the results of this project
        output_spec={'result': toloka.project.field_spec.StringSpec(allowed_values=['OK', 'BAD'])},
        view_spec=verification_interface,
    ),
)

# Call the API to create a new project
verification_project = toloka_client.create_project(verification_project)
print(f'Created verification project with id {verification_project.id}')
print(f'To view the project, go to: https://toloka.yandex.com/requester/project/{verification_project.id}')

Again, you can follow the link and check the task interface and instructions. You should see nearly the same interface as in the previous project, only without the ability to select areas.

It's important to make sure that the annotation results from the first project display correctly in the second one. To do this, open the task preview in the first project, outline the signs, click "Submit", and then copy the result.

Now open the preview to the second project. Click "Change input data" and paste the annotation results in the "selection" field.

Click "Apply" and make sure the annotation is displayed correctly.

### Create and set up a pool in the verification project
In the filter, we'll specify that performers don't have the segmentation skill. You can combine multiple conditions using the '&' and '|' operators.

Note that we add two quality control rules with the same collector, but with different conditions and actions.

In [None]:
verification_pool = toloka.pool.Pool(
    project_id=verification_project.id,
    private_name='Pool 1. Road sign verification',  # Only you can see this information.
    may_contain_adult_content=False,
    will_expire=datetime.datetime.utcnow() + datetime.timedelta(days=365),  # Pool will close after one year
    reward_per_assignment=0.01,  # We set the minimum payment amount for one task page
                                 # By default, auto_accept_solutions is on,
                                 # so we'll pay for all tasks
    assignment_max_duration_seconds=60*10,  # Give performers 10 minutes to complete one task
    defaults=toloka.pool.Pool.Defaults(
        # We need an overlap to check the performers among themselves,
        # and we need to set a incremental relabeling (dynamic overlap) value less than max_overlap
        default_overlap_for_new_task_suites=2,
    ),
)

# We'll only show our tasks to English-speaking users because the description of the task is in English.
# We also won't allow our verification tasks to be performed by users who performed segmentation tasks
verification_pool.filter = (
    (toloka.filter.Languages.in_('EN')) &
    (toloka.filter.Skill(segmentation_skill.id) == None)
)

# Set up quality control
# Quality is based on the majority of matching responses from performers who completed the same task.
verification_pool.quality_control.add_action(
    collector=toloka.collectors.MajorityVote(answer_threshold=2),
    # If a performer has 10 or more responses
    # and the responses are correct in less than 50% of cases,
    conditions=[
        toloka.conditions.TotalAnswersCount > 9,
        toloka.conditions.CorrectAnswersRate < 50,
    ],
    # we ban the performer from all our projects for 10 days
    action=toloka.actions.RestrictionV2(
        scope=toloka.user_restriction.UserRestriction.ALL_PROJECTS,
        duration=10,
        duration_unit='DAYS',
        private_comment=' Doesn\'t match the majority',  # Only you will see this comment
    )
)

# Set up checking skills using MajorityVote
# Depending on the percentage of correct responses, we increase the value of the performer's skill
verification_pool.quality_control.add_action(
    collector=toloka.collectors.MajorityVote(answer_threshold=2, history_size=10),
    conditions=[
        toloka.conditions.TotalAnswersCount > 2,
    ],
    action=toloka.actions.SetSkillFromOutputField(
        skill_id=verification_skill.id,
        from_field='correct_answers_rate',
    ),
)
print('Quality rule count:', len(verification_pool.quality_control.configs))

### Add incremental relabeling and create a pool
For more information on incremental labeling, check [Help.](https://yandex.com/support/toloka-requester/concepts/dynamic-overlap.html?lang=en)

We need to add incremental relabeling because we'll be using aggregation. The overlap will be based on the verification skill. 

In [None]:
# Set the task count for one page and turn task shuffling ON to enable incremental relabeling
verification_pool.set_mixer_config(
    real_tasks_count=10,
    golden_tasks_count=0,
    training_tasks_count=0,
    mix_tasks_in_creation_order=True,  # Enable shuffle mode to use incremental relabeling
    force_last_assignment=True,
)
# Create incremental relabeling
verification_pool.set_dynamic_overlap_config(
    type='BASIC',
    max_overlap=5,       # Each task can be completed a maximum of 5 times
    min_confidence=0.8,  # Percentage, where 100% = 1.0
    answer_weight_skill_id=verification_skill.id,  # Incremental relabeling by verification skill
    fields=[toloka.pool.DynamicOverlapConfig.Field(name='result')],
)

verification_pool = toloka_client.create_pool(verification_pool)
print(f'To view this pool, visit: https://toloka.yandex.com/requester/project/{verification_project.id}/pool/{verification_pool.id}')

---
---
## Add tasks and run the projects
At this point, we have two projects and can now start adding tasks from our dataset.

In [None]:
tasks = [
    toloka.task.Task(input_values={'image': url}, pool_id=segmentation_pool.id)
    for url in dataset['image'].values[:20]
]
# Add tasks to a pool
toloka_client.create_tasks(tasks, toloka.task.CreateTasksParameters(allow_defaults=True))
print(f'Populated segmentation pool with {len(tasks)} tasks')
print(f'To view this pool, visit: https://toloka.yandex.com/requester/project/{segmentation_project.id}/pool/{segmentation_pool.id}')

# Open the segmentation pool
segmentation_pool = toloka_client.open_pool(segmentation_pool.id)

You can visit the pool page in the web interface and make sure everything is ok: the number of tasks is correct, the pool is running, and some tasks may already be completed.
<table  align="center">
  <tr><td>
    <img src="./img/segmentation_pool_look.png"
         alt="Pool with tasks"  width="800">
  </td></tr>
  <tr><td align="center">
    <b>Figure 7.</b> How a running pool may look.
  </td></tr>
</table>


Performers in toloka work really fast, but they still need time to complete their tasks. We’ll have to wait while they complete all the tasks in the segmentation pool. You can view the status of the pool in the web interface, but this is not very convenient in a real-life project.

In [None]:
def wait_pool_for_close(pool):
    sleep_time = 60
    pool = toloka_client.get_pool(pool.id)
    while not pool.is_closed():
        print(
            f'   {datetime.datetime.now().strftime("%H:%M:%S")}\t'
            f'Pool {pool.id} has status {pool.status}.'
        )
        time.sleep(sleep_time)
        pool = toloka_client.get_pool(pool.id)

# Wait for the segmentation pool
print('\nWaiting for the segmentation pool to close')
wait_pool_for_close(segmentation_pool)
print(f'Segmentation pool {segmentation_pool.id} is finally closed!')

Do not run the following cells until the segmentation pool is closed.

When all the tasks in the segmentation pool have been completed, we can download the results from the pool and prepare our verification tasks.

The next step is to run the verification pool.

In [None]:
def prepare_verification_tasks():
    verification_tasks = []  # Tasks that we will send for verification
    request = toloka.search_requests.AssignmentSearchRequest(
        status=toloka.assignment.Assignment.SUBMITTED,  # Only take completed tasks that haven't been accepted or rejected
        pool_id=segmentation_pool.id,
    )
    # Create and store new tasks
    for assignment in toloka_client.get_assignments(request):
        verification_tasks.append(
            toloka.task.Task(
                input_values={
                    'image': assignment.tasks[0].input_values['image'],
                    'selection': assignment.solutions[0].output_values['result'],
                    'assignment_id': assignment.id,
                },
                pool_id=verification_pool.id,
            )
        )
    print(f'Generate {len(verification_tasks)} new verification tasks')
    return verification_tasks


def run_verification_pool(verification_tasks):
    verification_tasks_result = toloka_client.create_tasks(
        verification_tasks,
        toloka.task.CreateTasksParameters(allow_defaults=True)
    )
    # We'll store our verification_task-segmentation_assignments references. We'll need it later.
    task_to_assignment = {}
    for task in verification_tasks_result.items.values():
        task_to_assignment[task.id] = task.input_values['assignment_id']

    # Open the verification pool
    pool = toloka_client.open_pool(verification_pool.id)
    print(f'Verification pool status - {pool.status}')
    return task_to_assignment


# Prepare the tasks
verification_tasks = prepare_verification_tasks()
# Add it to the pool and run the pool
task_to_assignment = run_verification_pool(verification_tasks)

We just launched our verification pool. Let's wait for it to close.

In [None]:
print('\nWaiting for verification pool to close')
wait_pool_for_close(verification_pool)
print(f'Verification pool {verification_pool.id} is finally closed!')

If you have overlap in the pool, you need to aggregate the results. Toloka already offers some aggregation functions.
In this example, we use aggregation by skill. Learn more about aggregation [here](https://yandex.com/support/toloka-requester/concepts/result-aggregation.html?lang=en).

Now let’s start the aggregation process. Wait for it to complete and then get the results. We'll use these results to accept or reject the tasks submitted in the segmentation pool.

In [None]:
def get_aggregation_results():
    print('Start aggregation in the verification pool')
    aggregation_operation = toloka_client.aggregate_solutions_by_pool(
        type=toloka.aggregation.AggregatedSolutionType.WEIGHTED_DYNAMIC_OVERLAP,
        pool_id=verification_pool.id,   # Aggregate in this pool
        answer_weight_skill_id=verification_skill.id,   # Aggregate by this skill
        fields=[toloka.aggregation.PoolAggregatedSolutionRequest.Field(name='result')]  # Aggregate this field
    )

    # This may take some time
    aggregation_operation = toloka_client.wait_operation(aggregation_operation)
    print('Results aggregated')

    # Get aggregated results
    # Set a limit to show how to iterate over aggregation results
    aggregation_result = toloka_client.find_aggregated_solutions(aggregation_operation.id, limit=5)
    verification_results = aggregation_result.items
    # If we have more results, let's get them
    while aggregation_result.has_more:
        aggregation_result = toloka_client.find_aggregated_solutions(
            aggregation_operation.id,
            # We have to establish which id we want to get results from (or else we'll loop back)
            # This is usually the last item id in the previous request
            task_id_gt=aggregation_result.items[len(aggregation_result.items) - 1].task_id,
        )
        verification_results = verification_results + aggregation_result.items
    return verification_results


def set_segmentation_status(verification_results):
    # Reject or accept tasks in the segmentation pool
    print('Started adding results to segmentation tasks')
    for r in verification_results:
        # We need to reject or accept only previously stored assignments
        # If we try to accept or reject an already accepted assignment, an exception will be thrown
        if r.task_id not in task_to_assignment:
            continue
        # Find assignment_id in the input by task_id
        assignment_id = task_to_assignment[r.task_id]
        if r.output_values['result'] == 'OK':
            toloka_client.accept_assignment(assignment_id, "Well done!")
        else:
            toloka_client.reject_assignment(assignment_id, 'The object wasn\'t selected or was selected incorrectly.')
    print('Finished adding results to segmentation tasks')


# Aggregation operation
verification_results = get_aggregation_results()
# Reject or accept tasks in the segmentation pool
set_segmentation_status(verification_results)

We may need multiple iterations over our two pools, so we'll write some simple code to do this in a loop. Let's start it and wait.

Depending on the number of images in the segmentation pool and the time of day, this can take from 10 minutes to almost an hour.

In [None]:
while True:
    print('\nWaiting for segmentation pool to close')
    wait_pool_for_close(segmentation_pool)
    print(f'Segmentation pool {segmentation_pool.id} is finally closed!')

    # Preparing tasks
    verification_tasks = prepare_verification_tasks()

    # Make sure all the tasks are done
    if len(verification_tasks) == 0:
        print('All the tasks in our project are done')
        break

    # Add it to the pool and run the pool
    task_to_assignment = run_verification_pool(verification_tasks)

    print('\nWaiting for verification pool to close')
    wait_pool_for_close(verification_pool)
    print(f'Verification pool {verification_pool.id} is finally closed!')

    # Aggregation operation
    verification_results = get_aggregation_results()
    # Reject or accept tasks in the segmentation pool
    set_segmentation_status(verification_results)


print(f'Results received at {datetime.datetime.now()}')

---
---
## Get the results
Now you can download all the accepted tasks from the segmentation pool and work with them. In this notebook, we'll only show the segmentation results.

In [None]:
!pip3 install pillow
!pip3 install requests
from PIL import Image, ImageDraw
import requests

def get_image(url, selection):
    raw_image = requests.get(url, stream=True).raw
    image = Image.open(raw_image).convert("RGBA")
    regions = Image.new('RGBA', image.size, (255,255,255,0))
    pencil = ImageDraw.Draw(regions)
    for region in selection:
        if region['shape'] != 'rectangle':
            continue
        p1_x = region['left'] * image.size[0]
        p1_y = region['top'] * image.size[1]
        p2_x = (region['left'] + region['width']) * image.size[0]
        p2_y = (region['top'] + region['height']) * image.size[1]
        pencil.rectangle((p1_x, p1_y, p2_x, p2_y), fill =(255, 30, 30, int(255*0.5)))
    image = Image.alpha_composite(image, regions)
    return image

segmentation_result = {}  # We'll store our result here

In [None]:
max_images = 2
images = []

if not segmentation_result:
    request_for_result = toloka.search_requests.AssignmentSearchRequest(
        status=toloka.assignment.Assignment.ACCEPTED,
        pool_id=segmentation_pool.id,
    )

    for assignment in toloka_client.get_assignments(request_for_result):
        segmentation_result[assignment.tasks[0].input_values['image']] = assignment.solutions[0].output_values['result']

for i in range(max_images):
    url, selection = segmentation_result.popitem()
    image = get_image(url, selection)
    images.append(image)

ipyplot.plot_images(
    images,
    max_images=max_images,
    img_width=1000
)