# Video collection

### Call to action
If you found some bugs or have a new feature idea, don't hesitate to [open a new issue on Github](https://github.com/Toloka/toloka-kit/issues/new/choose).
Like our library and examples? Star [our repo on Github](https://github.com/Toloka/toloka-kit)

Prepare the environment and import everything you'll need.

In [None]:
%%capture
!pip install toloka-kit==0.1.26
!pip install ipython

import datetime
import logging
import sys
import time
import getpass

import pandas

import toloka.client as toloka
import toloka.client.project.template_builder as tb

In [None]:
logging.basicConfig(
    format='[%(levelname)s] %(name)s: %(message)s',
    level=logging.INFO,
    stream=sys.stdout,
)

Сreate toloka-client instance. All api calls will go through it. More about OAuth token in our [Learn the basics example](https://github.com/Toloka/toloka-kit/tree/main/examples/0.getting_started/0.learn_the_basics) [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Toloka/toloka-kit/blob/main/examples/0.getting_started/0.learn_the_basics/learn_the_basics.ipynb)

In [None]:
toloka_client = toloka.TolokaClient(getpass.getpass('Enter your OAuth token: '), 'PRODUCTION') # Or switch to 'SANDBOX'
print(toloka_client.get_requester())

## Create a project

In [None]:
new_project = toloka.Project(
    public_name='Record one 5 second video of a hand gesture',
    public_description='Make a short video of you hand moving from one given gesture to another.',
)

Configure the task interface.

Read more about the [Template Builder](https://yandex.ru/support/toloka-tb/index.html?utm_source=github&utm_medium=site&utm_campaign=tolokakit) in the Requester’s Guide.

In [None]:
describe_text = """Record a 5 second video with the following requirements:\n
1. Bright & solid background. Good light.
2. Only 1 hand in the video! No other objects!
3. Show all the following gestures:\n\n"""

main_md = tb.MarkdownViewV1(
    tb.JoinHelperV1([describe_text, '# ', tb.InputData('emoji1'), ' &#10145; ', tb.InputData('emoji2'), '\n\nExample:'])
)

example_gif = tb.ImageViewV1(tb.InputData('examplegif'), max_width=300)

attention_md = tb.MarkdownViewV1("""**Attention!**
    If the video isn\'t satisfactory, the task won\'t be accepted.
    Look at the instructions with examples."""
)

video = tb.MediaFileFieldV1(
    tb.OutputData('path'),
    tb.MediaFileFieldV1.Accept(video=True),
    validation=tb.RequiredConditionV1(hint='Add a video'),
    multiple=False,
)

project_interface = toloka.project.TemplateBuilderViewSpec(
    view=tb.ListViewV1([main_md, example_gif, attention_md, video]),
    plugins=[tb.TolokaPluginV1('scroll', task_width=400)],
)

Set data specification. And set task interface to project.
> Specifications are a description of input data that will be used in a project and the output data that will be collected from the performers.

Read more about [input and output data specifications](https://yandex.ru/support/toloka-tb/operations/create-specs.html?utm_source=github&utm_medium=site&utm_campaign=tolokakit) in the Requester’s Guide.

In [None]:
input_specification = {
    'emoji1': toloka.project.JsonSpec(),
    'emoji2': toloka.project.JsonSpec(),
    'examplegif': toloka.project.UrlSpec(),
}
output_specification = {
    'path': toloka.project.FileSpec(),
}

new_project.task_spec = toloka.project.task_spec.TaskSpec(
        input_spec=input_specification,
        output_spec=output_specification,
        view_spec=project_interface,
)

Write comprehensive instructions.

> With video collection tasks, it’s extra important to specify all the necessary requirements (things like light, background, position, and more). Since we’ll be reviewing assignments, we need to describe all the requirements as clearly as possible. After all, task acceptance and the performers’ earnings depend on how well these instructions are written.

Get more tips on [designing instructions](https://toloka.ai/knowledgebase/instruction?utm_source=github&utm_medium=site&utm_campaign=tolokakit) in our Knowledge Base.

In [None]:
new_project.public_instructions = """Task - make one video where your hand is the actor.<br>
You will get two emojis.<br>
Make sure your face is not in the video.<br>
<br>
<h2>Video recording</h2>
<ul>
<li>Show the first emoji with your hand.</li>
<li>Move or transfer your hand to show the second emoji.</li>
</ul>
Remember, all video will be checked after uploading. Therefore, each video must satisfy the several criteria.
"""

Create a project via API request.

In [None]:
new_project = toloka_client.create_project(new_project)

## Create a pool

Create the “Record video of emoji” skill that will be assigned to users after they complete the pool tasks. You will use this skill later to assign skill value to performers who complete your task

In [None]:
video_skill = next(toloka_client.get_skills(name='Record video of emoji'), None)
if video_skill:
    print('Detection skill already exists')
else:
    video_skill = toloka_client.create_skill(
        name='Record video of emoji',
        hidden=True,
        public_requester_description={'EN': 'Performer is record video'},
    )

A pool is a set of paid tasks grouped into task pages. These tasks are sent out for completion at the same time.

> All tasks within a pool have the same settings (price, (price, quality control, etc.)

Give the pool any name you find suitable. You are the only one who will see it.


Set the price per task suite (for example, $0.03).

> Video recording tasks may vary in price depending on the required effort. In our case, there are no extra requirements (like outdoor shooting), which means we can set a standard price.

Read more about [pricing principles](https://toloka.ai/knowledgebase/pricing?utm_source=github&utm_medium=site&utm_campaign=tolokakit) in our Knowledge Base.

In [None]:
# Create a pool
new_pool = toloka.Pool(
    project_id=new_project.id,
    private_name='Record one 5 second video of a hand gesture',
    may_contain_adult_content=False,
    will_expire=datetime.datetime.utcnow() + datetime.timedelta(days=365),
    reward_per_assignment=0.03,
    auto_accept_solutions=False,
    auto_accept_period_day=1,
    assignment_max_duration_seconds=60*15,
    defaults=toloka.Pool.Defaults(default_overlap_for_new_task_suites=3),
    filter=(
        (toloka.filter.Languages.in_('EN')) &
        (toloka.filter.ClientType == 'TOLOKA_APP') &
        (
            # This filter means that access to this pool will be granted either
            # to newbies who don’t have the quality skill yet, or to those who
            # submit at least 75% of tasks correctly.
            (toloka.filter.Skill(video_skill.id) == None) |
            (toloka.filter.Skill(video_skill.id) >= 75 )
        )
    ),
)

new_pool.set_mixer_config(real_tasks_count=1)

**Set up [Quality control](https://toloka.ai/en/docs/guide/concepts/control?utm_source=github&utm_medium=site&utm_campaign=tolokakit).**
> Since there is no one true answer to a video recording task that can be used as ground truth, post-acceptance is the preferable way to check if the recordings provided are acceptable.

Read more about [quality control principles](https://toloka.ai/knowledgebase/quality-control?utm_source=github&utm_medium=site&utm_campaign=tolokakit) in our Knowledge Base or check out [post-acceptance settings](https://toloka.ai/en/docs/guide/concepts/offline-accept) in the Requester’s Guide.


Set up the Results of assignments review rule. Use this rule to assign a skill value to performers after a specific number of tasks is reviewed.
>  A skill is a characteristic of the performer described by a number between 0 and 100.  In this case, you add the performers’ quality based on their task acceptance rate to the skill you created earlier.

Read more about [performer skills](https://toloka.ai/en/docs/guide/concepts/nav?utm_source=github&utm_medium=site&utm_campaign=tolokakit) in the Requester’s Guide.

In [None]:
"""
new_pool.quality_control.add_action(
    collector=toloka.collectors.AcceptanceRate(),
    conditions=[toloka.conditions.TotalAssignmentsCount > 2,],
    action=toloka.actions.SetSkillFromOutputField(skill_id=video_skill.id, from_field='approvedAssignmentsRate'),
)
"""

Set up the [Fast Responses](https://yandex.ru/support/toloka-requester/concepts/quick-answers.html?utm_source=github&utm_medium=site&utm_campaign=tolokakit) rule. If a performer gives a response too quickly (for example, it’s not possible to record and upload a video in 10 seconds), they will be banned from the pool.

In [None]:
new_pool.quality_control.add_action(
    collector=toloka.collectors.AssignmentSubmitTime(fast_submit_threshold_seconds=10),
    conditions=[toloka.conditions.FastSubmittedCount > 0],
    action=toloka.actions.RestrictionV2(
        scope=toloka.user_restriction.UserRestriction.PROJECT,
        duration=3,
        duration_unit='DAYS',
        private_comment='Fast response',
    )
)

Create a pool.

In [None]:
new_pool = toloka_client.create_pool(new_pool)

Mobile devices will display the task like that:

<table  align="center">
  <tr><td>
    <img src="./img/performer_interface.png"
         alt="How performers will see your task on mobile"  height="600">
  </td></tr>
  <tr><td align="center">
    <b>Figure 1.</b> How performers will see your task on mobile
  </td></tr>
</table>

Note: In preview mode you won't be able to upload an image and look at the result. This restriction is related to the preview features and doesn't affect performers.

## Prepare and upload a tasks
This example uses a small data set with emojis and gifs.

The dataset used is collected by Toloka team and distributed under a Creative Commons Attribution 4.0 International license
[![License: CC BY 4.0](https://img.shields.io/badge/License-CC%20BY%204.0-lightgrey.svg)](https://creativecommons.org/licenses/by/4.0/).

In [None]:
!curl https://tlk.s3.yandex.net/dataset/video_emoji/data.tsv --output dataset.tsv

dataset = pandas.read_csv('dataset.tsv', sep='\t')
print(dataset)

Not pretty much. Lets look at them closer.

In [None]:
from html import unescape

for row in dataset.itertuples():
    print(f'{unescape(row.emoji1)} -> {unescape(row.emoji2)}')

<table  align="center">
  <tr><td>
    <img src="https://tlk.s3.yandex.net/dataset/video_emoji/video/biguptobigdown.gif"
         alt="Example gif for performers"  height="600">
  </td></tr>
  <tr><td align="center">
    <b>Figure 2.</b> Example gif for performers
  </td></tr>
</table>

Prepare tasks.

In [None]:
tasks = [
    toloka.Task(
        pool_id=new_pool.id,
        input_values={
            'emoji1': row.emoji1,
            'emoji2': row.emoji2,
            'examplegif': row.examplegif,
        },
    )
    for row in dataset.itertuples()
]

Upload tasks

In [None]:
created_tasks = toloka_client.create_tasks(tasks, allow_defaults=True)
print(len(created_tasks.items))

Start the pool.

**Important.** Remember that real Toloka performers will complete the tasks.
Double check that everything is correct with your project configuration before you start the pool.

In [None]:
new_pool = toloka_client.open_pool(new_pool.id)
print(new_pool.status)

## Receiving responses

Wait until the pool is completed.

In [None]:
pool_id = new_pool.id
def wait_pool_for_close(pool_id, minutes_to_wait=1):
    sleep_time = 60 * minutes_to_wait
    pool = toloka_client.get_pool(pool_id)
    while not pool.is_closed():
        op = toloka_client.get_analytics([toloka.analytics_request.CompletionPercentagePoolAnalytics(subject_id=pool.id)])
        op = toloka_client.wait_operation(op)
        percentage = op.details['value'][0]['result']['value']
        print(
            f'   {datetime.datetime.now().strftime("%H:%M:%S")}\t'
            f'Pool {pool.id} - {percentage}%'
        )
        time.sleep(sleep_time)
        pool = toloka_client.get_pool(pool.id)
    print('Pool was closed.')

wait_pool_for_close(pool_id)

Since the main quality control method for this kind of task is post-acceptance, you will need to review the tasks after the pool is completed.

You can check the quality of responses and reject and reevaluate incorrect assignments. Performers will get paid only after their assignment is accepted.

There are two ways to review assignments:

    – manually
    – in a separate Toloka project

> Read more about [processing rejected assignments](https://toloka.ai/en/docs/guide/concepts/reassessment-after-accepting?utm_source=github&utm_medium=site&utm_campaign=tolokakit) in our Requester’s Guide.

Another way to review tasks is to ask other performers to do that. We recommend this option when you have limited resources for checking tasks yourself. Check how it's done in our [object detection example.](https://github.com/Toloka/toloka-kit/tree/main/examples/1.computer_vision/object_detection) [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Toloka/toloka-kit/blob/main/examples/1.computer_vision/object_detection/object_detection.ipynb)

## Showing results

Configure data display.

In [None]:
from IPython import display

results_list = []

for assignment in toloka_client.get_assignments(pool_id=pool_id, status='SUBMITTED'):
    for solution in assignment.solutions:
        results_list.append(solution.output_values)
print(len(results_list))
results_iter = iter(results_list)

Run the cell below multiple times to see different responses.

In [None]:
res = next(results_iter, None)
if res is not None:
    with open('tmp_video_file', 'w+b') as out_f:
        toloka_client.download_attachment(res['path'], out_f)
else:
    print('No more results')

display.Video("./tmp_video_file", height=300)