Examples for training phase of validation task #30

edufonseca · 2017-05-18T00:39:23Z

As mentioned in #27 , ideally, the annotation protocol will consist of a training phase followed by a validation phase. In the former, some representative audio examples should be presented to the rater. How to choose these examples for every category? Several options:

Clips whose validations were rated as PP && with highest Freesound ranking
Randomly chosen clips between those validated as PP. These will vary from rater to rater, thus mitigating bias…. However, in this way, some examples could be not very representative.

xavierfav · 2017-05-18T08:47:39Z

In case 1. no implementation needed. Just need to add the examples in the ontology.json file
In case 2. very few implementation needed.

I think the case 1. makes more sense imo, since we want to control the training phase as much as possible. Providing good examples might be essential.

ffont · 2017-05-18T15:34:27Z

We can also default to case 2 when there are no examples chosen for case 1.

jordipons · 2017-05-22T08:40:03Z

I like @ffont idea, as well! It is nice for the "cold start" problem we face.

For case 1 I propose computing a ranking of sounds given a punctuation considering PP and Freesound ratings.

xavierfav · 2017-05-22T10:24:19Z

I guess you mean Freesound ratings?

jordipons · 2017-05-22T10:26:49Z

Yes, thanks. I amended my comment! :)

edufonseca · 2017-06-07T19:37:58Z

For the crowdsource launch (ie, annotations have been validated by a single rater) it has been suggested:

We could define a number of good sound examples for every category, eg, from 5 to 10
Ideally all the categories should have examples

The examples could be used for (see #27 to locate this steps in the annotation protocol):

Representative examples to be shown in the explicit part of the training phase
Good examples for the hidden part of the training phase
Good examples to be used for Quality Control in the Validation Phase

How to choose these examples for every category? Currently, it is proposed to:

Consider clips whose annotations are currently rated as PP
Rank them with Freesound ranking
Since we can't be sure that the Freesound ranking always yield the most representative audio clips for every category, a manual inspection of the resulting list should be done.

xavierfav · 2017-06-12T14:40:58Z

Here is a dictionary {<aso_id> : [<fs_id>, <fs_id>, ...] , ...}

which provides PP examples for each AudioSet Ontology category ids.

I have combined Freesound ratings and downloads to sort them.
I manually checked a few categories, it seems to provide good examples.

As you said, now we need to manually validate some of this examples, and put them in the ontology_preCrowd.json file in the field "positive_examples" for each categories.

xavierfav · 2017-06-27T08:45:13Z

Definitely the best way would be to implement this functionality on the web platform.
As we are quite in a rush, I prefer putting the priority on other stuff (annotation protocol).

So I will leave the option of filling directly the json file with the freesound examples.

Please start adding and checking examples!

xavierfav · 2017-07-04T12:59:48Z

Tool is ready for adding the examples:

clone the repo
open the google sheet
use the script_examples_for_fsd.py file and fill the google sheet

edufonseca · 2017-07-17T20:10:06Z

After inspecting the tool, it seems good for the task. I have few suggestions:

The outcome will be a set of (ideally 10) examples for every sound category. They will be used to show something representative to the rater and for verification clips. For both cases, the best thing to have is short examples, for clarity and simplicity. However, in the proposed list of candidate examples, some of them are even longer than 1min. Options:

Maybe we could add duration in the criteria for creating this list.
Or, we could tell the subject to focus first on those clips shorter than 10s, for instance.
But IMO we should not have examples longer than 10s (ideally).

What is our intial target?

get examples for FSD current 398 categories?
get examples for AudioSet 632 categories? This would be great, but it may take a significant amount of additional effort: there may be not enough candidates provided (due to scarce validation), hence the user would have to go to Freesound and find them...
Maybe we can decide this depending on the manpower we have....

xavierfav · 2017-07-19T10:24:52Z

Candidate examples updated:
The same order is kept (using Freesound downloads and ratings), but the result are then organized by duration. First will be presented the sound which have a duration < 10 sec, then the one that have a duration btw 10 and 20 sec, and finally the one longer than 20 sec.
This way it will be easier to find short relevant examples.
Spreadshit updated:

it presents only the 398 categories that we considered during TTs
added MULTIPLE PARENTS after the path of the ambiguous case

xavierfav · 2017-09-05T10:57:54Z

Providing examples for all the 398 categories seems to be complicated.
Some categories are hard to distinguish, or they are just are to recognize.

I think we don't have time to gathered enough examples for allowing to have this ready for the platform launching.

xavierfav · 2017-10-26T11:41:53Z

A lot of examples have been provided (thanks @jordipons and all the contributors).
However it seems that they not always correspond to Present and Predominant source in clips.
Moreover, it would be needed to selected the examples that are shown to users, and the one used Quality Control. In some cases (mutliple parent), it would be nice to select examples for all possible parents.
A few Freesound IDs did not correspond to any sound in our platform (detailed at the end of this file)
From the Admin page it is now possible to edit these examples Add access to admin page for editing TaxonomyNode fields #79.

edufonseca mentioned this issue May 18, 2017

Annotation protocol for validation task #27

Closed

xavierfav assigned xavierfav, jordipons and edufonseca May 18, 2017

edufonseca added the validation task label May 18, 2017

xavierfav mentioned this issue Jul 11, 2017

Quality control #31

Closed

xavierfav mentioned this issue Jul 19, 2017

Add a command for consolidate examples into Annotations #60

Closed

xavierfav closed this as completed Sep 5, 2017

xavierfav reopened this Sep 5, 2017

xavierfav mentioned this issue Sep 5, 2017

Add verification examples as optional #62

Closed

edufonseca mentioned this issue Sep 28, 2017

Designing Explicit part of Training phase #71

Closed

xavierfav closed this as completed Oct 26, 2017

xavierfav reopened this Oct 26, 2017

xavierfav added this to the Platform Launch milestone Nov 7, 2017

xavierfav closed this as completed Oct 5, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Examples for training phase of validation task #30

Examples for training phase of validation task #30

edufonseca commented May 18, 2017

xavierfav commented May 18, 2017

ffont commented May 18, 2017

jordipons commented May 22, 2017 •

edited

xavierfav commented May 22, 2017

jordipons commented May 22, 2017

edufonseca commented Jun 7, 2017

xavierfav commented Jun 12, 2017

xavierfav commented Jun 27, 2017

xavierfav commented Jul 4, 2017

edufonseca commented Jul 17, 2017

xavierfav commented Jul 19, 2017

xavierfav commented Sep 5, 2017

xavierfav commented Oct 26, 2017

Examples for training phase of validation task #30

Examples for training phase of validation task #30

Comments

edufonseca commented May 18, 2017

xavierfav commented May 18, 2017

ffont commented May 18, 2017

jordipons commented May 22, 2017 • edited

xavierfav commented May 22, 2017

jordipons commented May 22, 2017

edufonseca commented Jun 7, 2017

xavierfav commented Jun 12, 2017

xavierfav commented Jun 27, 2017

xavierfav commented Jul 4, 2017

edufonseca commented Jul 17, 2017

xavierfav commented Jul 19, 2017

xavierfav commented Sep 5, 2017

xavierfav commented Oct 26, 2017

jordipons commented May 22, 2017 •

edited