Add strategy for prioritizing annotations to be voted #23

xavierfav · 2017-05-16T11:12:55Z

In order to have "the best dataset we can at a time t", we have chosen some constrains [TO BE DISCUSSED]:

vote all annotation candidate for a sound (in order to get closer to "complete" annotation for a sound)
annotation candidates need 2 identical votes to be considered valid
prioritize sound that have a length < 30 sec
prioritize sounds with "good quality" (use Freesound downloads and rates? a descriptor for quality?)

We need to implement a "manager" that selects the annotations and the sounds to be voted.
Ideally a rank of priority should be derived from the constrains, and the annotation should be proposed to crowd-workers following this rank.

ffont · 2017-05-18T15:22:18Z

I agree with what you propose.
You probably want to have these scores precomputed in a property of Annotation model because otherwise its probably complicated to compute all the scores in real time (specially if score is complex to compute).

I suggest you to start implementing a function which given an annotation returns a "priority score". This could be a method of Annotation class.

xavierfav · 2017-07-12T11:18:41Z

For now, the prioritization is based on votes:
Annotations that have at least one vote are prioritized.

In order to include the other constrains listed in this post, we need some Freesound metadata that we don't have in the current platform (ratings, nb of donwloads).

@ffont
Should we use the API to get this data. Or should I load it into our model so we have it in FSD platform?

Moreover, about the first point: vote all annotation candidate for a sound (in order to get closer to "complete" annotation for a sound)
I would say that as it is now, it is not worth to do it: because we did not work on population and prioritizing leaf nodes, we would prioritize annotation that are not worth voting (eg. voting "dog bark", "dog" and "animal"). We should first work on how to populate whenever an annotation is considered as ground truth.
We have been inspecting ambiguous cases with edufonseca (categories with more than one parent) to see whether or not it make sense to distinguish two categories and if it make sense to populate to the different parents or not.

ffont · 2017-07-12T11:27:15Z

@xavierfav We should use the API to load the data in the FSD platform ;)
There has always been the idea to write this management command that iterates over all sounds and gets data from freesound to store in the JSON field of each sound. I'm not sure if something similar was ever implemented (I guess not). I think this is the way to go, have this command that you can run from time to time to re-sync with Freesound.

When implementing the command, I'd iterate over all sounds in groups of N, and then use the API to make a search restricting the results to the IDs of these sounds (you can "OR" sound IDs in the search filter). Then using the fields param you decide which information you want to get returned and store in the FSD platform. In this way, the number of requests needed is n_sounds/N instead of n_sounds. N could be theoretically set to 150 (max number of search results per page), but the limitation here is the length of the URL (as all filter sound IDs will be in the URL). I think with N=50 should be fine. Otherwise try lower or higher values.

edufonseca · 2017-07-17T20:21:57Z

In the constraints listed in the first comment, it was suggested to prioritize sounds with length < 30 sec. I think we should specify further in this direction. How about prioritizing (apart from the other aforementioned constraints):

sounds with length < 10s (just as in AudioSet). This will presumably imply having more PP and also shorter sounds that, at this point, may be more useful.
when the above are over, sounds with length < 20s
when the above are over, sounds with length < 30s

xavierfav · 2017-10-26T11:27:43Z

Sound with length < 10 sec are prioritized #70

xavierfav assigned xavierfav, jordipons and edufonseca May 16, 2017

edufonseca mentioned this issue May 18, 2017

Annotation protocol for validation task #27

Closed

edufonseca added the validation task label May 18, 2017

xavierfav mentioned this issue Jul 11, 2017

Subset of annotations for FSD 1st release #26

Closed

xavierfav mentioned this issue Jul 14, 2017

Sounds from Freesound are sometimes deleted #55

Closed

xavierfav closed this as completed Oct 26, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add strategy for prioritizing annotations to be voted #23

Add strategy for prioritizing annotations to be voted #23

xavierfav commented May 16, 2017

ffont commented May 18, 2017 •

edited

xavierfav commented Jul 12, 2017

ffont commented Jul 12, 2017

edufonseca commented Jul 17, 2017

xavierfav commented Oct 26, 2017

Add strategy for prioritizing annotations to be voted #23

Add strategy for prioritizing annotations to be voted #23

Comments

xavierfav commented May 16, 2017

ffont commented May 18, 2017 • edited

xavierfav commented Jul 12, 2017

ffont commented Jul 12, 2017

edufonseca commented Jul 17, 2017

xavierfav commented Oct 26, 2017

ffont commented May 18, 2017 •

edited