No description, website, or topics provided.
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
CrowdAnnotationAggregator
TestApplication
LICENSE
README.md

README.md

CrowdAnnotationAggregator

Developed By Carlo Bernaschina (GitHub - B3rn475)
www.bernaschina.com

Copyright (c) 2014 Politecnico di Milano
www.polimi.it
Distributed under the LGPL Licence

CrowdAnnotationAggregator is a Java Library that allow to aggregate the annotations of a crowdsourcing system regardless of the specific type of annotations.

This system aggregates annotations that come from different users on the same object/content. It requires every object to be annotated by at least 3 users. It requires every user to annotate at least 3 objects. It requires every user to annotate every object at most 1 time. Every annotation that does not follow this rules will be rejected.

The data are organized as a bipartisan graph, in which nodes represent both users (on the right) and objects/contents (on the left) and in which edges represent annotations of a user on an object. The system assign a weight to every annotation and computes the final annotation as a weighted aggregation of the annotations. The weights are estimated using a iterative approach:

  • Step 1 for every annotation it is estimated another one that takes in account all the other annotation from different users on the same object
  • Step 2 the weight of the annotation is estimated taking in account all the couple (annotation estimation) of the same user on all the other objects
    The algorithm stops when the convergence is reached (or when a maximum number of steps is reached)

It solves the problem relying on the implementation of two custom function that are dependent on the type of annotation and must be implemented for the specific case.

Base Classes

Content
This class represents the object that is going to be annotated. It has by default only one field id that is used to identify it. The id must be greater than 0.
The id is the only field that is used for identification so two Content with the same id will be taken in account as they are the same.
If you need to add informations you can just implement a class that inherits from it and add the fields you need.

Annotator
This class represents the user that annotates a content. It has an id that is used to identify it. This id must be greater than 0 (0 means no Annotator and is accessible via Annotator.NONE).
The id identifies the Annotator so two Annotators with the same id will be taken in account as they are the same.
If you need to add informations you can just implement a class that inherits from it and add the fields you need.

Annotation (Abstract)
This class represents the annotation given by an Annotator to a Content. It is templated on both Content and Annotator so you can use informations stored in custom Content and Annotator implementations. If you do not need that you can just use BaseAnnotation.
This is an Abstract class so you need to implement your own Annotation. It has to contain the data needed by both Aggregator and CoherenceEstimator.

AggregationManager
This is the heart of the system, it is the class responsible of the Algorithm management.
It is templated on both Annotation and Content (the Content templated of the annotation must be the same of the Annotation).
If you are not going to create a custom Content you can use the BaseAggreationManager that is templated only on the Annotation that must be a class derived by BaseAnnotation.
It is a Map<Annotation, Double> so it allows to insert both the annotations and a start value for the weight associated to that Annotation (set the weight to 1 to use the default initialization).
It has been thinked has an asynchronous Object so once the start() method is called it may return immediately. The completion is signaled through the Listener callback.
The asynchronicity if the Object is dependent on the asynchronicity if both Aggregator and CoherenceEstimator, if at least one of them is Asynchronous the Object is Asynchronous.

Aggregator
This is the class responsible of aggregating a group of annotations taking in account the weights given to each annotation.
It is templated on both the Annotation and the Content. If you are not going to use a custom Content you can use BaseAggregator and so the Annotation will be a subclass of BaseAnnotation.
You must define your own Aggregator that implements an algorithm valid for you particular kind of Annotation.
During the process many Objects of this type will be instantiated. In particular one for each content, so you are sure that all the annotations passed are related to the same content.
You have to implement the method aggregate that takes as parameter an annotator to skip in the process.
This method can be Asynchronous but has to call the method postAggregate at the end of the process. The Annotation passed to the postAggregate must be related to the same Annotator passed as skip parameter.
You can access the current annotations due to the fact that the Object is a Collection of Annotation.
You can access the current weight using the method getWeights. If for performance reason you want to pre-compute some values at the beginning of the process you can do that Overriding the method initializingAggregation. It can be Asynchronous, you must invoke postInitializingAggregation at the end of the initialization.
If you have Overrided the method initializingAggregation and you want to clean up temporary properties at the end you can Override the method endingAggregation. It can be Asynchronous, you must invoke postEndingAggregation at the end of the tear down.

LinearAggregator
If your aggregation algorithm is linear (A+B)+C = A+(B+C) you can use instead of the standard Aggregator the LinearAggregator (or the BaseLinearAggregator) to obtain better performance. You must implement the methods sumAllAnnotations and subtractAnnotation. sumAllAnnotations has to sum all the annotations and report the total aggregated annotation, it can be Asynchronous, you must invoke postSubtractedAnnotation at the end of the process.
subtractAnnotation has to subtract a given annotation from the Total one using the given weight, it can be Asynchronous, you must invoke postSubtractAnnotation at the end of the process.

CoherenceEstimator
This is the class responsible of computing the coherence among the aggregated annotations.
It is templated on both the Annotation and the Content.
You must define your own CoherenceEstimator that implements an algorithm valid for you particular kind of Annotation.
During the process many Objects of this type will be instantiated. In particular one for each annotator, so you are sure that all the annotations passed are related to the same Annotator.
You have to implement the method estimate that takes as parameter a content to skip in the process.
This method can be Asynchronous but has to call the method postEstimate at the end of the process. To this method you have to pass as parameters the content that you have skipped and the weight estimate as a Double.
If for performance reason you want to pre-compute some values at the beginning of the process you can do that Overriding the method initializingEstimation. It can be Asynchronous, you must invoke postInitializingEstimation at the end of the initialization.
If you have Overrided the method initializingEstimation and you want to clean up temporary properties at the end you can Override the method endingEstimation. It can be Asynchronous, you must invoke postEndingEstimation at the end of the tear down.

LinearCoherenceEstimator
If your aggregation algorithm is linear (A+B)+C = A+(B+C) you can use instead of the standard CoherenceEstimator the LinearCoherenceEstimator to obtain better performance. You must implement the method comparePair, it has to compare annotations in couples, it can be Asynchronous, you must invoke postCompairPair at the end of the process.

AggregatorFactory
This interface must be implemented by a class and is used to allow the AggregationManager to build new Aggregators and new CoherenceEstimators without the need to know their real class.

Example

For a basic example (that is not asynchronous) you can see the classes at it.polimi.crowdannotationaggregator.examples.bool and it.polimi.crowdannotationaggregator.junit.bool.