Templates

Oana Inel edited this page Aug 9, 2016 · 16 revisions

One part of our framework is that users can create and upload their own templates for questions. There will probably be three stages in the implementation of this:

  1. Upload templates specifically for each platform
  2. Create and edit a JSON template online, which will be transformed to the platform specific format
  3. An online template builder

Right now, we’re in the first stage. In this document, we’ll describe how to create templates for the two platforms that are included in the standard version of our framework; CrowdFlower and Amazon Mechanical Turk.

Currently, the following templates are provided in the platform:

Useful tips and tricks for creating templates can be found in our Code Snippets.


* Task used in the Semantic Web Journal submission - Dumitrache, A., Inel, O., Timmermans, B., Ortiz, C., Sips, R.-J., Aroyo, L.: Empirically-derived Methodology for Crowdsourcing Ground Truth


Current implementation

CrowdFlower

This platform uses it’s own format, called CML. Please refer to the CrowdFlower documentation or their online questionbuilder to see what this is like. Parameters that have to be replaced by, for instance, terms in a relex-structured-sentence (one of the text formats we use for IBM’s Watson), have to be in this format: {{terms_first_text}}, where the underscore implies a deeper level in the array that’s in the sentence’s ‘content’ field. For other formats, the references work the same. CSS and JavaScript have to be uploaded under the same name and will be automatically included.

Amazon Mechanical Turk

Mechanical Turk uses HTML for it’s questions. Some special rules do apply however. For an HTML template to work correctly with our framework, only the HTML inside the form should be included. So you can leave the <head>, <body> and <form> tags behind and just start with the <input> fields. CSS and JavaScript have to be included in <style> and <script> tags. References to external CSS and JS are allowed, but only if the asset is hosted on a server that supports SSL (https://). The format of the parameters is the same as with CrowdFlower (see above). Every <input> name has to be: {uid}_fieldname. To have multiple questions on a single page is also supported, please check out our RelDir template for this. Since this will be handled by the framework automatically in the future, we won’t go into this here.

Vectors

Right now, creating custom rules for how annotation vectors are generated has to be done in the source code (in the Workerunit class).

Future implementation

Vectors will normally be generated based on any multiple choice elements in the template. All the possible options are included in the vector. For a single annotation, the value of the field will be 1 if the worker selected it, and 0 when he didn’t. The aggregated values form the vector of a unit. This may look like this (Relation Direction):

"entity/text/medical/relex-structured-sentence/2425" : {
  "Choice1" : 0,
  "Choice2" : 7,
  "Choice3" : 5
}

For some tasks, you’d want to have special rules that don’t correspond one on one to the QuestionTemplate. An example of this is our Factor Span task, which generates vectors like these:

"term1" : {
  "[WORD_-3]" : 0,
  "[WORD_-2]" : 1, 
  "[WORD_-1]" : 3,
  "[WORD_+1]" : 0, 
  "[WORD_+2]" : 0, 
  "[WORD_+3]" : 0, 
  "[WORD_OTHER]" : 0, 
  "[NIL]" : 3, 
  "[CHECK_FAILED]" : 2 
}

based on which words the user selected in a sentence (we ask them if a specific term is complete). We are still discussing ways to make it possible for the user to create custom vector rules like this one.

Clone this wiki locally
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.