Getting Started

Mortimerp9 edited this page Aug 17, 2012 · 2 revisions

Setup and Installation

Please see the README file for instructions on installing and configuring Clockwork Raven.


Depending on how Clockwork Raven is set up, you may need to use your LDAP credentials, or use credentials given to you by your Clockwork Raven administrator.

Front Page

This is the front page. It lists the most recent evaluations.

List of EvaluationsList of Evaluations

Creating an Evaluation

Click the button at the top to create a new evaluation. If your evaluation is similar to an existing one, use the "Copy" button to create a copy of that evaluation (you'll still upload new data and give it a new title). On the new evaluation page, you can set the options for your evaluation.

New Evaluation PageNew Evaluation Page

  • Name: This is the name used internally in the Clockwork Raven system.
  • Production?: If this box is checked, this evaluation will be submitted to the real Mechanical Turk server, get real responses, and cost real money. If it's unchecked, it will be submitted to the sandbox server, where you can preview it, submit responses to it yourself, and it's all free.
  • Title: This is the name shown to Mechanical Turk judges.
  • Note: This is a note shown in the Clockwork Raven system. It should give a technical description of what the evaluation is for.
  • Data: Choose a data file. This should be a TSV or CSV file where the first row are headers and the remaining rows each represent a task you'd like done. For example, if you'd like to have 1000 tweets categorized, your CSV would have 1001 rows: the first row would be the header "Tweet ID" (or whatever you want to label it), and the remaning rows would be tweet IDs. Alternatively, you can use a JSON file. The JSON file should be an array of objects, which each object has the exact same keys. Each object represents one task.
  • Description: This is the description Mechanical Turk users will see when they search for the task.
  • Keywords: Comma-separated list of keywords to help users find the task.
  • Qualification should be left as "Trusted" unless you're willing to sacrifice quality in exchange for cheaper, faster results.
    • None: Any user can complete the task. This yields low-quality results, but gets results fast with a low payment.
    • Masters: Any user who Mechanical Turk has marked as a categorization expert. This is an intermediate between allowing all users and only allowing trusted users.
    • Trusted: Only users who have been flagged as trusted in the Clockwork Raven system can complete this task. This yield high-quality results, but results come in slower and these users command a higher payment. You'll also have to add some users as trusted (we'll see how to do this later) before this option is useful.
  • Duration, Lifetime, and Auto-approve can usually be left as-is.
    • Duration: The amount of time, in seconds, a user has to complete the task. Give plenty of time -- and hour or more.
    • Lifetime: The amount of time, in seconds, this evaluation will be active on Mechanical Turk.
    • Auto-approve: The amount of time, in seconds, after a response is submitted that it will automatically be approved if you haven't explicitly rejected it. This is visible to workers, and many workers will not work on tasks where this is set to more than a day.
  • Payment: The amount payed per task. For trusted workers, a reasonable payment is $0.25-$0.35 for tasks without free-response questions and about $0.50 for tasks with free-response questions.

Template Designer

After you've set up your evaluation, you'll be taken to the template designer:

Template BuilderTemplate Builder

Use the "Add Item" button at the bottom to add items to your template. There are five kinds of items you can add:

  • Header: This a textual header displayed in big text
  • Text: this is an arbitrary snippet of text or HTML. You can use the "Insert Reference" button to interpolate a field from your data (or just use the syntax {{field name}}). For example, if your data was
foo bar baz
hello world stuff
thing thang thung

you could enter foo is: {{foo}}. bar is {{bar}}. baz is: {{baz}}. into a text item in your template, and the {{variables}} would be inserted based on the values of the current task -- so for the first task in this data, the text section would read "foo is: hello. bar is: world. baz is: stuff."

  • Template: this lets you select one of Clockwork Raven's pre-built templates. Each template takes parameters. You can choose which column of your data should be used to provide that parameter, or give a literal value. In the screenshot above, the template has an embedded tweet. The ID of this tweet is taken from the "Control Tweet ID" column in the data.
  • Multiple-choice question: This is a multiple-choice question asked of the workers who complete your task. If you provide numeric values for your options, Clockwork Raven will automatically calculate the average value across your responses. In the example above we associate "vanilla" with a score of 1 and "chocolate" with a score of -1, so the average score will tell us whether the judges prefer vanilla or chocolate overall.
  • Free-response question: This will let the worker any free-form text in a text box. If the "required" checkbox is checked, workers will not be able to submit their responses without entering something in the box. Note that evaluations with free-response questions take longer for workers, so you'll need to pay more.

On this page, you also select which column of your data are "metadata." Metadata items aren't shown to Mechanical Turk judges, but you can use them on the result analysis page to filter or segment results, just like multiple-choice questions.

Reviewing Evaluations

After creating a task, you can review it:

Evaluation ReviewEvaluation Review

and use the "preview random task" link to see a sample task:

Sample TaskSample Task

Submitting Evaluations

When you're satisfied, use the "Submit to MTurk" button to submit the task to Mechanical Turk. You'll be asked to confirm the total cost of the evaluation, and then the task will be sent off. This process can take several minutes for long tasks.

When it's done, press Continue to return to the evaluation page. You'll now see a live counter of the number of submitted responses at the bottom of the page (you'll need to refresh the page for the number to update).

Closing Evaluations and Importing Results

When you're satisfied with the number of results, use the "Close" button to close the evaluation on Mechanical Turk and import the results into Clockwork Raven.

Viewing Results

You'll then get a "View Results" button:

Submitted TaskSubmitted Task

Which will take you to the results page:

Results PageResults Page

Where you can control a chart of responses and view individual responses.

The top half of the page is the chart interface. Use the "Chart" options to select what to chart, and the "Segment By" options to break each bar down into sub-sections.

The "Display" option sets the height of the bars. By default, the height of the bars in the number of responses in that segment. You can also normalize the heights (useful if you're using the "Segment By" option to segment the bars), or display the average value of a multiple-choice question across the response represented by the bar.

If the multiple-choice question you're charting has valus associated with its options, you can view the average value below the chart. You can also filter results using the "Filter By" options.

The bottom half of the page is the response review interface. You can view, sort and search the responses. If the table is too wide for your browser window, use the "Show/Hide columns" options to the right of the search bar to hide some columns. The left-most cell of each row is the actions for the response. You can approve and reject responses. Note that rejecting responses only works within the time set by "Auto Approve" when you created the evaluation. If you reject a response within this time limit, the worker will not be paid for the response. If you reject the response after the "Auto Approve" value, the worker will still be paid, but the rejected result will not show up in the chart on the top of the page. You can also ban workers, which will prevent them from responding to any Clockwork Raven evaluations in the future, and you can trust workers, which will allow them to work on evaluations that are restricted to trusted workers only.