![Minions](minions.png)

# WHY???

- Label 50,000 rows of data
- Transcribe scans from my grandfathers journal
- Copy texts from business cards
- Draw 10,000 sheep ![sheep](sheep.png)
- Translate SMS to English ("CANT 2DAY 8 2 MUCH CU L8R")
- Solve CAPTCHA's
- Get summaries or addresses web pages
- Make human-machine hybrid centaurs!
- Get complimented on your new haircut.
- ...

# Prerequisites

1. `pip install boto`
2. Use your Amazon / AWS account to log in at mturk.com
3. Get the **root** access keys for your account.
   1. Go to console.aws.amazon.com
   2. Click on your name, choose _Security Credential_
   3. Ignore subtle hints that you should set up IAM Users
   4. Create a new Access Key and Secret
   5. Store them in a file called `.aws`:
      ```
      [default]
      aws_access_key_id = AKIA...
      aws_secret_access_key = ...
      ```
4. Optionally, fund your account.

In [12]:
import ConfigParser as configparser  # Python 3: import configparser

# Let's quickly get our credentials:
parser = configparser.SafeConfigParser()
parser.read(".aws")
access_key = parser.get("default", "aws_access_key_id")
secret_key = parser.get("default", "aws_secret_access_key")

In [13]:
from boto.mturk.connection import MTurkConnection

MTURK_HOST = 'mechanicalturk.amazonaws.com'
SANDBOX_HOST = 'mechanicalturk.sandbox.amazonaws.com'


mturk = MTurkConnection(
    aws_access_key_id=access_key,
    aws_secret_access_key=secret_key,
    host=SANDBOX_HOST
)

How to piece things together:

![structure](structure.png)

![Matryoshka](matryoshka.jpeg)

In [101]:
from boto.mturk.question import QuestionForm, Question, QuestionContent, AnswerSpecification, SelectionAnswer

In [40]:
# Let's do things bottom-up!

choices = (
    "An Animal",
    "Something Technical (or a book about technical things)",
    "Something entirely different"
)

answers = SelectionAnswer(
    style="radiobutton",
    selections=choices,
    type="text",
    other=False
)

answer_spec = AnswerSpecification(answers)

In [126]:
content = QuestionContent()
content.append_field("Title", "What do you see in this image?")

In [127]:
content.append(FormattedContent("<img src='...' />"))

In [128]:
question = Question(identifier=1, content=content, answer_spec=answer_spec)
form = QuestionForm([question])

## We need to put our images somewhere.

In [20]:
from boto.s3.connection import S3Connection

s3 = S3Connection(access_key, secret_key)
bucket = s3.create_bucket("pythonimages")

On AWS, configure the bucket policy to allow public access:

![Policy](policy.png)

```
{
	"Version": "2008-10-17",
	"Statement": [
		{
			"Sid": "AllowPublicRead",
			"Effect": "Allow",
			"Principal": {
				"AWS": "*"
			},
			"Action": "s3:GetObject",
			"Resource": "arn:aws:s3:::pythonimages/*"
		}
	]
}
```

In [50]:
k = bucket.new_key("myfile")
k.set_contents_from_filename("img/1.jpeg")
print k.generate_url(expires_in=0, query_auth=False, force_http=True)

http://pythonimages.s3.amazonaws.com/myfile


## Glueing things together

In [99]:
def create_question(number):
#    k = bucket.new_key(number)
#    k.set_contents_from_filename("img/{}.jpeg".format(number))
#    url = k.generate_url(expires_in=0, query_auth=False, force_http=True)
    url = 'http://pythonimages.s3.amazonaws.com/myfile'

    choices = (
        ("An Animal", 'animal'),
        ("Something Technical (or a book about technical things)", 'tech'),
        ("Something entirely different", 'other')
    )

    answers = SelectionAnswer(
        style="radiobutton",
        selections=choices,
        type="text",
        other=False
    )

    answer_spec = AnswerSpecification(answers)    
    
    content = QuestionContent()
    content.append_field("Title", "What do you see in this image?")
    content.append(FormattedContent("<img src='{}' alt='python-related image' />".format(url)))

    question = Question(identifier=1, content=content, answer_spec=answer_spec)

    form = QuestionForm([question])

    return form


In [100]:
mturk.create_hit(
    title="Categorise this image",
    question=create_question(1),
    duration=3600,
    reward=0.02
)

# And now, we wait.

## Wait, we're still in sandbox mode.

Fuck sandbox mode. Let's spend money.

In [111]:
mturk_for_real = MTurkConnection(
    aws_access_key_id=access_key,
    aws_secret_access_key=secret_key,
    host=MTURK_HOST
)

submitted_hits = {}

for image in range(1, 9):
    result = mturk.create_hit(
        title="Categorise this image",
        question=create_question(image),
        duration=300,
        reward=0.02
    )
    
    submitted_hits[result[0].HITId] = image

In [118]:
print submitted_hits

In [None]:
reviewable_hits = mturk.get_reviewable_hits()
print("Found {} Hits!".format(len(reviewable_hits)))

result = {}

In [122]:
for hit in reviewable_hits:
    if hit.HITId in submitted_hits:
        image = submitted_hits[hit.HITId]
        print("Reviewing HIT {} for image {}")
        # A hit may have several assignments!
        for assignment in mturk.get_assignments(hit.HITId):
            # An assignment may have several questions.
            for answer in assignment.answers[0]:
                # A question may have several answers (eg. checkboxes)
                result[image] = answer.fields[0]

                # Make sure your minions get paid
        mturk.approve_assignment(assignment.AssignmentId)

In [123]:
print result

{}
