# Annotating Training Data With MTurk

## Pre-requisites
If you haven't already, you'll need to setup MTurk and AWS accounts that are linked together to use MTurk with Python. The MTurk account will be used to post tasks to the MTurk crowd and the AWS accounts will be used to connect to MTurk via API and provide access to any additional AWS resources that are needed to execute your task.

1. If you don't have an AWS account already, visit https://aws.amazon.com and create an account you can use for your project.
2. If you don't have an MTurk Requester account already, visit https://requester.mturk.com and create a new account.

After you've setup your accounts, you will need to link them together. When logged into both the root of your AWS account and your MTurk account, visit https://requester.mturk.com/developer to link them together.

From your AWS console create a new AWS IAM User or select an existing one you plan to use. Add the AmazonMechanicalTurkFullAccess policy to your user. Then select the Security Credentials tab and create a new Access Key, copy the Access Key and Secret Access Key for future use.

If you haven't installed the awscli yet, install it with pip (pip install awscli) and configure a profile using the access key and secret key above (aws configure --profile mturk). 

To post tasks to MTurk for Workers to complete you will first need to add funds to your account that will be used to reward Workers. Visit https://requester.mturk.com/account to get started with as little as $1.00.

We also recommend installing xmltodict as shown below.

In [14]:
!pip install boto3



In [15]:
!pip install xmltodict



## Overview
Amazon Mechanical Turk allows you to post tasks for Workers to complete at https://worker.mturk.com. To post a task to
MTurk you create an HTML form that includes the information you want them to provide. In this example we'll be asking Workers to rate the sentiment of Tweets on a scale of 1 (negative) to 10 (positive).

MTurk has a Sandbox environment that can be used for testing. Workers won't work see your tasks in the Sandbox but you can log in to do them yourself to test the task interface at https://workersandbox.mturk.com. It's recommended you test first in the Sandbox to make sure your task returns the data you need before moving to the Production environment. There is no cost to use the Sandbox environment.

In [16]:
import boto3
import xmltodict
import json
import os
from datetime import datetime
import random
import pandas as pd 


In [17]:
create_hits_in_production = False
environments = {
        "production": {
            "endpoint": "https://mturk-requester.us-east-1.amazonaws.com",
            "preview": "https://www.mturk.com/mturk/preview"
        },
        "sandbox": {
            "endpoint": "https://mturk-requester-sandbox.us-east-1.amazonaws.com",
            "preview": "https://workersandbox.mturk.com/mturk/preview"
        },
}
mturk_environment = environments["production"] if create_hits_in_production else environments["sandbox"]

session = boto3.Session(profile_name='mturk')

client = session.client(
    service_name='mturk',
    region_name='us-east-1',
    endpoint_url=mturk_environment['endpoint'],
)

In [18]:
# This will return your current MTurk balance if you are connected to Production.
# If you are connected to the Sandbox it will return $10,000.
print(client.get_account_balance()['AvailableBalance'])

10000.00


## Define your task
For this project we are going to get the sentiment of a set of tweets that we plan to train a model to evaluate. We will create an MTurk Human Intelligence Task (HIT) for each tweet.

## Handle the combination of pics

In [23]:
survey_groups = pd.read_csv('survey_groups.csv') 
imagePath = "https://my-image-repo-520.s3.amazonaws.com/uploads"
# Preview the first 5 lines of the loaded data 
survey_groups.head()
for index, row in survey_groups.iterrows():
    for i, col in enumerate(survey_groups.columns): 
        imgName  = row[col]
        print(index, i,row[col],"{}/{}".format(imagePath,imgName.strip().replace(" ","+")))

0 0 Alternet Page - Wallet.png https://my-image-repo-520.s3.amazonaws.com/uploads/Alternet+Page+-+Wallet.png
0 1 Alternet Page - Polar Bear.png https://my-image-repo-520.s3.amazonaws.com/uploads/Alternet+Page+-+Polar+Bear.png
0 2 Book - NRA.png https://my-image-repo-520.s3.amazonaws.com/uploads/Book+-+NRA.png
0 3 Book - Adidas.png https://my-image-repo-520.s3.amazonaws.com/uploads/Book+-+Adidas.png
0 4 Breitbart Page - Fasion.png https://my-image-repo-520.s3.amazonaws.com/uploads/Breitbart+Page+-+Fasion.png
0 5 Breitbart Page - NRA.PNG https://my-image-repo-520.s3.amazonaws.com/uploads/Breitbart+Page+-+NRA.PNG
0 6 Daily Kos Page - Adidas.png https://my-image-repo-520.s3.amazonaws.com/uploads/Daily+Kos+Page+-+Adidas.png
0 7 Daily Kos Page - Impeach.png https://my-image-repo-520.s3.amazonaws.com/uploads/Daily+Kos+Page+-+Impeach.png
0 8 Food - NRA.png https://my-image-repo-520.s3.amazonaws.com/uploads/Food+-+NRA.png
0 9 Food - bp.png https://my-image-repo-520.s3.amazonaws.com/uploads/Food

24 11 NewsMax Page - Polar Bear.png https://my-image-repo-520.s3.amazonaws.com/uploads/NewsMax+Page+-+Polar+Bear.png
24 12 NYT Page - Impeach.png https://my-image-repo-520.s3.amazonaws.com/uploads/NYT+Page+-+Impeach.png
24 13 NYT Page - Wallet.png https://my-image-repo-520.s3.amazonaws.com/uploads/NYT+Page+-+Wallet.png
24 14 NYT Page2 - Impeach.png https://my-image-repo-520.s3.amazonaws.com/uploads/NYT+Page2+-+Impeach.png
24 15 NYT Page2 - Fasion.png https://my-image-repo-520.s3.amazonaws.com/uploads/NYT+Page2+-+Fasion.png
25 0 Alternet Page - bp.png https://my-image-repo-520.s3.amazonaws.com/uploads/Alternet+Page+-+bp.png
25 1 Alternet Page - Liberal Quiz.png https://my-image-repo-520.s3.amazonaws.com/uploads/Alternet+Page+-+Liberal+Quiz.png
25 2 Book - Impeach.png https://my-image-repo-520.s3.amazonaws.com/uploads/Book+-+Impeach.png
25 3 Book - Fasion.png https://my-image-repo-520.s3.amazonaws.com/uploads/Book+-+Fasion.png
25 4 Breitbart Page - Fasion.png https://my-image-repo-520.s3

53 9 Food - Fasion.png https://my-image-repo-520.s3.amazonaws.com/uploads/Food+-+Fasion.png
53 10 NewsMax Page - Adidas.png https://my-image-repo-520.s3.amazonaws.com/uploads/NewsMax+Page+-+Adidas.png
53 11 NewsMax Page - Liberal Quiz.png https://my-image-repo-520.s3.amazonaws.com/uploads/NewsMax+Page+-+Liberal+Quiz.png
53 12 NYT Page - NRA.png https://my-image-repo-520.s3.amazonaws.com/uploads/NYT+Page+-+NRA.png
53 13 NYT Page - Wallet.png https://my-image-repo-520.s3.amazonaws.com/uploads/NYT+Page+-+Wallet.png
53 14 NYT Page2 - Impeach.png https://my-image-repo-520.s3.amazonaws.com/uploads/NYT+Page2+-+Impeach.png
53 15 NYT Page2 - Wallet.png https://my-image-repo-520.s3.amazonaws.com/uploads/NYT+Page2+-+Wallet.png
54 0 Alternet Page - bp.png https://my-image-repo-520.s3.amazonaws.com/uploads/Alternet+Page+-+bp.png
54 1 Alternet Page - NRA.png https://my-image-repo-520.s3.amazonaws.com/uploads/Alternet+Page+-+NRA.png
54 2 Book - Impeach.png https://my-image-repo-520.s3.amazonaws.com/u

79 11 NewsMax Page - Liberal Quiz.png https://my-image-repo-520.s3.amazonaws.com/uploads/NewsMax+Page+-+Liberal+Quiz.png
79 12 NYT Page - Impeach.png https://my-image-repo-520.s3.amazonaws.com/uploads/NYT+Page+-+Impeach.png
79 13 NYT Page - Adidas.png https://my-image-repo-520.s3.amazonaws.com/uploads/NYT+Page+-+Adidas.png
79 14 NYT Page2 - Impeach.png https://my-image-repo-520.s3.amazonaws.com/uploads/NYT+Page2+-+Impeach.png
79 15 NYT Page2 - Fasion.png https://my-image-repo-520.s3.amazonaws.com/uploads/NYT+Page2+-+Fasion.png
80 0 Alternet Page - Fasion.png https://my-image-repo-520.s3.amazonaws.com/uploads/Alternet+Page+-+Fasion.png
80 1 Alternet Page - NRA.png https://my-image-repo-520.s3.amazonaws.com/uploads/Alternet+Page+-+NRA.png
80 2 Book - Polar Bear.png https://my-image-repo-520.s3.amazonaws.com/uploads/Book+-+Polar+Bear.png
80 3 Book - Adidas.png https://my-image-repo-520.s3.amazonaws.com/uploads/Book+-+Adidas.png
80 4 Breitbart Page - Adias.PNG https://my-image-repo-520.s3.

108 4 Breitbart Page - Fasion.png https://my-image-repo-520.s3.amazonaws.com/uploads/Breitbart+Page+-+Fasion.png
108 5 Breitbart Page - Impeach.png https://my-image-repo-520.s3.amazonaws.com/uploads/Breitbart+Page+-+Impeach.png
108 6 Daily Kos Page - Fasion.png https://my-image-repo-520.s3.amazonaws.com/uploads/Daily+Kos+Page+-+Fasion.png
108 7 Daily Kos Page - Polar Bear.png https://my-image-repo-520.s3.amazonaws.com/uploads/Daily+Kos+Page+-+Polar+Bear.png
108 8 Food - Polar Bear.png https://my-image-repo-520.s3.amazonaws.com/uploads/Food+-+Polar+Bear.png
108 9 Food - Fasion.png https://my-image-repo-520.s3.amazonaws.com/uploads/Food+-+Fasion.png
108 10 NewsMax Page - Wallet.png https://my-image-repo-520.s3.amazonaws.com/uploads/NewsMax+Page+-+Wallet.png
108 11 NewsMax Page - Polar Bear.png https://my-image-repo-520.s3.amazonaws.com/uploads/NewsMax+Page+-+Polar+Bear.png
108 12 NYT Page - Polar Bear.png https://my-image-repo-520.s3.amazonaws.com/uploads/NYT+Page+-+Polar+Bear.png
108 13

135 0 Alternet Page - Wallet.png https://my-image-repo-520.s3.amazonaws.com/uploads/Alternet+Page+-+Wallet.png
135 1 Alternet Page - Polar Bear.png https://my-image-repo-520.s3.amazonaws.com/uploads/Alternet+Page+-+Polar+Bear.png
135 2 Book - Liberal Quiz.png https://my-image-repo-520.s3.amazonaws.com/uploads/Book+-+Liberal+Quiz.png
135 3 Book - Adidas.png https://my-image-repo-520.s3.amazonaws.com/uploads/Book+-+Adidas.png
135 4 Breitbart Page - Adias.PNG https://my-image-repo-520.s3.amazonaws.com/uploads/Breitbart+Page+-+Adias.PNG
135 5 Breitbart Page - Impeach.png https://my-image-repo-520.s3.amazonaws.com/uploads/Breitbart+Page+-+Impeach.png
135 6 Daily Kos Page - Fasion.png https://my-image-repo-520.s3.amazonaws.com/uploads/Daily+Kos+Page+-+Fasion.png
135 7 Daily Kos Page - Impeach.png https://my-image-repo-520.s3.amazonaws.com/uploads/Daily+Kos+Page+-+Impeach.png
135 8 Food - Polar Bear.png https://my-image-repo-520.s3.amazonaws.com/uploads/Food+-+Polar+Bear.png
135 9 Food - bp.p

162 14 NYT Page2 - NRA.png https://my-image-repo-520.s3.amazonaws.com/uploads/NYT+Page2+-+NRA.png
162 15 NYT Page2 - Adidas.png https://my-image-repo-520.s3.amazonaws.com/uploads/NYT+Page2+-+Adidas.png
163 0 Alternet Page - bp.png https://my-image-repo-520.s3.amazonaws.com/uploads/Alternet+Page+-+bp.png
163 1 Alternet Page - Polar Bear.png https://my-image-repo-520.s3.amazonaws.com/uploads/Alternet+Page+-+Polar+Bear.png
163 2 Book - NRA.png https://my-image-repo-520.s3.amazonaws.com/uploads/Book+-+NRA.png
163 3 Book - Adidas.png https://my-image-repo-520.s3.amazonaws.com/uploads/Book+-+Adidas.png
163 4 Breitbart Page - Fasion.png https://my-image-repo-520.s3.amazonaws.com/uploads/Breitbart+Page+-+Fasion.png
163 5 Breitbart Page - Wildlife.png https://my-image-repo-520.s3.amazonaws.com/uploads/Breitbart+Page+-+Wildlife.png
163 6 Daily Kos Page - bp.png https://my-image-repo-520.s3.amazonaws.com/uploads/Daily+Kos+Page+-+bp.png
163 7 Daily Kos Page - NRA.png https://my-image-repo-520.s3.a

MTurk accepts an XML document containing the HTML that will be displayed to Workers. Workers will see these HTML for each item tweet that is submitted. To use the HTML for this example task, download it from [here](https://s3.amazonaws.com/mturk/samples/jupyter-examples/SentimentQuestion.html) and store it in the same directory as this notebook. Within the HTML is a variable ${content} that will be replaced with a different tweet when the HIT is created.

Here the HTML is loaded and inserted into the XML Document.

In [26]:
html_layout = open('./survey.html', 'r',encoding="utf-8").read()
QUESTION_XML = """<HTMLQuestion xmlns="http://mechanicalturk.amazonaws.com/AWSMechanicalTurkDataSchemas/2011-11-11/HTMLQuestion.xsd">
        <HTMLContent><![CDATA[{}]]></HTMLContent>
        <FrameHeight>650</FrameHeight>
        </HTMLQuestion>"""
question_xml = QUESTION_XML.format(html_layout)

In Mechanical Turk each task is representated by a Human Intelligence Task (HIT) which is an individual item you want annotated by one or more Workers and the interface that should be displayed. The definition below requests that five Workers review each item, that the HIT remain live on the worker.mturk.com website for no more than an hour, and that Workers provide a response for each item in less than ten minutes. Each response has a reward of \$0.05 so the total Worker reward for this task would be \$0.25 plus \$0.05 in MTurk fees. An appropriate title, description, keywords are also provided to let Workers know what is involved in this task.

In [27]:
TaskAttributes = {
    'MaxAssignments': 9,                 
    'LifetimeInSeconds': 60*60,           # How long the task will be available on the MTurk website (1 hour)
    'AssignmentDurationInSeconds': 60*10, # How long Workers have to complete each item (10 minutes)
    'Reward': '0.15',                     # The reward you will offer Workers for each response
    'Title': 'Answer questions about ads',
    'Keywords': 'survey, ad',
    'Description': 'Rate the relevancy of an ad from 1 to 5'
}


## Create the tasks
Here a HIT is created for each tweet so that it can be completed by Workers. Prior to creating the HIT, the tweet is inserted into the Question XML content. The HIT Id returned for each task is stored in a results array so that we can retrieve the results later.

In [28]:
import random
results = []
hit_type_id = ''
numberOFImage = 16
imagePath = "https://my-image-repo-520.s3.amazonaws.com/uploads"

for index, row in survey_groups.head(n=10).iterrows():
    result = {}
    question = question_xml
    for i, col in enumerate(survey_groups.columns): 
        imgName  = row[col]
        to_split = imgName.replace('.png', '').replace('.PNG', '').replace('Page2', '').replace('Page', '').replace('Ad', '').replace('Book', 'NYT').replace('Food', 'NYT').replace('Fasion', 'Fashion').replace('bp', 'BP').replace('Impeach', 'Impeachment')

        pageName, adName = to_split.split(' - ')
        question = question.replace('${{url_{0}}}'.format(i+1), "{}/{}".format(imagePath, imgName.strip().replace(" ", "+")))
        question = question.replace('${{website_{0}}}'.format(i+1), pageName.strip())
        question = question.replace('${{ad_name_{0}}}'.format(i+1), adName.strip())
        result['image{}'.format(i+1)] = imgName
    response = client.create_hit(
        **TaskAttributes,
        Question = question
    )
#     print(response)
    hit_type_id = response['HIT']['HITGroupId']
        
    result['id'] = index + 1
    result['hit_id'] = response['HIT']['HITId']
    results.append(result)

print("You can view the HITs here:")
print(mturk_environment['preview'] + "?groupId={}".format(hit_type_id))

if not os.path.exists("result/"):
    os.makedirs("result/")
    
now = datetime.now()

dt_string = now.strftime("%d-%m-%Y-%H-%M-%S")
with open('result/result-{}.json'.format(dt_string), 'w') as outfile:
    json.dump(results, outfile)

You can view the HITs here:
https://workersandbox.mturk.com/mturk/preview?groupId=3OPURK5EJIR1K151UATASLAW781FD8


In [50]:
results

[{'image1': 'Alternet Page - Wallet.png',
  'image2': 'Alternet Page - Polar Bear.png',
  'image3': 'Book - NRA.png',
  'image4': 'Book - Adidas.png',
  'image5': 'Breitbart Page - Fasion.png',
  'image6': 'Breitbart Page - NRA.PNG',
  'image7': 'Daily Kos Page - Adidas.png',
  'image8': 'Daily Kos Page - Impeach.png',
  'image9': 'Food - NRA.png',
  'image10': 'Food - bp.png',
  'image11': 'NewsMax Page - bp.png',
  'image12': 'NewsMax Page - Liberal Quiz.png',
  'image13': 'NYT Page - Impeach.png',
  'image14': 'NYT Page - Fasion.png',
  'image15': 'NYT Page2 - Impeach.png',
  'image16': 'NYT Page2 - Adidas.png',
  'id': 1,
  'hit_id': '3OPLMF3EU5J6A8V6SJTQVFD05OGLN9'},
 {'image1': 'Alternet Page - Fasion.png',
  'image2': 'Alternet Page - Polar Bear.png',
  'image3': 'Book - Liberal Quiz.png',
  'image4': 'Book - bp.png',
  'image5': 'Breitbart Page - Adias.PNG',
  'image6': 'Breitbart Page - Wildlife.png',
  'image7': 'Daily Kos Page - Wallet.png',
  'image8': 'Daily Kos Page - Imp

## Get Results
Depending on the task, results will be available anywhere from a few minutes to a few hours. Here we retrieve the status of each HIT and the responses that have been provided by Workers.

In [11]:
def getAnsewer(answerDict):
    answer ={}
    for ans in answer_dict['QuestionFormAnswers']['Answer']:
        if ans['QuestionIdentifier'] == "age":
            answer["age"] = ans["FreeText"]
        elif  ans['QuestionIdentifier'] == "gender":
            answer["gender"] = ans["FreeText"]
        elif  ans['QuestionIdentifier'] == "race":
            answer["race"] = ans["FreeText"]
        elif  ans['QuestionIdentifier'] == "zipCode":
            answer["zipCode"] = ans["FreeText"]
        elif  ans['QuestionIdentifier'] == "Hispanic":
            answer["Hispanic"] = ans["FreeText"]
        elif  ans['QuestionIdentifier'] == "education":
            answer["education"] = ans["FreeText"]
        elif  ans['QuestionIdentifier'] == "occupation":
            answer["occupation"] = ans["FreeText"]
                
        elif  ans['QuestionIdentifier'] == "d1.strong_disagree" and ans["FreeText"] == "true":
            answer["feelAboutAd"] = 1
        elif  ans['QuestionIdentifier'] == "d1.disagree" and ans["FreeText"] == "true":
            answer["feelAboutAd"] = 2
        elif  ans['QuestionIdentifier'] == "d1.Unsure" and ans["FreeText"] == "true":
            answer["feelAboutAd"] = 3
        elif  ans['QuestionIdentifier'] == "d1.agree" and ans["FreeText"] == "true":
            answer["feelAboutAd"] = 4
        elif  ans['QuestionIdentifier'] == "d1.strong_agree" and ans["FreeText"] == "true":
            answer["feelAboutAd"] = 5
            
        elif  ans['QuestionIdentifier'] == "q1.strong_disagree" and ans["FreeText"] == "true":
            answer["q1"] = 1
        elif  ans['QuestionIdentifier'] == "q1.disagree" and ans["FreeText"] == "true":
            answer["q1"] = 2
        elif  ans['QuestionIdentifier'] == "q1.Unsure" and ans["FreeText"] == "true":
            answer["q1"] = 3
        elif  ans['QuestionIdentifier'] == "q1.agree" and ans["FreeText"] == "true":
            answer["q1"] = 4
        elif  ans['QuestionIdentifier'] == "q1.strong_agree" and ans["FreeText"] == "true":
            answer["q1"] = 5
            
        elif  ans['QuestionIdentifier'] == "q2.strong_disagree" and ans["FreeText"] == "true":
            answer["q2"] = 1
        elif  ans['QuestionIdentifier'] == "q2.disagree" and ans["FreeText"] == "true":
            answer["q2"] = 2
        elif  ans['QuestionIdentifier'] == "q2.Unsure" and ans["FreeText"] == "true":
            answer["q2"] = 3
        elif  ans['QuestionIdentifier'] == "q2.agree" and ans["FreeText"] == "true":
            answer["q2"] = 4
        elif  ans['QuestionIdentifier'] == "q2.strong_agree" and ans["FreeText"] == "true":
            answer["q2"] = 5
            
        elif  ans['QuestionIdentifier'] == "q3.strong_disagree" and ans["FreeText"] == "true":
            answer["q3"] = 1
        elif  ans['QuestionIdentifier'] == "q3.disagree" and ans["FreeText"] == "true":
            answer["q3"] = 2
        elif  ans['QuestionIdentifier'] == "q3.Unsure" and ans["FreeText"] == "true":
            answer["q3"] = 3
        elif  ans['QuestionIdentifier'] == "q3.agree" and ans["FreeText"] == "true":
            answer["q3"] = 4
        elif  ans['QuestionIdentifier'] == "q3.strong_agree" and ans["FreeText"] == "true":
            answer["q3"] = 5
            
        elif  ans['QuestionIdentifier'] == "q4.strong_disagree" and ans["FreeText"] == "true":
            answer["q4"] = 1
        elif  ans['QuestionIdentifier'] == "q4.disagree" and ans["FreeText"] == "true":
            answer["q4"] = 2
        elif  ans['QuestionIdentifier'] == "q4.Unsure" and ans["FreeText"] == "true":
            answer["q4"] = 3
        elif  ans['QuestionIdentifier'] == "q4.agree" and ans["FreeText"] == "true":
            answer["q4"] = 4
        elif  ans['QuestionIdentifier'] == "q4.strong_agree" and ans["FreeText"] == "true":
            answer["q4"] = 5
            
        elif  ans['QuestionIdentifier'] == "q5.strong_disagree" and ans["FreeText"] == "true":
            answer["q5"] = 1
        elif  ans['QuestionIdentifier'] == "q5.disagree" and ans["FreeText"] == "true":
            answer["q5"] = 2
        elif  ans['QuestionIdentifier'] == "q5.Unsure" and ans["FreeText"] == "true":
            answer["q5"] = 3
        elif  ans['QuestionIdentifier'] == "q5.agree" and ans["FreeText"] == "true":
            answer["q5"] = 4
        elif  ans['QuestionIdentifier'] == "q5.strong_agree" and ans["FreeText"] == "true":
            answer["q5"] = 5
            
        elif  ans['QuestionIdentifier'] == "q6.strong_disagree" and ans["FreeText"] == "true":
            answer["q6"] = 1
        elif  ans['QuestionIdentifier'] == "q6.disagree" and ans["FreeText"] == "true":
            answer["q6"] = 2
        elif  ans['QuestionIdentifier'] == "q6.Unsure" and ans["FreeText"] == "true":
            answer["q6"] = 3
        elif  ans['QuestionIdentifier'] == "q6.agree" and ans["FreeText"] == "true":
            answer["q6"] = 4
        elif  ans['QuestionIdentifier'] == "q6.strong_agree" and ans["FreeText"] == "true":
            answer["q6"] = 5
            
        elif  ans['QuestionIdentifier'] == "q7.strong_disagree" and ans["FreeText"] == "true":
            answer["q7"] = 1
        elif  ans['QuestionIdentifier'] == "q7.disagree" and ans["FreeText"] == "true":
            answer["q7"] = 2
        elif  ans['QuestionIdentifier'] == "q7.Unsure" and ans["FreeText"] == "true":
            answer["q7"] = 3
        elif  ans['QuestionIdentifier'] == "q7.agree" and ans["FreeText"] == "true":
            answer["q7"] = 4
        elif  ans['QuestionIdentifier'] == "q7.strong_agree" and ans["FreeText"] == "true":
            answer["q7"] = 5
            
        elif  ans['QuestionIdentifier'] == "q8.strong_disagree" and ans["FreeText"] == "true":
            answer["q8"] = 1
        elif  ans['QuestionIdentifier'] == "q8.disagree" and ans["FreeText"] == "true":
            answer["q8"] = 2
        elif  ans['QuestionIdentifier'] == "q8.Unsure" and ans["FreeText"] == "true":
            answer["q8"] = 3
        elif  ans['QuestionIdentifier'] == "q8.agree" and ans["FreeText"] == "true":
            answer["q8"] = 4
        elif  ans['QuestionIdentifier'] == "q8.strong_agree" and ans["FreeText"] == "true":
            answer["q8"] = 5
            
        elif  ans['QuestionIdentifier'] == "q9.strong_disagree" and ans["FreeText"] == "true":
            answer["q9"] = 1
        elif  ans['QuestionIdentifier'] == "q9.disagree" and ans["FreeText"] == "true":
            answer["q9"] = 2
        elif  ans['QuestionIdentifier'] == "q9.Unsure" and ans["FreeText"] == "true":
            answer["q9"] = 3
        elif  ans['QuestionIdentifier'] == "q9.agree" and ans["FreeText"] == "true":
            answer["q9"] = 4
        elif  ans['QuestionIdentifier'] == "q9.strong_agree" and ans["FreeText"] == "true":
            answer["q9"] = 5
            
        elif  ans['QuestionIdentifier'] == "q10.strong_disagree" and ans["FreeText"] == "true":
            answer["q10"] = 1
        elif  ans['QuestionIdentifier'] == "q10.disagree" and ans["FreeText"] == "true":
            answer["q10"] = 2
        elif  ans['QuestionIdentifier'] == "q10.Unsure" and ans["FreeText"] == "true":
            answer["q10"] = 3
        elif  ans['QuestionIdentifier'] == "q10.agree" and ans["FreeText"] == "true":
            answer["q10"] = 4
        elif  ans['QuestionIdentifier'] == "q10.strong_agree" and ans["FreeText"] == "true":
            answer["q10"] = 5
            
        elif  ans['QuestionIdentifier'] == "q11.strong_disagree" and ans["FreeText"] == "true":
            answer["q11"] = 1
        elif  ans['QuestionIdentifier'] == "q11.disagree" and ans["FreeText"] == "true":
            answer["q11"] = 2
        elif  ans['QuestionIdentifier'] == "q11.Unsure" and ans["FreeText"] == "true":
            answer["q11"] = 3
        elif  ans['QuestionIdentifier'] == "q11.agree" and ans["FreeText"] == "true":
            answer["q11"] = 4
        elif  ans['QuestionIdentifier'] == "q11.strong_agree" and ans["FreeText"] == "true":
            answer["q11"] = 5
            
        elif  ans['QuestionIdentifier'] == "q12.strong_disagree" and ans["FreeText"] == "true":
            answer["q12"] = 1
        elif  ans['QuestionIdentifier'] == "q12.disagree" and ans["FreeText"] == "true":
            answer["q12"] = 2
        elif  ans['QuestionIdentifier'] == "q12.Unsure" and ans["FreeText"] == "true":
            answer["q12"] = 3
        elif  ans['QuestionIdentifier'] == "q12.agree" and ans["FreeText"] == "true":
            answer["q12"] = 4
        elif  ans['QuestionIdentifier'] == "q12.strong_agree" and ans["FreeText"] == "true":
            answer["q12"] = 5

        elif  ans['QuestionIdentifier'] == "q13.strong_disagree" and ans["FreeText"] == "true":
            answer["q13"] = 1
        elif  ans['QuestionIdentifier'] == "q13.disagree" and ans["FreeText"] == "true":
            answer["q13"] = 2
        elif  ans['QuestionIdentifier'] == "q13.Unsure" and ans["FreeText"] == "true":
            answer["q13"] = 3
        elif  ans['QuestionIdentifier'] == "q13.agree" and ans["FreeText"] == "true":
            answer["q13"] = 4
        elif  ans['QuestionIdentifier'] == "q13.strong_agree" and ans["FreeText"] == "true":
            answer["q13"] = 5

        elif  ans['QuestionIdentifier'] == "q14.strong_disagree" and ans["FreeText"] == "true":
            answer["q14"] = 1
        elif  ans['QuestionIdentifier'] == "q14.disagree" and ans["FreeText"] == "true":
            answer["q14"] = 2
        elif  ans['QuestionIdentifier'] == "q14.Unsure" and ans["FreeText"] == "true":
            answer["q14"] = 3
        elif  ans['QuestionIdentifier'] == "q14.agree" and ans["FreeText"] == "true":
            answer["q14"] = 4
        elif  ans['QuestionIdentifier'] == "q14.strong_agree" and ans["FreeText"] == "true":
            answer["q14"] = 5

        elif  ans['QuestionIdentifier'] == "q15.strong_disagree" and ans["FreeText"] == "true":
            answer["q15"] = 1
        elif  ans['QuestionIdentifier'] == "q15.disagree" and ans["FreeText"] == "true":
            answer["q15"] = 2
        elif  ans['QuestionIdentifier'] == "q15.Unsure" and ans["FreeText"] == "true":
            answer["q15"] = 3
        elif  ans['QuestionIdentifier'] == "q15.agree" and ans["FreeText"] == "true":
            answer["q15"] = 4
        elif  ans['QuestionIdentifier'] == "q15.strong_agree" and ans["FreeText"] == "true":
            answer["q15"] = 5

        elif  ans['QuestionIdentifier'] == "q16.strong_disagree" and ans["FreeText"] == "true":
            answer["q16"] = 1
        elif  ans['QuestionIdentifier'] == "q16.disagree" and ans["FreeText"] == "true":
            answer["q16"] = 2
        elif  ans['QuestionIdentifier'] == "q16.Unsure" and ans["FreeText"] == "true":
            answer["q16"] = 3
        elif  ans['QuestionIdentifier'] == "q16.agree" and ans["FreeText"] == "true":
            answer["q16"] = 4
        elif  ans['QuestionIdentifier'] == "q16.strong_agree" and ans["FreeText"] == "true":
            answer["q16"] = 5
    return answer

In [12]:
# with open('result/result-10-11-2019 15:51:20.json', 'r') as f:
#     results = json.load(f.read())
    
for item in results:
    
    # Get the status of the HIT
    hit = client.get_hit(HITId=item['hit_id'])
    item['status'] = hit['HIT']['HITStatus']

    # Get a list of the Assignments that have been submitted by Workers
    assignmentsList = client.list_assignments_for_hit(
        HITId=item['hit_id'],
        AssignmentStatuses=['Submitted', 'Approved'],
        MaxResults=100
    )

    assignments = assignmentsList['Assignments']
    item['assignments_submitted_count'] = len(assignments)

    answers = []
    for assignment in assignments:
    
        # Retreive the attributes for each Assignment
        worker_id = assignment['WorkerId']
        assignment_id = assignment['AssignmentId']
        accept_time = assignment['AcceptTime']
        submit_time = assignment['SubmitTime']
        deltaTime = submit_time-accept_time  
        
        if deltaTime.total_seconds() > 30:
            # Retrieve the value submitted by the Worker from the XML
            answer_dict = xmltodict.parse(assignment['Answer'])
    #         print(answer_dict)
            answer = getAnsewer(answer_dict)
            answer['duration'] = deltaTime.total_seconds()
    #         print (answer)
            answers.append(answer)

            # Approve the Assignment (if it hasn't already been approved)
            if assignment['AssignmentStatus'] == 'Submitted':
                client.approve_assignment(
                    AssignmentId=assignment_id,
                    OverrideRejection=False
                )
        else:
            print('Reject assignment= {} with workerid={} and hitid={}'.format(assignment_id,worker_id,item['hit_id']))
            client.reject_assignment(
                AssignmentId=assignment_id,
                RequesterFeedback='You did not finish the assignment properly'
            )
    
    # Add the answers that have been retrieved for this item
    item['answers'] = answers

print(json.dumps(results,indent=2))

[
  {
    "image1": "Alternet Page - Wallet.png",
    "image2": "Alternet Page - Polar Bear.png",
    "image3": "Book - NRA.png",
    "image4": "Book - Adidas.png",
    "image5": "Breitbart Page - Fasion.png",
    "image6": "Breitbart Page - NRA.PNG",
    "image7": "Daily Kos Page - Adidas.png",
    "image8": "Daily Kos Page - Impeach.png",
    "image9": "Food - NRA.png",
    "image10": "Food - bp.png",
    "image11": "NewsMax Page - bp.png",
    "image12": "NewsMax Page - Liberal Quiz.png",
    "image13": "NYT Page - Impeach.png",
    "image14": "NYT Page - Fasion.png",
    "image15": "NYT Page2 - Impeach.png",
    "image16": "NYT Page2 - Adidas.png",
    "id": 1,
    "hit_id": "3VEI3XUCZQW57OFP789PZLJHKVCPR3",
    "status": "Assignable",
    "assignments_submitted_count": 0,
    "answers": []
  },
  {
    "image1": "Alternet Page - Fasion.png",
    "image2": "Alternet Page - Polar Bear.png",
    "image3": "Book - Liberal Quiz.png",
    "image4": "Book - bp.png",
    "image5": "Breitb