# Table of Contents
* [Submitting HITs](#Submitting-HITs)
	* [Building URLs for images on s3](#Building-URLs-for-images-on-s3)
	* [submitting HITs in groups](#submitting-HITs-in-groups)
* [Reviewing HITs](#Reviewing-HITs)
* [Ignore](#Ignore)


In [1]:
import boto
import boto.mturk.connection as tc
import boto.mturk.question as tq
from keysTkingdom import mturk_ai2
import pickle

# Submitting HITs

## Building URLs for images on s3

In [125]:
def load_book_info():
    with open('breakdowns.pkl', 'rb') as f:
#         book_breakdowns = pickle.load(f, encoding='latin1')
        book_breakdowns = pickle.load(f)


    with open('pdfs/page_ranges.csv') as f:
        ranges = f.readlines()
    range_lookup = {line.split(' ')[0]:[int(num) for num in line.strip().split(' ')[1:]] for line in ranges}
    return book_breakdowns, range_lookup

def form_hit_url(book_name, page_n):
    book_name_no_ext = book_name.replace('.pdf', '_')
    baseurl = 'https://s3-us-west-2.amazonaws.com/ai2-vision-turk-data/textbook-annotation-test/build/index.html'
    full_url = baseurl + '?url={}{}.jpeg&id={}'.format(book_name_no_ext, page_n, page_n)
    return full_url

def make_book_group_urls(book_groups, book_group, ranges):
    group_urls = []
    def get_start_end(start, end):
        return start, end
    
    for tb in book_groups[book_group]:
        start, end = get_start_end(*ranges[tb])
        for page_n in range(start, end):
            group_urls.append(form_hit_url(tb, page_n))
    return group_urls

In [6]:
book_groups,ranges = load_book_info()

In [7]:
daily_sci_urls = make_book_group_urls('daily_sci')
spectrum_sci_urls = make_book_group_urls('spectrum_sci')

In [98]:
daily_sci_urls[500:600]

['https://s3-us-west-2.amazonaws.com/ai2-vision-turk-data/textbook-annotation-test/build/index.html?url=Daily_Science_Grade_4_Evan_Moor_153.jpeg&id=153',
 'https://s3-us-west-2.amazonaws.com/ai2-vision-turk-data/textbook-annotation-test/build/index.html?url=Daily_Science_Grade_4_Evan_Moor_154.jpeg&id=154',
 'https://s3-us-west-2.amazonaws.com/ai2-vision-turk-data/textbook-annotation-test/build/index.html?url=Daily_Science_Grade_4_Evan_Moor_155.jpeg&id=155',
 'https://s3-us-west-2.amazonaws.com/ai2-vision-turk-data/textbook-annotation-test/build/index.html?url=Daily_Science_Grade_4_Evan_Moor_156.jpeg&id=156',
 'https://s3-us-west-2.amazonaws.com/ai2-vision-turk-data/textbook-annotation-test/build/index.html?url=Daily_Science_Grade_4_Evan_Moor_157.jpeg&id=157',
 'https://s3-us-west-2.amazonaws.com/ai2-vision-turk-data/textbook-annotation-test/build/index.html?url=Daily_Science_Grade_4_Evan_Moor_158.jpeg&id=158',
 'https://s3-us-west-2.amazonaws.com/ai2-vision-turk-data/textbook-annotatio

## submitting HITs in groups

In [90]:
def creat_single_hit(url):
    """
    creates a single HIT from a provided url
    """
    title = "Annotate Science Textbook"
    description = "Choose which category a text entry best belongs to"
    keywords = ['image', 'science']
    frame_height = 1000 # the height of the iframe holding the external hit
    amount = .05
#     duration = 3600

    questionform = tq.ExternalQuestion(url, frame_height)

    create_hit_result = mturk.create_hit(
        title = title,
        description = description,
        keywords = keywords,
        question = questionform,
        reward = boto.mturk.price.Price(amount=amount),
#         max_assignments=3,
        max_assignments=1,
#         duration = duration
    )
def create_hits_from_pages(page_links):
    for url in page_links:
        creat_single_hit(url)

In [64]:
sandbox_host = 'mechanicalturk.sandbox.amazonaws.com' 
mturk = tc.MTurkConnection(
    aws_access_key_id = mturk_ai2.access_key,
    aws_secret_access_key = mturk_ai2.access_secret_key,
    host = sandbox_host,
    debug = 1 # debug = 2 prints out all requests.
)

In [68]:
creat_single_hit(daily_sci_urls[85])

In [123]:
create_hits_from_pages(daily_sci_urls[502:504])

# Reviewing HITs

In [102]:
r_hits = mturk.get_reviewable_hits()

In [121]:
for hit in r_hits:
    assignments = mturk.get_assignments(hit.HITId)
    for assigment in assignments:
        for answers in assigment.answers:
            print answers[0].fields
            

[u'Daily_Science_Grade_4_Evan_Moor_153.jpeg']
[u'Daily_Science_Grade_4_Evan_Moor_154.jpeg']


In [63]:
# batch_results_df = pd.read_csv(data_dir+results_csv)
# print(batch_results_df.shape)
# batch_results_df.head(2)

# Ignore

In [64]:
# grouped_results_df = batch_results_df.groupby('Input.image_url')
# for image_response in grouped_results_df:
#     print(image_response[1]['Answer.NumberOfItems'])

In [79]:
# hit_type_1 = (
#     "Annotate Science Textbook",
#     "Choose which category a text entry best belongs to",
#     boto.mturk.price.Price(amount=0.05),
#     3600,
#     ['image', 'science']
# )

# my_hits = list(mturk.get_all_hits())

# for hit in my_hits:
#     mturk.disable_hit(hit.HITId)

# my_hit = list(mturk.get_all_hits())[0]

# hitidr = mturk.register_hit_type(*hit_type_1)

In [99]:
for hit in my_hits:
    mturk.disable_hit(hit.HITId)

Choosing the right price for your HITs is crucial, and it can be tricky to figure out when you’re first starting. It’s here that those using Mechanical Turk as a digital sweatshop are separated from those using Mechanical Turk as fair and equitable way to employ of other people. Many turkers consider it unethical to pay under $0.10 per minute. This amount works out to a $6.00 hourly wage or the minimum wage in the US (though many states pay higher). Turkers specifically pay attention to price when determining whether or not a HIT is worth their time. As one turker said in a survey “…I figure a good task is one I can make 10 to 12 cents a minute on.” If you’re looking to get your HITs done quickly and have high-quality turkers work on them (and trust me, you are!) then you should make sure you pay your turkers fairly. If you want a quick rule of thumb it’s:

Fair Pay = $0.10 x (Average Number Of Minutes Per Assignment)