Submission Format

The output of your system a given task should produce a target language description for each image 
formatted in the following way:

<METHOD NAME> <IMAGE ID> <DESCRIPTION> <TASK> <TYPE>

Where:
METHOD NAME is the name of your method.
IMAGE ID is the identifier of the test image.
DESCRIPTION is the output generated by your system (either a translation or an independently generated description).
TASK is one of the following flags: 1 (for translation task), 2 (for image description task), 3 (for both). 
The choice here will indicate how your descriptions will be evaluated. 
Option 3 means they will be evaluated both as a translation task and as an image description task.
TYPE is either C or U, where C indicates "constrained", i.e. using only the resources provided by the task organisers, and U indicates "unconstrained".
Each field should be delimited by a single tab character.


Submission Requirements

Each participating team can submit at most 2 systems for each of the task variants (so up to 4 submissions). 
These should be sent via email to Lucia Specia lspecia@gmail.com. Please use the following pattern to name your files:
INSTITUTION-NAME_TASK-NAME_METHOD-NAME_TYPE, where:

INSTITUTION-NAME is an acronym/short name for your institution, e.g. SHEF

TASK-NAME is one of the following: 1 (translation), 2 (description), 3 (both).

METHOD-NAME is an identifier for your method in case you have multiple methods for the same task, e.g. 2_NeuralTranslation, 2_Moses

TYPE is either C or U, where C indicates "constrained", i.e. using only the resources provided by the task organisers, 
and U indicates "unconstrained".

For instance, a constrained submission from team SHEF for task 2 using method "Moses" could be named SHEF_2_Moses_C.

You are invited to submit a short paper (4 to 6 pages) to WMT describing your method(s). 
You are not required to submit a paper if you do not want to. In that case, we ask you to provide a summary and/or an appropriate 
reference describing your method(s) that we can cite in the WMT overview paper.

In [1]:
import os
import codecs

In [6]:
BASEDIR='/media/1tb_drive/multilingual-multimodal/flickr30k/train/processed'

test_sens = ['this is a test', 'this is another test']
test_img_ids = ['img1', 'img2']

# submission_dir = os.path.join(BASEDIR, 'final_submissions/test_submission')

# SUBMISSION 1 -- NMT min-risk baseline
submission_dir = '/media/1tb_drive/Dropbox/projects/wmt_multimodal_2016/final_submission/'

# test_output_file = '/media/1tb_drive/Dropbox/projects/wmt_multimodal_2016/final_submission/test.min-risk-NMT-baseline.hyps.out'
test_output_file = '/media/1tb_drive/Dropbox/projects/wmt_multimodal_2016/final_submission/test.multimodal-FULL-DEV-TEST-summed.hyps.out'

# test_output_file = os.path.join(submission_dir, 'test.multimodal-summed.hyps.out')

image_id_file='/media/1tb_drive/multilingual-multimodal/flickr30k/img_features/f30k-translational/test_images.txt'

INSTITUTION_NAME = 'DCU'
TASK_NAME = '1'
METHOD_NAME = 'min-risk-multimodal'
TYPE = 'C'

submission_file_name = '{}_{}_{}_{}'.format(INSTITUTION_NAME, TASK_NAME, METHOD_NAME, TYPE)
output_file = os.path.join(submission_dir, submission_file_name)

In [7]:
def parse_img_ids(img_filename):
    with codecs.open(img_filename, encoding='utf8') as inp:
        img_ids = inp.read().strip().split('\n')
    
    # remove the .jpg
    img_ids = [img_id[:-4] for img_id in img_ids]
    return img_ids



def process_file_for_wmt16(input_filename, input_img_filename, split_tabs=False):
    with codecs.open(input_filename, encoding='utf8') as inp:
        lines = inp.read().strip().split('\n')
        if split_tabs:
            lines = [l.split('\t')[0] for l in lines]
        
    img_ids = parse_img_ids(input_img_filename)
    
    print(len(img_ids))
    print(len(lines))
    assert len(img_ids) == len(lines) 
        
    output_lines = ['\t'.join([METHOD_NAME, img_id, hyp, TASK_NAME, TYPE]) 
                    for img_id, hyp in zip(img_ids, lines)]
    
    return output_lines
    
#     <METHOD NAME> <IMAGE ID> <DESCRIPTION> <TASK> <TYPE>

# Where:
# METHOD NAME is the name of your method.
# IMAGE ID is the identifier of the test image.
# DESCRIPTION is the output generated by your system (either a translation or an independently generated description).
# TASK is one of the following flags: 1 (for translation task), 2 (for image description task), 3 (for both). 
# The choice here will indicate how your descriptions will be evaluated. 
# Option 3 means they will be evaluated both as a translation task and as an image description task.
# TYPE is either C or U, where C indicates "constrained", i.e. using only the resources 
# provided by the task organisers, and U indicates "unconstrained".
# Each field should be delimited by a single tab character.

In [8]:
out_lines = process_file_for_wmt16(test_output_file, image_id_file, split_tabs=True)

1000
1000


In [9]:
with codecs.open(output_file, 'w', encoding='utf8') as out:
    for l in out_lines:
        out.write(l+'\n')
    print('Wrote output to: {}'.format(output_file))

Wrote output to: /media/1tb_drive/Dropbox/projects/wmt_multimodal_2016/final_submission/DCU_1_min-risk-multimodal_C


In [10]:
out_lines[:10]

[u'min-risk-multimodal\t1007129816\tEin Mann mit einem orangen Hut starrt etwas auf etwas .\t1\tC',
 u'min-risk-multimodal\t1009434119\tEin Softballspieler l\xe4uft auf einem gr\xfcnen Rasen vor einem wei\xdfen Zaun .\t1\tC',
 u'min-risk-multimodal\t101362133\tEin M\xe4dchen in Karateanz\xfcgen Spielkleidung mit einem Stock vor einem Stock .\t1\tC',
 u'min-risk-multimodal\t102617084\tF\xfcnf Personen mit Winterkleidung und Helmen stehen im Schnee und mit Bergen im Hintergrund .\t1\tC',
 u'min-risk-multimodal\t10287332\tLeute fahren das Dach eines Hauses .\t1\tC',
 u'min-risk-multimodal\t1039637574\tEin Mann in schwarz-hellbraune Kleidung machen eine Gruppe von M\xe4nnern in legerer Kleidung .\t1\tC',
 u'min-risk-multimodal\t1043819504\tEine Gruppe Leute steht vor einem Stuhl .\t1\tC',
 u'min-risk-multimodal\t1043910339\tEin Junge in einem roten Trikot versucht sich auf einem Teller .\t1\tC',
 u'min-risk-multimodal\t1044798682\tEin Mann arbeitet an einem Geb\xe4ude .\t1\tC',
 u'min-risk

In [7]:
be = parse_img_ids(image_id_file)

In [8]:
be[:10]

[u'1007129816',
 u'1009434119',
 u'101362133',
 u'102617084',
 u'10287332',
 u'1039637574',
 u'1043819504',
 u'1043910339',
 u'1044798682',
 u'1071201387']

In [5]:
te[:10]

[u'1007129816.jpg',
 u'1009434119.jpg',
 u'101362133.jpg',
 u'102617084.jpg',
 u'10287332.jpg',
 u'1039637574.jpg',
 u'1043819504.jpg',
 u'1043910339.jpg',
 u'1044798682.jpg',
 u'1071201387.jpg']