You need to respect the following constraints:

- The separator between the figure ID and the concepts has to be a tabular whitespace

- The separator between the UMLS concepts has to be a semicolon (;)

- The same concept cannot be specified more than once for a given figure ID

- Each figure ID of the testset must be included in the submission file exactly once (even if there are no concepts)

In [1]:
import pandas

In [2]:
# the separator between the figure ID and the concepts has to be a tabular whitespace.
results = pandas.read_csv(filepath_or_buffer='results.csv', delimiter='\t', names=['figure', 'concepts'])

In [3]:
testing = pandas.read_csv(filepath_or_buffer='testing.csv', delimiter='\t', names=['figure'])

In [4]:
# the separator between the UMLS concepts has to be a semicolon (;).
results.head()

Unnamed: 0,figure,concepts
0,fsurg-04-00047-g002,C0016538;C0033363;C0221055;C0262878;C0309989;C...
1,AJC-17-412-g001,C0006290;C0014245;C0040578;C0040580;C0043246;C...
2,gr7_PMC5545870,C0014245;C0034196;C0038351;C0038354;C0153418;C...
3,SaudiMedJ-38-541-g004,C0009924;C0027651;C0030797;C0034606;C0040405;C...
4,ijn-12-2179Fig1,C0005889;C0017547;C0041618;C0043194;C0221055;C...


In [5]:
testing.head()

Unnamed: 0,figure
0,12864_2017_3726_Fig1_HTML
1,13071_2017_2412_Fig3_HTML
2,13071_2017_2412_Fig4_HTML
3,41598_2017_1378_Fig2_HTML
4,41598_2017_1378_Fig3_HTML


In [6]:
# each figure ID of the testset must be included in the submission file exactly once (even if there are no concepts).
results.describe()

Unnamed: 0,figure,concepts
count,9938,9938
unique,9938,9775
top,bi-2017-003002_0008,C0221055;C1550557;C1706368
freq,1,47


In [7]:
testing.describe()

Unnamed: 0,figure
count,9938
unique,9938
top,bi-2017-003002_0008
freq,1


In [8]:
results.count()

figure      9938
concepts    9938
dtype: int64

In [9]:
testing.count()

figure    9938
dtype: int64

In [10]:
results.figure.unique().size

9938

In [11]:
testing.figure.unique().size

9938

In [12]:
set(results['figure']) - set(testing['figure'])

set()

In [13]:
set(testing['figure']) - set(results['figure'])

set()

In [14]:
# the same concept cannot be specified more than once for a given figure ID.
for figure, concepts in results.values:
    tmp = concepts.split(';')
    total = len(tmp)
    unique = len(set(tmp))
    if total != unique:
        print(total, unique)

In [15]:
!pip install sklearn
!pip install scipy

[33mYou are using pip version 8.1.2, however version 10.0.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.[0m
[33mYou are using pip version 8.1.2, however version 10.0.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.[0m


In [16]:
import sys, argparse, string
import csv
import warnings

In [17]:
from sklearn.metrics import f1_score

In [18]:
def main(candidate_file, gt_file):

    # Hide warnings
    warnings.filterwarnings('ignore')

    # Concept stats
    min_concepts = sys.maxsize
    max_concepts = 0
    total_concepts = 0
    concepts_distrib = {}

#     # Parse arguments
#     parser = argparse.ArgumentParser()
#     parser.add_argument('candidate_file', help='path to the candidate file to evaluate')
#     parser.add_argument('gt_file', help='path to the ground truth file')
#     args = parser.parse_args()

    # Read files
    print('Input parameters\n********************************')

    print('Candidate file is "' + candidate_file + '"')
    candidate_pairs = readfile(candidate_file)

    print('Ground Truth file is "' + gt_file + '"')
    gt_pairs = readfile(gt_file)

    # Define max score and current score
    max_score = len(gt_pairs)
    current_score = 0

    # Check there are the same number of pairs between candidate and ground truth
    if len(candidate_pairs) != len(gt_pairs):
        print('ERROR : Candidate does not contain the same number of entries as the ground truth!')
        exit(1)

    # Evaluate each candidate concept list against the ground truth
    print('Processing concept sets...\n********************************')

    i = 0
    for image_key in candidate_pairs:

        # Get candidate and GT concepts
        candidate_concepts = candidate_pairs[image_key].upper()
        gt_concepts = gt_pairs[image_key].upper()

        # Split concept string into concept array
        # Manage empty concept lists
        if gt_concepts.strip() == '':
            gt_concepts = []
        else:
            gt_concepts = gt_concepts.split(',')

        if candidate_concepts.strip() == '':
            candidate_concepts = []
        else:
            candidate_concepts = candidate_concepts.split(',')

        # Manage empty GT concepts (ignore in evaluation)
        if len(gt_concepts) == 0:
            max_score -= 1
        # Normal evaluation
        else:
            # Concepts stats
            total_concepts += len(gt_concepts)

            # Global set of concepts
            all_concepts = sorted(list(set(gt_concepts + candidate_concepts)))

            # Calculate F1 score for the current concepts
            y_true = [int(concept in gt_concepts) for concept in all_concepts]
            y_pred = [int(concept in candidate_concepts) for concept in all_concepts]

            f1score = f1_score(y_true, y_pred, average='binary')

            # Increase calculated score
            current_score += f1score

        # Concepts stats
        nb_concepts = str(len(gt_concepts))
        if nb_concepts not in concepts_distrib:
            concepts_distrib[nb_concepts] = 1
        else:
            concepts_distrib[nb_concepts] += 1

        if len(gt_concepts) > max_concepts:
            max_concepts = len(gt_concepts)

        if len(gt_concepts) < min_concepts:
            min_concepts = len(gt_concepts)

        # Progress display
        i += 1
        if i % 1000 == 0:
            print(i, '/', len(gt_pairs), ' concept sets processed...')

    # Print stats
    print('Concept statistics\n********************************')
    print('Number of concepts distribution')
    print_dict_sorted_num(concepts_distrib)
    print('Least concepts in set :', min_concepts)
    print('Most concepts in set :', max_concepts)
    print('Average concepts in set :', total_concepts / len(candidate_pairs))

    # Print evaluation result
    print('Final result\n********************************')
    print('Obtained score :', current_score, '/', max_score)
    print('Mean score over all concept sets :', current_score / max_score)

In [19]:
# Read a Tab-separated ImageID - Caption pair file
def readfile(path):
    try:
        pairs = {}
        with open(path) as csvfile:
            reader = csv.reader(csvfile, delimiter='\t', quoting=csv.QUOTE_NONE)
            for row in reader:
                # We have an ID and a set of concepts (possibly empty)
                if len(row) == 2:
                    pairs[row[0]] = row[1]
                # We only have an ID
                elif len(row) == 1:
                    pairs[row[0]] = ''
                else:
                    print('File format is wrong, please check your run file')
                    exit(1)

        return pairs
    except FileNotFoundError:
        print('File "' + path + '" not found! Please check the path!')
        exit(1)

In [20]:
# Print 1-level key-value dictionary, sorted (with numeric key)
def print_dict_sorted_num(obj):
    keylist = [int(x) for x in list(obj.keys())]
    keylist.sort()
    for key in keylist:
        print(key, ':', obj[str(key)])

In [21]:
main('results.csv', 'results.csv')

Input parameters
********************************
Candidate file is "results.csv"
Ground Truth file is "results.csv"
Processing concept sets...
********************************
1000 / 9938  concept sets processed...
2000 / 9938  concept sets processed...
3000 / 9938  concept sets processed...
4000 / 9938  concept sets processed...
5000 / 9938  concept sets processed...
6000 / 9938  concept sets processed...
7000 / 9938  concept sets processed...
8000 / 9938  concept sets processed...
9000 / 9938  concept sets processed...
Concept statistics
********************************
Number of concepts distribution
1 : 9938
Least concepts in set : 1
Most concepts in set : 1
Average concepts in set : 1.0
Final result
********************************
Obtained score : 9938.0 / 9938
Mean score over all concept sets : 1.0


In [22]:
"""
---------------
IMPORTANT LINKS
---------------

http://www.crowdai.org/challenges/imageclef-2018-caption-concept-detection/submissions
http://ceur-ws.org/Vol-1866/
http://clef2018.clef-initiative.eu
http://imageclef.org/2018
http://imageclef.org/2018/caption
http://scikit-learn.org/stable/modules/generated/sklearn.metrics.f1_score.html
"""

'\n---------------\nIMPORTANT LINKS\n---------------\n\nhttp://www.crowdai.org/challenges/imageclef-2018-caption-concept-detection/submissions\nhttp://ceur-ws.org/Vol-1866/\nhttp://clef2018.clef-initiative.eu\nhttp://imageclef.org/2018\nhttp://imageclef.org/2018/caption\nhttp://scikit-learn.org/stable/modules/generated/sklearn.metrics.f1_score.html\n'

In [23]:
"""
Title:

Concept Detection on Medical Images using Deep Learning with Depthwise Separable Convolutions.

Description:

Convolutional neural networks are considered among the best classifiers for single-label image classification. In this task, we adapt a convolutional neural network with transfer learning to multi-label classification task. In particular, we use the Xception, a novel deep convolutional neural network architecture inspired by the Inception network. Inception was primarily designed for the ImageNet competition and is characterized by depthwise separable convolutions.

The CNN architecture of Xception is based entirely on depthwise separable convolution layers and has 36 convolutional layers forming the feature extraction base of the network. Since this is a image classification task, the convolutional base here is followed by a logistic regression layer. In addition, we have insert fully-connected layers before the logistic regression layer, which is explored in the system presentation section. The 36 convolutional layers are structured into 14 modules, all of which have linear residual connections around them, except for the first and last modules.

In order to train the CNN model we used Jupyter Notebook and Keras on Amazon AWS through the AWS Deep Learning AMI. Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text. Keras is a deep learning framework and high-level library that uses TensorFlow as its back-end engine. Finally, the AWS Deep Learning AMIs provide machine learning practitioners and researchers with the infrastructure and tools to accelerate deep learning in the cloud, at any scale.

The overall process, which includes both training and testing, took about 24 hours for both phases combined and our final model was able to achieve an F1-Score of 22.06% on the validation data.

Retrieval Type:

Mixed (Textual and Visual)

Run Type:

Automatic

Primary Run?

Checked

Other information:

Nothing to add here.

Additional resources used

No additional resources were used.

Choose File:

/home/eualin/Desktop/results.csv

https://www.crowdai.org/challenges/imageclef-2018-caption-concept-detection/submissions
"""

'\nTitle:\n\nConcept Detection on Medical Images using Deep Learning with Depthwise Separable Convolutions.\n\nDescription:\n\nConvolutional neural networks are considered among the best classifiers for single-label image classification. In this task, we adapt a convolutional neural network with transfer learning to multi-label classification task. In particular, we use the Xception, a novel deep convolutional neural network architecture inspired by the Inception network. Inception was primarily designed for the ImageNet competition and is characterized by depthwise separable convolutions.\n\nThe CNN architecture of Xception is based entirely on depthwise separable convolution layers and has 36 convolutional layers forming the feature extraction base of the network. Since this is a image classification task, the convolutional base here is followed by a logistic regression layer. In addition, we have insert fully-connected layers before the logistic regression layer, which is explored i