# ImageCLEF Medical Caption Task 2019

The [Image CLEF 2019 Concept Detection Task](https://www.imageclef.org/2019/medical/caption/) is a large-scale multi-label classification task aiming to identify medical terms (concepts) in radiology images. Implement a system to classify a medical image based on several abnormalities represented by [Unified Medical Language System (UMLS)](https://www.nlm.nih.gov/research/umls/index.html) concept IDs.

The AUEB NLP Group won the competition with [this paper](http://nlp.cs.aueb.gr/pubs/paper_136.pdf). This assignment was prepared by Vasiliki Kougia and John Pavlopoulos.

## Get the Data

* The training data exist in Google Drive.

* You can download the training and validation data from <https://drive.google.com/uc?id=1UOccw0VNCiRTwaQSEJMhiYWXhizvKptX> and the test data from <https://drive.google.com/uc?id=1diO2apPPFJeTH8CGcd3S55OUNTXtJVu2>. Alternatively, you can use the [gdown](https://github.com/wkentaro/gdown) utility; the file IDs are `1UOccw0VNCiRTwaQSEJMhiYWXhizvKptX` and `1diO2apPPFJeTH8CGcd3S55OUNTXtJVu2`.

* The data are organised as follows:

  * `training-set`: 56,629 training images.
  
  * `validation-set`: 14,157 validation images.

  * `Last`; rename it to `test-set`: 10,000 images used for testing. The test images have no annotations. They will be used for assessment.
  
  * `train_concepts.csv`: the image IDs of the training set with their gold (i.e., known correct) tags, separated with `;`.
  
  * `val_concepts.csv`: the validation image IDs with their gold tags, separated with `;`.

  * `string_concepts.csv`: all the available tag IDs and their corresponding name, separated with tabs.

## Data Exploration

* Explore your data.

* Plot some images.

* For those images, fetch their tag IDs and their tag names.

* How many tags are there in total?

* Which ones are the most frequent?

* How many tags are there per image?

## Data Preprocessing

* Preprocess the images so that you can use them as input.

* You may have to preprocess the labels as well.

## Build a Baseline

* Think of a baseline classifier that you could use to to measure your efforts.

* That could be a classifier that produces always the most frequent labels.

* Alternatively (and probably better), it could be a classifier that samples from the labels based on their frequency.

## Build a Neural Network

* You can use any of the neural network architectures you have seen in class, or any other architecture you may like.

* You are free to try pretrained models. However, be warned that they may demand considerable resources.

## Assessment

* For each validation image, measure the [F1 score](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.f1_score.html) of the predicted tags. You can use the evaluation code of the competition, which can be found in the official site, in the Evaluation Methodology section of the [competition web page](https://www.imageclef.org/2019/medical/caption/). Actually that's probably the best course of action to avoid getting bogged down in differences in F1 score implementations.

* Calculate the average for all the *validation* images.

* Keep in mind that the best F1 score achieved in the competition is 0.282 for the *test* images.

* If nothing else, you can read the [paper](http://nlp.cs.aueb.gr/pubs/paper_136.pdf) to get the outline of possible solutions (but you can try your own, you do not have, and are not expected to, mimic previous work).

* After finessing your model on the validation images, you will use it on the test images.

* You must submit:

  * Your notebook, indicating the score you achieved on the validation set.

  * A file with your solutions on the test set, as explained in the Submission Instructions section of the competition web page.

## Honor Code

You understand that this is an individual assignment, and as such you must carry it out alone. You may seek help on the Internet, by Googling or searching in StackOverflow for general questions pertaining to the use of Python and pandas libraries and idioms. However, it is not right to ask direct questions that relate to the assignment and where people will actually solve your problem by answering them. You may discuss with your fellow students in order to better understand the questions, if they are not clear enough, but you should not ask them to share their answers with you, or to help you by giving specific advice.