# Named entity recognition project
In this project the goal is to detect persons, organizations and locations mentioned in the text.
Thus, we are doing predictions for each word/token.
The data looks something like this:

| Word           | Label |
|----------------|-------|
| Germany        | I-LOC |
| 's             | O     |
| representative | O     |
| to             | O     |
| the            | O     |
| European       | I-ORG |
| Union          | I-ORG |
| 's             | O     |
| veterinary     | O     |
| committee      | O     |
| Werner         | I-PER |
| Zwingmann      | I-PER |
| said           | O     |
| on             | O     |
| Wednesday      | O     |
| consumers      | O     |
| should         | O     |
| buy            | O     |
| sheepmeat      | O     |
| from           | O     |
| countries      | O     |
| other          | O     |
| than           | O     |
| Britain        | I-LOC |
| until          | O     |
| the            | O     |
| scientific     | O     |
| advice         | O     |
| was            | O     |
| clearer        | O     |
| .              | O     |


Here label I-LOC refers to locations, I-ORG to organizations and I-PER to persons. All other words are labeled with O tags. Having multiple consecutive words with the same label are considered to be part of the same entity, e.g. _Werner Zwingmann_ is a single person entity. This topic will be covered in detail during the upcoming lectures, but you can already start planning how you would solve the problem and you should be able to also build a simple model for the task.

## Data
Training data can be downloaded from https://github.com/glample/tagger .
The data can be converted to a json format with read_ner.ipynb .

The data is divided into separate sentences and has also been tokenized already. You don't have to do any preprocessing. The produced json files contain a single dictionary for each sentence. The dictionary has a list of tokens and the corresponding list of labels, e.g.:
  {
    "text": [
      "EU",
      "rejects",
      "German",
      "call",
      "to",
      "boycott",
      "British",
      "lamb",
      "."
    ],
    "tags": [
      "I-ORG",
      "O",
      "O",
      "O",
      "O",
      "O",
      "O",
      "O",
      "O"
    ]
  },


# Milestones
## 1.1 Predicting word labels independently
The first part is to train a classifier which assigns a label for each given input word independently. Evaluate the results on token level and entity level.
Report your results with different network hyperparameters.
Also discuss whether the token level accuracy is a reasonable metric. How do the pretrained word embeddings influence your predictions?
## 1.2 Expand context
Modify your network in such way that it is able to utilize the surrounding context of the word. This can be done for instance with a convolutional or recurrent layer (recurrent neural networks will be discussed later).
Analyze different neural network architectures and hyperparameters. How does utilizing the surrounding context influence the predictions.
## 2.1 Analyze false positive predictions
Look at the entities your model is predicting, but which are not present in the gold standard data. Can you see any patterns in the misclassifications? Do the predicted entities share some similarities with the real entities?

## 2.2 Analyze the convolutional kernels
Using the example codes shown during the lectures, analyze where your convolutional kernels are activating. Are the any clear person name related kernels etc.?

## 3. Add character level information to your model
Add a convolutional layer which reads words as character sequences. Concatenate this information with the word embeddings before feeding them into the other existing layers you have created in your model architecture. 
