# Named entity recognition
In this tutorial we'll show you how to start with a Named Entity Recognition (NER) project.

## What is named entity recognition<a id='ner'></a>

The first key concept to understand is what is Named Entity Recognition (NER). NER is a sub-task of information extraction that seeks to locate and classify named entities mentioned in unstructured text into predefined categories such as names, organizations, locations, medical codes, time expressions, quantities, etc. NER has many use cases, such as anonymization of documents containing personal data. This is especially applicable to use cases involving protected data such as medical documents or court decisions. To make things more concrete, let's examine an example. Let's say we this text to label:

![](https://raw.githubusercontent.com/kili-technology/kili-playground/master/recipes/img/getting_started/ner-no-labels.png)

We want to specify in this document which includes names, nouns, adjectives, and verbs:

![](https://raw.githubusercontent.com/kili-technology/kili-playground/master/recipes/img/getting_started/ner-labeled.png)

This task is easy for a human: recognizing nouns, adjectives, and verbs. But this process is actually helpful for a machine to learn to recognize nouns and other things in a document. Training a machine learning model requires large quantities of annotated assets. With Kili, you can label your data efficiently. In this tutorial, we will create our project on NER step by step.

Following are the steps in this tutorial; feel free to jump ahead some steps if you've already done some previous tutorials:

1. [Connecting to Kili](#connecting)
2. [Creating a new named entity recognition project](#creating-project)
3. [Importing assets](#importing-assets)
4. [Labeling](#labeling)
5. [Exporting labels](#exporting-labels)
6. [Quality Management](#quality-management)
7. [More advanced concepts](#advanced-concepts)


## Connecting to Kili <a id="connecting"></a>

The first step is to be able to connect to the platform.

Ask your organization admin to create your profile. When your profile is ready, depending on the authentication implementation you can:
- Sign up and set your password with first login.
- Sign up using the temporary password provided to you by the admin.

If you use the SaaS version of Kili, by default you use the Auth0 login identification, or your company's authentication, if it has been implemented:

<img src="https://raw.githubusercontent.com/kili-technology/kili-playground/master/recipes/img/getting_started/kili-login.png"/>

**Note:**<br>
If you use Kili on premise, you will probably use your own authentication.

If everything succeeds, you should arrive at the projects list page:

![](https://raw.githubusercontent.com/kili-technology/kili-playground/master/recipes/img/getting_started/project-list.png)

## Creating a new named entity recognition project <a id='creating-project'></a>

You can either create a project from the graphical user interface or from the API.

To create a new named entity recognition project from the graphical user interface:
1. Click on the Kili logo in the top-left corner of the screen to get to your project list.
2. From the project list, click `Create New`.
3. Type your project name and description.
4. Select your asset type (`Text`).
5. Select your project type (`Text Named-Entities Recognition`).
6. Click `Save`.

![](https://raw.githubusercontent.com/kili-technology/kili-playground/master/recipes/img/getting_started/new-project.png)

<details>
<summary style="display: list-item;"> Follow these instructions to create a NER project from the API</summary>

From the API, you can create a project with a single call, which allows you to store and share project interfaces:

```python
# Authentication
import os

# !pip install kili # uncomment if you don't have Kili installed already
from kili.client import Kili

api_endpoint = os.getenv('KILI_API_ENDPOINT') # If you use Kili SaaS, use the url 'https://cloud.kili-technology.com/api/label/v2/graphql'

kili = Kili(api_endpoint=api_endpoint)
```

2. Set up project interface:

```python
interface = {
	"jobs": {
		"JOB_0": {
			"mlTask": "NAMED_ENTITIES_RECOGNITION",
			"instruction": "Categories",
			"required": 1,
			"isChild": false,
			"isVisible": true,
			"content": {
				"categories": {
					"INTERJECTION": {
						"name": "Interjection",
						"children": [],
						"color": "#0755FF"
					},
					"NOUN": {
						"name": "Noun",
						"children": [],
						"color": "#EEBA00"
					}
				},
				"input": "radio"
			}
		}
	}
}
```

3. Call the method `create_project`: <a id='command'></a>
```python
result = kili.create_project(
    title='Project Title',
    description='Project Description',
    input_type='TEXT',
    json_interface=interface
)
print(result)
```

When you run the `create_project` command, it outputs a unique identifier of the project. This identifier is used to access, and modify the project from the API:

```python
Out: {'id': 'ckm4pmqmk0000d49k6ewu2um5'}
```

Another way to get this project identifier is to look at the URL you're in:

![](https://raw.githubusercontent.com/kili-technology/kili-playground/master/recipes/img/url_project.png)
</details>

This creates a project with a simple interface with two types of Named Entities: "Interjection" and "Noun". Once logged in, you can see your project in the list of projects: 

![](https://raw.githubusercontent.com/kili-technology/kili-playground/master/recipes/img/getting_started/project-list.png)

Click on your project name. You'll see project overview:

![](https://raw.githubusercontent.com/kili-technology/kili-playground/master/recipes/img/getting_started/project-overview-text.png)

If you want to modify or view the interface, go to the project Settings page.

Here, you can make modifications using a form:

![](https://raw.githubusercontent.com/kili-technology/kili-playground/master/recipes/img/getting_started/settings-form.png)

... or using a project JSON file:

![](https://raw.githubusercontent.com/kili-technology/kili-playground/master/recipes/img/getting_started/settings-json.png)

For information on how to dynamically modify your interface, refer to [Customizing project interface](https://docs.kili-technology.com/docs/customizing-project-interface).

If you want to go back to the list of projects, click on `Kili Technology` logo in the top bar.

### Named entities relations

Another task relative to information extraction is Named Entities Relation.
Using Named Entities Relation, you can create relationships between your named entities.

To add and configure a Named Entities Relation job:
1. From your project Settings page, click on **Add a new job** and then select "Named Entities Relation".
2. Set up relationships between your object classes.

**Note:**<br>
Some classes must already exist in your project.

![](https://raw.githubusercontent.com/kili-technology/kili-playground/master/recipes/img/getting_started/ner-configuration.png)

## Importing assets <a id='importing-assets'></a>

The next step is to import assets.

You can import assets either using the graphical user interface or using Kili API.

To import assets using the graphical user interface:
1. From the project Queue page,  click on `Add assets`.
2. Select if the files that you want to add to project are `Hosted by Kili` or you want to upload them `From remote storage`.

**Note:**<br>
Files hosted in remote storage must be provided in the form of a list in .csv format. For additional information and examples, refer to [Adding assets to project](https://docs.kili-technology.com/docs/adding-assets-to-project).<br>
If you want to upload files from remote storage and your license doesn't allow that, contact us at <br>[support@kili-technology.com](mailto:support@kili-technology.com).

3. Simply drag and drop your files to the designated area or click on "click to upload" and upload the files manually.


<img src="https://raw.githubusercontent.com/kili-technology/kili-playground/master/recipes/img/getting_started/upload-assets.png"/>

<details>
<summary style="display: list-item;"> Follow these instructions to import data using the API </summary>

Simply call the [append_many_to_dataset() function] (https://python-sdk-docs.kili-technology.com/latest/asset/#kili.mutations.asset.__init__.MutationsAsset.append_many_to_dataset):

```python
kili.append_many_to_dataset(
    project_id="<your project id>", 
    content_array=["<path to local image OR url of image>"],
    external_id_array=["<your identifier of the image>"]
)
```


```python
# Example

result = kili.create_project(<project configuration>)

project_id = result['id']

kili.append_many_to_dataset(
    project_id=project_id, 
    content_array=["https://raw.githubusercontent.com/kili-technology/kili-playground/master/recipes/img/car_2.jpg"],
    external_id_array=["car_2.jpg"]
)
```
    
```python
Out: {'id': 'ckm4pmqmk0000d49k6ewu2um5'}
```
</details>

## Labeling <a id='labeling'></a>

When you create a project, you automatically become an admin of the project. As admin, you can immediately start labeling assets. If you want to add other project members, refer to [Managing project members](https://docs.kili-technology.com/docs/managing-project-members).

### Opening an asset to start labeling

From the project Queue page, click on the specific asset.
Alternatively, click on the "To do" tab, and then click on **Start labeling**.

You will see the labeling interface:

![](https://raw.githubusercontent.com/kili-technology/kili-playground/master/recipes/img/getting_started/labeling-interface-text.png)

### How to label?

1. Select the entity category (in our example, noun or interjections). You can do that by:
- Clicking on the round button next to the category name.
- Using the keyboard shortcut (in our example, "i" for interjections and "n" for noun).
2. Highlight the word or words you want to annotate in your text.
3. Click on **Submit**.

![](https://raw.githubusercontent.com/kili-technology/kili-playground/master/recipes/img/getting_started/ner_annotation.gif)

<details>
    <summary style="display: list-item;"> Follow these instructions to add a label from the API </summary>

    For that, you need to know the identifier of the asset (image). Either from the url when you are on an asset

    ![](https://raw.githubusercontent.com/kili-technology/kili-playground/master/recipes/img/asset_id_url.png)

    or from the API, retrieving the assets of the project:


    ```python
    assets = kili.assets(
        project_id=project_id,
        fields=['id']
    )
    asset_id = assets[0]['id']
    print(asset_id)
    ```

    ```python
    kili.append_labels(
        json_response_array=[json_response],
        asset_id_array=[asset_id]
    )
    ```

    Output: [{'id': 'ckm4pmzlj0009d49k1avaeubv'}]

    With a `json_response` such as :

    ```
    jsonResponse: {
            JOB_0: {
            annotations: [
                {
                beginId: '__default__',
                beginOffset: 252,
                categories: [
                    {
                    name: 'NOUN',
                    confidence: 100
                    }
                ],
                content: 'Proin',
                endId: '__default__',
                endOffset: 257,
                mid: '2021050514450488-34689'
                }
            ]
            }
        },
    ```
    
</details>

## Exporting labels <a id='exporting-labels'></a>

There are two ways to export your labeled project using the graphical user interface:
- From the project Queue page, click on the ellipsis icon (...) and then select “Export all labels”.
- From the project Queue page, select assets whose labels you want to export and then click on the export button at the bottom-right side of the page.


<img src="https://raw.githubusercontent.com/kili-technology/kili-playground/master/recipes/img/getting_started/export-all-labels.png"/>

From the "Export data" popup window, you can customize two export parameters:

- Label format
- Labels to export (scope of exported labels)

<img src="https://raw.githubusercontent.com/kili-technology/kili-playground/master/recipes/img/getting_started/export-data-popup.png"/>

For detailed information on exporting labels, refer to [Exporting project data](https://docs.kili-technology.com/docs/exporting-project-data).

<details>
<summary style="display: list-item;"> Follow these instructions to export labels from the API </summary>

```python
labels = kili.labels(
    project_id=project_id
)
```

To hide sensitive data in the list, use the function below:

```
def hide_sensitive(label):
    label['author'] = {
        'email': 'email of the author of the label',
        'id': 'identifier of the author of the label',
    }
    return label

result_hidden = [hide_sensitive(label) for label in labels]
result_hidden
```

Our API uses GraphQL. Simply choose the fields you want to fetch by specifying a list:

```python
labels = kili.labels(
    project_id=project_id,
    fields=['id', 'createdAt', 'labelOf.externalId']
)
assert len(labels) > 0
labels
```

    [{'labelOf': {'externalId': 'car_2.jpg'},
      'id': 'ckm4pmzlj0009d49k1avaeubv',
      'createdAt': '2021-03-11T10:10:20.984Z'}]


Of course, you have plenty more options/filters. To access this information, type:


```python
help(kili.labels)
```
</details>

## Quality Management <a id='quality-management'></a>

To make sure that your model performs well, it's essential that your annotations are of good quality.
Using Kili, you have two main ways to measure the quality of annotations: consensus and honeypot.

**Consensus** is the measure of agreement between annotations from different annotators.
**Honeypot** is measured by comparing the annotations of your annotators to a specified *gold standard* that you set before annotators start labeling your assets.

To configure quality management metrics: from project Settings page, click on “Quality Management”:


![](https://raw.githubusercontent.com/kili-technology/kili-playground/master/recipes/img/getting_started/settings-quality-management.png)

For detailed information, refer to [Quality management](https://docs.kili-technology.com/docs/quality-management).

## More advanced concepts <a id='advanced-concepts'></a>

- [Importing predictions](https://docs.kili-technology.com/recipes/importing-labels-and-predictions)
- [More on Named Entities Recognition](https://docs.kili-technology.com/docs/named-entities-recognition)
- [Reviewing the labels](https://docs.kili-technology.com/docs/reviewing-labeled-assets)
- [Issue/Question system](https://docs.kili-technology.com/docs/handling-questions-and-issues)

For more information on how to operate Kili API, refer to our [GraphQL API documentation](https://docs.kili-technology.com/reference/graphql-api) or our [SDK reference](https://python-sdk-docs.kili-technology.com/)