# Using PyLighter for NER annotations

This notebook is a step by step guide for getting started with PyLighter !

Before getting started, the task for this example is to annotate verbs, persons, organisations and localities. For this use, let's define the following dataset:

## Defining the corpus

In [None]:
# Defining the corpus of documents to use throughout this notebook
corpus = [
     "PyLighter is an annotation tool for NER tasks directly on Jupyter. " 
    + "It aims on helping data scientists easily and quickly annotate datasets. "
    + "This tool was developed by Paylead.",
    "PayLead is a fintech company specializing in transaction data analysis. "
    + "Paylead brings retail and banking together, so customers get rewarded when they buy. " 
    + "Welcome to the data-for-value economy."
]

## Start annotating

In [None]:
# Import pylighter annotation tool !
from pylighter import Annotation

In [None]:
# Start annotating !
Annotation(corpus)

## Changing the labels

There is a wide variety of NER tasks, so by default the possible labels are _l1_, _l2_, etc. 
You could remember that _l1_ is equivalent to the label _verb_, _l2_ to _person_ and so on. However, it is possible to change the names of labels, just like this:

In [None]:
labels_names = ["Verb", "Person", "Org", "Loc"]

In [None]:
annotation = Annotation(corpus, labels_names=labels_names)

Now you can correctly annotate the corpus !

## Retrieving the results

At that point, you should have finished your annotation but you may wonder how to get your annotations. There is two ways:
- Clicking on the save button
- Accessing the labelise corpus directly

### Using the save button

When clicking on the save button, your annotations will be save in the file _annotation.csv_. It will also automatically save when you finished annotating all the corpus.

The csv file will have two columns, the first one is each element of the corpus, the second one is the annotation of the corpus charachter by charachter in IOB2 format.

Note: It is possible to change the path to the save file.

In [None]:
path_to_save_file = "annotation.csv" # Which is the default one

In [None]:
Annotation(
    corpus,
    labels_names=labels_names,
    save_path=path_to_save_file
)

You can now read access the elements in that file this:

In [None]:
import pandas as pd
pd.read_csv(path_to_save_file, sep=";")

### Direct access to the annotations

If you access annotation.labels, you should see the labels of your annotations. You can then do whatever you want with it.

In [None]:
my_annotation = annotation.labels

If you expect to modify it, it is strongly recommended to do a deep copy of annotation.labels 

In [None]:
from copy import deepcopy
my_annotation = deepcopy(annotation.labels)

## Final note

At this point, you should have everything you need to start annotating. 

Bear in mind, that if you stumble upon more specific use cases, PyLighter offers more tools that is likely to respond to your need. You can read more about it in the README.md or in the Advanced usage demo. 