# API test

See: https://github.com/informagi/REL/

It is possible to use the Radboud API at https://rel.cs.ru.nl/api. Alternatively you can run your own api with:

```
docker run \
    -p 5555:5555 \
    -v $PWD/data/:/workspace/data \
    --rm -it informagi/rel \
    python -m REL.server --bind 0.0.0.0 --port 5555 /workspace/data wiki_2019
```

Note that you need to run this instruction from the REL directory. The system will look for the 2019 Wikipedia, stored in REL/data/wiki_2019

(instruction taken from Github page mentioned above)

In [1]:
import requests
import time

API_URL = "https://rel.cs.ru.nl/api"
# API_URL = "http://0.0.0.0:5555"
text_doc = "If you're going to try, go all the way - Charles Bukowski"
text_doc = """Vladimir Putin’s decision to order Russian nuclear forces to be put on high alert is a “bone-chilling development”, United Nations chief Antonio Guterres said.
Speaking in New York, the UN secretary-general said the once “unthinkable” prospect of nuclear conflict was back within the realm of possibility.
He added that the UN will allocate a further $40 million from its Central Emergency Response fund to ramp up humanitarian aid for Ukraine.
The funds will help get critical supplies of food, water, medicines and other vital supplies into the country, as well as providing cash assistance to those in need, he said
"""

In [2]:
# Example Entity Linking (EL)
start_time = time.time()
el_result = requests.post(API_URL, json={
    "text": text_doc,
    "spans": []
}).json()
print(f"elapsed time: {round(time.time()-start_time, 1)} seconds")

elapsed time: 6.2 seconds


In [3]:
[ result[2:5] for result in el_result ]

[['Vladimir Putin', 'Vladimir_Putin', 0.9243598334905461],
 ['Russian', 'Russia', 0.4643396647555494],
 ['United Nations', 'United_Nations', 0.962285854706128],
 ['Antonio Guterres', 'António_Guterres', 0.38727825954758405],
 ['New York', 'New_York_City', 0.5235067088881321],
 ['UN', 'United_Nations', 0.9128588007908953],
 ['UN', 'United_Nations', 0.9132827401506348],
 ['Ukraine', 'Ukraine', 0.5323545395081096]]

In [4]:
# missed entity

text_doc[372:403]

'Central Emergency Response fund'

Note: the default text processing mode of the API is case-insensitive. Perhaps this entity could be picked up with case-sensitive processing. However, it seems to be impossible to change the default mode in the API (`REL/server.py` line 67: `text, spans = self.read_json(post_data)`: the only available fields in the json are `text` and `spans`)

In [5]:
# Example Entity Detection (ED)
start_time = time.time()
ed_result = requests.post(API_URL, json={
    "text": text_doc,
    "spans": [(372, 403)]
}).json()
print(f"elapsed time: {round(time.time()-start_time, 1)} seconds")

elapsed time: 0.2 seconds


In [6]:
ed_result

[]

In [7]:
text_doc[372:403]

'Central Emergency Response fund'


.github/workflows
Remove 3.6
2 years ago
REL
fix #74 by being less specific about exception thrown
15 months ago
scripts
Bump gson from 2.3.1 to 2.8.9 in /scripts/gerbil_middleware (#97)
2 months ago
tests
[skip ci] Code formatting
2 years ago
tutorials
Update 07_custom_models.md
2 years ago
.dockerignore
Doc improvements (#83)
12 months ago
.gitignore
Doc improvements (#83)
12 months ago
Dockerfile
Doc improvements (#83)
12 months ago
LICENSE
dep fix ## Efficiency tests with REL code

The script `REL/scripts/efficiency_test.py` can be used for computing the run times (times in number of seconds). Instructions:

1. go to the REL directory: `cd REL`
2. activate the project's Python environment: `source ../venv3/bin/activate`
3. optional: install the REL software: `pip install .`
4. run the script: `python3 scripts/efficiency_test.py`

Result:

| NER  |Wiki |Model      |Time MD|Time ED|Precision MD|Recall MD|Precision ED|Recall ED|
|:----:|:----:|:---------:|:-----:|:-----:|:----------:|:-------:|:----------:|:-------:|
|Flair | 2014 |Without GPU|  47.0 |   7.7 |      88.2% |   73.5% |      70.6% |   66.2% | 
|Flair | 2019 |Without GPU|  50.0 |   6.1 |      88.7% |   74.3% |      66.3% |   65.7% |
|Bert  | 2014 |Without GPU|  59.7 |   8.6 |      17.9% |   26.7% |      71.0% |   20.6% |
|Bert  | 2019 |Without GPU|  57.0 |   8.4 |      17.9% |   26.7% |      66.3% |   19.5% |

# Performance tests with Gerbil

We cannot use https://gerbil.aksw.org/gerbil/config because port 1235 is closed by a firewall, verify with command `nmap -p 1235 HOSTNAME`

1. go to project directory `cd $HOME/software/gerbil`
4. start Gerbil: `./start.sh`
5. open Gerbil in browser http://0.0.0.0:1234/gerbil/config
6. go to middleware directory: `cd $HOME/projects/rel20/REL/scipts/gerbil_middleware`
7. start middleware: `mvn clean -Dmaven.tomcat.port=1235 tomcat:run`
8. go to REL directory: `cd $HOME/projects/rel20/REL`
9. start server: `python3 REL/server.py --ed-model ed-wiki-2019 data wiki_2019`

Note that 3, 5 and 7 need to run simultaneously. Then this information needs to be entered in the Gerbil configuration form:

| Field            | Value |
| ---------------- | ----- |
| url:             | http://0.0.0.0:1234/gerbil/config |
| Experiment type  | A2KB|
| Matching         | Ma - strong annotation match|
| Annotator Name   | Test |
| Annotator URI    | http://0.0.0.0:1235/gerbil-spotWrapNifWS4Test/myalgorithm |
| Dataset          | AIDA CoNLL--Test B |


Results:

|  Exp | NER   | Wiki | Model       | server    | Time  | InKB Macro F1 | InKB Micro F1: ER |      |
| ---- | ----- | ---- | ----------- | --------- | ----  | ------------- | ----------------- |  --- |
| paper| Flair | 2014 |             |           |       | 0.813         | 0.833             |      |
| paper| Flair | 2019 |             |           |       | 0.786         | 0.805             |      |
| 19.3 | Flair | 2014 | Without GPU | server.py | 0.525 | 0             | 0                 |      |
| 19.4 | Flair | 2019 | Without GPU | server.py | 0.512 | 0             | 0                 |      |
| 19.2 | Bert  | 2014 | Without GPU | server.py | 0.510 | 0             | 0                 |      |
| 19.1 | Bert  | 2019 | Without GPU | server.py | 0.490 | 0             | 0                 |      |
| 19.7 | Bert  | 2014 | Without GPU | docker+   | 0.536 | 0             | 0                 |      |
| 19.6 | Flair | 2014 | Without GPU | docker    | 0.682 | 0.635         | 0.725             |      |
| 19.5 | Flair | 2019 | Without GPU | docker    | 0.685 | 0.697         | 0.703             |  <== |
| 22.5 | Flair | 2019 | Without GPU | server.py | 2.873 | 0.251         | 0.250             |  <== |
| 26.4 | Flair | 2014 | Without GPU | docker    | 2.908 | 0.620         | 0.589             |      |

paper refers to Van Hulst et al (2020): [REL: An ENtity Linker Standing on the Shoulders of Giants](https://arxiv.org/pdf/2006.01969.pdf) (Table 1)

docker+ is a new docker image built from the modified REL

The 19.X experiments have been run on processors which are four times as fast as those of the 22.X experiment

What causes the differences between the two linesm marked <==?

Comparison with paper results

| Exp   | NER   | Wiki | server | dataset   | Time  | InKB Macro F1 | InKb Micro F1 |
| ----- | ----- | ---- | ------ | --------- | ----- | ------------- | ------------- |
| paper | Flair | 2014 |        | AIDA-B    |       | **0.813**         | **0.833**         |
| 26.4  | Flair | 2014 | docker | AIDA-B    | 2.908 | 0.620         | 0.589         |
| 29.7  | Flair | 2014 | server | AIDA-B    | 2.942 | 0.292         | 0.249         |
| paper | Flair | 2014 |        | MSNBC     |       | **0.732**         | **0.744**         |
| 29.1  | Flair | 2014 | docker | MSNBC     | 3.296 | 0.700         | 0.708         |
| 29.8  | Flair | 2014 | server | MSNBC     | 3.154 | 0.265         | 0.245         |
| paper | Flair | 2014 |        | OKE-2015  |       | 0.615         | 0.648         |
| 29.5  | Flair | 2014 | docker | OKE-2015  | 0.715 | 0.613         | 0.643         |
| 29.9  | Flair | 2014 | server | OKE-2015  | 0.703 | 0.214         | 0.253         |
| paper | Flair | 2014 |        | OKE-2016  |       | 0.575         | 0.588         |
| 29.6  | Flair | 2014 | docker | OKE-2016  | 0.534 | **0.822**         | **0.824**         |
| 29.10 | Flair | 2014 | server | OKE-2016  | 0.645 | 0.267         | 0.248         |
| paper | Flair | 2014 |        | N3Reuters |       | 0.468         | 0.497         |
| 29.0  | Flair | 2014 | docker | N3Reuters | 1.390 | 0.459         | 0.473         |
| 29.11 | Flair | 2014 | server | N3Reuters | 1.621 | 0.294         | 0.259         |
| paper | Flair | 2014 |        | N3-RSS-500|       | 0.359         | 0.343         |
| 29.2  | Flair | 2014 | docker | N3-RSS-500| 0.468 | 0.361         | 0.339         |
| 29.12 | Flair | 2014 | server | N3-RSS-500| 0.516 | 0.198         | 0.121         |
| paper | Flair | 2014 |        | Derczynski|       | 0.381         | 0.412         |
| 29.3  | Flair | 2014 | docker | Derczinsky| 0.413 | 0.405         | 0.400         |
| 29.13 | Flair | 2014 | server | Derczinsky| 0.451 | 0.321         | 0.169         |
| paper | Flair | 2014 |        | KORE50    |       | **0.601**         | **0.616**         |
| 29.4  | Flair | 2014 | docker | KORE50    | 0.312 | 0.401         | 0.429         |
| 29.14 | Flair | 2014 | server | KORE50    | 0.375 | 0.173         | 0.200         |
