Can obtain nested tags #11

PabloMosUU · 2021-09-22T13:46:16Z

If you input this text into Deduce:
'ADHD Adres: Naamlaan 100 Woonplaats: 3512AB Apeldoorn Tel: 088-1234567'
and run deduce.annotate_text, you will obtain:
'ADHD <PERSOON Adres: <LOCATIE Naamlaan 100> Woonplaats: 3512AB Apeldoorn Tel>: <TELEFOONNUMMER 088-1234567>'
which includes nested tags. Obviously there is a problem whereby the entired string from "Adres" to "Tel" is being detected as a person's name. However, the problem I'm pointing out here is that, having detected that, it then finds a LOCATIE tag within the PERSOON tag, which means that the final output contains nested tags, which should not be allowed.

This should be fixed easily by moving the flatten_text call, currently happening within the "names" deidentification, to the very end of the annotate_text method, right before returning the final text. Do you agree with this?

vmenger · 2021-09-23T16:40:46Z

Thanks, I'll let you know when I found the time to understand what is happening exactly in this issue/PR.

PabloMosUU · 2021-09-24T07:23:23Z

The main question is why this:
"Adres: Naamlaan 100 Woonplaats: 3512AB Apeldoorn Tel"
gets annotated as a single PERSOON by annotate_names

However, once you accept that this is the case, then what is happening is quite simple: the text within the previously annotated text "Naamlaan 100" gets recognized as an address and annotated, so we end up with nested tags

PabloMosUU assigned vmenger Sep 22, 2021

PabloMosUU mentioned this issue Sep 23, 2021

fix bugs related to adjacent tags and nested tags #13

Merged

PabloMosUU linked a pull request Sep 23, 2021 that will close this issue

fix bugs related to adjacent tags and nested tags #13

Merged

vmenger closed this as completed Oct 5, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can obtain nested tags #11

Can obtain nested tags #11

PabloMosUU commented Sep 22, 2021

vmenger commented Sep 23, 2021

PabloMosUU commented Sep 24, 2021

Can obtain nested tags #11

Can obtain nested tags #11

Comments

PabloMosUU commented Sep 22, 2021

vmenger commented Sep 23, 2021

PabloMosUU commented Sep 24, 2021