Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can obtain nested tags #11

Closed
PabloMosUU opened this issue Sep 22, 2021 · 2 comments · Fixed by #13
Closed

Can obtain nested tags #11

PabloMosUU opened this issue Sep 22, 2021 · 2 comments · Fixed by #13
Assignees

Comments

@PabloMosUU
Copy link
Collaborator

If you input this text into Deduce:
'ADHD Adres: Naamlaan 100 Woonplaats: 3512AB Apeldoorn Tel: 088-1234567'
and run deduce.annotate_text, you will obtain:
'ADHD <PERSOON Adres: <LOCATIE Naamlaan 100> Woonplaats: 3512AB Apeldoorn Tel>: <TELEFOONNUMMER 088-1234567>'
which includes nested tags. Obviously there is a problem whereby the entired string from "Adres" to "Tel" is being detected as a person's name. However, the problem I'm pointing out here is that, having detected that, it then finds a LOCATIE tag within the PERSOON tag, which means that the final output contains nested tags, which should not be allowed.

This should be fixed easily by moving the flatten_text call, currently happening within the "names" deidentification, to the very end of the annotate_text method, right before returning the final text. Do you agree with this?

@vmenger
Copy link
Owner

vmenger commented Sep 23, 2021

Thanks, I'll let you know when I found the time to understand what is happening exactly in this issue/PR.

@PabloMosUU
Copy link
Collaborator Author

The main question is why this:
"Adres: Naamlaan 100 Woonplaats: 3512AB Apeldoorn Tel"
gets annotated as a single PERSOON by annotate_names

However, once you accept that this is the case, then what is happening is quite simple: the text within the previously annotated text "Naamlaan 100" gets recognized as an address and annotated, so we end up with nested tags

@vmenger vmenger closed this as completed Oct 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants