Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot detect values? #21

Closed
Fati-Hei opened this issue Oct 7, 2021 · 3 comments
Closed

Cannot detect values? #21

Fati-Hei opened this issue Oct 7, 2021 · 3 comments

Comments

@Fati-Hei
Copy link

Fati-Hei commented Oct 7, 2021

No description provided.

@plison
Copy link
Collaborator

plison commented Oct 11, 2021

Could you give a concrete example with some simple sentences we could try out? The code you provide looks a priori fine (apart from the fact you also need to create a FunctionAnnotator for your st_detector function and run it on your documents.

@plison
Copy link
Collaborator

plison commented Oct 13, 2021

But as far as I can see, this doesn't seem to be a problem with skweak, but with the functions standards_detector and st_detector that you implemented.

For instance, the st_detector function relies on having separate tokens for the "NS-EN" and the numbers that come after it -- which means it won't work on phrases such as " NS-EN12845". And your function is also limited to handling two tokens (since you only check whether the current token starts with a digit), so it's not suprising it doesn't recognise the full phrase NS-EN 12845 2020.

@plison
Copy link
Collaborator

plison commented Oct 15, 2021

Well it's simply that the loop you have written in your function does not properly handle two consecutive tokens with numerical values.

@plison plison closed this as completed Oct 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants