-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Erro nos atributos .start e .end #15
Comments
Hi Felipe, I've also had this problem before. It seems like a bug in cogroo4. |
Apparently the _preproc function changes the amount of characters in the text. Could this be the source of the bug? Take a look: Using the same input:
we can see:
What you think? |
I got a satisfactory result using the following function to run Cogroo4py with
Perhaps we can implement it more sophisticatedly in the library. |
Good catch! So this is a problem caused by the pre-processing of texts in https://github.com/gpassero/cogroo4py/blob/master/python/cogroo4py/cogroo.py#L219. If I remember right, bad things happened when this wasn't done but I can't remember exactly what. A comment would have been nice to justify this step. |
Não sei exatamente se é uma causa específica do cogroo4py ou se aplica-se à todo Cogroo.
Segue o código mostrando o erro:
CODE1 :
OUTPUT1:
CODE2:
OUTPUT2:
Como observado, aparentemente existe um erro nos atributos .start e .end presentes no cogroo4py quando a frase inserida é maior. Usando apenas a sessão de erro, o retorno é correto. Então pode ser que seja pela presenção ou não do '.' antes de 'Refiro-me'. Fiz ainda o último teste:
CODE3:
OUPUT3:
Informações de reprodutibilidade:
Códigos executados no Jupyter Notebook do VScode.
OS Win 10 Home
Python 3.10
Execução ocorre em Ambiente Virtual Python
The text was updated successfully, but these errors were encountered: