-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/spacy detector #64
Feature/spacy detector #64
Conversation
Nice work, thanks! I haven't had a chance to look through it properly, but two things popped out:
I look forward to playing with this next week! Thanks for the commits! 😁 |
100%. This is highly unfinished. I am planning to writing tests shortly, to get coverage to the same point (or ideally higher) than it was before this PR. Just wanted to write a draft PR to get early feedback.
Yes, that is also something that occurred to me, as I was playing with it. Will implement it 👍 ! |
I have now implemented those changes pinpointed before. Also added some initial tests to the detector. At the moment the build fails for python3.5, because Spacy doesn't support it. So wondering how you want to go about that. Should I add the spacy detector as an "extra"/should I allow the tests to fail ? I believe that we can make travis skip tests for specific environments, but that will require some investigation. Adding as an extra requirement would allow someone that only wants to use "vanilla" scrubadub to just use it, but if someone wants to use spacy transformers (which is arguably way slower) they could just say: pip install scrubadub[spacy] |
Ah. Yeah including it as an extra seems very sensible. python 3.5 is almost end of life, but i believe it's the python still in ubuntu LTS 18.04... Don't worry about it in the MR, I'll work out something... |
Hi @thomasbird , I believe I am happy with the initial functionality. I added the tests. On python 3.8 I am over 98%, however since the functionality doesn't work in 3.5, there is a decrease in coverage (I set the tests to skip). Let me know what you think would be the best approach for coveralls. Happy to hear your thoughts on the MR in general. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome work! Thanks for the commits!
Thanks for merging this. I believe this closes #18 |
Current stats for the old entity detector:
Current stats for new spacy detector