Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code Improvements and split by apostrophe #3

Open
wants to merge 8 commits into
base: master
from

Conversation

@Mte90
Copy link

Mte90 commented Oct 1, 2019

No description provided.

@Mte90 Mte90 mentioned this pull request Oct 1, 2019
@Mte90

This comment has been minimized.

Copy link
Author

Mte90 commented Oct 8, 2019

just a ping

@dabinat

This comment has been minimized.

Copy link
Owner

dabinat commented Oct 20, 2019

Sorry for the delay with this and thanks for making these changes. Just so that I understand your code correctly, why does the split not occur when reading from the dictionary file? Is that just an efficiency thing or is there a specific compatibility reason for it?

Splitting on apostrophes should occur after they have been normalized from curly quotes, otherwise some will slip through. I also think splitting on apostrophes should be behind a flag, defaulting to off. It would be weird for English and probably other languages too, so it should only be on for people who need that behavior.

@Mte90

This comment has been minimized.

Copy link
Author

Mte90 commented Oct 20, 2019

In italian we don't use curly apostrophes so the risk of that there isn't.
I think that they are used only in english has an alternative version of " and normalize them as single apostrophe will create a lot of issues.

I can add a flag without problems to enable the split by a parameter is not a problem.

About the dictionary file I didn't used in my case because the readme of the CV scraper don't mention it.

@Mte90

This comment has been minimized.

Copy link
Author

Mte90 commented Oct 21, 2019

New parameter added :-)

@Mte90

This comment has been minimized.

Copy link
Author

Mte90 commented Nov 22, 2019

I want to work on other tickets but just before to do news stuff I want to see if it is something missing there.

Copy link
Owner

dabinat left a comment

Thanks for the cleanup, looks good except for the parts I noted.

word_usage.py Show resolved Hide resolved
word_usage.py Show resolved Hide resolved
word_usage.py Show resolved Hide resolved
@Mte90

This comment has been minimized.

Copy link
Author

Mte90 commented Nov 24, 2019

Refreshed the code :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.