Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve handling of punctuation and camel case #108

Closed
johnv02139 opened this issue Sep 6, 2016 · 1 comment
Closed

improve handling of punctuation and camel case #108

johnv02139 opened this issue Sep 6, 2016 · 1 comment

Comments

@johnv02139
Copy link
Contributor

I've found some issues with how punctuation is handled.

For some punctuation that is found within a word, it's much better to remove it, than to replace it with a space. For example, "Bob's Burgers" should become "Bobs Burgers", not "Bob s Burgers"

Interestingly I had a problem even with the example given. It works great with "Marvels.Agents.of.S.H.I.E.L.D." but not with "Marvel's Agents of S.H.I.E.L.D."

I think trying for one big do-it-all regex is probably too complicated. I have a proposed fix that handles a few cases one at a time.

@eprenamer
Copy link
Member

Resolved in #119

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants