Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for whitespace symbols in the words.dict file #48

Merged
merged 2 commits into from Jul 19, 2017

Conversation

ernestum
Copy link
Contributor

If you do character-wise prediction and not word-wise prediction, you might want to predict whitespaces. But the way the words.dict file is parsed right now crashes on whitespaces. This pull request should fix this.

@HendrikStrobelt
Copy link
Owner

HendrikStrobelt commented Jul 19, 2017

Thanks for adding this. I will merge it.
As best practice we recommend to encode white-spaces as extra word like '<space>' or '__' etc.

@HendrikStrobelt HendrikStrobelt merged commit 8b5f7ad into HendrikStrobelt:master Jul 19, 2017
@ernestum
Copy link
Contributor Author

Cool thanks! I thought about that solution too, but only after I implemented the fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants