Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can OpenNMT-TF be used for a Neural Spell Checker? #115

Closed
mzeidhassan opened this issue Apr 30, 2018 · 7 comments
Closed

Can OpenNMT-TF be used for a Neural Spell Checker? #115

mzeidhassan opened this issue Apr 30, 2018 · 7 comments
Labels

Comments

@mzeidhassan
Copy link

Hello OpenNMT-tf team,

Can OpenNMT-TF be used to train a neural spell-checker model?

Thanks

@guillaumekln
Copy link
Contributor

Hello,

I don't have specific experience in spell checkers, but if you can model it as a classic sequence to sequence model, then yes you can use OpenNMT-tf as is. Otherwise, the code should be friendly enough to customize it.

@jsenellart
Copy link
Contributor

Hello, you may want to check this as an entry point: http://nlp.seas.harvard.edu/papers/aesw2016.pdf

@mzeidhassan
Copy link
Author

Thanks @guillaumekln and @jsenellart for your replies and advice. I appreciate it.

@ghost
Copy link

ghost commented Nov 21, 2019

Hello,
I understand that this is an old topic now, but I just wanted to add a slight note on this.
With the BERT paper, and with this Seq2Seq library, it is possible to create a Spell Checker using openNMT by the introduction of "masking". In BERT they mask some words, and they replace some other words with random words. Using the same approach, we can replace some characters with other character on purpose, to teach the model to fix them. Using 12 enc-dec layers as CNN did the trick for me.

@mzeidhassan
Copy link
Author

mzeidhassan commented Nov 22, 2019

Thanks @ridhwan-saal for your update. Do you have code to share? This sounds really interesting. Thanks again for your useful update. Maybe, you should write a Medium article about it. It would be great if you can share the paper you are referring to.

@ghost
Copy link

ghost commented Nov 24, 2019

This is the BERT paper I'm referring to.
Basically the concept that I derived from the paper (it is not exactly how it's done in the paper, but it is what I did) is to purposefully replace some characters with wrong ones. For example:
I love to play with my cat
would turn into characters
I l o v e t o p l a y w i t h m y c a t
Then I would create 5 train cases from this sentence:
I l o v r t o p l e y w i h h m y c a t
I l u v e t u p l a a w e t h m y c a t
... etc

The target for these 5 sentences would be
I love to play with my cat
Which is the same sentence we began with. This way the model will learn how to deduce sentences on its own, and at the same time it'll learn to fix some characters.
For me the example was character to words, but you can word to word at a character level. Example:
I love to play with my cat
becomes
I <sep> l o v e <sep> t o <sep> p l a y <sep> w i t h <sep> m y <sep> c a t <sep> <eos>
this way you teach the model to only fix the word with itself, instead of trying to fix the sentence. It will depend on your case. Let me know if you need more clarification.

@mzeidhassan
Copy link
Author

Thanks a million @ridhwan-saal. I really appreciate taking the time to give such a great and detailed explanation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants