Can OpenNMT-TF be used for a Neural Spell Checker? #115

mzeidhassan · 2018-04-30T17:35:42Z

Hello OpenNMT-tf team,

Can OpenNMT-TF be used to train a neural spell-checker model?

Thanks

guillaumekln · 2018-05-01T09:25:13Z

Hello,

I don't have specific experience in spell checkers, but if you can model it as a classic sequence to sequence model, then yes you can use OpenNMT-tf as is. Otherwise, the code should be friendly enough to customize it.

jsenellart · 2018-05-01T09:41:13Z

Hello, you may want to check this as an entry point: http://nlp.seas.harvard.edu/papers/aesw2016.pdf

mzeidhassan · 2018-05-01T18:01:56Z

Thanks @guillaumekln and @jsenellart for your replies and advice. I appreciate it.

ghost · 2019-11-21T08:12:35Z

Hello,
I understand that this is an old topic now, but I just wanted to add a slight note on this.
With the BERT paper, and with this Seq2Seq library, it is possible to create a Spell Checker using openNMT by the introduction of "masking". In BERT they mask some words, and they replace some other words with random words. Using the same approach, we can replace some characters with other character on purpose, to teach the model to fix them. Using 12 enc-dec layers as CNN did the trick for me.

mzeidhassan · 2019-11-22T03:36:29Z

Thanks @ridhwan-saal for your update. Do you have code to share? This sounds really interesting. Thanks again for your useful update. Maybe, you should write a Medium article about it. It would be great if you can share the paper you are referring to.

ghost · 2019-11-24T06:56:22Z

This is the BERT paper I'm referring to.
Basically the concept that I derived from the paper (it is not exactly how it's done in the paper, but it is what I did) is to purposefully replace some characters with wrong ones. For example:
I love to play with my cat
would turn into characters
I l o v e t o p l a y w i t h m y c a t
Then I would create 5 train cases from this sentence:
I l o v r t o p l e y w i h h m y c a t
I l u v e t u p l a a w e t h m y c a t
... etc

The target for these 5 sentences would be
I love to play with my cat
Which is the same sentence we began with. This way the model will learn how to deduce sentences on its own, and at the same time it'll learn to fix some characters.
For me the example was character to words, but you can word to word at a character level. Example:
I love to play with my cat
becomes
I <sep> l o v e <sep> t o <sep> p l a y <sep> w i t h <sep> m y <sep> c a t <sep> <eos>
this way you teach the model to only fix the word with itself, instead of trying to fix the sentence. It will depend on your case. Let me know if you need more clarification.

mzeidhassan · 2019-11-24T13:40:04Z

Thanks a million @ridhwan-saal. I really appreciate taking the time to give such a great and detailed explanation.

guillaumekln added the question label May 1, 2018

guillaumekln closed this as completed May 15, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can OpenNMT-TF be used for a Neural Spell Checker? #115

Can OpenNMT-TF be used for a Neural Spell Checker? #115

mzeidhassan commented Apr 30, 2018

guillaumekln commented May 1, 2018

jsenellart commented May 1, 2018

mzeidhassan commented May 1, 2018

ghost commented Nov 21, 2019

mzeidhassan commented Nov 22, 2019 •

edited

ghost commented Nov 24, 2019

mzeidhassan commented Nov 24, 2019

Can OpenNMT-TF be used for a Neural Spell Checker? #115

Can OpenNMT-TF be used for a Neural Spell Checker? #115

Comments

mzeidhassan commented Apr 30, 2018

guillaumekln commented May 1, 2018

jsenellart commented May 1, 2018

mzeidhassan commented May 1, 2018

ghost commented Nov 21, 2019

mzeidhassan commented Nov 22, 2019 • edited

ghost commented Nov 24, 2019

mzeidhassan commented Nov 24, 2019

mzeidhassan commented Nov 22, 2019 •

edited