Sentence-level Markov model (or, reconstructing Moby-Dick using a neural network) #99

jeffbinder · 2018-12-01T00:24:45Z

I split each chapter of Moby-Dick into sentences, then used a neural network to try to guess what order the sentences should appear in. I call the result Mboy-Dcki.

This is essentially a Markov chain model that works at the level of sentences rather than words or tokens. Such a model cannot be trained directly, so I created a encoder-decoder-type recurrent neural network that takes in the last 25 characters of a sentence and tries to guess what the first 25 characters of the next sentence will be. I then used this network to compute the probabilities for each pair of sentences.

It actually sort of works—at the very least, it picks the right sentence a little more often than chance would dictate. But the point, of course, is in the interesting ways it fails.

Code and a more detailed explanation are here.

hugovk · 2018-12-01T14:16:08Z

MBOY-DCKI;
OR, THE WHEAL.
BY
HERMAN MELVILLE,
AND A NEURAL NETWORK

CHAPTER 1. Loomings.

Call me Ishmael. Go visit the Prairies in June, when for scores on scores of miles you wade knee-deep among Tiger-lilies--what is the one charm wanting?--Water--there is not a drop of water there!

Chief among these motives was the overwhelming idea of the great whale himself. He thinks he breathes it first; but not so.

Circumambulate the city of a dreamy Sabbath afternoon. With a philosophical flourish Cato throws himself upon his sword; I quietly take to the ship. Strange! here come more crowds, pacing straight for the water, and seemingly bound for a dive.

enkiv2 · 2018-12-01T19:49:45Z

The premise reminds me of [my phrase-chain project]( https://github.com/enkiv2/misc/tree/master/phrasechain), which I used in NaNoGenMo 2016. It seems like your version probably has better results since it takes more words into account?

…

On Fri, Nov 30, 2018 at 7:24 PM Jeff Binder ***@***.***> wrote: I split each chapter of *Moby-Dick* into sentences, then used a neural network to try to guess what order the sentences should appear in. I call the result *Mboy-Dcki* <https://raw.githubusercontent.com/jeffbinder/sentence-level-markov/master/mboydcki.txt> . This is essentially a Markov chain model that works at the level of sentences rather than words or tokens. Such a model cannot be trained directly, so I created a encoder-decoder-type recurrent neural network that takes in the last 25 characters of a sentence and tries to guess what the first 25 characters of the next sentence will be. I then used this network to compute the probabilities for each pair of sentences. It actually sort of works—at the very least, it picks the right sentence a little more often than chance would dictate. But the point, of course, is in the interesting ways it fails. Code and a more detailed explanation are here. <https://github.com/jeffbinder/sentence-level-markov> — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#99>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAd6GdbXLnIal7eeoWMEO87RDHHQrZaoks5u0cxNgaJpZM4Y8mXI> .

jeffbinder · 2018-12-01T23:50:06Z

Thanks for sharing this!

The two approaches could possibly be combined by running the neural network on phrases rather than sentences. The training script should work without modification on any linguistic unit (phrases, clauses, paragraphs, etc.)—the corpus just has to be prepared differently. Doing it at the phrase level might make the strangeness more immediate because you would have to read fewer words on average before getting to something that differs from the original text. I'm not sure how well the particular model I used would do at assembling phrases into syntactically correct sentences, though.

tra38 · 2018-12-02T13:31:07Z

Just for reference, links to the two novels:

Original Method (trained on the Wright American Fiction corpus) - Mboydcki
Alternative Method (trained on the Moby Dick corpus) - Mobydcik

jeffbinder changed the title ~~Sentence-level Markov model~~ Sentence-level Markov model (or, selecting the order of sentences using a neural network) Dec 1, 2018

jeffbinder closed this as completed Dec 1, 2018

jeffbinder reopened this Dec 1, 2018

jeffbinder changed the title ~~Sentence-level Markov model (or, selecting the order of sentences using a neural network)~~ Sentence-level Markov model (or, reconstructing Moby-Dick using a neural network) Dec 1, 2018

hugovk added the completed label Dec 1, 2018

hugovk added the preview label Dec 1, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sentence-level Markov model (or, reconstructing Moby-Dick using a neural network) #99

Sentence-level Markov model (or, reconstructing Moby-Dick using a neural network) #99

jeffbinder commented Dec 1, 2018

hugovk commented Dec 1, 2018

enkiv2 commented Dec 1, 2018 via email

jeffbinder commented Dec 1, 2018

tra38 commented Dec 2, 2018 •

edited

Loading

Sentence-level Markov model (or, reconstructing Moby-Dick using a neural network) #99

Sentence-level Markov model (or, reconstructing Moby-Dick using a neural network) #99

Comments

jeffbinder commented Dec 1, 2018

hugovk commented Dec 1, 2018

enkiv2 commented Dec 1, 2018 via email

jeffbinder commented Dec 1, 2018

tra38 commented Dec 2, 2018 • edited Loading

tra38 commented Dec 2, 2018 •

edited

Loading