New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vanish Like: The Novel #107

Open
cpressey opened this Issue Nov 16, 2017 · 6 comments

Comments

Projects
None yet
3 participants
@cpressey

cpressey commented Nov 16, 2017

This novel generator is in the vein of "mathematical exploration", rather like perm-comb-finder, which resulted in 3×C(21,3)+2×C(215,2)=50000: The Novel in 2014. So this will begin with an expository section before getting to the code and result.

Exposition

Apparently — I should note that my understanding here comes from a past age, from books I read before the Internet was ubiquitous, and I have not researched any of this using modern resources to confirm or deny or modify any of it — apparently the word ABRACADABRA is supposed to mean "vanish like the word", and in a magic charm, you'd write it like this:

ABRACADABRA
ABRACADABR
ABRACADAB
ABRACADA
ABRACAD
ABRACA
ABRAC
ABRA
ABR
AB
A

And whatever you're trying to banish would, by the power of sympathetic magic, also vanish, just like the word ABRACADABRA vanishes there, just like it says it will.

Now, this sort of vanishing trick seems amenable to computer generation. Say I wanted to start with a paragraph, and remove words one by one until they're all gone, and assemble a "vanishing novel" this way.

w.r.t. NaNoGenMo, the significant part here is not figuring out how to implement this; it's quite straightforward. The significant part is determining how many words the first paragraph should have so that the total number of words in the novel is as close to 50,000 as possible, without going under.

Now, ABRACADABRA has 11 letters, so the total number of letters in the charm shown above is 1+2+3+4+5+6+7+8+9+10+11 = 66.

OK, so we already know we have a computer available (I mean, this is NaNoGenMo, right) and addition is cheap and 50,000 is not that big, so we could write a brute-force solver like

110 T=0
120 I=1
130 T=T+I
140 IF T>=50000 THEN GOTO 170
150 I=I+1
160 GOTO 130
170 PRINT T

But that seems inelegant. Surely there's a closed form for "sum of integers between 1 and n", and that would be more elegant.

Indeed there is. Folklore has it that Gauss came up with it in class one day when the teacher assigned summing the numbers from 1 to 100 manually as a tedious punishment.

But I don't know if I should believe that, because folklore also has it that Gauss's IQ was 220, but I figured out the closed form for this in math class one day too (but because I was bored, not because I was looking for a way to save myself labour) and my IQ is certainly nowhere near 220. Heck, 220 doesn't even sound like a real IQ. I should probably look some of this stuff up.

But my point is, it's not that hard to figure out, I guess? If you feel so inclined, you could try to tackle it yourself before reading further.

The first observation is that when you are adding successive integers, you can visualize each integer n as a row of n blocks, and the sum as a stacking of these rows. Indeed, you can just look at that ABRACADABRA charm up there to see what I mean.

Now, you can plainly see, such a stacking fits in a square, in fact takes up roughly half the square, so the closed form must be roughly n²/2.

But - an area of exactly n²/2 would have a "smooth" diagonal from the upper-right corner to the bottom-left corner of that square. But the letters along the diagonal form a "jaggy" line. Each row overhangs the diagonal by half a character (sliced diagonally). And there are n rows. So we must add n/2 to our formula. Then we get

n²/2 + n/2

which can be rewritten as

(n² + n)/2

which can be rewritten as

(n)(n + 1)/2

which is often how this closed form is presented.

Verily! Now that we have over-explained something you could've just gone and looked up, back to applying this to NaNoGenMo. We want to solve

(n)(n + 1)/2 = 50000

We can multiply both sides by 2 to get

(n)(n + 1) = 100000

And then multiply the two terms on the LHS to get

n² + n = 100000

And then subtract 100000 from both sides to get

n² + n - 100000 = 0

which has a form which may look familiar; it's a quadratic equation. And we have a formula for solving one of those, for which I have no anecdote except that they tried to make us memorize it in that same math class but, while parts have stuck in my memory, like the ±, those parts had become hopelessly scrambled over time, so this one I did have to look up. Here it is:

n = (-b ± sqrt(b² - 4ac)) / 2a

In our case, a = 1, b = 1, and c = -100000. Plugging those in, we get

n = (-1 ± sqrt(1² - 4×1×-100000)) / 2×1

Simplifying

n = (-1 ± sqrt(400001)) / 2

Taking the square root, we see the two solutions are

n = (-1 + 632.456) / 2
n = (-1 - 632.456) / 2

That is,

n = {315.728, -316.728}

Now, while exploring what it might mean for a piece of text to have a negative number of words is a fascinating possibility for a conceptual exercise, I feel it might be out of scope for this entry, and we will discard the negative value. Similarly, exploring what it might mean for a piece of text to have a fractional number of words feels like it might be similarly out of scope, so we round up, and we're left with

n = 316

We plug it back in to see that

(316² + 316) / 2 = 50086

is indeed quite close to 50,000, and

(315² + 315) / 2 = 49770

shows that it is as close as you can get without going under. Yay!

So now, we write 316 words and that's a paragraph, and the computer can spit out 315 more paragraphs, each with one less word, and we put them all together, and we have a generated novel, fit for a NaNoGenMo green "completed" label.

Code

This is in Python 2.7. It takes the initial paragraph as input, as a text file, with one sentence per line.

import random
import sys

class Sentence(object):
    def __init__(self, s):
        s = s.strip()
        assert s.endswith(('.', '?'))
        self.ender = s[-1]
        self.words = s[0:-1].split(' ')

    def __str__(self):
        words = list(self.words)
        words[0] = words[0][0].upper() + words[0][1:]
        while words[-1].endswith(('.', ',', '?')):
            words[-1] = words[-1][0:-1]
        return ' '.join(words) + self.ender

    def reduce(self):
        del self.words[random.randint(0, len(self.words) - 1)]

if __name__ == '__main__':
    sentences = [Sentence(line) for line in sys.stdin]
    initial_word_count = sum([len(sentence.words) for sentence in sentences])
    assert initial_word_count >= 316, initial_word_count

    while sentences:
        print '  '.join([str(s) for s in sentences]) + '\n'
        random.choice(sentences).reduce()
        sentences = [s for s in sentences if s.words]

Result

Well, here is the thing, right? After writing this code and doing this writeup, I thought it would be easy to write a 316-word paragraph on the theme of vanishing to use as input for this generator. But I'm struggling with it. It's really a good thing I didn't try to do NaNoWriMo this year -- if I can't even get 316 words together, imagine how hard I'd fail with 50K.

Update: finished novel is here.

@hugovk

This comment has been minimized.

Show comment
Hide comment
@hugovk

hugovk Nov 16, 2017

Member

Now, while exploring what it might mean for a piece of text to have a negative number of words is a fascinating possibility for a conceptual exercise, I feel it might be out of scope for this entry, and we will discard the negative value.

Fascinating indeed; do -50,000 words a negative novel make?

Member

hugovk commented Nov 16, 2017

Now, while exploring what it might mean for a piece of text to have a negative number of words is a fascinating possibility for a conceptual exercise, I feel it might be out of scope for this entry, and we will discard the negative value.

Fascinating indeed; do -50,000 words a negative novel make?

@cpressey

This comment has been minimized.

Show comment
Hide comment
@cpressey

cpressey Nov 17, 2017

AL: I think this issue should have a cyan "preview" label.

BETTY: I don't. He hasn't shown what the novel is going to look like.

AL: So if you show none of your code, only a few words of its output, you get a cyan "preview" label -- but if you show all of your code and none of its output, you don't?

BETTY: Don't be naïve, Al. No one cares about the code.

cpressey commented Nov 17, 2017

AL: I think this issue should have a cyan "preview" label.

BETTY: I don't. He hasn't shown what the novel is going to look like.

AL: So if you show none of your code, only a few words of its output, you get a cyan "preview" label -- but if you show all of your code and none of its output, you don't?

BETTY: Don't be naïve, Al. No one cares about the code.

@hugovk

This comment has been minimized.

Show comment
Hide comment
@hugovk

hugovk Nov 17, 2017

Member

Vanish Like: The Novel
Vanish Like: The
Vanish Like:
Vanish

Member

hugovk commented Nov 17, 2017

Vanish Like: The Novel
Vanish Like: The
Vanish Like:
Vanish

@hugovk hugovk added the preview label Nov 17, 2017

@enkiv2

This comment has been minimized.

Show comment
Hide comment
@enkiv2

enkiv2 Nov 17, 2017

enkiv2 commented Nov 17, 2017

@cpressey

This comment has been minimized.

Show comment
Hide comment
@cpressey

cpressey Nov 20, 2017

Finished novel is here.

cpressey commented Nov 20, 2017

Finished novel is here.

@hugovk hugovk added the completed label Nov 20, 2017

@cpressey

This comment has been minimized.

Show comment
Hide comment
@cpressey

cpressey Nov 20, 2017

Couldn't you find exactly 316 words of public domain material about vanishment or disappearance?

Not for this work, thanks for understanding.

cpressey commented Nov 20, 2017

Couldn't you find exactly 316 words of public domain material about vanishment or disappearance?

Not for this work, thanks for understanding.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment