Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.Sign up
GitHub is where the world builds software
Millions of developers and companies build, ship, and maintain their software on GitHub — the largest and most advanced development platform in the world.
This novel generator is in the vein of "mathematical exploration", rather like perm-comb-finder, which resulted in 3×C(21,3)+2×C(215,2)=50000: The Novel in 2014. So this will begin with an expository section before getting to the code and result.
Apparently — I should note that my understanding here comes from a past age, from books I read before the Internet was ubiquitous, and I have not researched any of this using modern resources to confirm or deny or modify any of it — apparently the word ABRACADABRA is supposed to mean "vanish like the word", and in a magic charm, you'd write it like this:
And whatever you're trying to banish would, by the power of sympathetic magic, also vanish, just like the word ABRACADABRA vanishes there, just like it says it will.
Now, this sort of vanishing trick seems amenable to computer generation. Say I wanted to start with a paragraph, and remove words one by one until they're all gone, and assemble a "vanishing novel" this way.
w.r.t. NaNoGenMo, the significant part here is not figuring out how to implement this; it's quite straightforward. The significant part is determining how many words the first paragraph should have so that the total number of words in the novel is as close to 50,000 as possible, without going under.
Now, ABRACADABRA has 11 letters, so the total number of letters in the charm shown above is 1+2+3+4+5+6+7+8+9+10+11 = 66.
OK, so we already know we have a computer available (I mean, this is NaNoGenMo, right) and addition is cheap and 50,000 is not that big, so we could write a brute-force solver like
But that seems inelegant. Surely there's a closed form for "sum of integers between 1 and n", and that would be more elegant.
Indeed there is. Folklore has it that Gauss came up with it in class one day when the teacher assigned summing the numbers from 1 to 100 manually as a tedious punishment.
But I don't know if I should believe that, because folklore also has it that Gauss's IQ was 220, but I figured out the closed form for this in math class one day too (but because I was bored, not because I was looking for a way to save myself labour) and my IQ is certainly nowhere near 220. Heck, 220 doesn't even sound like a real IQ. I should probably look some of this stuff up.
But my point is, it's not that hard to figure out, I guess? If you feel so inclined, you could try to tackle it yourself before reading further.
The first observation is that when you are adding successive integers, you can visualize each integer n as a row of n blocks, and the sum as a stacking of these rows. Indeed, you can just look at that ABRACADABRA charm up there to see what I mean.
Now, you can plainly see, such a stacking fits in a square, in fact takes up roughly half the square, so the closed form must be roughly n²/2.
But - an area of exactly n²/2 would have a "smooth" diagonal from the upper-right corner to the bottom-left corner of that square. But the letters along the diagonal form a "jaggy" line. Each row overhangs the diagonal by half a character (sliced diagonally). And there are n rows. So we must add n/2 to our formula. Then we get
which can be rewritten as
which can be rewritten as
which is often how this closed form is presented.
Verily! Now that we have over-explained something you could've just gone and looked up, back to applying this to NaNoGenMo. We want to solve
We can multiply both sides by 2 to get
And then multiply the two terms on the LHS to get
And then subtract 100000 from both sides to get
which has a form which may look familiar; it's a quadratic equation. And we have a formula for solving one of those, for which I have no anecdote except that they tried to make us memorize it in that same math class but, while parts have stuck in my memory, like the ±, those parts had become hopelessly scrambled over time, so this one I did have to look up. Here it is:
In our case, a = 1, b = 1, and c = -100000. Plugging those in, we get
Taking the square root, we see the two solutions are
Now, while exploring what it might mean for a piece of text to have a negative number of words is a fascinating possibility for a conceptual exercise, I feel it might be out of scope for this entry, and we will discard the negative value. Similarly, exploring what it might mean for a piece of text to have a fractional number of words feels like it might be similarly out of scope, so we round up, and we're left with
We plug it back in to see that
is indeed quite close to 50,000, and
shows that it is as close as you can get without going under. Yay!
So now, we write 316 words and that's a paragraph, and the computer can spit out 315 more paragraphs, each with one less word, and we put them all together, and we have a generated novel, fit for a NaNoGenMo green "completed" label.
This is in Python 2.7. It takes the initial paragraph as input, as a text file, with one sentence per line.
import random import sys class Sentence(object): def __init__(self, s): s = s.strip() assert s.endswith(('.', '?')) self.ender = s[-1] self.words = s[0:-1].split(' ') def __str__(self): words = list(self.words) words = words.upper() + words[1:] while words[-1].endswith(('.', ',', '?')): words[-1] = words[-1][0:-1] return ' '.join(words) + self.ender def reduce(self): del self.words[random.randint(0, len(self.words) - 1)] if __name__ == '__main__': sentences = [Sentence(line) for line in sys.stdin] initial_word_count = sum([len(sentence.words) for sentence in sentences]) assert initial_word_count >= 316, initial_word_count while sentences: print ' '.join([str(s) for s in sentences]) + '\n' random.choice(sentences).reduce() sentences = [s for s in sentences if s.words]
Well, here is the thing, right? After writing this code and doing this writeup, I thought it would be easy to write a 316-word paragraph on the theme of vanishing to use as input for this generator. But I'm struggling with it. It's really a good thing I didn't try to do NaNoWriMo this year -- if I can't even get 316 words together, imagine how hard I'd fail with 50K.
Update: finished novel is here.
Fascinating indeed; do -50,000 words a negative novel make?
AL: I think this issue should have a cyan "preview" label.
BETTY: I don't. He hasn't shown what the novel is going to look like.
AL: So if you show none of your code, only a few words of its output, you get a cyan "preview" label -- but if you show all of your code and none of its output, you don't?
BETTY: Don't be naïve, Al. No one cares about the code.
Couldn't you find exactly 316 words of public domain material about vanishment or disappearance?…
On Fri, Nov 17, 2017 at 6:50 AM Hugo ***@***.***> wrote: Vanish Like: The Novel Vanish Like: The Vanish Like: Vanish — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#107 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAd6GchgxjNdD8JwEeIaJaki4cMT7jyzks5s3XMggaJpZM4QgjMf> .