Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NaNoGenLab: an experiment a day during November #10

Open
cpressey opened this issue Oct 20, 2014 · 89 comments
Open

NaNoGenLab: an experiment a day during November #10

cpressey opened this issue Oct 20, 2014 · 89 comments

Comments

@cpressey
Copy link

I intend to participate again this year.

If you are wondering why I used the word "again" in the previous sentence, it may help to understand that the account I was using last year has since been converted into an organization.

I don't know what I intend to do, yet, but the end result better consist of 50,000 of something that I can make a fair argument are "words" or I will surely be forced to pack my bags and catch the next Greyhound out of town in my shame.

@cpressey
Copy link
Author

I will not have as much time to commit to this this year as I did last year.

It might end up just being the first two days after Hallowe'en.

It will not use Markov processes. It might use a genetic algorithm with a Levenshtein distance as part of its fitness function. Or it might just use a grammar-based generator (a recursive-descent parser "in reverse", whatever you call that.) Don't know yet.

It will not use Twitter. It might use Project Gutenberg. But also, it might not.

@cpressey
Copy link
Author

This is just me thinking out loud.

It occurs to me that you can write an unremarkable generator which generates a remarkable novel, or a remarkable generator which generates an unremarkable novel. Or both or neither of course, but what I'm getting at is, you can concentrate your efforts on either side.

Is a program that generates a single novel still a novel-generator? If, as the rulebook suggests, a script that downloads a novel from Project Gutenberg and spits it out is a novel generator, then, yes it is.

So, you could participate in both NaNoGenMo and NaNoWriMo by opening a text file like the following and banging on it throughout the month:

#!/usr/bin/env python
print """
Owls and Lollipops
==================

It was a dark and stormy night.  My buddy had already parked his car when suddenly
(continued for 49984 more words)
"""

And if you get stuck, just print "meow" for the remaining words. Or, more interestingly, use some less trite application of logic and looping and whatnot to construct parts of the text. I think you could call this a hybrid novel/novel-generator.

@cpressey
Copy link
Author

cpressey commented Nov 1, 2014

Causes of insufficient moisture: Cutting the curd too fine or breaking up his furniture, perhaps...! Oh, my friend, my fortune is made. This made it possible to issue to each workman a shovel which would hold a load of 21 pounds of whatever it could be. Yet it is upon feeling, more often than thinking, that animals act; and every act White Fang with his foot. He says that some soft body of the Yellow Thing, and did seem as that it screamed to rage amid the entrails thereof; so wondrous was the fury and energy of that trusted Weapon. If only he could have taken counsel with someone with someone not bound his hands behind his back. Is your husband smoking, my dear. I thank you, he said. A hissing tongue of flame leapt in. If you won't, I'll give us no clue.

@cpressey
Copy link
Author

cpressey commented Nov 1, 2014

That was just an experiment. The program which produced it, and other experiments, will eventually be placed into:

https://github.com/catseye/NaNoGenLab

@MichaelPaulukonis
Copy link

regarding eliza v eliza: Did you see the Eliza twitterbot that was talking to GamerGaters?

What would happen if we hooked up Eliza to twitter, went back to twitter for another comment, something apropos.....

@MichaelPaulukonis
Copy link

Here is my novel-generating algorithm:

  1. Propose an idea to @cpressey
  2. Wait for it to be implemented
  3. ...
  4. Profit!

@cpressey
Copy link
Author

cpressey commented Nov 5, 2014

Playing requests now in the bandstand! 15 dollars a day, weddings, parties... bongo jams a speciality!

@MichaelPaulukonis Unfortunately, and even though ElizaRBarr is my new hero, I don't do Twitter. (I realize I must be in the distinct minority, here.) So you might be waiting a long time. This variation on the algorithm might be more efficient:

  1. Propose an idea to @cpressey
  2. Wait for any idea, any idea at all to be implemented
  3. ...
  4. AssertionError: profit <= 0

@MichaelPaulukonis
Copy link

@cpressey cpressey changed the title Gimme an "I"! Gimme an "N"! Gimme a "T"! (etc etc "What's that spell?" etc etc) NaNoGenLab: an experiment a day during November Nov 6, 2014
@cpressey
Copy link
Author

cpressey commented Nov 6, 2014

NaNoGenLab: an experiment a day during November

Meditating on @dariusk's statement, encouraged by @hugovk, and inspired by @MichaelPaulukonis's tireless research into Propp and Fortran V, I have settled on setting the goal for myself for this NaNoGenMo to be to produce, aside from awkward sentences like this one, one experiment in the generation, transformation, and general mutilation of text (and images, yes those too) per day, on average, for the month of November 2014.

With the following caveats:

  • "Experiment" is defined about as strictly as "novel" is.
  • Experiments are not expected to always be successful (of course).
  • Some of these experiments are going to be derivative of previous experiments.
  • I have about 14 currently, so I'm ahead of the clock, but I may run out of ideas.
  • I may also run out of available time.
  • Ideally, the novel submitted (near the end of the month, surely) will incorporate several of the experiments. But if I don't have enough time available, I may just pick one and run it to 50K words.
  • In the course of running these experiments, various supporting tools and corpora are accumulating in the lab as well. I won't, generally, consider these to be experiments for the purpose of counting the number of experiments I've done (unless I get desperate.)

Everything in the NaNoGenLab is in the public domain, so feel free to steal it, build on it, sell it, don't even credit me, whatever. I'm trying to use only verifiably public-domain external resources as well (Project Gutenberg, chroniclingamerica, and images in Wikimedia's "PD" categories, so far.)

@MichaelPaulukonis
Copy link

produce ... one experiment ... per day
I have about 14 currently

good GOD, man!

@dariusk
Copy link
Owner

dariusk commented Nov 6, 2014

I'm into it.

@cpressey
Copy link
Author

cpressey commented Nov 7, 2014

@MichaelPaulukonis We are doing science so hard right now!!

I guess I didn't mention that I have a fairly long commute which constitutes a large chunk of my free time. If I can bang it out while on the train, it's an "experiment".

And y'understand that some of these are going to be just bloody awful.

It, was-was,
It, was-was,
the-the best!
the-the best!

of, times,
of-of times,
of-of times-times of-of it,
it it was.

the, worst,
the-the worst,
the-the worst-worst the-the of,
of of times.

@MichaelPaulukonis
Copy link

Sea-Shanty "Tale of Two Cities" is admirable, admiral.

What happens when that's pointed at something smaller, like a short story. It would get longer, wouldn't it? Or does the shanty stop after three verses.

https://www.youtube.com/watch?v=FfiQAvvmXvc

@cpressey
Copy link
Author

cpressey commented Nov 7, 2014

It'll keep producing verses as long as there are more words in the input, the main problem is that the input words are given on the command-line, you'll have to use xargs or something if you want to give it a text file as input, oh and if the number of input words isn't divisible by 4 it'll crash at the end, oh and it doesn't filter out punctuation so unless you like your shanties with extra puncutation in them you'll have to filter those characters out first LOOK WE ARE DOING SCIENCE HERE, NOT ENGINEERING, O-KAY?

If I get to 30 experiments and have time remaining, I'll clean it up. And try to add more templates for verses, too (currently there are only two.)

Here's some output from this morning's experiment:

survival concise impartial house chemical advise unusual false especial because metal worse impartial increase commercial crease differential close mathematical cause !

It may amuse to try to guess the method used before reading the report. What's frightening is how simple it is.

@MichaelPaulukonis
Copy link

@cpressey
Copy link
Author

cpressey commented Nov 8, 2014

Indeed! I do hope I'll be part of the mad control group that doesn't get its world taken over when the time comes.

But wouldn't a mad engineer spend their time doing mad calcs to prove that their mad blueprint meets the mad specifications and doesn't violate any mad codes...?

(snaps fingers) Mad blueprint generator! Hmm...

@cpressey
Copy link
Author

cpressey commented Nov 8, 2014

Here is some output from this morning's experiment, btw.

turn four, for or of to the they this is in an and are as a by It with walk want when who who, grow great dear leave travel three, those, town, work, work forced followed crowded roads rags, alms. all sell sex, six see beg being time native able female employ every These their either mothers honest object fight up, Spain, infants instead Pretender passenger sustenance streets, stroling through thieves themselves helpless beggars country, children, Barbadoes. cabbin-doors melancholy livelihood, importuning

@MichaelPaulukonis
Copy link

"two hours ago" it says to me at 9:am on an EST saturday. Where are you, and how can you start so early?

I'm only online because my wife is out of the house and the kids are playing Angry Birds while I tweak some code (nobody has eaten breakfast yet).

I did add more configurable genders, and buff up the wordbank passing mechanism.

Tiny tiny miniscule tweaks. Bigger projects keep getting shunted to the side....

@cpressey
Copy link
Author

cpressey commented Nov 8, 2014

Well it is important to remember that NaNoGenLab is just one of many (mad) arms of the entire vast (and mad) Cat's Eye Technologies Lab Complex, which spans several hundred square kilometers inside the hollow Earth and has (mad) openings to the surface near Calgary, Rejkiavik, Krasnoyarsk, and Venice. But I'm working remotely from near Oxford right now.

Actually, that reminds me that I really ought to open a bug report about this whole "Na = National" thing, because when I glanced over participants' Github profiles, I counted at least 6 countries. Well, at least I've found a workaround that works really well for me ("just ignore it") so it's kind of low priority I guess.

Anyway, here's some output from this afternoon's experiment.

Alice and Bob saw a ghost looking pensive.

Then one day, Alice turned to Bob and said, "Bob, do you remember that one time when we saw a ghost looking pensive?" Bob smiled. "Of course I do, Alice."

Then one day, Bob turned to Alice and said, "Alice, do you remember that one time when we remembered that time when we saw a ghost looking pensive?" Alice smiled. "Of course I do, Bob."

Then one day, Bob turned to Alice and asked, "Alice, do you think that one day we will wonder if we'd ever remember that time when we remembered that time when we remembered that time when we remembered that time when we saw a ghost looking pensive?" "I don't know, Bob," said Alice.

Then one day, Bob turned to Alice and asked, "Alice, do you think that one day we will remember that time when we remembered that time when we remembered that time when we remembered that time when we saw a ghost looking pensive?" "I don't know, Bob," said Alice.

Then one day, Bob turned to Alice and said, "Alice, do you remember that one time when we remembered that time when we remembered that time when we saw a ghost looking pensive?" Alice smiled. "Of course I do, Bob."

Then one day, Alice turned to Bob and said, "Bob, do you remember that one time when we remembered that time when we remembered that time when we remembered that time when we saw a ghost looking pensive?" Bob smiled. "Of course I do, Alice."

[edit: fixed names that were incorrectly assigned in parts of the dialogue]

@dariusk
Copy link
Owner

dariusk commented Nov 8, 2014

  1. this event is a NaNoWriMo take-off, so that bug is a WONTFIX as it's based on an upstream dependency
  2. we never said which nation

@MichaelPaulukonis
Copy link

East Germany? Lichtenstein? The Monastic State of the Teutonic Knights?

Good ol' Bob and Alice!

@cpressey
Copy link
Author

cpressey commented Nov 9, 2014

Scotireland? Spexico? Grome?

The latest experiment has gone horribly wrong I'm afraid; I'm just lucky nothing exploded, I suppose.

Well, it's not that bad, it's just that it needs a very specific input text before I will be able to make use of it. The text needs to be 215 words long and all the words need to be unique.

So I find myself working on a poem... only 89 words so far, and I've already used "you", "no", "for", "at", "the", and "and". Tricky business, this poetry stuff.

@ikarth
Copy link

ikarth commented Nov 9, 2014

You can always take an arbitrarily large text, walk through it grabbing only the unique ones that you don't already have, and stop when you have 215.

http://www.peterbe.com/plog/uniqifiers-benchmark

@cpressey
Copy link
Author

cpressey commented Nov 9, 2014

@ikarth What, and end up with gibberish? I don't think so.

Oh, wait... right. Well anyway, poem's written now. No going back.

The idea (detailed here) was to try to answer the question: If we wanted to submit a novel to NaNoGenMo that was exactly 50,000 words in length, and we wanted to generate it using only permutations or combinations (with repetitions allowed, or not) of r words drawn from a set of n words, which combinatoric method, and what values of r and n should we pick? And let's ignore trivial solutions like P(50000, 1).

Turns out (unless there was a flaw in my maths) that you cannot get 50,000 out of a single non-trivial combinatoric function, although C(317, 2) = 50086 which is quite close, although also, as I realized somewhat late in the research, that is just the number of ways to pick two elements out of 317; if you wanted to count all those elements picked, it would actually be r times that. The closest, taking that into account, is 2*P(159,2) = 50244.

So this led me to ask (and instruct my computer to find out) if there were any two non-trivial combinatoric expressions of the latter sort that, when added up, totalled 50000. Turns out, yes: 3_C(21,3) + 2_C(215,2) = 50000. (And because choose has a symmetry in it, there are three other possibilities, using r = 19 in the first C and/or r = 214 in the second C.)

Directly I collected 21 unique words roughly meaning "section of a text", and wrote a 215-unique-word poem, and threw together something to pull all the combinations and output Markdown, and the result is:

3×C(21,3)+2×C(215,2)=50000: The Novel

@cpressey
Copy link
Author

cpressey commented Nov 9, 2014

And @ikarth, just to let you know, your suggestion was not made in vain.

@cpressey
Copy link
Author

Some recent results:

Recursively expanding templates without localizing the variables first:

sheep can't understand sheep can't understand meerkat and sheep can't understand sheep can't understand meerkat can't understand sheep can't understand meerkat can't understand sheep can't understand sheep can't understand meerkat and sheep can't understand sheep can't understand meerkat can't understand sheep can't understand meerkat is no sheep can't understand sheep can't understand meerkat can't understand sheep can't understand meerkat is the sheep can't understand sheep can't understand meerkat and sheep can't understand sheep can't understand meerkat can't understand sheep can't understand meerkat is no sheep can't understand sheep can't understand meerkat can't understand sheep can't understand meerkat of sheep can't understand sheep can't understand meerkat and sheep can't understand sheep can't understand meerkat can't understand sheep can't understand meerkat is the sheep can't understand sheep can't understand meerkat can't understand sheep can't understand meerkat of sheep can't understand sheep can't understand meerkat and sheep can't understand meerkat!

Converting a binary file into a great big number and then treating that number as a phone number mnemonic (you know, like 1-800-GET-LOST):

V LEEK OB X KHZ K WIG IGOR M MEMO YOU O GIN WIKI BUS OK WHAT ALVA VA GELT X SUN M VEX ZN E PINES FM VOLT WOO I IR YAK H IO FIG HI VAIN WU WOLFS EX WOW X TAX M TRULY LUNG I MING R JIG X BOY A CI X TASK I HUI H SITED HG M K K HI OW EGO O CRY EU TI VEX MY LEARY O OB M WADE TL O O GAY LOUT M K K LYX PM NAG EMIT NORA SLY UZI SOLE V TO LYX BLOB X K BLU LN JR PL R SWAIN X SQ THOSE H JR I ODOM JAG MET CANOED MU LN PEI I IKE LU PEGS H K K IBM X MUD TWIG YO HG M BIC AH H LN K CAIN APACE M LN IBM JR M IBM R NB HUBERT DOE RAVELS MYS K H CI I H TOFU HOLY HUI SHY WHY O NOE MYS M TORY M WOO FELONY H NOE GOBI KALI K POI TARE HG BOB H IPAD CLAY K LN MAY MG FLAP X X BARD H TL HOOK TY TI RYE KOOK V MN FOODS HZ X CYST V SOAVE X FIT INK H MN A PORK O HUBS H MEND MG PALLS X MAY ROCHE GAB I I LINGO I OH I GINNY ZELIG X MN X X BRISK TOO MAZE O TY TONES X I K H KHZ TH FOOT IRK WU E K OILY PL STYLI TYCOON LU A OATH M LEVI RX ROOM MG K PL MACY ORR MN H GAY OFT V IO MAVIN MIA I INS NORA FLUTE URDU SHUT O SO SONY X LI PLACE M K TELL E IO I ROB I I I GAMY H LID O GIGS HI DILLS MN M KHZ X FEY KIT HACKS H THAN OB I I OAR DAVIS BELL M K ZIBO AM H LN H LU CHEW EH X LOB X YE LAOS H K HA TAMI I X I LN MODEM AH PU TOGS K K GOO GOG FOOL X TWO IN GWYN MG V LAIN X MEEK R FIT I I RELY YO STEIN X H MN JR BIG LN I AGREE I A AXE O I H HZ H SPREE GLIB X I AT SWAPS OS H K LAD HE HUCK AH K LEN SLUT I HG K MEG M OSCAR KC QUILLS SO ROMPS I X JR X H LBJ PL HI SOPS K HE I IN KIEL DI TOIL K GS MGM ZR GOTH I MANX OB SWIM M LU I K H V COO X MIX KENS H GULP M SO FLINT HONK LODE V AMOS X H M ADO DOOR K LN V SNAG M K HG M PL K WOES KC IO WAGS ZN X K GS M LO EBERT E JILT O O ENG OK COD H K YAK H KW QUAD YO HG M MHZ YELLOW MIX NP PET MOSS K EYCK H EMMY X CADGE CID YO A CHEN K TRIAD I YAK X K KC IO JUN M DIET CULL IO I LUSH TY H LIKE X I AZANA OK LN X ANON LE UGH HA IO I DOCK TIPS H DIOR X GOD IO COT M NEW YUK I OUR A DRAB I HUE O UM JR LUCRE MY LTD OLAF O IN X AWAY CAW GEE STOATS TULSA UNIT GET DUCHY K H GET V GUY YUKON DUO I I KW ENRAGE HI OK SIKH M MUG HG CLOD ANN ERGO HE IBIZA V IBO CLOT M NOSY EU MALT K ZN OK ONYX INVAR H FIB H TRAPS BANNS X H MICH CZAR MN LO JOVE JIG FIRM X MES M TEA X COIF TATUM O KOOK ON I CAROL GO BAWDY M GUT OB O TOD LLAMA M LE COZY X WHO IR HA OMIT HUNKS TH K H RIO K H M HOSES I WII EH I NINA V IO ELI OGDEN ZN STOWS X H HULL I BRET K H RYE OK H SLY DIOR V LI GRUS X K JAZZ NAG LN X I DI ILLS PIG TUXES H MN HI IO IVY SWASH I CUBS PU CLIO K UGH GOA MY ETTA VEILS AX PIS NEST TI X HUS SLOB I K UTOPIA V KC I I ETTA I GS H GO LN HENS PM PALE GULP A WRIT O FIVE ZR ION X HULL K MONA I OW HG LASH V LICIT O DISCO V MY MONK FM IGOR ZN TY ENDUE KEGS SI ETON A WIZ X UTE H RIB STANK MY SHUN MN MN X GEO HE TWIG MU X ZN GS TRULY GIBED X WU EH DOUBT HUNG I IO ADDERS I GONNA IF ADZ GEO EH R BIT ISM E CARLO H HIKE M THETA X ROUTE O LOOP K RAGING O HUI LITHE DENY A UGLY I H DOLT FIB IT OHIO LI UBANGI LO I EH X X K FIT O AVIOR R CELIA HUE GOD BOLT MUD WEIR K H GYM

I believe I did say that some of these were going to be bloody awful.

@MichaelPaulukonis
Copy link

LASH V LICIT O DISCO V MY MONK FM has actually been one of my favorites radio stations for a few years now.

@enkiv2
Copy link

enkiv2 commented Nov 11, 2014

You could grab out only the sections of that number that are dictionary
words, I guess.

On Tue Nov 11 2014 at 2:14:54 PM Michael Paulukonis <
notifications@github.com> wrote:

LASH V LICIT O DISCO V MY MONK FM has actually been one of my favorites
radio stations for a few years now.


Reply to this email directly or view it on GitHub
#10 (comment)
.

@cpressey
Copy link
Author

@enkiv2 Ironically, all of those words did come from /usr/share/dict/words -- I don't know why it contains single letters, but apparently it does. Abbreviations, too.

Actually, thinkinaboutit, doesn't a real dictionary usually have entries for "J" and "FM" and such too?

Anyway, I know what you mean, and yes you could throw the whole thing through a filter to clean it up, but then it would lose an interesting property. As it is now, you ought to be able to reconstruct the original binary file from the words.

The whole thing was a hack of course (I'm starting to regret the experiment-a-day goal; it's like speed chess; don't play speed chess, Bobby, it'll ruin you) and I doubt it generates an "optimal" phone number mnemonic. I'm pretty sure it would be possible to do better with some kind of dynamic programming ish solution. (And I'm pretty sure the same applies to a number of other experiments I've done so far as well.)

@MichaelPaulukonis Your frequent musical references are sorely tempting me to write a synthesized music generator as one of the experiments. Arguing that a piece of music is a novel is probably beyond even my own post-modernist-conceptual-non-media-specific (lack-of-)sensibilities, though.

@cpressey
Copy link
Author

@ikarth: to follow up on some things you mentioned on other issue-threads:

Of course, "we're going to try as many techniques as possible" also counts as a single, strong technique.

Really? Here I thought it was a way of procrastinating until I came across a solid concept...

Or maybe it was an experiment in answering the question "Where is the bar set?" by throwing the bar across the room.

Elsewhere, you also said

The valuable thing, to my mind, of having a completely algorithmic process is that it's easy to recreate the process exactly.

As a SCIENTIST I should agree whole-heartedly with the idea that the results ought to be reproducible!

But in my own experiments, too often I've just gone and used Python's pseudo-random number generator without choosing or recording the seed... so the output is not, technically speaking, reproducible. (Not without some sort of brute-force search that I'm sure no one wants to do) Although obviously it's usually obvious that you're obtaining similar results... (and Javascript's prng doesn't even let you seed it, last time I checked; you have to use one written in Javascript if you want to do that.)

Need to write some kind of seed-chooser-and-recorder device as a piece of lab equipment. Ah, but there'll be time for that later. I still have one or two more silly ideas, and as long a commute as always...

@ikarth
Copy link

ikarth commented Nov 25, 2014

Or maybe it was an experiment in answering the question "Where is the bar set?" by throwing the bar across the room.

This is probably the best description of this whole event that anyone has come up with.

Need to write some kind of seed-chooser-and-recorder device as a piece of lab equipment. Ah, but there'll be time for that later. I still have one or two more silly ideas, and as long a commute as always...

I just went to a lot of trouble to set up a stored seed for my own project. Of course, in my case, I had the extra incentive of writing a pure functional system, so the random shuffling was the first thing that broke perfect repeatability.

On the other hand, a lost random seed may be the closest the computer can come to the impermanent: an artifact that has never been generated before and may never be generated again.

@enkiv2
Copy link

enkiv2 commented Nov 25, 2014

If you need a static seed to generate a worthwhile novel, that's a bug in
your generator -- much better to make the generator robust enough that
it'll occasionally generate gems if you run it enough. That seems to be
more in line with the experimentation going on here.

On Tue Nov 25 2014 at 11:20:32 AM ikarth notifications@github.com wrote:

Or maybe it was an experiment in answering the question "Where is the bar
set?" by throwing the bar across the room.

This is probably the best description of this whole event that anyone has
come up with.

Need to write some kind of seed-chooser-and-recorder device as a piece of
lab equipment. Ah, but there'll be time for that later. I still have one or
two more silly ideas, and as long a commute as always...

I just went to a lot of trouble to set up a stored seed for my own
project. Of course, in my case, I had the extra incentive of writing a pure
functional system, so the random shuffling was the first thing that broke
perfect repeatability.

On the other hand, a lost random seed may be the closest the computer can
come to the impermanent: an artifact that has never been generated before
and may never be generated again.


Reply to this email directly or view it on GitHub
#10 (comment)
.

@cpressey
Copy link
Author

@enkiv2 I was thinking, something like this. (This is untested and should be considered pseudo-code.)

def autoseed():
    seed = os.getenv('NANOGENLAB_SEED', None)
    if seed is None:
        seed = random.randint(0, 1000000)
    with open('seed.log', 'a') as f:
        f.write('%s: %s: %s\n' % (sys.argv[0], datetime.now(), seed))
    random.seed(seed)

This way, it doesn't get in the way, but you can set a specific seed if you want, and (maybe more importantly) when it does produce a gem it will at least write the seed somewhere so that you have a better chance at reproducing it. (Of course, there are yet other variables like "what version of the script was I using", "what input files was I using", etc.)

@enkiv2
Copy link

enkiv2 commented Nov 25, 2014

The benefit of having the seed is that it's shorter than the novel that
gets generated -- but it still makes more sense to host the interesting
generated novels than to host a list of seeds and the versions that
produced them. To go back to the science angle, I don't think that the seed
should be a variable we need to control for in any case -- because
controlling for the seed isn't reproduction so much as it's history.

On Tue Nov 25 2014 at 2:56:55 PM Chris Pressey notifications@github.com
wrote:

@enkiv2 https://github.com/enkiv2 I was thinking, something like this.
(This is untested and should be considered pseudo-code.)

def autoseed():
seed = os.getenv('NANOGENLAB_SEED', None)
if seed is None:
seed = random.randint(0, 1000000)
with open('seed.log', 'a') as f:
f.write('%s: %s: %s\n' % (sys.argv[0], datetime.now(), seed))
random.seed(seed)

This way, it doesn't get in the way, but you can set a specific seed if
you want, and (maybe more importantly) when it does produce a gem it
will at least write the seed somewhere so that you have a better chance at
reproducing it. (Of course, there are yet other variables like "what
version of the script was I using", "what input files was I using", etc.)


Reply to this email directly or view it on GitHub
#10 (comment)
.

@cpressey
Copy link
Author

We might be talking at cross-purposes here, a bit... hosting the seeds instead of the generated result is definitely not what I had in mind. Maybe I should clarify that, beyond the playing-science trope of chanting "Reproducibility! Yes! (Remember cold fusion, after all!)", my own it'd-actually-be-a-nice-thing-to-have use case for this would be when I have just run

./experiment.py | less

and pressed q before realizing that, wait, that one was kind of cool, I wonder how it ends? OH WELL, GONE NOW.

@MichaelPaulukonis
Copy link

I worked on getting a random key last year, and just let it slide by the wayside this year.

In the field of generative visual art, it is really really really a nice thing to have.

Also, for unit-testing.

@hugovk
Copy link
Collaborator

hugovk commented Nov 26, 2014

50,000 Meows was developed test-first.

@cpressey
Copy link
Author

Tests? This is a lab! This is no place for tests...

Regarding seeds, I'm now inclined to play devil's advocate and just maybe in this modern age where everything we do is archived forever in the cloud we should be grateful for all the empherality we can get? Shrug?

Anyway, latest experiment is here and it is a total flop, by which I mean a total success, by which I mean that artists often run experiments but the hypothesis is almost always "I hypothesize that if I try this, the result will be pleasing enough, or at least the experience will be rewarding enough, that it was worth the effort of trying it."

@hugovk hugovk added the preview label Nov 27, 2014
@cpressey
Copy link
Author

Just to report: I had a vague goal that I'd produce a cut-up novel of some sort -- with four experiments run in the name of doing so -- and you can see how far I got with that here:

https://github.com/catseye/NaNoGenLab/blob/master/sensible-paste-up/sample-cheese.jpg

I think it has promise ("cheese, stirring it until is is pneumonia", for example, and "FOUNDER PRECISE ARTIST") but about a week ago I decided it really deserves deeper thought about composition, and better engineering, than remaining time allows. One thing, for example, that would be nice to do, would be to run each snippet through an OCR, and use that information (somehow) when choosing a place to paste it.

Plus running it for, what, 200 pages (or whatever would feel sufficiently 50,000-words-ish) would result in a massive file which I'd have to host somewhere and, ehh, that'd just be more hassle right now. So, maybe next year.

@ikarth
Copy link

ikarth commented Nov 29, 2014

That idea has potential, but I can see why you're holding off.

@cpressey
Copy link
Author

I got to try my hand at procedural image processing anyway, which is not something I'd ever really done before. And learned a bit about using PIL. So that's something.

It has been a good year. A fun month -- an exhausting month, in many ways --

fwiw I do not recommend the experiment-a-day approach, unless you just have way too many ideas and want to surprise yourself by how quick-and-dirty you are willing to code, to get them down.

And, stupidly, I seem to have even more ideas now. Arrgh.

Well, next year... arrgh. Next year is eleven months away! Well, what about the off-season? Dunno; last year after NaNoGenMo I (mercifully?) lost my taste for generated text, but now...

One thing I'm tempted to do is to extract the possibly-useful "lab equipment" into some kind of reusable library-slash-suite of utilities. The name NaNoGenLib suggests itself, but maybe that's a bit presumptuous. KTLN, a toolkit for unnatural language processing also suggests itself, especially if I can think of a better backronym than "Kitten Talks Like Nixon". Shrug?

also fwiw @hugovk I don't think this issue deserves a "Completed" tag, due to its tangential nature. I've been following your lead and opening separate issues for each novel. And actually, since I uploaded them all as gists, a handy index / summary can be found here: https://gist.github.com/cpressey/

@cpressey
Copy link
Author

more fwiw: At the request of a friend, I translated the uniquifier experiment to Javascript and put it online here: Text Uniquifier.

Also, this is neither here nor there, but I just noticed that:

  • NaNoWriMo's slogan is "The world needs your novel";
  • NaNoWriMo does not require you to share your novel with anyone at all at the end;
  • NaNoGenMo does not make any claims about whether the world needs or does not need your generator or the novels it generates;
  • NaNoGenMo does require you to share your generator and at least one novel at the end.

I noticed this while looking for public results from NaNoWriMo this year (y'know, to compare notes, sort of.) I haven't yet found any, although granted I haven't spent a lot of time hunting yet. The NaNoWriMo site has links to authors' websites, most of whom are "published for-reals" and have, at best, a link to an ebook for you to purchase -- sometimes, from a draft completed during NaNoWriMo.

Take this for what you will, my only point is: different.

@MichaelPaulukonis
Copy link

If anybody is interested in text manipulation in the off season, it is an
interest of mine, and would enjoy collaborating or bouncing ideas around.
On Nov 30, 2014 8:46 AM, "Chris Pressey" notifications@github.com wrote:

more fwiw: At the request of a friend, I translated the uniquifier
experiment
https://github.com/catseye/NaNoGenLab/tree/master/uniquified-novel to
Javascript and put it online here: Text Uniquifier
http://catseye.tc/installation/Text_Uniquifier.

Also, this is neither here nor there, but I just noticed that:

  • NaNoWriMo's slogan is "The world needs your novel";
  • NaNoWriMo does not require you to share your novel with anyone at
    all at the end;
  • NaNoGenMo does not make any claims about whether the world needs or
    does not need your generator or the novels it generates;
  • NaNoGenMo does require you to share your generator and at least one
    novel at the end.

I noticed this while looking for public results from NaNoWriMo this year
(y'know, to compare notes, sort of.) I haven't yet found any, although
granted I haven't spent a lot of time hunting yet. The NaNoWriMo site has
links to authors' websites, most of whom are "published for-reals" and
have, at best, a link to an ebook for you to purchase -- sometimes, from a
draft completed during NaNoWriMo.

Take this for what you will, my only point is: different.


Reply to this email directly or view it on GitHub
#10 (comment)
.

@hugovk hugovk removed the preview label Nov 30, 2014
@hugovk
Copy link
Collaborator

hugovk commented Nov 30, 2014

@cpressey It had a "preview" label, but I've now de-labelled it.

@ikarth
Copy link

ikarth commented Nov 30, 2014

@MichaelPaulukonis There's probably enough interest to establish some kind of communication channel for that, if someone organizes it.

@moonmilk
Copy link

@cpressey "I do not recommend the experiment-a-day approach"

I'll just put this here... https://www.flickr.com/photos/ranjit/collections/72157627384812764/

@moonmilk
Copy link

@MichaelPaulukonis I am interested year-round!

@cpressey
Copy link
Author

@moonmilk Indeed. I think I would get funny looks from the other commuters if I were to try that on the train. (well, funniER.)

@MichaelPaulukonis Consider me interested too, at least enough to lurk on said communications channel...

@MichaelPaulukonis
Copy link

I started up an out-of-season rep last year @ https://github.com/TextGenTex/TextGenTex

I'm certainly open to "better" communication channels.

@enkiv2
Copy link

enkiv2 commented Dec 1, 2014

I'm definitely interested in off-season experiments as well. I've
registered #nanogenmo on freenode (since this is a primary channel, I'll
cede ownership to @dariusk if he requests it). IRC seems like it would open
up some interesting avenues of experimentation, seeing as how interactive
text generators could interact with each other organically :-)

On Sun Nov 30 2014 at 8:38:33 PM Michael Paulukonis <
notifications@github.com> wrote:

I started up an out-of-season rep last year @
https://github.com/TextGenTex/TextGenTex

I'm certainly open to "better" communication channels.


Reply to this email directly or view it on GitHub
#10 (comment)
.

@moonmilk
Copy link

moonmilk commented Dec 1, 2014

How about a google group for the off-season text stuff? They're free and
fairly user-friendly.

IRC is nice, but even if someone is archiving it, it's much harder to look
through the archives and learn stuff than from the more structured records
of a google group or other mailing list.

-r

On Mon, Dec 1, 2014 at 1:37 PM, John Ohno notifications@github.com wrote:

I'm definitely interested in off-season experiments as well. I've
registered #nanogenmo on freenode (since this is a primary channel, I'll
cede ownership to @dariusk if he requests it). IRC seems like it would
open
up some interesting avenues of experimentation, seeing as how interactive
text generators could interact with each other organically :-)

On Sun Nov 30 2014 at 8:38:33 PM Michael Paulukonis <
notifications@github.com> wrote:

I started up an out-of-season rep last year @
https://github.com/TextGenTex/TextGenTex

I'm certainly open to "better" communication channels.


Reply to this email directly or view it on GitHub
<
https://github.com/dariusk/NaNoGenMo-2014/issues/10#issuecomment-65011280>

.


Reply to this email directly or view it on GitHub
#10 (comment)
.

@enkiv2
Copy link

enkiv2 commented Dec 1, 2014

I'd join a google group for this if someone produced one. I'm treating
these issue threads as mailing lists anyhow.

On Mon Dec 01 2014 at 7:52:16 AM Ranjit Bhatnagar notifications@github.com
wrote:

How about a google group for the off-season text stuff? They're free and
fairly user-friendly.

IRC is nice, but even if someone is archiving it, it's much harder to look
through the archives and learn stuff than from the more structured records
of a google group or other mailing list.

-r

On Mon, Dec 1, 2014 at 1:37 PM, John Ohno notifications@github.com
wrote:

I'm definitely interested in off-season experiments as well. I've
registered #nanogenmo on freenode (since this is a primary channel, I'll
cede ownership to @dariusk if he requests it). IRC seems like it would
open
up some interesting avenues of experimentation, seeing as how
interactive
text generators could interact with each other organically :-)

On Sun Nov 30 2014 at 8:38:33 PM Michael Paulukonis <
notifications@github.com> wrote:

I started up an out-of-season rep last year @
https://github.com/TextGenTex/TextGenTex

I'm certainly open to "better" communication channels.


Reply to this email directly or view it on GitHub
<

https://github.com/dariusk/NaNoGenMo-2014/issues/10#issuecomment-65011280>

.


Reply to this email directly or view it on GitHub
<
https://github.com/dariusk/NaNoGenMo-2014/issues/10#issuecomment-65058901>

.


Reply to this email directly or view it on GitHub
#10 (comment)
.

@MichaelPaulukonis
Copy link

https://groups.google.com/d/forum/generativetext

or

generativetext@googlegroups.com

assuming I've set up the settings correctly. Which seems unlikely.

@enkiv2
Copy link

enkiv2 commented Dec 1, 2014

It seems OK, other than being a private group.

On Mon Dec 01 2014 at 9:09:16 AM Michael Paulukonis <
notifications@github.com> wrote:

https://groups.google.com/d/forum/generativetext

or

generativetext@googlegroups.com

assuming I've set up the settings correctly. Which seems unlikely.


Reply to this email directly or view it on GitHub
#10 (comment)
.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants