-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Resources #1
Comments
Use ConceptNet to start writing your own Tale-Spin-like stories. "There was a kitten. The kitten was someone's pet. The kitten wanted to explore. Surfing the web is used for exploring. You need to connect to the internet in order to surf the web." |
I also feel like there may be relevant things in Michael Cook's list of procedural generation tutorials. |
I recommend, in addition to ConceptNet, CMU NELL. It's another ontology (There's also the potential for some of us to use IBM Watson, probably. I On Mon, Oct 20, 2014 at 1:09 PM, Allison Parrish notifications@github.com
|
Project Gutenberg is now being mirrored on Github -- https://github.com/GITenberg -- although exactly what benefit that may or may not bring to you, the potential NaNoGenMo participant, I cannot say. I mention it mainly because I'm not sure if it was around last time -- I only noticed it recently (a month or two ago.) |
Generate a sufficiently complicated graph, and then explicate it for 50,000 words with wordgraph? https://wordgraph.readthedocs.org/en/latest/ |
http://www.ark.cs.cmu.edu/TweetNLP/
|
Got a postcard from Blurb yesterday: if you want to print your finished novel, get 20% off using the code Expires November 30, so you'll have to finish a little early... |
2nding Tweet NLP--I'm doing some twitter-related stuff, and hacked around on it via Clojure for a bit last night. It's not terribly well-documented outside command-line stuff, but the source code is clear and well-commented. I also can't recommend NLTK strongly enough. It's definitely intimidating if you're not familiar with NLP, but that's mostly because it's got so much stuffed into it--wordnet, several phenomenal corpora (including a bunch of Project Gutenberg stuff), markov generators, & more. NLTK3 is even Py3K-compatible, for all of you who really want to put some unicode in your novels/hate |
2nding @swizzard about NLTK (though I just uninstalled NLTK3 in favor of 2.04 because Markov generators are not included in NLTK3), and adding Pattern to the mix. Pattern has linguistics (for parsing and information extraction based on chunk/word patterns), web-search (Google, Bing, Yahoo, Twitter, Wikipedia), web-crawling, language modelling (TF-IDF), classification, and commonsense reasoning. They're pretty approachable using the included examples. The website has some usage examples. patent-generator uses the search module from Pattern to generate patents from literary texts. On the literary side I would like to mention two resources on Ubuweb: It's Anthology of Conceptual Writing and /ubu editions. There's The first thousand numbers classified in alphabetical order by Claude Closky, Name, A Novel by Toadex Hobgrammathon and All the Numbers from Numbers by Kenneth Goldsmith, and possibly more stuff which could have (or has) been been done by computer. |
I had no idea they'd taken Markov generators out of NLTK3! That's such a bummer. Pattern sounds great though. I'll definitely be looking into it. |
What's a good way to create a PDF book from plain text? Are there any handy (Python) scripts to generate a half-decent PDF by throwing a bunch of text at it? Or is it better to export to PDF from Libre/Open Office? |
@hugovk This is a biased opinion of course, but I like http://johnmacfarlane.net/pandoc/ for document format conversions. It's written in Haskell, which may or may not be your cup of tea, but you can shell it out from Python. I'm sure there are Python-specific tools for generating PDFs too (generating a PDF is actually not that difficult, especially if it's just text no images,) but I don't know any offhand. |
@hugovk – I've tried, with limited success, to use ReportLab. It's very confusing and hard to get nice formatting. I would suggest generating a .txt file, then using Word/InDesign to format and export, or Ghostscript on the command line, or called from Python at the end of your script. |
@hugovk, Consider generating markup instead of text. I typically generate html, open it in a browser, and then use the print to PDF function. This time around I might generate LaTeX markup instead, which renders very nicely to PDF. |
@hugovk Seconding Darius on LaTex - my preferred intermediary format from plain text is Fletcher Penney's MultiMarkdown, which is reasonably painless once up and running. Happy to help if I can! |
Thanks for all the suggestions! I think I'll keep it simple and just use plain text and PDF. ReportLab looks like a powerful library (and yes, confusing) for making PDFs in Python. Here's one script intended for turning Python source code into a PDF, but Here's another example (via). Finally, here's another that doesn't use ReportLab or any special libraries. |
In case anybody is particularly masochistic, here's a science fiction plot On Mon, Oct 27, 2014 at 8:35 AM, Hugo notifications@github.com wrote:
|
Actually, MultiMarkdown looks good, but can someone point me in the right direction for going from there via LaTeX to PDF? |
From the man page for pdftex(1): NAME SYNOPSIS DESCRIPTION
create PDF files as well as DVI files.
TeX engine.
PDF output has been enabled. The pdftex command uses the equivalent of the
initex and virtex commands. In this installation, if the links exist, they
graphics formats. pdfTeX cannot include PostScript or Encapsulated On Mon, Oct 27, 2014 at 3:01 PM, Hugo notifications@github.com wrote:
|
Quoting Anyway I believe the question was how the whole mmd -> LaTeX -> PDF pipeline would look. |
All this talk about generating PDFs got me remembering some old code I wrote to do just that, in Lua, which I decided to dig out of the attic and throw up on Github today. Of course, it doesn't do any of that layout kind of stuff with the spacing and the kerning and the orphans and the gutters and the suchlike, but if anyone is planning on using Lua to generate a 500-page long piece of concrete poetry ... well, it might be marginally more useful than the kinds of things my cat throws up, anyway. |
For PDF support with potentially rich layout that's also developer-friendly, I recommend using CSS3 Paged Media: http://alistapart.com/article/building-books-with-css3. The best tools are commercial and expensive, but they typically have trials. |
For I got an alligator as a pet, I had python generate really simple markdown, used a simple online converter to turn the markdown to html, and then printed the html to a PDF. It wasn't pretty but it was easy! |
If you want or need HTML to PDF creation, the easiest way I have found is to use PhantomJS, I've written a HTML2PDF as a service. You can click the deploy to Heroku and have your own running in minutes. http://github.com/optional-is/html2pdf |
Since we've brought up interactive novels, Curveship is a system for interactive narrative simulation written in Python. I imagine you could conceivably use it as part of a non-interactive novel generation process. https://github.com/nickmontfort/curveship |
If you want OCR text/images from newspaper pages, I made a simple Python wrapper around the Chronicling America API, full of scanned newsapers. From each search result you get an The Library of Congress is also on Flickr, along with over a million images from the British Library, 2.6m Internet Archive Book Images and scores more also in Flickr Commons. |
Another corpus that people might look at is the archive of all state of the If nothing else, a markov model fed with the speeches of very dissimilar On Fri, Oct 31, 2014 at 6:45 AM, Hugo notifications@github.com wrote:
|
http://www.qdl.qa/en has lots of scanned material (letters from the India Company, etc.) although you'd have to figure out how to best scrape it and OCR it if you wanted to use the actual words. It was brought to my attention by this BBC News article. Incidentally, if you are looking for a name for your generator, you could probably do worse than naming it "Warris Ali": |
I found this handy script that takes MultiMarkdown and uses http://ianhocking.com/2013/06/23/writing-a-novel-using-markdown-part-two/ |
There's also https://rawgit.com/
Compare: |
Here's a tip, in case anyone is looking for last-minute text source ideas. Maybe this is common knowledge, but I only found out how to do this recently ... You know how Twitter search results only return tweets from the last week or so? They're making improvements on the web and desktop interface, but last time I checked, the Rest API was still loading the "one-week" index and not the fulll index you can access elsewhere. So what to do? Well, topsy.com has a searchable full index of tweets that can be sorted by date. The web interface limited in various ways, though, and a "pro" account is super expensive. BUT the website's search interface gets its results from Topsy's API, and the request URL (search.js) includes an API key. You can query that directly and get JSON. You can also tweak the paramaters to get up to 100 results per chunk. I thought maybe the API key was temporary since it's just exposed, but I've been using the same one for about a week. Again, maybe this is common knowledge, but if not, hope it helps someone else! |
A non-text-generation resource: Noticed that we're getting near the end of the month and a couple people mentioned not knowing how to upload their source code. For those who want to upload their source code to Github, but have no idea how to use git, a couple of resources: http://www.sourcetreeapp.com/ Both of these will give you a GUI that lets you interface with git without having to learn all of the commands. Makes it easy for a beginner to get started with source code control. You can, of course, just put up a zip file with your source code, or post it as a gist, but I wanted everyone to be aware of some of the resources that make things a lot easier. |
This is a little late in the game, but maybe someone will do something with Somebody could probably use this to generate stories out of individual On Sat Nov 29 2014 at 9:12:34 AM ikarth notifications@github.com wrote:
|
For those of you who want to convert your text output to another format (and don't already have a library to do it in code) some tools that I've found: |
sorry this is very late but I was asked to bring these resources over here. maybe useful for next year? a list of nouns, and some word lists that are presumably hacker resource files... but i found the long list of names really exhaustive and useful. |
https://github.com/marcoguerini/DepecheMood/releases Terms with their On Fri Dec 19 2014 at 2:19:57 PM Ian Martin notifications@github.com
|
Hi, I assume this (NaNoGenMo 2014) has ended already, was it Dec 1, 2014 the deadline ? |
@jkervinen Yep, it ran from 12:01am GMT on Nov 1st 2014 to 12:01am GMT Dec 1st 2014. Welcome back in November 2015. There's also an out-of-season email group: #155 |
@hugovk Thanks, participating in November 2015, and checking the email group ! |
Not long until NaNoGenMo 2015! And just in time, @samplereality is amongst the instructors of a free, online, six-week course on Electronic Literature, starting today. Sounds perfect for NaNoGenMo inspiration!
https://www.edx.org/course/electronic-literature-davidsonx-d004x |
This is an open issue where you can comment and add resources that might come in handy for NaNoGenMo.
There are already a ton of resources on the old resources thread for the 2013 edition.
The text was updated successfully, but these errors were encountered: