New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Entry: The Days Left Forebodings and Water #119

Open
lizadaly opened this Issue Nov 22, 2016 · 2 comments

Comments

Projects
None yet
4 participants
@lizadaly

lizadaly commented Nov 22, 2016

Blackout generates pages of text from book or newspaper scans in the style of Newspaper Blackout Poetry, popularized by Austin Kleon (c.f. A Humument by Tom Phillips).

Blackout does the following:

  1. Take, as input, an image of text, from a newspaper or book.
  2. Run OCR against the image, identifying the words and their bounding boxes.
  3. Feed the extracted text into a natural language parser, categorizing each part of speech.
  4. Given one of many randomly selected Tracery grammars, select words from the current page that match the parts of speech of that grammar.
  5. Draw around those words and "scribble" out all other text on the page image.
  6. Output the final page as a new image.

Pen width, line length, line direction, number of strokes, and stroke opacity are all randomly fuzzed. The pen color is always black, except in rare cases it is blood red.

The work:

"The Days Left Forebodings and Water"

The source material is A Vindication of the Rights of Women by Mary Wollstonecraft (1792).

Read The Days Left Forebodings and Water. 45 pages long, consists of entries that were generated randomly, but hand-picked and ordered on November 9, 2016.

(The full NaNoGenMo entry of ~50,000 words is a 9.3GB PDF of nearly 10,000 pages and is no longer available for download.)

Full source code and more examples

@enkiv2

This comment has been minimized.

Show comment
Hide comment
@enkiv2

enkiv2 Nov 22, 2016

Oh man, this is so cool. Blackout poetry has been done before but only as
redaction of particular words as far as I can tell; actually scribbling out
portions of an image of a page is a really neat idea.

On Mon, Nov 21, 2016 at 8:22 PM Liza Daly notifications@github.com wrote:

Blackout https://github.com/lizadaly/blackout generates pages of text
from book or newspaper scans in the style of Newspaper Blackout Poetry
http://newspaperblackout.com/, popularized by Austin Kleon
https://twitter.com/austinkleon (c.f. A Humument
http://tomphillipshumument.tumblr.com/ by Tom Phillips).

Blackout does the following:

  1. Take, as input, an image of text, from a newspaper or book.
  2. Run OCR https://github.com/jflesch/pyocr against the image,
    identifying the words and their bounding boxes.
  3. Feed the extracted text into a natural language parser
    https://spacy.io/, categorizing each part of speech.
  4. Given one of many randomly selected Tracery
    https://github.com/aparrish/pytracery grammars, select words from
    the current page that match the parts of speech of that grammar.
  5. Draw around those words and "scribble" out all other text on the
    page image.
  6. Output the final page as a new image.

Pen width, line length, line direction, number of strokes, and stroke
opacity are all randomly fuzzed. The pen color is always black, except in
rare cases it is blood red.

The work:
"The Days Left Forebodings and Water"
https://github.com/lizadaly/blackout/blob/master/images/title.png?raw=true

The source material is A Vindication of the Rights of Women
https://en.wikipedia.org/wiki/A_Vindication_of_the_Rights_of_Woman by
Mary Wollstonecraft (1792).

Read The Days Left Forebodings and Water
https://s3.amazonaws.com/worldwritable/nanogenmo2016-short.pdf. 45
pages long, consists of entries that were generated randomly, but
hand-picked and ordered on November 9, 2016.
https://github.com/lizadaly/blackout/blob/master/images/4.png?raw=true
https://github.com/lizadaly/blackout/blob/master/images/3.png?raw=true
https://github.com/lizadaly/blackout/blob/master/images/7.png?raw=true

(The full NaNoGenMo entry of ~50,000 words is a 9.3GB PDF
https://s3.amazonaws.com/worldwritable/nanogenmo2016-9g-long.pdf of
nearly 10,000 pages. You almost certainly do not want to download it.)

Full source code and more examples https://github.com/lizadaly/blackout


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#119, or mute the thread
https://github.com/notifications/unsubscribe-auth/AAd6GbEpND7oXjsbisp46O3DmASu_3EAks5rAkPUgaJpZM4K45n3
.

enkiv2 commented Nov 22, 2016

Oh man, this is so cool. Blackout poetry has been done before but only as
redaction of particular words as far as I can tell; actually scribbling out
portions of an image of a page is a really neat idea.

On Mon, Nov 21, 2016 at 8:22 PM Liza Daly notifications@github.com wrote:

Blackout https://github.com/lizadaly/blackout generates pages of text
from book or newspaper scans in the style of Newspaper Blackout Poetry
http://newspaperblackout.com/, popularized by Austin Kleon
https://twitter.com/austinkleon (c.f. A Humument
http://tomphillipshumument.tumblr.com/ by Tom Phillips).

Blackout does the following:

  1. Take, as input, an image of text, from a newspaper or book.
  2. Run OCR https://github.com/jflesch/pyocr against the image,
    identifying the words and their bounding boxes.
  3. Feed the extracted text into a natural language parser
    https://spacy.io/, categorizing each part of speech.
  4. Given one of many randomly selected Tracery
    https://github.com/aparrish/pytracery grammars, select words from
    the current page that match the parts of speech of that grammar.
  5. Draw around those words and "scribble" out all other text on the
    page image.
  6. Output the final page as a new image.

Pen width, line length, line direction, number of strokes, and stroke
opacity are all randomly fuzzed. The pen color is always black, except in
rare cases it is blood red.

The work:
"The Days Left Forebodings and Water"
https://github.com/lizadaly/blackout/blob/master/images/title.png?raw=true

The source material is A Vindication of the Rights of Women
https://en.wikipedia.org/wiki/A_Vindication_of_the_Rights_of_Woman by
Mary Wollstonecraft (1792).

Read The Days Left Forebodings and Water
https://s3.amazonaws.com/worldwritable/nanogenmo2016-short.pdf. 45
pages long, consists of entries that were generated randomly, but
hand-picked and ordered on November 9, 2016.
https://github.com/lizadaly/blackout/blob/master/images/4.png?raw=true
https://github.com/lizadaly/blackout/blob/master/images/3.png?raw=true
https://github.com/lizadaly/blackout/blob/master/images/7.png?raw=true

(The full NaNoGenMo entry of ~50,000 words is a 9.3GB PDF
https://s3.amazonaws.com/worldwritable/nanogenmo2016-9g-long.pdf of
nearly 10,000 pages. You almost certainly do not want to download it.)

Full source code and more examples https://github.com/lizadaly/blackout


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#119, or mute the thread
https://github.com/notifications/unsubscribe-auth/AAd6GbEpND7oXjsbisp46O3DmASu_3EAks5rAkPUgaJpZM4K45n3
.

@anjabeth

This comment has been minimized.

Show comment
Hide comment
@anjabeth

anjabeth Nov 27, 2016

Whoa, I love this! Particularly what you did with making it look like "real" blackout poetry with the penstrokes and everything. Haven't read the whole "The Days Left Forebodings and Water", but I'm excited to (and what a great title)

Whoa, I love this! Particularly what you did with making it look like "real" blackout poetry with the penstrokes and everything. Haven't read the whole "The Days Left Forebodings and Water", but I'm excited to (and what a great title)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment