Find file History
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
..
Failed to load latest commit information.
font
image
text
.gitignore
README.md
cleanup_text.py
cmudict_to_json.py
process_voynich.py
run.sh
setup.sh
translate.py

README.md

RWET Midterm Project

Scripts:

  • setup.sh: creates a virtualenv environment and installs the editdistance module
  • run.sh: activates the virtualenv environment, preprocesses texts and generates poems.

Four programs:

Three generated poems:

cleanup_text.py

Cleanup_text.py reads a text from standard input, joins all lines together and then splits newlines after punctuation such as period (.), exclamation (!), question (?), comma (,), dash (-), colon (:) and semicolon (;), before printing the modified text to standard output.

$ ./cleanup_text.py --help
usage: cleanup_text.py [-h]

Cleans up a source text by splitting on punctuation instead of original lines

optional arguments:
  -h, --help  show this help message and exit

Example

$ echo 'What time. Such food.' | ./cleanup_text.py
What time.
Such food.

cmudict_to_json.py

Cmudict_to_json.py reads a cmudict specification from standard for word pronunciations from standard input and outputs equivalent JSON to standard output.

$ ./cmudict_to_json.py --help
usage: cmudict_to_json.py [-h]

Converts cmudict to JSON format

optional arguments:
  -h, --help  show this help message and exit

Example

$ ./cmudict_to_json.py 
;;; this is a comment
WHATEVER W HH AE0 T EH1 V ER0
{
  "whatever": [
    [
      "W", 
      "HH", 
      "AE0", 
      "T", 
      "EH1", 
      "V", 
      "ER0"
    ]
  ]
}

process_voynich.py

Process_voynich.py reads a Voynich manuscript transcription from standard input and extracts one transcription for each line to print to standard output, translating all special markers into spaces.

$ ./process_voynich.py --help
usage: process_voynich.py [-h]

Reads a single transcription from Voynich manuscript data files

optional arguments:
  -h, --help  show this help message and exit

Example

$ ./process_voynich.py 
<f1r.P1.1;H>       fachys.ykal.ar.ataiin.shol.shory.cth!res.y.kor.sholdy!-
<f1r.P1.1;C>       fachys.ykal.ar.ataiin.shol.shory.cthorys.y.kor.sholdy!-
#
<f1r.P1.2;H>       sory.ckhar.o!r.y.kair.chtaiin.shar.are.cthar.cthar.dan!-
<f1r.P1.2;C>       sory.ckhar.o.r.y.kain.shtaiin.shar.ar*.cthar.cthar.dan!-
fachys ykal ar ataiin shol shory cth res y kor sholdy  
sory ckhar o r y kair chtaiin shar are cthar cthar dan

translate.py

Translate.py reads from standard input an ancient untranslated manuscript and prints to standard output a rhythmically plausible translation into English based on syllabic analysis. In addition to the ancient manuscript, the program reads a dictionary of pronunciations from the --dictionary argument and a source text providing possible translations from the --source argument. The output may be either text or HTML formatted.

$ ./translate.py --help
usage: translate.py [-h] -d DICTIONARY [--html] [-i IMAGE] [-s SOURCE]

Translates a manuscript into rhythmically plausible lines from source text

optional arguments:
  -h, --help            show this help message and exit
  -d DICTIONARY, --dictionary DICTIONARY
                        the pronunciation dictionary
  --html                output to html
  -i IMAGE, --image IMAGE
                        the html image to precede the text
  -s SOURCE, --source SOURCE
                        the source text

Example

$ ./translate.py --dictionary temp/cmudict07a.json --source temp/odyssey.txt
fachys ykal ar ataiin shol shory cth res y kor sholdy  
sory ckhar o r y kair chtaiin shar are cthar cthar dan
fachys ykal ar ataiin shol shory cth res y kor sholdy
ULYSSES ORDERED THEM ABOUT AND MADE THEM DO THEIR WORK QUICKLY, (10)

sory ckhar o r y kair chtaiin shar are cthar cthar dan
AND THE BOWLS IN WHICH HE WAS MIXING WINE FELL FROM HIS HANDS, (6)