Permalink
Switch branches/tags
Nothing to show
Commits on Aug 26, 2014
  1. t-SNE applied to Ohhla corpus

    jhave committed Aug 26, 2014
    my laptop threw an -11 error with all 57k rap songs
    so i analyzed 33k and after 18 hours it coughed up
    a massive csv and a set of images
Commits on Aug 23, 2014
  1. TSNE classification on poetryfoundation corpus

    jhave committed Aug 23, 2014
    Dimensionality reduction using:
    t-Distributed Stochastic Neighbor Embedding (t-SNE) in sklearn
    
    Read about TSNE : http://homepage.tudelft.nl/19j49/t-SNE.html
    
    Adapted from:
    http://nbviewer.ipython.org/urls/gist.githubusercontent.com/AlexanderFabisch/1a0c648de22eff4a2a3e/raw/59d5bc5ed8f8bfd9ff1f7faa749d1b095aa97d5a/t-SNE.ipynb
    
    Output image:
    bdp.glia.ca/plots/2014-08-23_16h42_TSNE_poetryFoundation_NO_ANNOTATIONALL.png
    
    it threw an error when outputting the annotated version
Commits on Aug 20, 2014
  1. SPREEDER: basically same as before but tweaked to make the originals …

    jhave committed Aug 20, 2014
    …less discernible and the ran a screen grab to make a video.
    
    Using Python, NLTK, Alchemy, pattern.en, and pyenchant
    to analyze and perform word replacement
    on a corpus of 10,119 poems scraped from the poetryFoundation
    and generate 10,118 poems in 66 minutes.
    
    There is a real-time hour-long screen-grab output
    of the trace window in SublimeText
    as the poetry-gen program runs.
    
    http://bdp.glia.ca/spreed-speed-screen-reading-one-hour-real-time-poetry-generation-screengrab/
Commits on Aug 17, 2014
  1. Markov chaining ... super simple....

    jhave committed Aug 17, 2014
    found it online... then generated a bit of Bernstein
    http://bdp.glia.ca/markov-bern/
Commits on Aug 14, 2014
  1. SYN-SCI-RAP

    jhave committed Aug 14, 2014
    http://bdp.glia.ca/syn-sci-rap
    synset + science terms + rap stuff
Commits on Aug 4, 2014
  1. Smaller Words (shrink-gapped): not much change

    jhave committed Aug 4, 2014
    except to increase the shortness of words just a bit
    blog excerpts here: http://bdp.glia.ca/smaller-words-shrink-gapped
    
    output reads like
    Robert Creeley becoming Samuel Beckett in Gertrude Stein's gut.
Commits on Aug 3, 2014
Commits on Jul 31, 2014
  1. import_utilities includes SYNSET replacement, and sundry functions

    jhave committed Jul 31, 2014
    (just uploading as i am about to mutate it prior to using ohhla rap_mouth reservoir)
  2. FIXED Hatcher bug. Created FAKE_authors.

    jhave committed Jul 31, 2014
    (becuz who wants to read a poem written by a bot...unless it's good)
    USED pattern.en to catch article and conjugation problems. (partially working but much more robust than my meagre hacks...)
Commits on Jul 30, 2014
  1. Hatcher: this code creates an irregular repetition

    jhave committed Jul 30, 2014
    of word words words
    in the style of Ian Hatcher
    due to an error
    in how i was catching a special case of "me."
    and other 3 letter words that are not supposed to be replaced as if they were states.
    
    Read (if you wish) 10k+ poems generated by this code here:
    http:///bdp.glia.ca/poems/2014-07-30_22_poetryFoundation_generatedPOEMS_ALL.html
Commits on Jul 23, 2014
  1. 10,118 poems generated in 4.62 hours

    jhave committed Jul 23, 2014
    using similar techiniques to before
    AlCHEMY: to identify entities, relations
    NLTK for POS and synset
Commits on Jul 20, 2014
  1. starting to send poems to ALCHEMY

    jhave committed Jul 20, 2014
    ran into daily transaction limit after 6,063 items
    results stashed in json/ALCHEMY_POEMS_2014-07-20_19
  2. BIO generation code + results

    jhave committed Jul 20, 2014
    resonable. not impeccable.
Commits on Jul 17, 2014
  1. some bios v.1

    jhave committed Jul 17, 2014
  2. BASIC Bio construction using keyword entities from Alchemy

    jhave committed Jul 17, 2014
    -- for now very randomized....
Commits on Jul 16, 2014
  1. BIO pre-pipeline pipe: parse-html, send to ALchemy, convert to JSON, …

    jhave committed Jul 16, 2014
    …read back in...
    
    Next steps: feature extraction, generation
  2. using ALCHEMY to get a basic json from poetry foundation txt files

    jhave committed Jul 16, 2014
    keywords, entities, category, concepts, relations
    -- writing a json of returned data for each poem
    in a date-time stamped folder
Commits on Jul 15, 2014
  1. removed unicode bug from feature extraction and json dump

    jhave committed Jul 15, 2014
    ran random forest classifier on poetry foundation
    (after long struggle i abandoned the parsing data because it was throwing errors)
    -- results terrible but pathway now open for retrieving word usage of stopwords for lyrics and shakespeare etc... and eventually arriving at a classifier that can guess author
Commits on Jul 14, 2014
  1. redid the JSON dump

    jhave committed Jul 14, 2014
    with stop word frequency
    and binarized lists
Commits on Jul 13, 2014
Commits on Jul 8, 2014
  1. utilities

    jhave committed Jul 8, 2014
  2. JSON - classifier : buggy

    jhave committed Jul 8, 2014
    rewrote code to output json
    and now its broken
Commits on Jul 7, 2014
  1. lyrics, lyrics clean (with json unicode bug) and ohhla synset generat…

    jhave committed Jul 7, 2014
    …or (it makes crap poems very very fast)
    
    another painful moment in the inertial evolution of nothing
  2. frustrating: went nowhere tried to stuff the words_per_line string re…

    jhave committed Jul 7, 2014
    …turned from csv into numeric form, mart succeeded in converting it but then, it seems to have bizarrely shuffled indices.... as i said went nowhere....
Commits on Jul 5, 2014
  1. downloaded 56k rap songs in txt format

    jhave committed Jul 5, 2014
    and built a rudimentary generator
    to write poems based on their
    part-of-speech format
    
    note: lemma('purring')
    http://www.clips.ua.ac.be/pages/pattern-en
    
    and for future
    http://www.clips.ua.ac.be/pages/pattern-en
Commits on Jul 2, 2014
  1. Pre class Tweals

    tijptjik committed Jul 2, 2014
  2. Big Data

    tijptjik committed Jul 2, 2014
  3. Random Forest implementation to look for author...

    jhave committed Jul 2, 2014
    Conclusion: too sparse in Y
    But intriguing.
    
    TODO: get the parsed metre data and line lists into calculations...
Commits on Jul 1, 2014
  1. Plotting a few features against the date of publication.

    jhave committed Jul 1, 2014
    Feasible but not revelatory.
    
    Next step: put the metrical analyis through a random forest.
    And send up the website urls to alchemy api to extract text.
Commits on Jun 30, 2014
  1. ok. this is wht i wanted to push:

    jhave committed Jun 30, 2014
    a rudimentary but feasible plot method
    that can be expanded to include other features.