Code to collect lexical blends from my Euralex 2012 paper.
Python Shell
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.

This is code accompanying the following paper:

Paul Cook. Using social media to find English lexical blends. To appear in Proceedings of the 15th EURALEX International Congress (EURALEX 2012), Oslo, Norway.


A Twitter account:

To build a collection of lexical blend candidates:

First edit TWITTER_USERNAME and PASSWORD in collecttweets.bash to a Twitter username and password

Then to collect tweets run this:

$ bash collecttweets.bash

This will create a bunch of files with names like this: 120721072955.json (The filenames correpsond to the date and time the process was started.) Each line of each file is a JSON string representing a tweet.

Then once you've got a bunch of tweets, you can get the candidate blends using this:

$ cat 120721072955.json | python | python aspell_list.txt.gz

Each line of the output has the following information blend_candidate source_word1 source_word2 Regex_match URL_of_tweet text_of_tweet