Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Code to collect lexical blends from my Euralex 2012 paper.
Python Shell
branch: master

Fetching latest commit…

Cannot retrieve the latest commit at this time

Failed to load latest commit information.
README.md
aspell_list.txt.gz
blendinterps.py
collecttweets.bash
getblendinterpcands.py
gpl.txt
neos.re
neos.track
tokeniser.py
tweetswithre.py

README.md

This is code accompanying the following paper:

Paul Cook. Using social media to find English lexical blends. To appear in Proceedings of the 15th EURALEX International Congress (EURALEX 2012), Oslo, Norway.

Requirements:

langid.py: https://github.com/saffsd/langid.py

A Twitter account: https://twitter.com/

To build a collection of lexical blend candidates:

First edit TWITTER_USERNAME and PASSWORD in collecttweets.bash to a Twitter username and password

Then to collect tweets run this:

$ bash collecttweets.bash

This will create a bunch of files with names like this: 120721072955.json (The filenames correpsond to the date and time the process was started.) Each line of each file is a JSON string representing a tweet.

Then once you've got a bunch of tweets, you can get the candidate blends using this:

$ cat 120721072955.json | python tweetswithre.py --searchres=neos.re | python getblendinterpcands.py aspell_list.txt.gz

Each line of the output has the following information blend_candidate source_word1 source_word2 Regex_match URL_of_tweet text_of_tweet

Something went wrong with that request. Please try again.