A descriptive temperature scale, based on tweets that read "it's (hot/cold) as ______"
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
data
images
presentation
research
vendor/forecast-php
.gitignore
0-scrape.py
1-dbsetup.php
2-collect-tweets.php
3-get-weather.php
README.md

README.md

A Curseword-Based Scale for Temperature

5,400 geolocated tweets with phrases like "it's hot as hell" or "it's cold as a bitch" were collected from August 2017 to January 2018, and the outdoor temperature collected for each.

The result is a derived curseword-based temperature scale: "it's hot as hell" is 86°F, statistically speaking.

By @jimwebb, and presented at Hack and Tell DC (Presentation: Keynote | PDF). Thanks to @metasemantic for inspiration and code.

Observed median temperature, by phrase (n>10)

All Temps

Most popular phrases and median temperatures (°F)

Phrase Count Median Temp (°F)
(it's) cold as fuck 623 41°F
cold as shit 497 45°F
cold as hell 466 48°F
cold as balls 78 37°F
cold as a bitch 43 37°F
--------------------------- ------- ---------------
hot as hell 549 86°F
hot as fuck 495 86°F
hot as shit 261 86°F
hot as balls 197 84°F
hotter than hell 58 85°F

Detail: hot subjects (n>5)

Displayed with frequency, median (black) and 95% confidence (gray)

Hot phrases

Detail: cold subjects (n>5)

Displayed with frequency, median (black) and 95% confidence (gray)

Cold phrases

"Hell" can be hot or cold

"Cold as hell" (48°F) and "hot as hell" (86°F) exist together, and hell isn't the only subject with this duality:

Phrase When Hot (°F) When Cold (°F)
dick 92 36
satan's balls 90 34
a witch's tit 90 26
a bitch 89 38
a mf 88 42
a mother 88 29
ever 87 59
f 87 45
tits 85 40
shit 85 45
hell 85 47
fuck 84 41
balls 84 36
heck 78 28

"Boogers" are always cold; "The Devil's Dick" is always hot

Some subjects are used in one context (either hot or cold), but not both. "Boogers" and "Mars" are always cold; "the devil's dick" and "two rats (fucking)" are always hot:

Hot/Cold Bias

Data set

Dataset provided in data/collected-tweets.csv; tweets with temperatures added are a SQLite database, in data/collected-tweets.db. Contact me for raw tweets (250MB).

Data collection

python src/0-scrape.py

Requires an empty folder in the local directory named raw_firehose and a file named access_tokens.json with your Twitter API keys:

{
    "key":"XXXXXXXXXXXXXXXXXXXXXXXXX",
    "secret":"XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX",
    "access":"XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX",
    "access_secret":"XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"
}

Processing

Set up SQLite database for processing (one-time)

php src/1-dbsetup.php

Parse the tweets and organize by phrasing

php src/2-collect-tweets.php

Poll the Dark Sky API for temperatures (edit file with your Dark Sky API key)

php src/3-get-weather.php