# Not all polls are retweeted equally

Somebody on Twitter noted that polls showing a tigher election result were retweeted more often.

In [1]:
%%HTML
<blockquote class="twitter-tweet" data-lang="en"><p lang="en" dir="ltr">All polls aren&#39;t tweeted equally. RTs from <a href="https://twitter.com/britainelects">@britainelects</a> acct: <br>ICM - Con +11 (380 RTs)<br>ORB - Con +9 (473)<br>Survation - Con + 1 (4,300)</p>&mdash; Rob Ford (@robfordmancs) <a href="https://twitter.com/robfordmancs/status/871126287754113024">June 3, 2017</a></blockquote> <script async src="//platform.twitter.com/widgets.js" charset="utf-8"></script>

Lets use the Twitter API to confirm this observation.

### Getting out stuff together

In [2]:
%matplotlib notebook

import matplotlib.pyplot as plt
from twython import Twython
import json
import re

### Configuring our Twitter client

In [3]:
CREDS=json.loads(open('./creds.json').read())
twitter = Twython(CREDS['consumer_key'], CREDS['consumer_secret'], CREDS['access_token'], CREDS['access_token_secret'])

### Downloading the tweets
I'm going to use the [@BritainElects](http://twitter.com/britainelects) feed because it's convenient. Let's grab as much as we can with one API call (200) and filter it down so that we only have "Westminster voting intention".

In [4]:
tweets = twitter.get_user_timeline(screen_name='britainelects', count=200)
KEEP_KEYS = [ 'created_at', 'retweet_count', 'text' ]
westminster = [ { k: t[k] for k in KEEP_KEYS } for t in tweets if t['text'].startswith('Westminster voting intention') ]

We'll define this function to parse out the voting intention for each party.

In [5]:
PATTERN = "([A-Z]+):? ([0-9]+)%"
def calc_diff(input):
    intention = dict(re.findall(PATTERN, input))
    return int(intention['CON']) - int(intention['LAB'])

Now we can plot a graph of retweets against this difference. I'll use a logarithmic scale for the y-axis.

In [6]:
xs = [ calc_diff(t['text']) for t in westminster ]
ys = [ t['retweet_count'] for t in westminster ]
plt.scatter(xs, ys)
plt.title('Retweets of @BritainElects')
plt.yscale('log')
plt.yticks([100, 500, 1000, 5000, 10000])
plt.xlabel('CON - LAB % intention share')
plt.ylabel('Number of retweets')

<IPython.core.display.Javascript object>

<matplotlib.text.Text at 0x10d98d390>