Skip to content

Commit

Permalink
Abstention detection changed
Browse files Browse the repository at this point in the history
Todo stuff
  • Loading branch information
frabcus committed Sep 17, 2003
1 parent e37e0a8 commit f81c593
Show file tree
Hide file tree
Showing 7 changed files with 138 additions and 83 deletions.
2 changes: 1 addition & 1 deletion build/getlogs
Expand Up @@ -10,7 +10,7 @@ cd logs
mkdir -p tmpdownload
cd tmpdownload

rsync -v --progress --delete -az -e "ssh" $USER:./www.publicwhip.org.uk_logs/ .
rsync -v --progress --delete -az -e "ssh" $USER:./www.publicwhip.org.uk_logs/*.gz .

for X in access_log.*.gz
do
Expand Down
17 changes: 2 additions & 15 deletions errata.txt
@@ -1,24 +1,11 @@
Hansard errata
--------------

Ask about Gareth Thomas ambiguity:
http://www.publications.parliament.uk/pa/cm200203/cmhansrd/cm030120/debtext/30120-25.htm#30120-25_div56

27 Nov 2002, Division 10 has wrong ayes count or wrong vote list
- it says 341 ayes, but 240 are listed!
- it says 240 ayes, but 341 are listed!
http://www.publications.parliament.uk/pa/cm200203/cmhansrd/vo021127/debtext/21127-30.htm

Why does a division 73 appear twice here, and where is division 74? (3 Feb 2003):
Why does a division 73 appear twice (identical!) here, and where is division 74? (3 Feb 2003):
http://www.publications.parliament.uk/pa/cm200203/cmhansrd/cm030203/debindx/30203-x.htm
http://www.publications.parliament.uk/pa/cm200203/cmhansrd/cm030204/debindx/30204-x.htm

Ask about corrections of divisions - are they fixed up properly in bound volume?
329 2002-10-28 is correction of 329 2002-10-23
99 2003-03-06 is correctoin of 99 2003-03-04

Trivial
-------

Ask about 253 2003-06-24 division labelled wrongly in index


78 changes: 71 additions & 7 deletions ideas.txt
@@ -1,10 +1,3 @@
Data integrity
--------------

Search for MPs who voted on both sides in one division
Check for an MP voting both for and against! See "Abstention" here:
http://www.parliament.uk/documents/upload/p09.pdf

Party politics
--------------

Expand Down Expand Up @@ -37,6 +30,10 @@ Table of all MPs which have never voted - interesting to see
Tables of worst attendance record (after those with an excuse, such as
ministry posts, speaker, ill, SF...)

"Performance tests" for government - turning excessive monitoring and
testing back onto them.
corruptometer, loyaltometer, evilness, sleepometer, waffle-meter

Data anlaysis (using existing data)
-------------

Expand All @@ -56,6 +53,14 @@ variation." - we could do this with MP clustering.
Improve clustering distance algorithm
See J Vaughan suggestions

Colour dots in cluster diagram by how many times they have voted.
Bright colours for more relevant the data - i.e. how many intersections
with other's votes there are.

Play with stuff in vector search article
http://www.perl.com/pub/a/2003/02/19/engine.html
In particular PDL for speeding up octave algebra stuff

> Idea 2. Darren suggested that the reason Tony Blair is an outlier
> in the java app is coz he only turns up to votes he thinks are
> going to be controversial, hence ones that people are probably
Expand Down Expand Up @@ -86,6 +91,9 @@ Why did this happen? Anomalies in Hansard. Email them to complain.
Find three line whip definition
Infer no Whip if 10% +- from base? Or at least +-1

Animated cluster diagram over last 15 years.
3 month window moving week by week

Additional numeric data
-----------------------

Expand All @@ -104,11 +112,13 @@ It is worth looking for MPs who spoke but did not vote. This is a good
way to detect active abstentions. It may also have all sorts of other
interesting meanings.
division.php?date=2003-06-10&number=224&showall=yes
Count how many times MP spoke in a debate, or on the day

Integrate parliamentary majority, and look for correlations with
rebelliousness? Majorities here:
http://www.psr.keele.ac.uk/area/uk/mps.htm
(Should be no correlation, as reselection more important?)
Plot majority as a colour on the cluster diagram

Analyse if MPs who are "sir" vote differently in anyway
first check data integrity that title always has "Sir" for knights
Expand All @@ -127,9 +137,27 @@ Collate all MPs articles in newspapers
Regional analysis. Scotland, NI, Whales, North v South. Urban v.
Rural.

Area of land for constituency. This gives a "ruralness" measure.
Population of constituency.

Make cluster diagram for just divisions relating to one issue. Or
for one person's interested issues. Plot point on cluster diagram for
issues themselves.

Value of the vote. What is the monetary expenditure cost of agreeing
the motion? Graph against time spent discussing, and see how silly the
correlation is.

Measure lobbying power behind each issue (expenditure by interested
parties). Again, correlate to time spent on it.

Additional text content
-----------------------

Issue sub-selector. User can log in, name an issue, and say which way
votes should have gone to satisfy him on that issue. Get all manner of
people to make issues for next general election.

Software to follow legislation from Queen's speech

Group votes by department, so you can see areas of interest (Sirius
Expand All @@ -151,6 +179,10 @@ Link to draft of Bill which is being debated
Usability
---------

Email reports to people when search queries change
e.g. When your MP has voted. When he has rebelled. When an issue is
voted on, and so on.

Show majority in division table - sort by which ones majority was least on?

Link from MP to other sources of info
Expand All @@ -162,6 +194,38 @@ Link from search engine to

Links to other political resource websites

hansard.php - takes links to days and chunks, does a redirect
reduce bandwidth, and do tracking of where people link through to

Log failed searches so we can improve the search engine

Detect MS Java applet and upgrade it
FastCGI if our load gets high
mod_gzip to reduce bandwidth

Usability (some sort of done - this is just some notes)
- make website name link back to homepage
- consider link titles http://www.useit.com/alertbox/980111.html
- about the authors, so feels personal to people
- consider breadcrumb trail
- about section (not all FAQ?)
- company name/logo at topleft, search at topright
- search input box on front page (http://www.useit.com/alertbox/20010513.html)
- print stylesheet media="print" removing menus

Physicsl gimmicks
-----------------

Actually post a whipping sheet to MPs. This would arrive every week at
the same time as their party whipping sheet. It would tell them how
many voters in their constituency have register with organisations which
would like them to vote particular ways.

Make big wall chart of cluster diagram - colour, pretty
Maybe even sell it to people

Newsletter (may be better than blog that you have)

About one MP
------------

Expand Down
56 changes: 14 additions & 42 deletions todo.txt
Expand Up @@ -23,72 +23,36 @@ Facility to register and define voting subset for a particular issue
Do Iraq subselection ourselves
Do climate change subselection

Investigate EDMs
Investigate EDMs, legality of using them

Look at majorities

Website
-------

Put news (at least headlines) on front page

Make cluster diagram clearer in "highlights" section on front page

Sort cluster diagram name entries by surname - rather than no order

Think more about excess motion text on bills

Consider changing support@ email addr to something less corporate sounding

Move distant metric stuff so it is uses the initial data, not the munged
data from the cluster diagram

Change our use of the word "rebel" more consistently

Trim some opinion waffle out of Cluster text

News/blog of observations we make about things
Some kind of news system
Newsletter
Some kind of comments system
Solicit help in some way?
Email address more prominant everywhere

Log failed searches so we can improve the search engine

Colour blind people, or indeed blind people, need a better rebel marker
than redness in MPs division list. Boldness is one idea.

Log file grabbing for permanent keeps?

hansard.php - takes links to days and chunks, does a redirect
reduce bandwidth, and do tracking of where people link through to

Find logo

Detect MS Java applet and upgrade it

FastCGI if our load gets high
mod_gzip to reduce bandwidth

Usability (some sort of done - this is just some notes)
- make website name link back to homepage
- consider link titles http://www.useit.com/alertbox/980111.html
- about the authors, so feels personal to people
- license link violates by popping up new window - should be in main window
- consider breadcrumb trail
- about section (not all FAQ?)
- company name/logo at topleft, search at topright
- search input box on front page (http://www.useit.com/alertbox/20010513.html)
- print stylesheet media="print" removing menus

Make big wall chart of cluster diagram - colour, pretty
Maybe even sell it to people

Play with stuff in vector search article
http://www.perl.com/pub/a/2003/02/19/engine.html
In particular PDL for speeding up octave algebra stuff

Scraper
-------

Check out tapiR, see if useful

Finish last few divisions that you don't have right
Missing division 74

Expand All @@ -99,9 +63,17 @@ Check "Question accordingly..." fits with our counting
Tally vote numbers in text and check they fit with our counting

Deal with when an MP voted twice in one division
Search for MPs who voted on both sides in one division
Check for an MP voting both for and against! See "Abstention" here:
http://www.parliament.uk/documents/upload/p09.pdf

Also I need to tidy the whole thing up to be more usable. Reduce the
number of commands, make the pipeline more straightforward, and so it
doesn't go wrong if you do things in the wrong order.

Make one script I have to run which just does everything (backup logs,
backup sf cvs repository, get latest divisions, upload to db)

Improve motion text extraction


39 changes: 23 additions & 16 deletions website/division.php
@@ -1,5 +1,5 @@
<?php
# $Id: division.php,v 1.1 2003/08/14 19:35:48 frabcus Exp $
# $Id: division.php,v 1.2 2003/09/17 12:01:32 frabcus Exp $

# The Public Whip, Copyright (C) 2003 Francis Irving and Julian Todd
# This is free software, and you are welcome to redistribute it under
Expand Down Expand Up @@ -70,7 +70,12 @@
by pw_mp.party, vote order by party, vote");
print "<h2>Party Summary</h2>";
print "<p>Votes by party, bold entries are a guess at the party
whip, red entries a guess at rebels.</p>";
whip, red entries a guess at rebels. Abstentions are calculated
from the expected turnout, which is statistical based on the
average proporionate turnout for that party in all divisions. A
negative abstention indicates that more members of that party than
expected voted; this is always relative, so it could be that another
party has failed to turn out <i>en masse</i>.</p>";

# Precalc values
$ayes = array();
Expand Down Expand Up @@ -102,8 +107,9 @@

# Make table
print "<table><tr class=\"headings\"><td>Party</td><td>Ayes</td><td>Noes</td><td>Turnout</td>";
print "<td>Expected</td><td>Extra Turnout</td></tr>";
$allparties = array_unique(array_merge(array_keys($ayes), array_keys($noes)));
print "<td>Expected</td><td>Abstain</td></tr>";
#$allparties = array_unique(array_merge(array_keys($ayes), array_keys($noes)));
$allparties = array_keys($alldivs);
usort($allparties, strcasecmp);
$votes = array_sum(array_values($ayes)) + array_sum(array_values($noes));
if ($votes <> $turnout)
Expand All @@ -125,20 +131,21 @@

$alldiv = $alldivs[$party];
$expected = round($votes * ($alldiv / $alldivs_total), 1);
$extra = number_format(100 * $total / ($votes * ($alldiv / $alldivs_total)) - 100, 1);
if ($extra > 0)
$abstentions = $expected - $total;
$classabs = "normal";
if (abs($abstentions) >= 2) { $classabs = "important"; }

if ($aye > 0 or $noe > 0 or $abstentions >= 2)
{
$extra = "+" . $extra;
$prettyrow = pretty_row_start($prettyrow);
print "<td>" . pretty_party($party) . "</td>";
print "<td class=\"$classaye\">$aye</td>";
print "<td class=\"$classnoe\">$noe</td>";
print "<td>$total</td>";
print "<td>$expected</td>";
print "<td class=\"$classabs\">$abstentions</td>";
print "</tr>";
}

$prettyrow = pretty_row_start($prettyrow);
print "<td>" . pretty_party($party) . "</td>";
print "<td class=\"$classaye\">$aye</td>";
print "<td class=\"$classnoe\">$noe</td>";
print "<td>$total</td>";
print "<td>$expected</td>";
print "<td class=\"percent\">$extra%</td>";
print "</tr>";
}
print "</table>";

Expand Down
26 changes: 25 additions & 1 deletion website/news.php
@@ -1,5 +1,5 @@
<? $title = "News"; include "header.inc";
# $Id: news.php,v 1.3 2003/09/12 09:41:43 frabcus Exp $
# $Id: news.php,v 1.4 2003/09/17 12:01:33 frabcus Exp $

# The Public Whip, Copyright (C) 2003 Francis Irving and Julian Todd
# This is free software, and you are welcome to redistribute it under
Expand All @@ -12,6 +12,30 @@
*/
?>

<h2>Detecting abstentions - 16 September 2003 by Francis</h2>
<p>Quite often members deliberately refrain from voting in a division,
even if they are in the house so could have done so. Conversely, on an
important vote, the whip of one party will deliberately try and get a
higher turnout. A while ago Becka suggested a way of detecting these
effects.</p>

<p>You add up the turnouts for each party across <b>all</b> divisions
and end up with a percentage expected vote share per party. Then you
calculate, given the total turnout for this particular division, what
the percentage would lead you to expect. If the number of voters in
the party is much different from your expectation, then something
interesting is happening.</p>

<p>This calculation has been in Public Whip for a while, manifest as a
mysterious column of numbers on the party table in the division listing.
I've hopefully made it a bit clearer, using the terminology of
abstentions, and displating high abstention parties even if nobody in them
voted. Have a look at the recent <a
href="division.php?date=2003-09-10&number=307">Iraq and the UN vote</a>,
where the Lib Dems proposed a motion. You can see from the large
abstention number for the Conservatives that the party whip must have
been to abstain. Indeed none of them voted at all.</p>

<h2>Which Gareth Thomas? - 12 September 2003 by Francis</h2>
<p>One of the things I'm doing at the moment is improving the quality of
data for the current parliament. There are sometimes ommissions or
Expand Down
3 changes: 2 additions & 1 deletion website/publicwhip.css
@@ -1,4 +1,4 @@
/* $Id: publicwhip.css,v 1.3 2003/09/09 14:26:08 frabcus Exp $
/* $Id: publicwhip.css,v 1.4 2003/09/17 12:01:33 frabcus Exp $
The Public Whip, Copyright (C) 2003 Francis Irving and Julian Todd
This is free software, and you are welcome to redistribute it under
Expand Down Expand Up @@ -68,6 +68,7 @@ hr.topline */
table td.rebel { background-color: #ee7777; }
table td.whip { font-weight: bold; }
table td.percent { text-align: right; }
table td.important { font-style: italic; font-weight: bold; }

table { margin: 0;
padding: 0;
Expand Down

0 comments on commit f81c593

Please sign in to comment.