Rail fare increases: Charts explain passengers' frustration
In January 2019 as rail fares increased we published an analysis of official data which showed that rail users were paying more for worsening delays, shortages of staff and, in some areas, an ageing fleet of carriages.
The analysis included scripts in both R (analysis, visualisation) and Python (scraping).
This is the fifth story the data unit has done on rail fare rises. In August 2018 we reported Commuters 'pay fifth of salary' on season ticket, and 12 months before that we reported Commuters to pay £100 more in 2018. In January 2017 we published Rail fares: Who are the season ticket winners and losers? and in September 2016 we published Rail season tickets cost 10% of net pay.
Get the data
The tweets data is not included here because it is too large for GitHub. However, the filtered file of tweets from November 20 onwards, is.
- ORR: Delays by cause and operator, 2008 to present (XLS)
- CSV: Delay minute totals by year
- CSV: Breakdown of minutes by type and operator, 2011 vs 2018
- XLS: Tweets mentioning sorry or apologies or apologise by operator
- XLS: Tweets mentioning compensation or delay repay by operator
Quotes and interviews
- Stewart Frank, commuter
- James Vasey, Bradford Rail Users Group
- Spokesperson, The Office of Rail and Road (ORR)
- Darren Shirley, Campaign for Better Transport (CBT)
- Paul Plummer, chief executive, the Rail Delivery Group
- Tree map: Rail delays by cause and responsibility
- Grouped bar chart: Train delays due to staff shortages, 2017 vs 2018
- Bar chart: Percentage of tweets saying 'sorry', 'apologies' or 'apologise' between November 20 and December 19 by train operator
- Column chart: Compensation claims made by Northern Rail passengers during 2018, by period
- Line chart: Age of rolling stock by operator, 2008-2018
- Table: Rise in monthly rail season ticket fares, by route
Scripts and code
- The notebook traindelays details the process of analysing ORR data on train delays.
- The R notebook 7periodcomparison takes the periodic data produced by the ORR and produces totals for 7 periods, allowing for a comparison between the delays to date, and those for the same 7-period dates in previous years.
- Python script to scrape Twitter accounts
- The R markdown file traintweetsrmdonly details the process of analysing tweets by train company accounts. This is not saved as a notebook because the resulting HTML file is over 40MB!