Skip to content
This repository has been archived by the owner on Jan 29, 2019. It is now read-only.

alphagov/govuk_programme_analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 

Repository files navigation

Intro

This is a simple script that scrapes the counts of 2xx, 3xx, 4xx and 5xx status codes for various apps from GOV.UK graphite. These counts are then saved as a .csv file with the format:

name, timestamp, 2xx, 3xx, 4xx, 5xx

Requirements

This script requires python 2.7+ but only uses packages from the standard library (urllib2, datetime, json and csv).

Installation

Either clone this directory:

git clone git@github.com:alphagov/govuk_programme_analysis.git

or copy just the raw script

Usage

This should be run from the terminal using python from this directory:

cd ~/govuk/govuk_programme_analysis # Or where ever you've copied the script
python status_counts.py

and produce output like:

Getting stats for:
    Whitehall frontend
    Manuals frontend
    Government frontend
[...]
    Whitehall Admin
    Content Store
Got data for 39 apps.
All done, 2017-04-04_status_code_report.csv created

How it works

The script uses python to make a series of calls to graphite's render API. Using this API we can make graphite produce json output. The json is then parsed and written as a csv (the render API can natively produce csv but it is easier to work with json).

The API call to graphite uses several parameters & functions:

  • from=-2weeks The start of the window to produce output for (set to 2 weeks ago)
  • until=-1weeks The end of the window to produce output for (set to 1 week ago)
  • target= What we want to produce output for
    • sumSeries Sum over the given array of values (this wraps the summarize function)
      • hitcount Estimate the counts of events recorded as rates in the stats-path.
        • stats-path which path to estimate for
        • intervalString: the interval to estimate for (value: 1week)
  • format=json We want json formatted output

Graphite paths

Graphite uses .-deliminated paths to organise its metrics. For each app/status code combination we need to generate a path. To do this we split the graphite path into three sections:

  • The host
  • The app-path
  • The status code

The hosts are of the form stats.hostname-*.nginx_logs where the hostname is something like frontend the * is a wildcard that indicates to graphite that it should aggregate over all of the hosts of that name (for example frontend-1, frontend-2). The app-path is the full name of the app (for example calculators_publishing_service_gov_uk) and the status code is, for example: http_2xx.

The apps

  • Metadata API
  • Asset Manager
  • Collections Publisher
  • Contacts Admin
  • Content API
  • Content performance manager
  • Content Tagger
  • Email Alert Api
  • HMRC manuals API
  • Imminence
  • Local links manager
  • Manuals Publisher
  • Maslow
  • Need API
  • Policy Publisher
  • Publisher
  • Publishing API
  • Release
  • Search admin
  • Short URL manager
  • Signon
  • Specialist publisher
  • Specialist Publisher
  • Support (api)
  • Transition
  • Travel advice publisher
  • Content Store
  • Calculators frontend
  • Calendars frontend
  • Smartanswers frontend
  • Feedex (support form)/Feedback
  • Government frontend
  • Info frontend
  • Manuals frontend
  • Specialist frontend
  • Mapit
  • Rummager (search API)
  • Whitehall Admin
  • Whitehall frontend

Not currently included

  • Info pages frontend - this doesn't seem to exist as a separate app.
  • Feedback forms - this doesn't seem to exist as a separate app.
  • Bouncer - this produces status codes for a large number of domains (for example 'directgov' and 'businesslink') and it is unclear which statistics should be gathered for it.

About

Scripts for helping the programme analysis team

Topics

Resources

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages