Data warehouse that stores content and content metrics to help content owners measure and improve content on GOV.UK
Switch branches/tags
Clone or download
Latest commit f73516c Sep 24, 2018
Permalink
Failed to load latest commit information.
app Merge pull request #932 from alphagov/move-content-controller Sep 24, 2018
bin Lint fixes Sep 24, 2018
config Move ContentController out of API Sep 24, 2018
db Merge pull request #932 from alphagov/move-content-controller Sep 24, 2018
doc Merge pull request #922 from alphagov/adr-add-json-schema-to-dimensio… Sep 21, 2018
lib Minor: remove spec output Sep 24, 2018
log Track with git folders `log` and `tmp` Oct 26, 2017
public Initial commit Nov 16, 2016
spec Move ContentController out of API Sep 24, 2018
tmp Track with git folders `log` and `tmp` Oct 26, 2017
vendor/assets Initial commit Nov 16, 2016
.dockerignore Docker Jul 25, 2017
.gitignore Clean up ignored files Feb 5, 2018
.rspec Auto-require rails_helper in specs for convenience May 17, 2017
.rubocop.yml Render allocation hidden fields directly from the filter Sep 15, 2017
.ruby-version Bump to ruby-2.5.1 May 29, 2018
.travis.yml Add a lint check Dec 16, 2016
Dockerfile Bump to ruby-2.5.1 May 29, 2018
Gemfile Remove logstasher Sep 21, 2018
Gemfile.lock Bump capybara from 3.8.0 to 3.8.1 Sep 24, 2018
Guardfile Auto-lint files using Guard May 12, 2017
Jenkinsfile Set rubyLintDiff to false Aug 22, 2018
LICENCE Add MIT licence Apr 12, 2018
Procfile Rename, and remove unused, sidekiq configuration Jul 19, 2018
README.md Add LICENCE and glossary Apr 12, 2018
Rakefile Remove content prefix Feb 8, 2018
config.ru Initial commit Nov 16, 2016
docker-compose.yml Docker Jul 25, 2017
openapi.yaml Change error documentation links Sep 14, 2018
startup.sh Check dependencies on startup Jun 27, 2017

README.md

Content Performance Manager

A data warehouse that stores content and content metrics, to help content owners measure and improve content on GOV.UK.

This repository contains:

  • Extract, transform, load (ETL) processes for populating the data warehouse
  • An internal tool for exploring the data (AKA the sandbox)
  • Content performance API (docs)

Data is combined from multiple sources, including the publishing platform, user analytics, user feedback, and readability indicators.

Nomenclature

  • Data warehouse: the database where we store all the metrics.
  • ETL: extract, transform, load - how we get data into the data warehouse.
  • Fact: a record containing measurements/metrics
  • Dimension: a characteristic that provides context for a fact (such as the time it was extracted, or the content item it belongs to)
  • Star schema: The way we structure data in the data warehouse using fact and dimension tables

Dependencies

Setting up the application

Using the GDS development VM

See the getting started guide for instructions about setting up and running your development VM.

In the development VM, go to:

cd /var/govuk/govuk-puppet/development-vm

Then run:

bowl content-performance-manager

The application can be accessed from:

http://content-performance-manager.dev.gov.uk

Running the test suite

To run the test suite:

$ bundle exec rake

or you can also use Guard, see list commands

$ bundle exec guard

Populating data

If you are a GOV.UK developer using the development VM, you can run the replication script to populate the database.

To run the ETL process locally, you need to set up Google Analytics credentials in development.

Updating the API

All changes

Anytime you change what the API accepts as input or returns as output, you need to update the OpenAPI spec and documentation.

Backwards incompatable changes

Currently the API is in alpha, so users should expect backwards incompatable changes without warning.

When the API is live, we will follow the GDS API technical and data standards

  • make backwards compatible changes where possible
  • use a version number as part of the URL when making backwards incompatible changes
  • make a new endpoint available for significant changes
  • provide notices for deprecated endpoints

Licence

MIT License