Content Performance Manager
A data warehouse that stores content and content metrics, to help content owners measure and improve content on GOV.UK.
This repository contains:
- Extract, transform, load (ETL) processes for populating the data warehouse
- An internal tool for exploring the data (AKA the sandbox)
- Content performance API (docs)
- Data warehouse: the database where we store all the metrics.
- ETL: extract, transform, load - how we get data into the data warehouse.
- Fact: a record containing measurements/metrics
- Dimension: a characteristic that provides context for a fact (such as the time it was extracted, or the content item it belongs to)
- Star schema: The way we structure data in the data warehouse using fact and dimension tables
Setting up the application
Using the GDS development VM
See the getting started guide for instructions about setting up and running your development VM.
In the development VM, go to:
The application can be accessed from:
Running the test suite
To run the test suite:
$ bundle exec rake
$ bundle exec guard
If you are a GOV.UK developer using the development VM, you can run the replication script to populate the database.
To run the ETL process locally, you need to set up Google Analytics credentials in development.
Updating the API
Anytime you change what the API accepts as input or returns as output, you need to update the OpenAPI spec and documentation.
Backwards incompatable changes
Currently the API is in alpha, so users should expect backwards incompatable changes without warning.
When the API is live, we will follow the GDS API technical and data standards
- make backwards compatible changes where possible
- use a version number as part of the URL when making backwards incompatible changes
- make a new endpoint available for significant changes
- provide notices for deprecated endpoints