Skip to content
Data warehouse that stores content and content metrics to help content owners measure and improve content on GOV.UK
Ruby TSQL Other
Branch: master
Clone or download
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.github
app Fix syntax Aug 7, 2019
bin Address linting issues Aug 7, 2019
config Use Rails.root.join Jul 3, 2019
db Address linting issues Aug 7, 2019
doc
lib Fix update editions rake task to only process live Jun 27, 2019
log Track with git folders `log` and `tmp` Oct 26, 2017
public
spec
tmp Track with git folders `log` and `tmp` Oct 26, 2017
vendor/assets
.gitignore
.rspec Auto-require rails_helper in specs for convenience May 17, 2017
.rubocop.yml
.ruby-version
.travis.yml
Gemfile
Gemfile.lock
Jenkinsfile
LICENCE
Procfile Rename, and remove unused, sidekiq configuration Jul 19, 2018
README.md Update repo to reference the content data api Jun 27, 2019
Rakefile Add --parallel to rubocop when run through `rake lint` Apr 11, 2019
config.ru Initial commit Nov 16, 2016
openapi.yaml Update repo to reference the content data api Jun 27, 2019
startup.sh

README.md

Content Data API

A data warehouse that stores content and content metrics, to help content owners measure and improve content on GOV.UK.

This repository contains:

Data is combined from multiple sources, including the publishing platform, user analytics, user feedback.

Introduction

Live examples

Nomenclature

  • Data warehouse: the database where we store all the metrics.
  • ETL: extract, transform, load - how we get data into the data warehouse.
  • Fact: a record containing measurements/metrics
  • Dimension: a characteristic that provides context for a fact (such as the time it was extracted, or the content item it belongs to)
  • Star schema: The way we structure data in the data warehouse using fact and dimension tables

Technical documentation

This is a Ruby on Rails application that stores over time performance metrics and content changes and exposes this information via an API. It is built on a PostgreSQL 9.6 database.

Dependencies

Running the application

See the getting started guide for instructions about setting up and running your development VM.

cd /var/govuk/govuk-puppet/development-vm
bowl content-data-api

The application can be accessed from http://content-data-api.dev.gov.uk, and will be installed on port 3235 on your Dev environment.

Running the test suite

To run the test suite:

$ bundle exec rake

Populating data

If you are a GOV.UK developer using the development VM, you can run the replication script to populate the database.

Run ETL processes locally

Licence

MIT License

You can’t perform that action at this time.