Skip to content

govuk-one-login/data-analytics-platform

Repository files navigation

DI Data Analytics Platform

Data and Analytics platform which will enable the implementation of the OneLogin reporting strategy.

Prerequisites

Install development tools

The project uses the current (as of 03/05/2023) LTS of Node, version 18. The GDS recommendation is to use nvm to manage Node versions - installation instructions can be found here.

Core
  • AWS SAM CLI - for running SAM commands
  • Node - for lambda development and running npm commands
  • Docker - for running sam local
  • Checkov - for validating IaC code. Install on GDS Macs in the terminal by running pip3 install checkov
Optional
  • AWS CLI - for interacting with AWS on the command line
  • GitHub CLI - for interacting with GitHub on the command line. Can do some things not possible via the GUI, such as running workflows that have not been merged to main

Set up commit signing

Commits will be rejected by GitHub if they are not signed using an SSH or GPG key. SSH keys do not support expiration or revocation so GPG is preferred. Follow the instructions here to generate a key and set it up with GitHub. You may need to install gpg first - on a GDS Mac open the terminal and run brew install gpg.

Set up husky hooks

Husky is used to run githooks, specifically pre-commit and pre-push. To install the hooks run npm run husky:install. After this, the hooks defined under the .husky directory will automatically run when you commit or push.* The lint-staged library is used to only run certain tasks if certain files are modified.

Config can be found in the lint-staged block in package.json. Note that lint-staged works by passing a list of the matched staged files to the command defined, which is why the commands in package.json are e.g. prettier --write, with no file, directory or glob arguments. (usually if you wanted to run prettier you would need such an argument, e.g.prettier --write . or prettier --check src. More information can be found here.

* Git LFS hooks also live in this directory - see section below

Set up Git LFS

If you intend to make changes to any of the large binary files in this repository (currently just *.tar.gz and *.jar) then you will need to install Git LFS. This is necessary as GitHub blocks files larger than 100 MiB.

If you do not install Git LFS you will only get the pointer files and not the actual data. This is not a problem unless you want to edit these files. See this section of the GitHub docs for more information

Git LFS also uses hooks, specifically post-checkout, post-commit, post-merge and pre-push. In the case of the latter, husky also uses this hook which is why the file at .husky/_/pre-push contains both husky and Git LFS code. Note that the Git LFS hooks are in the husky directory because husky was installed in the repository before Git LFS and so that directory structure was already in place. Manually editing the hooks was necessary due to the clash on pre-push, and this comment was the general direction taken.

Repository structure

Lambdas

The lambdas and supporting code are written in TypeScript and built with esbuild.

Individual lambda handlers (and unit tests) can be found in subdirectories of the src/handlers directory. Common and utility code can be found in the src/shared directory.

In addition, files to support running lambdas with sam local invoke are in the sam-local-examples directory.

IaC

IaC code is written in AWS SAM (a superset of CloudFormation templates) and deployed as SAM applications.

IaC code can be found in the iac directory. There are currently two applications, each with its own subdirectory (main and quicksight-access). In each there is a base file, base.yml, which contains everything except the Resources section. In the resources/ subdirectory, there are YAML files containing all the stack resources, grouped by functional area.

A package.json script, iac:build, concatenates all these files for a particular application into a single top-level template.yaml file that is expected by SAM and Secure Pipelines. The script requires an argument for which application you wish to build, e.g. npm run iac:build -- main. To build all applications at once (useful for linting and scanning), an additional npm script, iac:buildall, exists which puts the template files it builds into the (git ignored) iac-dist directory.

The AWS SAM config is at samconfig.toml.

Workflows

Workflows that enable GitHub Actions can be found in the .github/workflows directory. Below is a list of workflows. The ✳️ symbol at the start of a workflow name indicates that it can be run manually.

Name File Triggers Purpose
Deploy to an AWS environment deploy-to-aws.yml Deploys to a deployable AWS environment (dev, build, test)
✳️ Deploy to the test environment deploy-to-test.yml Deploys IaC and lambda code to the test AWS
✳️ Deploy to the dev environment deploy-to-dev.yml Deploys IaC and lambda code to the dev AWS
Deploy to the build environment deploy-to-build.yml Deploys IaC and lambda code to the build AWS
✳️ Test and validate iac and lambdas test-and-validate.yml Runs linting, formatting and testing of lambda code, and linting and scanning of IaC code
✳️ Upload Athena files to S3 upload-athena-files.yml Uploads athena scripts for a particular environment (under athena-scripts) to S3
✳️ Pull request deploy and test pull-request-deploy-and-test.yml Deploys a pull request branch to the feature environment and runs integration tests when a pull request is opened, reopened or updated
✳️ Pull request tear down pull-request-tear-down.yml Tears down the feature environment when a pull request is merged or otherwise closed
Upload testing image to ECR upload-testing-image.yml Builds a testing dockerfile in tests/scripts/ and uploads the image to ECR
✳️ Upload testing images to ECR upload-testing-images.yml Builds one or more testing dockerfiles in tests/scripts/ and uploads the images to ECR. Which dockerfiles to build can be specified via inputs
SonarCloud Code Analysis code-quality-sonarcloud.yml Runs a SonarCloud analysis on the repository
✳️ Run flyway command on redshift run-flyway-command.yml Runs a specified flyway command on the redshift database in a specified environment. For more on how to use this workflow see the README here
✳️ Add Quicksight user add-quicksight-user.yml Provides an interface to add a user to Cognito and Quicksight by invoking the quicksight-add-users lambda
✳️ Add Quicksight users from spreadsheet add-quicksight-users.yml Reads the DAP account management spreadsheet and attempts to add users to Cognito and Quicksight
✳️ Deploy to the production preview environment deploy-to-production-preview.yml Deploys to the production-preview environment
SAM deploy sam-deploy.yml Performs a SAM deploy to an environment without secure pipelines (feature, production-preview)
✳️ Upload Flyway files to S3 upload-flyway-files.yml Uploads flyway files for a particular environment (under redshift-scripts/flyway) to S3
✳️ Export analysis from Quicksight quicksight-export.yml Exports a Quicksight analysis to S3 using the asset bundle APIs
✳️ Import analysis to Quicksight quicksight-import.yml Imports a Quicksight analysis from S3 using the asset bundle APIs

Testing

Unit tests

Unit testing is done with Jest and the lambdas should all have associated unit tests (*.spec.ts).

  • npm run test - run all tests under src/
  • jest consumer - run a specific test or tests
    • anything after jest is used as a regex match - so in this example consumer causes jest to match all tests under the src/handlers/txma-event-consumer/ directory (and any other directory that might have consumer in its name)

Integration tests

TODO

Test reports

After running unit or integration tests, a test report called index.html will be available in the test-report directory. This behaviour is provided by jest-stare and configured in jest.config.js.

Linting, formatting and validation

Lambdas

Linting and formatting are handled by ESLint and Prettier (with an EditorConfig file) respectively. typescript-eslint is used to allow these tools to work with TypeScript.

  • npm run lint:check - run linting and formatting checks and print warnings
  • npm run lint:fix - run linting and formatting checks and (attempt to) automatically fix issues

IaC

AWS SAM can perform validation and linting of CloudFormation files. In addition, checkov can find misconfigurations. Prettier can also check (or fix) the formatting of the YAML of the SAM template. Prettier is used to ensure consistent formatting of the YAML.

  • npm run iac:lint - run validation and linting checks and print warnings
  • npm run iac:scan - run checkov scan and print warnings
  • npm run iac:format:check - run formatting checks and print warnings
  • npm run iac:format:fix - run formatting checks and automatically fix issues

Scripts

Prettier is used to ensure consistent formatting of the script files in the scripts/ directory. The ability to format shell scripts comes from the prettier-plugin-sh library.

  • npm run scripts:format:check - run formatting checks and print warnings
  • npm run scripts:format:fix - run formatting checks and automatically fix issues

Building and running

Lambdas

  • npm run build - build (transpile, bundle, etc.) lambdas into the dist directory

Lambdas can be run locally with sam local invoke. A few prerequisites:

  • Docker is running
  • Lambda you wish to run has been built into a .js file (npm run build)
  • Lambda you wish to run is defined in CloudFormation and has been built into the top-level template.yml file (npm run iac:build)
    • You can use the CloudFormation resource name (e.g. AthenaGetConfigLambda or EventConsumerLambda) to refer to the lambda in the invoke command
  • SAM application has been built (sam build)
    • Order matters here - this command copies the lambda JS into .aws-sam/, so make sure npm run build has been run beforehand
  • You have defined a JSON file (ideally here) containing the event you wish to be the input event of the lambda (unless you don't need an input event)
  • You have added any environment variables you need the lambda to take to env.json

An example invocation might be

npm run build
npm run iac:build
sam build

# invoke with no input event or environment vars
sam local invoke EventConsumerLambda

# invoke specifying both an input event and environment variables
sam local invoke EventConsumerLambda --env-vars sam-local-examples/env.json --event sam-local-examples/txma-event-consumer/valid.json
A note on args
  • The --env-vars arg takes the path to a JSON file with any environment vars you want the lambda to have access to (via node process.env). Find these (and define more) in per-function objects within the main object in sam-local-examples/env.json
  • The --event arg takes the path to a JSON file with the input event you want the lambda to have. Find these (and define more) in per-function subdirectories under sam-local-examples/
  • A different template file path can be specified with the --template-file flag

SAM local can also be used to generate events. An example invocation might be sam local generate-event sqs receive-message or sam local generate-event s3 put. You can run sam local generate with no args for a list of supported services.

IaC

AWS SAM can build the YAML template. Artifacts will be placed into .aws-sam/. If you wish the lambda code to be included, it must first have been built into a .js file (npm run build). An example invocation might be

sam build

which will build template.yaml and use the lambda code in dist/. A different template file path can be specified with the --template-file flag and a different lambda code directory by changing the CodeUri global property in template.yaml.

Deploying and environments

Deployment is done via Secure Pipelines*. The deployments are done via the Secure Pipelines SAM deployment stack, and tests are run after the SAM deployment, which is done via Secure Pipelines testing containers.

The deployment of the platform is currently split two applications, main and quicksight-access (each having its own subdirectory in iac). This was to overcome an issue where we had hit a hard character limit for the programmatic permissions boundary (used by lambdas) caused by us having so many AllowedServices in the SAM deployment stack. The solution was to make a second SAM deployment stack to have some of the AllowedServices and split off some of the IaC to become its own application deployed by that stack (the Cognito and Quicksight functionality as it was the source of the most recent permissions we had requested that had put our permissions boundary over the limit).

From a Secure Pipelines point-of-view, environments can be split into two types: 'higher' and 'lower' environments. The lower environments are test, dev and build**. The higher environments are staging, integration and production. More information can be found using the Secure Pipelines link above, but the key differences are that the lower environments are the only ones that can be deployed to directly from GitHub. Deployment to the higher environments relies on 'promotion' from a lower environment, specifically the build environment. In addition, the higher environment lambdas are triggered by real TxMA event queues***, whereas lower environments use a placeholder one that we create and must put our own test events onto.

* With the exception of the feature environment - see section below

** Strictly speaking, test and dev do not form part of the Secure Pipelines build system which takes an application that is deployed to build all the way to production via the other higher environments. Our test and dev environments are disconnected sandboxes; however they still use Secure Pipelines to deploy directly from GitHub

*** An important exception is that dev is connected to the real TxMA staging queue. This is intended to be temporary since at time of writing we do not have the higher environments set up. Once our own staging account is ready, it will receive the real TxMA staging queue and dev will get a placeholder queue

Lower Environments

Test

Our test environment is a standalone environment and can therefore be used as a sandbox. A dedicated GitHub Action Deploy to the test environment exists to enable this. It can be manually invoked on a chosen branch by finding it in the GitHub Actions tab and using the Run workflow button.

Dev

Our dev environment is also a standalone environment and can therefore be used as a sandbox. A dedicated GitHub Action Deploy to the dev environment exists to enable this, allowing manual deploys like the one for test.

Additionally, the action will automatically run after a merge into the main branch after a Pull Request is approved.

Build

The build environment is the entry point to the Secure Pipelines world. It is sometimes referred to as the 'Initial Account' in Secure Pipelines, as it is the first account on the journey to Production, and has unique needs (compared with the higher environments) such as the ability to deploy to from GitHub.

A GitHub Action Deploy to the build environment exists to enable this. The action cannot be invoked manually like the one for dev, only by merging into the main branch after a Pull Request is approved.

Higher Environments

Higher environment config

Because they use real TxMA event queues (from external AWS accounts and not in our IaC code), deployment to higher environments* relies on the following AWS System Manager Parameters being available in the target account:

* These parameters are also required in the dev account for the reasons mentioned above (dev currently having the real TxMA staging queue). They are additionally required in the production preview account as it also has a real TxMA queue.

Name Description
TxMAEventQueueARN ARN of the TxMA event queue which triggers the txma-event-consumer lambda
TxMAKMSKeyARN ARN of the TxMA KMS key needed for the txma-event-consumer lambda

You can see these values being referenced in the template files in the following way:

'{{resolve:ssm:TxMAEventQueueARN}}'

See the following links for how to create the parameters via:

Parameter values can be found on this page - recall that our dev environment currently takes the values assigned to staging on that page.

Staging

The staging environment is the first higher environment and so cannot be directly deployed to. When a deployment pipeline is successful in the build environment, the artifact will be put in a promotion bucket in the build account, which is polled by staging. When staging picks up a new build it is deployed to that environment.

Integration and Production

The integration and production environments are the second (and final) level of higher environment. They behave like the staging environment in the sense that they cannot be deployed to but instead poll for promoted artifacts from a lower environment. The difference between them and staging is that the promotion bucket integration and production poll is the one in staging.

Other Environments

The following accounts are not in secure pipelines.

Feature

The feature environment is a standalone environment for the purpose of testing GitHub pull requests. It has a GitHub Action Pull request deploy and test which deploys there and then runs integration tests. This deployment is not done via Secure Pipelines, but just manual sam deploy commands. Likewise, the tests are not run with the Secure Pipelines testing container approach, but instead manually invoked with npm run. This action can be manually invoked, but will also automatically run when a pull request is opened, reopened or updated. Unlike other environments, feature has a second GitHub Action Pull request tear down which completely deletes the stacks. Like the first action it can be manually invoked, but will also automatically run when a pull request is merged or otherwise closed. Automatic running has been disabled until DAC-1862 is done.

To perform the deployments and tear downs we use a special role in the feature environment called dap-feature-tear-down-role. It is not in the IaC because it causes one of the following issues:

  • Without a DeletionPolicy, it gets deleted while the stack is being deleted and so the deletion fails part way through as there are then no longer the permissions to do the deletion
  • With an appropriate DeletionPolicy this doesn't happen, but instead the next stack creation fails because the resource already exists
Production Preview

The production preview environment is another standalone environment that exists outside of Secure Pipelines (like the feature environment it is deployed with a manual sam deploy). It has a GitHub Action Deploy to the production preview environment but no corresponding tear down one.

The deployments use a special role in the production preview environment, dap-production-preview-deploy-role, much like the role in feature.

Production preview has a real TxMA queue in addition to its placeholder queue and so requires the SSM parameters mentioned above in the Higher environment config section.

Config for cross account data sync

Because production preview and staging are used for cross account data sync, they have a single SSM parameter holding the name of the cross account data sync role. They use this to allow access to their SQS queues and usage of their KMS keys to enable the cross account data sync process.

Name Description
CrossAccountDataSyncRoleARN ARN of the role allowing cross account data sync

Additional Documents

For a guide to how and why certain development decisions, coding practices, etc. were made, please refer to the Development Decisions document.

For a list of TODOs for the project, please see the TODOs document.