Allows developers to detect production issues, by automating the process of querying the Data Lake and alerting on undesirable results.
You will need to be able to run queries against the Data Lake using AWS Athena
,
so you'll need to request dataLakeQuerying
permissions via Janus if you do not already have access.
Currently this service has read-only access to the clean.pageviews
table only. If you need to run queries against
a different table, you'll need to add further permissions to the Cloudformation template in this repository.
- Write your query and test it using Athena.
- The Athena pricing model is based on data scanned, so try not to scan more data than necessary.
- Add
case object MyFeature extends Feature { ??? }
to theFeatures
object. You will need to provide:- A feature id (used when triggering the lambda)
- A list of platforms which the feature is relevant for (defaults to iOS and Android)
- A function which returns your query (to be run by Athena) and a list of checks to run when the query completes
- Add your feature to
allFeaturesWithMonitoring
(also in theFeatures
object).
Ensure you are on the correct AWS region (eu-west-1
). This can be achieved by using the AWS_REGION
environment variable:
export AWS_REGION=eu-west-1
- Obtain
developerPlayground
andophan
Janus credentials. - Run
sbt "test:runMain com.gu.datalakealerts.integration.FullTestWithLocalPolling friction_screen android"
(passing in the relevantPlatform
andFeature
ids).
Note that when running locally or in the CODE
environment all alerts will be sent to the anghammarad.test.alerts
Google Group
(instead of the team who maintains the specified production stack).
This allows you to send test alerts in these environments without spamming your team.
Monitoring checks run at 12:00 UTC every weekday. To confirm that monitoring has been scheduled correctly for your feature:
- Run
sbt "test:runMain com.gu.datalakealerts.integration.ConfirmTaskWillBeScheduled"
- Confirm that your feature (and platform) are listed in the output.
If you want to confirm that your monitoring task is scheduled correctly in production, or see the results of a successful query, you can view the logs using Kibana. Hint: searching these logs for your feature_id
might save you some time.