Run Lighthouse audits on URLs, and write the results daily into a BigQuery table.
Steps (needs rewrite)
- Clone repo.
npm installin directory.
- Install Google Cloud SDK.
- Authenticate with
gcloud auth login.
- Create a new GCP project.
- Enable Cloud Functions API and BigQuery API.
- Create a new dataset in BigQuery.
gcloud config set project <projectId>in command line.
config.json, update list of
sourceURLs and IDs, edit
projectIdto your GCP project ID, edit
datasetIdto the BigQuery dataset ID.
gcloud functions deploy launchLighthouse --trigger-topic launch-lighthouse --memory 2048 --timeout 540 --runtime=nodejs8.
gcloud pubsub topics publish launch-lighthouse --message allto audit all URLs in source list.
gcloud pubsub topics publish launch-lighthouse --message <source.id>to audit just the URL with the given ID.
- Verify with Cloud Functions logs and a BigQuery query that the performance data ended up in BQ. Might take some time, especially the first run when the BQ table needs to be created.
How it works
When you deploy the Cloud Function to GCP, it waits for specific messages to be pushed into the
launch-lighthouse Pub/Sub topic queue (this topic is automatically generated by the function).
When a message corresponding with a URL defined in
config.json is registered, the function fires up a lighthouse instance and performs the basic audit on the URL.
This audit is then parsed into a BigQuery schema, and written into a BigQuery table named
report under the dataset you created.
The BigQuery schema currently only includes items that have a "weight", i.e. those that impact the scores also provided in the audit.
You can also send the message
all to the Pub/Sub topic, in which case the Cloud Function self-executes a new function for every URL in the list, starting the lighthouse processes in parallel.
The main problem is with the Performance audit. The lighthouse instances aren't meant for heavy lifting with default settings, so they don't necessarily reflect actual performance costs of the site. Some configuration for network conditions needs to be done in the future.
This is extremely low-cost. You should basically be able to work with the free tier for a long while, assuming you don't fire the functions dozens of times per day.