-
Notifications
You must be signed in to change notification settings - Fork 2
/
index.html.md.erb
70 lines (57 loc) · 3.3 KB
/
index.html.md.erb
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
---
title: GOV.UK GA4 (BigQuery export)
weight: 2
last_reviewed_on: 2024-05-23
review_in: 6 months
---
# GOV.UK GA4 (BigQuery export)
We export our [GOV.UK Google Analytics data](/data-sources/ga/ga4/) and store it in Google BigQuery to enable more detailed analysis and the use of this data in other tools.
A [flattened dataset](/data-sources/ga/ga4-flat/) is created from the live site data. This contains most, but not all, of the same fields as the raw data.
The flattened dataset is easier and more efficient to query, so should be used for most analysis and reporting.
## Access
Everyone with a @digital.cabinet-office.gov.uk account (all GDS staff) has access to this data by default.
In addition, everyone who requests and is granted access to the GOV.UK GA4 data will be given access to query this raw dataset.
More information can be found in our [GA access policy](/processes/ga-access/#what-we-provide-access-to).
### Location
There are 3 GA4 datasets. These correspond to the integration, staging, and production or live GOV.UK websites.
The GA4 data for the live GOV.UK site is located in BigQuery in the `ga4-analytics-352613.analytics_330577055` dataset.
The GA4 data for the staging site is located in BigQuery in the `ga4-analytics-352613.analytics_330580593` dataset.
The GA4 data for the integration site is located in BigQuery in the `ga4-analytics-352613.analytics_294475112` dataset.
These datasets are all comprised of date sharded tables - a new table is created each day with the suffix YYYYMMDD.
Our Google Analytics properties export GOV.UK data several times a day.
The data for the current day is temporarily stored in intraday tables.
At the end of the day, BigQuery automatically moves the data in the intraday tables to a date table (suffixed `YYYYMMDD`) and deletes the intraday tables in question.
New intraday tables are created and added to throughout the next day.
The GA4 datasets are all stored within the [GA4 analytics project](/gcp/#ga4-analytics).
For more information on the Google Cloud Platform projects, see our [GCP Project Documentation](/gcp/).
## Schema
This table uses the default [GA4 BigQuery Export schema](https://support.google.com/analytics/answer/7029846):
| field name | type | mode |
| --- | --- | --- |
| event_date | STRING | NULLABLE |
| event_timestamp | INTEGER | NULLABLE |
| event_name | STRING | NULLABLE |
| event_params | RECORD | REPEATED |
| event_previous_timestamp | INTEGER | NULLABLE |
| event_value_in_usd | FLOAT | NULLABLE |
| event_bundle_sequence_id | INTEGER | NULLABLE |
| event_server_timestamp_offset | INTEGER | NULLABLE |
| user_id | STRING | NULLABLE |
| user_pseudo_id | STRING | NULLABLE |
| privacy_info | RECORD | NULLABLE |
| user_properties | RECORD | REPEATED |
| user_first_touch_timestamp | INTEGER | NULLABLE |
| user_ltv | RECORD | NULLABLE |
| device | RECORD | NULLABLE |
| geo | RECORD | NULLABLE |
| app_info | RECORD | NULLABLE |
| traffic_source | RECORD | NULLABLE |
| stream_id | STRING | NULLABLE |
| platform | STRING | NULLABLE |
| event_dimensions | RECORD | NULLABLE |
| ecommerce | RECORD | NULLABLE |
| items | RECORD | REPEATED |
| collected_traffic_source | RECORD | NULLABLE |
| is_active_user | BOOLEAN | NULLABLE |
## Retention
The integration and staging GA4 datasets have a default table expiry in BigQuery of 30 days.