![ga4](https://www.google-analytics.com/collect?v=2&tid=G-6VDTYWLKX6&cid=1&en=page_view&sid=1&dl=statmike%2Fvertex-ai-mlops%2Farchitectures%2Ftracking%2Fsetup%2Fga4&dt=GA4+Setup.ipynb)

## GA4 Setup

**Goal** 

Count how many times each document is viewed without tracking anything more than the view - no user information!

**Constraints**

This repository is primarily markdown documents (`.md`) and Jupyter Notebooks (`.ipynb`) which are both static when viewed. Viewers include: IDE like VSCode, JupyterLab, GitHub.com, Colab.  

**Approach**

Include a tracking pixel as an image at the top of each document.  As the files load in a viewer they load images `![](path/to/image)` as they render markdown.  

The pixels path is a GA4 measurement protocol path that includes tracking information.  The only information passed to the tracking pixel will be the document name and path within the repository.  **Use dummy values for session and user.**

**Storage**

Google Analytics can automatically export to BigQuery daily or streaming.

---
## Google Analytics Setup (GA4)

First, create a Google Analytics Account and a Property:
- Go to [Google Analytics](https://analytics.google.com) and login
- Go to `Admin` (lower left corner)
- `+ Create Account`
    1. Account name = vertex-ai-mlops, click `Next`
    2. Property name = github, click `Next`
    3. optional fill out business info, click `Create`
- Select the Account and Property, then on Property menu:
    - Click `Data Streams`
    - Click `Add stream` and select `Web`
        - Website URL is https://www.github.com, but can be anything!
        - Stream name = github
        - Make sure `Enhanced measurement` is selected
        - Click `Create stream`
        - You may be prompted to Install your Google tag, dismiss this setup by clicking `X`
    - Note the Measurment ID for this stream
    - Retrieve the API secret for this Measurement ID
        - Under `Web stream details` select `Measurment Protocol API secrets`
            - Navigation if needed: Admin > Account > Property > Data Streams > github (name assigned above) > Measurement Protocol API secrets
        - Select `Create` (upper right)
        - Nickname = vertex-ai-mlops, Click `Create`
        - Note the Secret value (but do not store in notebook!)

---
## Create Tracking Pixels

Tracking pixels are URLs constructed of information and using the measurment ID create during Google Analytics Setup.

>A seemingly not well documented version of the measurment protocol with `&v=2`, version 2, exists.  I discovered these blogs and tips oneline for it:
>- https://www.optimizesmart.com/what-is-measurement-protocol-in-google-analytics-4-ga4/
>- https://stackoverflow.com/questions/59264782/analytics-track-custom-events-in-new-webapp




The main url is: https://www.google-analytics.com/collect
Options are added to this url:
- `?v=2` - specifies version 2 of the measurment protocol
    - [Reference](https://developers.google.com/analytics/devguides/collection/protocol/v1/parameters#v)
- `?tid=<value here>` measurement id - points information to the property we created above
    - [Reference](https://developers.google.com/analytics/devguides/collection/protocol/v1/parameters#tid)
- `?cid=1` - the users client id, required when not sending `uid` (user id), in this case is set to dummy value of 1 for all users/clients.
    - [Reference](https://developers.google.com/analytics/devguides/collection/protocol/v1/parameters#cid)
- `?en=page_view`
    - [Reference](https://support.google.com/analytics/answer/9216061#)
- `?sid=1` - session id, is required, but is set to a dummy value of 1 (no cookies)
    - [Reference](https://developers.google.com/analytics/devguides/collection/protocol/v1/parameters)
- `?dt=` - the name of the file, make sure it is url encoded (space is %20)
    - [Reference](https://developers.google.com/analytics/devguides/collection/protocol/v1/parameters#dt)
- `?dl=` - the path to the file, make sure it is url encoded
    - [Reference](https://developers.google.com/analytics/devguides/collection/protocol/v1/parameters#dl)


The tracking pixels are automatically added to all `.md` and `.ipynb` files in this repository using the notebook [tracking_ga4_add.ipynb](./tracking_ga4_add.ipynb).



---
## GA4 Export To BigQuery

This is a process you setup that runs continously, not just one time.

**References**
- [GA4 BigQuery Export](https://support.google.com/analytics/answer/9358801?hl=en&utm_id=ad)
- [GA4 Setup BigQuery Export](https://support.google.com/analytics/answer/9823238?hl=en&ref_topic=9359001#zippy=%2Cin-this-article)

Setup Process:
- Note: Use the same login for Google Analytics and GCP.  This login for GCP needs owner access to the BigQuery project that will be used and the editor role for the Google Analytics Property created above.
- Go to [Google Analytics](https://analytics.google.com) and login
- Go to `Admin` (lower left corner)
- Select Account = vertex-ai-mlops (created above)
- Select Property = github (created above)
    - Select `BigQuery Links` under `Product Links`
    - Click `Link`
    - Click `Choose a BigQuery project`
    - Select a project from the list, click `Confirm`
    - Select a location form the list, US multi-regions, click `Next`
    - Select `Configure data streams and events` and select the data stream named github (created above). No need to exclude any events.
    - Click `Done`
    - Select Frequency - both `Daily` and `Streaming`
    - Click `Next`
    - Click `Submit`


---
## Data in BigQuery

**Dataset**

The process above creates a dataset in choosen BigQuery project that is named `analytics_########` where the `########` is the property id.

**Tables**

The daily tables are named `events_YYYYMMDD`.  
The streaming tables are named `events_intraday_YYYYMMDD`.

These are sharded tables that can be read individually or using a wildcard.  The read can be filtered with a `WHERE` statement that uses `_TABLE_SUFFIX`. [Reference](https://cloud.google.com/bigquery/docs/querying-wildcard-tables)

---
## Review Data In BigQuery

- open in colab section
- setup: project, parameters (BQ project = PROJECT_ID), client
- list datasets with analytics_
- list tables in dataset(s)
- return sample from a table
- provide a link to console to view
- add a screenshot
