![ga4](https://www.google-analytics.com/collect?v=2&tid=G-6VDTYWLKX6&cid=1&en=page_view&sid=1&dl=statmike%2Fvertex-ai-mlops%2Farchitectures%2Ftracking&dt=tracking_ga4.ipynb)

# GA Tracking For Repository

**Goal** 

Count how many times each document is viewed without tracking anything more than the view - no user information!

**Constraints**

This repository is primarily markdown document and Jupyter Notebooks which are both static when viewed.  To track a view the target technique to include a tracking pixel that loads when viewed.  The only information passed to the tracking pixel will be the document name and path within the repository.

**GA Setup**

This requred setting up a GA account + property:
- GA > Admin
- Create Account:
    - account name = vertex-ai-mlops
    - Property Setup / Create Property
        - Property Name = github
        - Data Stream Setup
            - Add Stream (Web)
                - stream name = github
                - stream url = https://www.github.com (can be anything)
                - Save Measurment ID for this stream


---
## Working with GA4 Measurement Protocol

Does not seem to have a GET based call that returns a pixel.  With needing CURL for POST it is not simple to use this in static documents.  Still good for tracking custom events.  This may be helpful for tracking parts of the code as they are executed.

https://developers.google.com/analytics/devguides/collection/protocol/ga4/policy

In [1]:
import json
import base64
import urllib.parse
import requests

In [98]:
measurement_id = 'G-6VDTYWLKX6'
api_secret = '*********' ## Retrieve from GA: admin > account > property > data streams > stream > Measurement Protocol API Secrets > Secret value 
url = f'https://www.google-analytics.com/mp/collect?measurement_id={measurement_id}&api_secret={api_secret}'
print(url)

https://www.google-analytics.com/mp/collect?measurement_id=G-6VDTYWLKX6&api_secret=B2xh4NsVRfWqi6NYBBQFxA


In [99]:
body = {
    "client_id": "x",
    "events": [
        {
            "name": "open_file",
            "params": {
                "path": "example/path",
                "file": "file.md"
            }
        }
    ]
}
type(body)

dict

In [100]:
response = requests.post(url, json = body)

In [101]:
response

<Response [204]>

---
## Encoding Tips

In [102]:
json_body = json.dumps(body)
type(json_body), json_body

(str,
 '{"client_id": "x", "events": [{"name": "open_file", "params": {"path": "example/path", "file": "file.md"}}]}')

In [105]:
utf_body = json_body.encode('utf-8')
type(utf_body), utf_body

(bytes,
 b'{"client_id": "x", "events": [{"name": "open_file", "params": {"path": "example/path", "file": "file.md"}}]}')

In [108]:
url_body = urllib.parse.quote_plus(utf_body)
type(url_body), url_body

(str,
 '%7B%22client_id%22%3A+%22x%22%2C+%22events%22%3A+%5B%7B%22name%22%3A+%22open_file%22%2C+%22params%22%3A+%7B%22path%22%3A+%22example%2Fpath%22%2C+%22file%22%3A+%22file.md%22%7D%7D%5D%7D')

In [109]:
b64_body = base64.b64encode(utf_body)
type(b64_body), b64_body

(bytes,
 b'eyJjbGllbnRfaWQiOiAieCIsICJldmVudHMiOiBbeyJuYW1lIjogIm9wZW5fZmlsZSIsICJwYXJhbXMiOiB7InBhdGgiOiAiZXhhbXBsZS9wYXRoIiwgImZpbGUiOiAiZmlsZS5tZCJ9fV19')

---
## Tracking Pixel with GA Measurement Protocol (Universal Analytics)

Load a pixel with a url like:
```
https://www.google-analytics.com/collect?
v=1
&cid=1
&tid=UA-xxx-y
&t=pageview
dp=path%2Fto%2Ffile
&dt=file.ext
```

**References**
- Directly in the GA docs [here](https://developers.google.com/analytics/devguides/collection/protocol/v1/email)
- Good blog [here](https://mjpitz.com/blog/2020/07/17/repo-impression-tracking/)

**Issue**
- GA4 replaces UA in 2023 so not worth implementing this as solution

---
## Tracking Pixel with GA Measurement Protocol (GA4) - v2

## IMPLEMENTED IN THIS REPOSITORY

A seemingly not well documented version of the measurment protocol with `&v=2`, version 2, exists.  I discovered these blogs and tips oneline for it:
- https://www.optimizesmart.com/what-is-measurement-protocol-in-google-analytics-4-ga4/
- https://stackoverflow.com/questions/59264782/analytics-track-custom-events-in-new-webapp

It looks like it is possible to use the same technique as tracking pixel with UA (above) by replace the version with 2 and the `tid` with a GA4 measurment id.

**Notes**
- `sid` is session id and seems to be required in order for the event data to show up in the BigQuery Exports as well as Reports (other than real time).  Use sid=1 to force this?

In [5]:
measurement_id = 'G-6VDTYWLKX6'
pwd = !pwd
pwd = pwd[0].replace('/home/jupyter/', 'statmike/')
file_name = 'tracking_ga4.ipynb'

url = 'https://www.google-analytics.com/collect?v=2'
track_parms = f'&tid={measurement_id}&cid=1&en=page_view&sid=1'
pass_parms = f"&dt={urllib.parse.quote_plus(file_name)}&dl={urllib.parse.quote_plus(pwd)}"
click = url + track_parms + pass_parms

print(click)

https://www.google-analytics.com/collect?v=2&tid=G-6VDTYWLKX6&cid=1&en=page_view&sid=1&dt=tracking_ga4.ipynb&dl=statmike%2Fvertex-ai-mlops%2Farchitectures%2Ftracking


---
## Tracking Clicks

A redirection tools is a great way to gather click (conversions).  This is called a beacon.

An GitHub project that combines beacon data with returning a tracking pixel can be viewed [here](https://github.com/igrigorik/ga-beacon).  It is built for UA and not updated for GA4.  It also shows an issue with using tracking pixel on GitHub due to image caching mechanisms that GitHub uses.

A direct tool redirection is call aRT - see go/art

---
## GA to BigQuery

- https://support.google.com/analytics/answer/9358801?hl=en&utm_id=ad
- GA > Admin > Account = vertex-ai-mlops > property = github > one of following (both go to same place)
    - Setup Assistant > Advanced Setup (optional) > Link To BigQuery > Setup BigQuery Link
    - Product Links > BigQuery Links
- BigQuery Links > Link
    - Choose a BigQuery Project: vertex-ai-mlops-369716 (it automatically sees projects in GCP under same login as GA = statmike@)
    - Confirm
    - location = US
    - Next
    - Frequency = Daily
    - Next
    - Submit
