Public HTTP cloud function handle_telemetry_event
receives POST requests.
Cloud function saves event as a json file to GCS specific to each dataset
and table
combination.
A separate cloud function compose_telemetry_events
runs regularly to consolidate single event files into bigger files.
BigQuery data transfer job to ingest GCS files into <dataset>.raw_<table>_yyyymmdd
tables in BigQuery.
Scheduled BigQuery queries to parse raw json into daily <dataset>.parsed_<table>_yyyymmdd
tables.
cURL example POST request
curl --location --request POST 'https://us-east1-gcp-telemetry-example.cloudfunctions.net/handle_telemetry_event' \
--header 'Content-Type: application/json' \
--data-raw '{
"gcs_custom_prefix": "andrewm4894",
"project": "gcp-telemetry-example",
"dataset": "dataset_a",
"table": "table_a2",
"event_type": "dev",
"event_key": "mykey",
"event_data": "{'\''some_value'\'':'\''some_key'\''}"
}'
Python example POST request
import json
import requests
url = "https://us-east1-gcp-telemetry-example.cloudfunctions.net/handle_telemetry_event"
event_data = {
"some_string" : "a string" ,
"some_list" : ['a' , 'b' , 'c' ],
"some_int" : 42 ,
"some_float" : 0.99 ,
"some_dict" : {"some_list" : ['foo' , 'bar' ]}
}
data = {
"gcs_custom_prefix" : "andrewm4894" ,
"project" : "gcp-telemetry-example" ,
"dataset" : "dataset_a" ,
"table" : "table_a1" ,
"event_type" : "dev" ,
"event_key" : "mykey" ,
"event_data" : json .dumps (event_data )
}
payload = json .dumps (data )
headers = {
'Content-Type' : 'application/json'
}
response = requests .request ("POST" , url , headers = headers , data = payload )
print (response .json ())
SELECT
*
FROM
` gcp-telemetry-example.dataset_a.parsed_table_a1_20201122`
/*
timestamp,event_type,event_key,a1_key1,a1_key2
2020-11-22 11:00:00.197725,example,example,a1_value1,a1_value2
2020-11-22 04:30:00.095965,example,example,a1_value1,a1_value2
2020-11-22 18:00:00.628708,example,example,a1_value1,a1_value2
2020-11-22 20:00:00.212486,example,example,a1_value1,a1_value2
2020-11-22 03:30:00.141171,example,example,a1_value1,a1_value2
*/
Terraform is used to create all GCP resources.
All project Stackdriver logs are sinked to a BigQuery dataset called stackdriver
.