Monitoring, Auditing and Reporting System (MARS)
Open Cloud Shell
Clone in the https://github.com/ROIGCP/Mars repo
Command: git clone https://github.com/ROIGCP/Mars
Command: cd Mars
Make sure you have a project set
Command: gcloud config set project YOURPROJECTNAME
Bucket named projectid-bucket
Command: gcloud storage buckets create gs://${GOOGLE_CLOUD_PROJECT}-bucket1 --location=us-central1 --soft-delete-duration=0d
Dataflow API enabled (enabled via script in run-cloud.sh)
Command:
gcloud services enable dataflow.googleapis.com dataflow.googleapis.com cloudfunctions.googleapis.com \
run.googleapis.com cloudbuild.googleapis.com eventarc.googleapis.com pubsub.googleapis.com \
cloudbuild.googleapis.com containerregistry.googleapis.comCreate a Service Account called marssa
Command: gcloud iam service-accounts create marssa
Grant marssa@PROJECTID.iam.gserviceaccount.com the roles/editor
(NOTE: for production, reduce this permission to roles/dataflow.worker and access to resources it requires - GCS Bucket, BigQuery, etc)
Command: gcloud projects add-iam-policy-binding $GOOGLE_CLOUD_PROJECT --member serviceAccount:marssa@$GOOGLE_CLOUD_PROJECT.iam.gserviceaccount.com --role roles/editor
Command: gcloud projects add-iam-policy-binding $GOOGLE_CLOUD_PROJECT --member serviceAccount:marssa@$GOOGLE_CLOUD_PROJECT.iam.gserviceaccount.com --role roles/dataflow.worker
BigQuery Dataset called "mars"
Command: bq mk mars
BigQuery Table called "activities" - starting schema
Command: bq mk --schema timestamp:STRING,ipaddr:STRING,action:STRING,srcacct:STRING,destacct:STRING,amount:NUMERIC,customername:STRING -t mars.activities
`Schema: (if you want to create manually)
timestamp:STRING,
ipaddr:STRING,
action:STRING,
srcacct:STRING,
destacct:STRING,
amount:NUMERIC,
customername:STRING`
Run the Local Version (in Cloud Shell)
(also installs the required components - review the scripts and code before running)
Command: ./run-local.sh
Run the Cloud Version (in Cloud Shell)
(also installs the required components - review the scripts and code before running)
Command: ./run-cloud.sh
Buckets with Moonbank Data
Sample Data Bucket (7x small files): gs://mars-sample
If you are running in PluralSight there is a 5GB limit, don't run the production bucket as a source if your are running in PluralSight
Production Data Bucket (300+ larger files): gs://mars-production
Make a Copy of this Looker Studio Dashboard and adjust to your project.dataset.table
URL: https://datastudio.google.com/reporting/3f79b633-ac24-43b3-86c8-41f386ea514a
The Streaming examples (located in /streaming/) have been adjusted to read a pub/sub topic and write into BigQuery
However - they only write the data into a single column (message:string) in a table named raw
Streaming inserts expect data formatted in JSON instead of CSV
BigQuery Dataset called mars and a table raw
Command: bq mk mars
Command: bq mk --schema message:STRING -t mars.raw
Subscribe to the Mars Activity Topic (if you have access to roigcp-mars project)
Command: gcloud pubsub subscriptions create mars-activities --topic projects/moonbank-mars/topics/activities
Alternate (if you don't have access to topic): create a pubsub topic and subscription in your own project, and post messages for
Command: gcloud pubsub topics create activities-topic
Command: gcloud pubsub subscriptions create activities-subscription --topic=activities-topic\
Run the Local Version (in Cloud Shell)
(also installs the required components - Review the scripts and code BEFORE running)
Command: cd streaming
Command: ./run-stream-local.sh
Run the Cloud Version (in Cloud Shell)
(also installs the required components)
(Review the script and mars-cloud.py BEFORE running)
Command: ./run-stream-cloud.sh
Adjust the transformation function (processline) to create the JSON that represents the row and adjust to insert the row into the mars.activity table
