-
This pipeline uses AppEngine, PubSub and BigQuery
-
The app retrieves messages from
PUBSUB_SUBSCRIPTION
and writes them toBIGQUERY_TABLE
.
-
create the service account
gcloud iam service-accounts create [NAME]
-
Grant permissions to the service account gcloud projects add-iam-policy-binding [PROJECT_ID]
--member "serviceAccount:[NAME]@[PROJECT_ID].iam.gserviceaccount.com"
--role "roles/owner"` -
Generate key file gcloud iam service-accounts keys create service-account.json
--iam-account [NAME]@[PROJECT_ID].iam.gserviceaccount.com`
-
If you don't have
virtualenv
, install using pipsudo pip install virtualenv
-
Create isolated Python environment, and install dependencies
virtual env source env/bin/activate pip install -r requirements.txt
-
Export environmental variables
export GOOGLE_APPLICATION_CREDENTIALS="./service-account.json" export PROJECT_ID=knowledge-prototype export PUBSUB_TOPIC=cyton-data export PUBSUB_SUBSCRIPTION=cyton-data
-
Run
python main.py
bq load --source_format="NEWLINE_DELIMITED_JSON" knowledge-prototype:rtda.cytonData [PATH_TO_DATA_FILE] [PATH_TO_SCHEMA_FILE]