Skip to content

confluentinc/etl

Repository files navigation

etl

Code for ETL data pipelines

** Authenticate BigQuery

  • scp BQ credentials to VM src_bash{gcloud compute scp /.confluentR.config --project [project_id] --zone [zone] [name]:/}

** Set up python3 environment in VM and install packages

  • Install virtualenv if you don't have it yet:sudo pip install virtualenv
  • Create virtual env: virtualenv -p /usr/bin/python3 env
  • Activate virtual env: . ~/etl/env/bin/activate
  • Install google cloud package: pip install google-cloud-bigquery

** Create a cron job to get it scheduled

  • chmod 755 ~/etl/dedup_stitch_sfdc.py
  • crontab -e
  • Add this line to have it run at 12 p.m. everyday: 0 12 * * * . ~/etl/exec.sh

About

Code for ETL data pipelines

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published