Skip to content

one-data-cookie/mds4all-elt

Repository files navigation

mds4all

Minimalistic and free modern data stack, hence for all.

mds4all-elt

ELT layer of mds4all.

Setup

### BigQuery
# Log in to project
# install gcloud from https://cloud.google.com/sdk/docs/install
$ gcloud auth login
$ PROJECT_ID=$(gcloud config get-value project)

# Create user and add permissions
$ gcloud iam service-accounts create bigquery-sa --display-name="BigQuery SA"
$ gcloud projects add-iam-policy-binding $PROJECT_ID --member="serviceAccount:bigquery-sa@${PROJECT_ID}.iam.gserviceaccount.com" --role="roles/bigquery.user"
$ gcloud projects add-iam-policy-binding $PROJECT_ID --member="serviceAccount:bigquery-sa@${PROJECT_ID}.iam.gserviceaccount.com" --role="roles/bigquery.dataEditor"

# Download creds to local
$ gcloud iam service-accounts keys create bigquery-sa.json --iam-account=bigquery-sa@${PROJECT_ID}.iam.gserviceaccount.com

### Meltano
# Test Meltano locally
$ pip install meltano
$ meltano init meltano
$ cd meltano
$ meltano add extractor tap-spreadsheets-anywhere
$ meltano add loader target-bigquery
# add config info about tap and target into meltano.yml
$ meltano elt tap-spreadsheets-anywhere target-postgres

### dbt
# Test dbt locally
$ pip install dbt-bigquery
$ dbt init
# set up files
$ dbt debug
$ dbt run
$ dbt test
$ dbt docs generate
$ dbt docs serve

Limitations

  • Public repository
  • Public documentation
  • Warehousing with 100 GB of storage and 1 TB of queries

Inspiration