{% include navigation.html %}
The Google Big Query Loader allows to load IESI metadata so that is can be used for analytics and machine learning.
In a first phase, focus is put on script result metadata.
The architecture is based on a publish/subscribe pattern where the Big Query Loader (BQL) is running as a separate service fron the IESI core solution.
- upon completion of a script, a message will be published on a Google Pubsub topic
- the published message contains the run id of the script, which is unique within the IESI solution
- the BQL subscribes to this topic and will pick up the message
- the script results are retrieved from the IESI rest server and inserted into Big Query
Make sure that the following Google Cloud Plaform APIs are enabled for the project that you are deploying to:
for terraform
export GOOGLE_APPLICATION_CREDENTIALS="service-account.json"
setenv.sh
The Terraform folder contains the backend.tf
configuration file which needs the following information:
- bucket: the Google Cloud Storage (GCS) bucket to store the state. If you do not have a bucket yet, you can create a new one.
- path: the file path in the bucket where to store the state information. You can update this value to respect your own way of working.
Through Terraform, you need to provide the necessary values for these variables (see documentation):
- interactively
- by editing the
backend.tf
file - via the commandline:
terraform init -backend-config "bucket=${tf_state_bucket}" --backend-config "path=${tf_state_path}"
Now, you can initialize the Terraform deployment using terraform init
.
Before performing the deployment, the necessary variables by are listed in the variables.tf
file.:
- credentials_file: path the to credentials json file
- project: project id where to deploy to
Through Terraform, you need to provide the necessary values for there variables ([see documentation}()):
- by editing the
terraform.tfvars
file - via the commandline:
terraform apply -var="credentials_file=${GOOGLE_APPLICATION_CREDENTIALS}" -var="project=${tf_target_project}"
Next, you can plan and execute the Terraform deployment using terraform plan
and terraform apply
.
This will deploy the following:
- Pubsub topic: iesi-scriptresults
- Pubsub subsription@ iesi-scriptresults-bigquery
- Bigquery dataset & relevant tables: iesi_results
The java component supports some of the individual deployment actions, however it is advised to currently use the Terraform deployment and use these functions only for advanced ops interventions. It is possible in future that these actions will be improved and upgrade to suport all needed operations.
todo