Skip to content

Dedicated project for IESI and GCP integration

License

Notifications You must be signed in to change notification settings

metadew/iesi-gcp

Repository files navigation

{% include navigation.html %}

Google Big Query Loader

The Google Big Query Loader allows to load IESI metadata so that is can be used for analytics and machine learning.

Focus

In a first phase, focus is put on script result metadata.

Architecture

The architecture is based on a publish/subscribe pattern where the Big Query Loader (BQL) is running as a separate service fron the IESI core solution.

  • upon completion of a script, a message will be published on a Google Pubsub topic
  • the published message contains the run id of the script, which is unique within the IESI solution
  • the BQL subscribes to this topic and will pick up the message
  • the script results are retrieved from the IESI rest server and inserted into Big Query

Deploying the solution

API

Make sure that the following Google Cloud Plaform APIs are enabled for the project that you are deploying to:

Credentials

for terraform

export GOOGLE_APPLICATION_CREDENTIALS="service-account.json"

Using Terraform

Helpers

setenv.sh

Store state in a Google Cloud Storage Backend

The Terraform folder contains the backend.tf configuration file which needs the following information:

  • bucket: the Google Cloud Storage (GCS) bucket to store the state. If you do not have a bucket yet, you can create a new one.
  • path: the file path in the bucket where to store the state information. You can update this value to respect your own way of working.

Through Terraform, you need to provide the necessary values for these variables (see documentation):

  • interactively
  • by editing the backend.tf file
  • via the commandline: terraform init -backend-config "bucket=${tf_state_bucket}" --backend-config "path=${tf_state_path}"

Now, you can initialize the Terraform deployment using terraform init.

Perform the deployment

Before performing the deployment, the necessary variables by are listed in the variables.tf file.:

  • credentials_file: path the to credentials json file
  • project: project id where to deploy to

Through Terraform, you need to provide the necessary values for there variables ([see documentation}()):

  • by editing the terraform.tfvars file
  • via the commandline: terraform apply -var="credentials_file=${GOOGLE_APPLICATION_CREDENTIALS}" -var="project=${tf_target_project}"

Next, you can plan and execute the Terraform deployment using terraform plan and terraform apply.

This will deploy the following:

  • Pubsub topic: iesi-scriptresults
  • Pubsub subsription@ iesi-scriptresults-bigquery
  • Bigquery dataset & relevant tables: iesi_results

Using the component itself

The java component supports some of the individual deployment actions, however it is advised to currently use the Terraform deployment and use these functions only for advanced ops interventions. It is possible in future that these actions will be improved and upgrade to suport all needed operations.

Running the solution

todo

About

Dedicated project for IESI and GCP integration

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published