Skip to content
/ vertex-tfx-pipeline Public template

An example of TFX intended to work with Vertex AI in Google Cloud

License

Notifications You must be signed in to change notification settings

iht/vertex-tfx-pipeline

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

53 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TFX pipelines with Vertex AI

Setup

This project can be run from the Cloud Shell of your Google Cloud project.

You will need a Google Cloud project with owner permissions, and you also need to have the Google Cloud SDK configured to use that project. For instance, you could use the Cloud Shell in your Google Cloud project, which is configured by default with the Google Cloud SDK.

Setup Google Cloud project

This repository contains some Terraform code in the terraform directory to setup Vertex AI and all the required APIs and permissions in the Google Cloud project.

Please check the README.md in the terraform/ directory for more details. You only need to run the Terraform code once.

Prepare the data

PROJECT_ID=<PROJECT_ID> gcloud storage cp data/creditcard.csv.gz gs://$PROJECT_ID/data/

bq load --project_id $PROJECT_ID --autodetect --source_format=CSV --replace=true data_playground.transactions gs://$PROJECT_ID/data/creditcard.csv.gz

Running the pipeline

Python version

Please don't use Python < 3.7 (e.g. 3.6) or Python > 3.9 (e.g. 3.10), they will not work with TFX. For more details, please check:

At the moment of writing this, the Cloud Shell has Python 3.9. You can check your Python version by running the following command:

python --version

Once you have made sure you have the correct Python version, create a virtualenv:

python -m venv tfxenv

Activate it:

source ./tfxenv/bin/activate

And install the dependencies in the file requirements.txt, by running:

pip install -r requirements.txt

Run the pipeline

Edit the scripts in the directory scripts to point to your project id and region of choice.

The playground branch of this repository contains incomplete code that you need to finish, as an exercise to learn the ropes of TFX pipelines.

To run the pipeline in Google Cloud, you need to run the provided scripts from the top level directory of the repository:

./scripts/launch_google_cloud.sh

About

An example of TFX intended to work with Vertex AI in Google Cloud

Topics

Resources

License

Stars

Watchers

Forks