Kubeflow pipelines for Genie

These scripts are useful to generate datasets and train models with Genie on a Kubeflow cluster. They are tailored towards our internal use, but maybe you'll find them useful too!

This is not officially released software. Please do not file issues, sorry.

Setup

Copy config.sample to config, and edit the data in as described inside.

Building the docker image

The docker images are split into a base image and a regular Kubeflow image, which is used by each step. There is an additional image that includes the Jupyter notebook code.

To build the base image, set the COMMON_IMAGE field in the config file, then run ./rebuild-common-image.sh.

To build the regular image, use ./rebuild-image.sh. See ./rebuild-image.sh --help for options.

The scripts will build the images and upload to ACR (Azure Container Registry).

Uploading pipelines

Use:

python3 upload_pipeline.py ${pipeline_function} ${pipeline_name}

to upload a pipeline to your Kubeflow cluster. ${pipeline_function} should be the name of a Python function in pipelines.py.

You can also upload all pipelines at the same time via upload_all_pipelines.sh. You need to set the following keys before running the script: THINGPEDIA_DEVELOPER_KEY, AZURE_SP_APP_ID, AZURE_SP_TENANT_ID , and AZURE_SP_PASSWORD.

Running jobs

Refer to the Kubeflow documentation to learn how to run jobs after uploading the pipeline.

Name		Name	Last commit message	Last commit date
Latest commit History 717 Commits
components		components
pipelines		pipelines
.flake8		.flake8
.gitignore		.gitignore
.isort.cfg		.isort.cfg
.pre-commit-config.yaml		.pre-commit-config.yaml
Dockerfile		Dockerfile
Dockerfile.common		Dockerfile.common
Dockerfile.jupyter		Dockerfile.jupyter
LICENSE		LICENSE
README.md		README.md
config.sample		config.sample
lib.sh		lib.sh
pyproject.toml		pyproject.toml
rebuild-common-image.sh		rebuild-common-image.sh
rebuild-image.sh		rebuild-image.sh
requirements.txt		requirements.txt
submit_job.py		submit_job.py
sync-repos.sh		sync-repos.sh
upload_all_pipelines.sh		upload_all_pipelines.sh
upload_pipeline.py		upload_pipeline.py
utils.py		utils.py

License

stanford-oval/genie-k8s

Folders and files

Latest commit

History

Repository files navigation

Kubeflow pipelines for Genie

Setup

Building the docker image

Uploading pipelines

Running jobs

About

Resources

License

Stars

Watchers

Forks

Languages