sk8s-pipelines

Scalable pipelines processing and infrastructure management.

Features

Scalable pipeline processing using datapackage pipelines.
Infrastructure management using Kubernetes.

Using the pipelines

Please refer to the datapackage pipelines documentation for full documentation of the pipelines.

This project contains a sample pipeline called noise - which generates some noise.

The pipelines are defined in the pipeline-spec.yaml file. Each step's run attribute can point to a local python file implementing the datapackage pipelines processor interface, see noise.py for an example.

Running the pipelines

Install some dependencies (the following should work on recent versions of Ubuntu / Debian)

sudo apt-get install -y python3.6 python3-pip python3.6-dev libleveldb-dev libleveldb1v5
sudo pip3 install pipenv

Install the app depepdencies

pipenv install

Activate the virtualenv

pipenv shell

Get the list of available pipelines

dpp

Run a pipeline

dpp run <PIPELINE_ID>

Running the pipelines on a Kubernetes cluster

Run the pipelines using Kubernetes jobs.

Prerequisites

Terminal with kubectl command authenticated to a Kubernetes cluster. You should have some running nodes (verify using kubectl get nodes).

Write the job specification

The following example job configurations are available, you can use them and modify according to your requirements

k8s-job.yaml - simple job, running once to completion
k8s-scheduled-job.yaml - scheduled job, running daily, before each run - syncs latest data generated from the job defined in k8s-job.yaml

Run the job:

kubecyl apply -f k8s-job.yaml

To modify the job and re-run, delete the old job first

kubectl delete job <JOB_NAME>
kubectl delete cronjob <CRON_JOB_NAME>

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
google-storage-sync		google-storage-sync
tests		tests
.dockerignore		.dockerignore
.gcloudignore		.gcloudignore
.gitignore		.gitignore
.travis.yml		.travis.yml
Dockerfile		Dockerfile
LICENSE		LICENSE
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
README.md		README.md
cloudbuild-gcr-docker-cache.yaml		cloudbuild-gcr-docker-cache.yaml
entrypoint.sh		entrypoint.sh
k8s-job-deployment.yaml		k8s-job-deployment.yaml
k8s-job.yaml		k8s-job.yaml
k8s-ops-jsons-secret.tar.enc		k8s-ops-jsons-secret.tar.enc
k8s-ops-secret.json.enc		k8s-ops-secret.json.enc
k8s-scheduled-job.yaml		k8s-scheduled-job.yaml
noise.py		noise.py
pipeline-spec.yaml		pipeline-spec.yaml
uumpa-public-k8s-ops-secret.json.enc		uumpa-public-k8s-ops-secret.json.enc

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

sk8s-pipelines

Features

Using the pipelines

Running the pipelines

Running the pipelines on a Kubernetes cluster

Prerequisites

Write the job specification

About

Releases 3

Packages

Languages

License

OriHoch/sk8s-pipelines

Folders and files

Latest commit

History

Repository files navigation

sk8s-pipelines

Features

Using the pipelines

Running the pipelines

Running the pipelines on a Kubernetes cluster

Prerequisites

Write the job specification

About

Resources

License

Stars

Watchers

Forks

Releases 3

Packages 0

Languages

Packages