C2Dv2 continues the concept of bringing algo to the data, allowing both public and private use of datasets for use with algorithms. Although the previous versions was relying on external components (Provider -> Operator Service (running in k8) -> multiple Operator-Engines (each running in it's own k8 namespace)), C2DV2 is embedded entirely in ocean-node.
It has a modular approach, allowing multiple compute engines to be connected to the same ocean-node engine. This compute engines can be internal (docker or k8 (if ocean-node runs in a k8 env)), or external (in future, we could integrate projects like Bachalau, iExec, etc (TBD))
- allow multiple c2d engines connected to same ocean-node
- allow multiple jobs(stages) in a workflow
- a job can depend(or not) on a previous stage -> parallel/serial jobs
A workflow defines one or more jobs to be executed. Each job may have dependencies from a previous job
[
index: number
jobId: generated by orchestrator
runAfter: if defined, wait for specific jobId to finish
input: [
index: number
did?: string
serviceId?: string,
files?: filesObject
],
algorithm: {
did?: string
serviceId?: string
files?: filesObject
rawcode?: string
container?: {
entrypoint: string
image: string
tag: string
}
}
]
Former known as "operator-service", this layer handles interactions between ocean-node core layer and different execution environments.
In a nutshell, it should:
- exposing list of compute environments for all engines
- exposing list of running jobs and limits (max concurrent jobs, etc)
- take a new job (created by startJob core handler)
- determine which module to use (docker, k8, Bachalau, etc)
- insert workflow in database
- signal engine handler to take over the job execution
- read workflow status when c2d getStatus core is called
- serve job results when c2d getJobResult is called
Since, due to technical constrains, both internal modules (docker and k8) will use docker images for data provisioning (old pod-configuration) and results publishing (old pod-publishing), orchestration layer will also expose two new core commands:
- c2dJobStatusUpdate (called by both pod-config and pob-publishing to update job status)
- c2dJobPublishResult (called by pod-publishing when results have to be uploaded)
When any of pod-** will call one of those endpoints, we must verify the signature and respond accordingly
Will be based on a escrow contract. Orchestrator will:
- compute sum(maxDuration) from all jobs in the workflow
- calculate required fee (depend on previous step, token, environment, etc)
- lock amount in escrow contract
- wait until all jobs are finished (success or not)
- calculate actual duration spent
- compute proof
- withdraw payment & put proof and release the difference back to the customer.
A C2D Engine is a piece of code that handles c2d jobs that are running on a specific orchestration implementation.
This documents focuses on internal compute engines: - Docker based (host with docker environment installed) - K8 based (if ocean-node runs inside a k8 cluster)
An engine that uses external services (like Bachalau) has the same logic, but mostly likely will interact with remote APIs
An engine is responsible for:
- store workflows and each job status (so, on restart we can resume flows or continue running flow)
- queue new jobs
This module requires docker service installed at the host level.
It will leverage Docker API to:
- create job volume (with quotas)
- start provisioning container (aka pod-configuration)
- monitor it's status
- create yml for algo, with hw constrains (cpu,ram)
- pass devices for gpu envs
- start algorithm container
- monitor algo health & timeout constrains
- stop algo if quota is exceeded
- start publishing container
- delete job volume
title C2Dv2 message flow for docker module
User -> Ocean-node: start c2d job
Ocean-node -> Orchestration-class: start c2d job
Orchestration-class -> Orchestration-class: determinte module and insert workflow, random private key in db
Orchestration-class -> Docker-engine: queue job
Docker-engine -> Docker_host_api: create job volume
Docker-engine -> Docker-engine: create yaml for pod-configuration, set private key
Docker-engine -> Docker_host_api: start pod-configuration
Pod_configuration -> Pod_configuration: starts ocean-node as pod-config
Pod_configuration -> Ocean-node: call c2dJobProvision
Ocean-node -> Pod_configuration: return workflow
Pod_configuration -> Pod_configuration : download inputs & algo
Pod_configuration -> Ocean-node: call c2dJobStatusUpdate
Ocean-node -> Docker-engine: download success, start algo
Docker-engine -> Docker-engine: create yaml for algo
Docker-engine -> Docker_host_api: start algo container
Docker-engine -> Docker-engine: monitor algo container, stop if timeout
Docker-engine -> Docker-engine: create yaml for pod-publishing, set private key
Docker-engine -> Docker_host_api: start pod-publishing
Docker_host_api -> Pod-Publishing: start as docker container
Pod-Publishing -> Pod-Publishing : prepare output
Pod-Publishing -> Ocean-node: call c2dJobPublishResult
Pod-Publishing -> Ocean-node: call c2dJobStatusUpdate
This module requires access to k8 credentials (or autodetect them if ocean-node already runs in a k8 cluster).
It will leverage K8 API to:
- create job volume (with quotas)
- start provisioning container (aka pod-configuration)
- monitor it's status
- create yml for algo, with hw constrains (cpu,ram)
- pass devices for gpu envs
- start algorithm container
- monitor algo health & timeout constrains
- stop algo if quota is exceeded
- start publishing container
- delete job volume
For a efficient communication between ocean-node and the two containers, the easiest way is to use p2p/http api. Thus, all pod-* will run a ocean-node instance (each will have a job generated random key), and they will connect to the main ocean-node instance. Main ocean-node instance peerNodeId or http API endpoint will be inserted in yaml. Each pod-** will use a private key, also exposed in yaml.
So, each yml of pod-* will contain the following envs: - nodePeerId ? - nodeHttpApi ? - privateKey
In the past, pod-configuration was a standalone repo, built as docker image. In this implementation, it will be ocean-node, with a different entrypoint (entry_configuration.js)
Implementation:
- call ocean-node/c2dJobProvision and get workflow's input section
- download all assets
- call ocean-node/c2dJobStatusUpdate core command to update status (provision finished or errors)
In the past, pod-publishing was a standalone repo, built as docker image. In this implementation, it will be ocean-node, with a different entrypoint (entry_publishing.js)
Implementation:
- read output folder
- if multiples files or folders are detected, create a zip with all those files/folders
- call ocean-node/c2dJobPublishResult core command and let ocean-node handle storage
- call ocean-node/c2dJobStatusUpdate core command to update job as done