Orchestracto is a scheduling system that can help you run your regular pipelines easily.

## Installation and initialization

- Create new virtualenv (or use existing one) and install the provided wheel. Python >=3.11 is required.
```bash
python -m venv /path/to/venv
source /path/to/venv/bin/activate
python -m pip install ./orchestracto-0.0.0-py3-none-any.whl
```
- Or you can just use a jupyter kernel image with installed orchestracto:
`cr.eu-north1.nebius.cloud/e00faee7vas5hpsh3s/solutions/orchestracto:v1`

Now you have `orc` command line tool available:

In [2]:
!orc --help

Let's init a new instance of orchestracto. You need a directory on your Tracto cluster for that (more reliable that a dir in `//tmp`)

In [4]:
!orc init //tmp/orc-example

Now you have some directories prepared:

In [6]:
!yt list //tmp/orc-example

Let's start a scheduler process:

In [8]:
!orc --orc-dir //tmp/orc-example scheduler start --detached

## Workflows

Let's open this dir in navigation. There in `workflows` directory you can create a new workflow, just like you create a notebook or a simple document.
The name of the document becomes `workflow_id` that identifies it within this orchestracto instance. Let's use `my_wf` as workflow id here.
You will see a JSON editor with some dummy workflow. Let's replace it with something more interesting:
```JSON
{
	"triggers": [
		{
			"trigger_type": "cron",
			"params": {
				"cron_expression": "*/10 * * * *"
			}
		}
	],
	"steps": [
		{
			"step_id": "assert_foo_bar",
			"task_type": "docker",
			"task_params": {
				"env": {
					"FOO": "BAR"
				},
				"command": "python3 -c 'import json, os; assert os.environ[\"FOO\"] == \"BAR\"; print(json.dumps({\"key1\": \"value1\"}))' >&2",
				"docker_image": "docker.io/library/python:3.11"
			},
            "max_retries": 3,
            "min_retry_interval_seconds": 10,
			"outputs": [
				{
					"name": "key1"
				}
			]
		},
		{
			"step_id": "get_prev_step_out",
			"task_type": "docker",
			"task_params": {
				"command": "python3 -c 'import os; the_arg = os.environ[\"ORC_PARAM_the_arg\"]; print(f\"Hello, {the_arg}\")' >&2",
				"docker_image": "docker.io/library/python:3.11"
			},
			"args": [
				{
					"name": "the_arg",
					"src_type": "step_output",
					"src_ref": "assert_foo_bar.key1"
				}
			],
			"secrets": [
				{
					"key": "YT_TOKEN",
					"value_ref": "YT_TOKEN"
				}
			],
			"depends_on": [
				"assert_foo_bar"
			]
		}
	]
}
```
### Dependencies
Here we define a workflow consisting of two steps: `assert_foo_bar` and `get_prev_step_out`. The second step depends on the first one, as you can see in `depends_on` parameter. 

### Task types
Both steps are of the same task type: `docker`. This task type allows you to run a command (`task_params->command`) in a docker container (image is defined in `task_params->docker_image`). The command should not write anything to stdout: please use stderr instead. If your docker registry is private, add a secret with the following content (the document in `value_ref` must contain `username`+`password` or `auth` fields):
```json
{
    "key": "docker_auth",
    "value_src_type": "cypress_docker_creds",
    "value_ref": "//cypress/path/to/document"
}
```

There is also `notebook` task type available. It runs a jupyter notebook in an existing kernel. Its configuration looks something like this:
```json
{
    "task_type": "notebook",
    "task_params": {
        "notebook_path": "//some/path/to/notebook",
        "yt_jupyter_kernel": "my_lovely_kernel"
    }
}
```

### Output params and args
Steps can have output params. Those must be defined in `outputs` section of the step. They can be used as input arguments for children steps, just like step `get_prev_step_out` uses an output of `assert_foo_bar` step -- see `args` section. Outputs must be printed to stderr in json format in the last line of the program's output.

### Retries
Failing steps can be retries: `max_retries` times with `min_retry_interval_seconds` between attempts.

### Secrets
Steps also have `secrets` section. As for now there is only one type of secrets available - YT_TOKEN that orchestracto has been launched with. It will be available in environment variable `YT_SECURE_VAULT_YT_TOKEN`.

### Triggers
This workflow also has a trigger. There are two types of triggers available so far: `cron` and `node_update`. In the example above the trigger is configured to run the workflow every 10 minutes. `node_update` trigger watches if a cypress node has changed, here is a configuration example:
```json
{
    "trigger_type": "node_update",
    "params": {
        "node_path": "//some/cypress/path"
    }
}
```

## Manual run
You can run it manually in workflow editor UI or via cli tool:

In [11]:
!orc --orc-dir //tmp/orc-example workflow run my_wf

It creates a run request that is handled by scheduler process.
You can track the execution progress on `Runs` tab on the workflow's page.

## Logs
Will be better soon. As for now you can find execution logs in workflow and step execution operations, which can be found by their run_id (filter operations by title, run_id are in it). There you can find job stderr links (on `Jobs` tab).