Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE REQUEST] Orchestration job metadata #62683

Open
max-arnold opened this issue Sep 14, 2022 · 4 comments · May be fixed by #66073
Open

[FEATURE REQUEST] Orchestration job metadata #62683

max-arnold opened this issue Sep 14, 2022 · 4 comments · May be fixed by #66073
Labels
Feature new functionality including changes to functionality and code refactors, etc. needs-triage Runners

Comments

@max-arnold
Copy link
Contributor

max-arnold commented Sep 14, 2022

Is your feature request related to a problem? Please describe.

For reporting/observability purposes it would be very useful to add custom metadata (key/value pairs) to Salt jobs. This metadata should be propagated back to job returns (be visible on the event bus) and could be used to track and filter particular job types using this metadata.

Turns out, this feature already exists in Salt:

Unfortunately, salt-run doesn't have an ability to add metadata and it is not possible to pass metadata from an orchestration job to any child jobs that generate new JIDs and produce separate returns

Describe the solution you'd like

My suggestion is to add the job metadata to runner jobs to achieve feature parity with other job types (salt, salt-call, scheduled):

  1. Add a new CLI arg to salt-run: either named --metadata (used by salt) or --set-metadata (used by salt-call, because --metadata does another thing, see Add --set-metadata option to salt-call #37362
  2. Automatically inherit this metadata by orchestration sub-jobs that generate new JIDs and produce separate return events (salt.function, salt.state, salt.runner, salt.parallel_runners, salt.wheel, including nested orchestrations).
  3. Optionally, allow adding/extending job metadata for orchestration sub-jobs explicitly (in addition to inherited metadata), e.g.:
# orch/myorch.sls
Sleep:
  salt.function:
    - name: test.sleep
    - metadata:
        baz: qux
    - tgt: "*"
    - arg:
      - 1
salt-run state.orch orch.myorch --metadata '{foo: bar}'

The return event from the Sleep sub-job should have {foo: bar, baz: qux} metadata, the return from the orch itself just {foo: bar}

I believe this ability to add cross-cutting job metadata will significantly improve job traceability for complex orchestrations that spawn new sub-jobs.

Describe alternatives you've considered
An alternative is to add a parent job ID to every sub-job spawned by the orchestration. It is less flexible though because it is a single field and doesn't allow adding arbitrary attributes.

P.S. check whether it is possible to supply metadata via salt-api.
P.P.S. maybe also worth passing it down to jinja context for a complete pass-through observability

@max-arnold max-arnold added Feature new functionality including changes to functionality and code refactors, etc. needs-triage labels Sep 14, 2022
@garethgreenaway garethgreenaway self-assigned this Sep 23, 2022
@max-arnold
Copy link
Contributor Author

@garethgreenaway Are there any chances to see this feature implemented in the near future (maybe after the 3006 is released)?

We have some ideas on how to use it Salt Grafana https://turtletraction-oss.gitlab.io/salt-grafana/

@garethgreenaway
Copy link
Contributor

@max-arnold Thanks for the reminder. Seems like a good thing to plan for 3007.

@max-arnold
Copy link
Contributor Author

Ping

@max-arnold
Copy link
Contributor Author

max-arnold commented Jan 14, 2024

@garethgreenaway Hey! Hope you are well

It looks like I have to roll up my sleeves and try to implement this feature on my own.

Would you be kind enough to show some pointers in the code? I do not need any help with the trivial stuff like CLI options, but some pointers in the codebase where it is best to pass this metadata context can be very helpful:

  1. Where to pass it to any runner job? Is it runner.RunnerClient or runner.Runner? Should it be in the init method, run or cmd? What about the async runner calls?
  2. What is the best way to pass (inherit) the metadata in any runner sub-jobs (from an orchestration)?
  3. Am I missing some other ways to spawn runner sub-jobs (that generate their own return data) other than the five orch state functions?
  4. What is the safest backward-compatible way to override it in a sub-job? Do I have to add the metadata keyword to all orch state functions (salt.function, salt.state, salt.runner, salt.parallel_runners, salt.wheel) or there is a more DRY approach?
  5. Are there any changes required in salt-api to support the metadata? I'm assuming it is not
  6. Are there any other corner cases I do not know about, that will prevent this feature from being cross-cutting across all Salt code? Salt-SSH minions?

UPD for (3): found at least one convoluted way to spawn a runner sub-job that is not orchestration:

salt-run salt.cmd saltutil.runner config.get '[id]'

It produces two job returns:

salt/run/20240114111539772526/new	{
    "_stamp": "2024-01-14T11:15:40.239526",
    "fun": "runner.salt.cmd",
    "fun_args": [
        "saltutil.runner",
        "config.get",
        [
            "id"
        ]
    ],
    "jid": "20240114111539772526",
    "user": "salt"
}
salt/run/20240114111540819548/new	{
    "_stamp": "2024-01-14T11:15:40.844612",
    "fun": "runner.config.get",
    "fun_args": [
        "id"
    ],
    "jid": "20240114111540819548",
    "user": "UNKNOWN"
}
salt/run/20240114111540819548/ret	{
    "_stamp": "2024-01-14T11:15:40.850252",
    "fun": "runner.config.get",
    "fun_args": [
        "id"
    ],
    "jid": "20240114111540819548",
    "return": "minion_master",
    "success": true,
    "user": "UNKNOWN"
}
salt/run/20240114111539772526/ret	{
    "_stamp": "2024-01-14T11:15:40.857119",
    "fun": "runner.salt.cmd",
    "fun_args": [
        "saltutil.runner",
        "config.get",
        [
            "id"
        ]
    ],
    "jid": "20240114111539772526",
    "return": "minion_master",
    "success": true,
    "user": "salt"
}

@max-arnold max-arnold linked a pull request Feb 16, 2024 that will close this issue
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature new functionality including changes to functionality and code refactors, etc. needs-triage Runners
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants