-
Notifications
You must be signed in to change notification settings - Fork 145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feature: Capture run ID #8458
Comments
Thanks for filing @menzenski!
I can imagine this, though we'd prefer to keep the run ID as a UUID to avoid having to create an Alembic migration script, since in Postgres it uses the builtin Uniqueness of Let me know if those restrictions work for you and your workflow, or if you'd need support for arbitrary strings.
I'm certain we could pass down a |
…tom run UUIDs Related: * Closes #8458
Sorry - I wasn't clear in my original message. I wasn't trying to propose that an orchestrator external to meltano should be able to set the meltano run ID. Rather, I was thinking about something like this:
Or similar - it seems that the meltano/src/meltano/core/job/job.py Line 112 in 2988899
singer_state property).
For our use case, as long as it was available as an environment variable here, when the meltano/src/meltano/cli/run.py Lines 153 to 209 in 2988899
|
@menzenski Would this have a different value to the FWIW if you wanna check out the approach, I was able to experiment with a |
…tom run UUIDs Related: * Closes #8458
…tom run UUIDs Related: * Closes #8458
@edgarrmondragon sorry for my delayed response here, I was out of office and missed your update - the draft PR https://github.com/meltano/meltano/pull/8459/files looks awesome, that'd totally work for our use case. (I confirmed that Argo Workflows is using v4 UUID strings). |
Thanks for confirming @menzenski. I'm already in the process of beta testing Meltano 3.4.0 but I could probably slip #8459 in if the team accepts it. |
@edgarrmondragon I put Meltano 3.4.0 into production today - we're using this new It works great! Huge quality-of-life improvement for us. Thanks so much for implementing this! |
I'm glad that it's helpful! |
Feature scope
CLI (options, error messages, logging, etc.)
Description
We run meltano in Kubernetes using Argo Workflows.
We use the Argo Workflows workflow archive, so we have workflow execution data saved in Postgres there.
We use the Meltano Postgres system database, so we have Meltano job run data saved in Postgres also.
But we don't have a way to join these two "job execution" data sets together. That is, we don't have a way to link "this specific argo workflow executed this specific Meltano run". We would like to be able to do that.
I am not sure exactly what would make the most sense here, or what would be easiest relative to existing behavior/functionality. But some things I can think of:
meltano run
could accept a--run-id=abc123
CLI argument or similar, that could be persisted as part of theruns
table record for that run.meltano run
would expose the run ID of the current job as an environment variable (MELTANO_RUN_ID
or similar), we could capture that upon completion of the job and persist it in the argo workflows archive.Being able to join these two sets of "job run data" would be really valuable to us and I'd be happy to try to contribute to this effort.
The text was updated successfully, but these errors were encountered: