-
Notifications
You must be signed in to change notification settings - Fork 808
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf(sql): Store child pipelines as references #3986
perf(sql): Store child pipelines as references #3986
Conversation
@@ -830,10 +832,39 @@ class SqlExecutionRepository( | |||
stages.forEach { storeStageInternal(ctx, it, executionId) } | |||
} | |||
} finally { | |||
// Restore original object state. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the fact that we need to do this seems a bit wild to me - I wonder if it's worth just objectMapper copying the execution on the way in here so we are free to mutate it as we see fit..
unless the expectation is that the repository mutates to execution to set Ids on it or something...
anyway not super relevate to the overall change in this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was an optimization. Saving the execution is a high-throughput (as far as Orca is concerned) operation, and serialization of executions is very high cost.
the approach definitely makes sense to me |
// throw an exception, we'll continue to load the execution with [PipelineRefTrigger] and let downstream | ||
// consumers throw exceptions if they need to. We don't want to throw here as it would break pipeline list | ||
// operations, etc. | ||
log.error("Attempted to load parent execution for '${execution.id}', but it no longer exists: ${trigger.parentExecutionId}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I would make this a warn
so it doesn't show up sentry, etc.
Also, it's not really an error here, as you state. It's an error at a time when someone downstream needs it
I remember someone mentioning that some users have (dozens/hundreds?!) parent pipelines. What happens in those cases, does the perf suffer a lot? Otherwise, i think it makes a ton of sense |
@marchello2000 Perf suffers to the point until where MySQL's (or any SQL's) buffer space runs out -- in some pathological community deployments, people are seeing pipeline payloads going as high as 1GB because of the embedded pipeline trigger. When this buffer space runs out, it causes the query to fail and, as a result, anything in Orca failing since it's an irrecoverable error until someone increases the buffer size... which is not always possible. |
This is sensible to me. |
10ca824
to
8dc6210
Compare
8dc6210
to
23b1561
Compare
Changes pipeline triggers to be stored as a reference, rather than as the full execution. This makes individual pipelines smaller to store, trading additional SQL queries to hydrate the full `PipelineExecution`.
23b1561
to
ab5c739
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is now ready for review.
import com.netflix.spinnaker.orca.pipeline.model.support.mapValue | ||
import com.netflix.spinnaker.orca.sql.pipeline.persistence.PipelineRefTrigger | ||
|
||
class PipelineRefTriggerDeserializerSupplier : CustomTriggerDeserializerSupplier { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needed to be made as a CustomTriggerDeserializerSupplier
because TriggerDeserializer
is in orca-core
, whereas PipelineRefTrigger
is an implementation detail of orca-sql
only.
I would encourage the community to not consider use cases like nested pipelines as pathological or abnormal. As Spinnaker adoption grows, especially in the enterprise, you will find usage patterns that are different from yours -- an open source pipelining system needs to be at least somewhat flexible to pipeline patterns that users are naturally likely to gravitate towards. This is an issue that has become existential for us (Salesforce). I'm not sure why the decision was made to store json as a blob in Orca, but this is leading to huge performance issues culminating in failures. |
Since this is reverted back, what is the solution? Is there another PR to address the issue? This is becoming a big issue. Also, somewhere in 1.20, the no of artifacts produced out of a deployment manifest stage has increased many folds. So, that compounds the issue of child contexts in the pipelines. |
Changes pipeline triggers to be stored as a reference, rather than as the full execution. This makes individual pipelines smaller to store, trading additional SQL queries to hydrate the full `PipelineExecution`.
Changes pipeline triggers to be stored as a reference, rather than as the full execution.
This makes individual pipelines smaller to store, trading additional SQL queries to hydrate
the full
PipelineExecution
.