Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: Use polling model for workflow phase metric #4557

Merged
merged 7 commits into from Nov 24, 2020

Conversation

simster7
Copy link
Member

Fixes: #4551

Signed-off-by: Simon Behar simbeh7@gmail.com

Checklist:

Signed-off-by: Simon Behar <simbeh7@gmail.com>
Comment on lines -57 to -61
const enoughTimeForInformerSync = 1 * time.Second

const semaphoreConfigIndexName = "bySemaphoreConfigMap"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some pork-barrel style changes in this PR

go wfc.metrics.RunServer(ctx)
go wait.Until(wfc.syncWorkflowPhaseMetrics, 5*time.Second, ctx.Done())
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think 5 seconds is a good balance here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

So real-time for Prometheus means up to 15s old. Plus whatever delay the app has.

Every 15s would mean Prometheus would be up to 30s out of date. @jessesuen I'd like to do as little polling as possible.

Signed-off-by: Simon Behar <simbeh7@gmail.com>
@simster7 simster7 marked this pull request as ready for review November 19, 2020 19:45
workflow/controller/controller.go Show resolved Hide resolved
workflow/controller/indexes/labels.go Show resolved Hide resolved
workflow/controller/indexes/labels.go Outdated Show resolved Hide resolved
go wfc.metrics.RunServer(ctx)
go wait.Until(wfc.syncWorkflowPhaseMetrics, 5*time.Second, ctx.Done())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

So real-time for Prometheus means up to 15s old. Plus whatever delay the app has.

Every 15s would mean Prometheus would be up to 30s out of date. @jessesuen I'd like to do as little polling as possible.

@alexec alexec added this to the v2.12 milestone Nov 23, 2020
Copy link
Contributor

@alexec alexec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without @jessesuen to discus, can we set the polling interval to 1m. I'd rather have lower system load than up-to-date values. I will approve that.

@simster7
Copy link
Member Author

Without @jessesuen to discus, can we set the polling interval to 1m. I'd rather have lower system load than up-to-date values. I will approve that.

I'm okay waiting for @jessesuen to discuss. In my opinion 1m is too high, esp. since the load is minimal since we read from the informer

@alexec
Copy link
Contributor

alexec commented Nov 24, 2020

since the load is minimal since we read from the informer

Good point.

This is a blocking issue for v2.12, so I suggest set it to 15s as a compromise and these discus with @jesse next week.

@simster7 simster7 merged commit 4531d79 into argoproj:master Nov 24, 2020
alexcapras pushed a commit to alexcapras/argo that referenced this pull request Dec 2, 2020
Signed-off-by: github@finnesand.no <github@finnesand.no>

feat(ui): Add Template/Cron workflow filter to workflow page. Closes argoproj#4532 (argoproj#4543)

Signed-off-by: Tianchu Zhao <evantczhao@gmail.com>

feat(executor): Auto create s3 bucket if not present.

Signed-off-by: Alex Capras <alexcapras@gmail.com>

Apply codegen

Signed-off-by: Alex Capras <alexcapras@gmail.com>

Add argo-e2e label to test wf

Signed-off-by: Alex Capras <alexcapras@gmail.com>

chore: Updated stress test YAML (argoproj#4569)

Signed-off-by: Alex Collins <alex_collins@intuit.com>

docs: Updated kubectl apply command in manifests README (argoproj#4577)

Signed-off-by: Stefan Gloutnikov <stefan@gloutnikov.com>

feat(controller): Make MAX_OPERATION_TIME configurable. Close argoproj#4239 (argoproj#4562)

Signed-off-by: Alex Collins <alex_collins@intuit.com>

docs: Fix a typo in example (argoproj#4590)

Signed-off-by: Takayoshi Nishida <takayoshi.nishida@gmail.com>

feat(controller): Retry transient offload errors. Resolves argoproj#4464 (argoproj#4482)

Signed-off-by: Alex Collins <alex_collins@intuit.com>

fix(server): use the correct name when downloading artifacts (argoproj#4579)

Signed-off-by: Daniel Herman <dherman@factset.com>

fix(server): serve artifacts directly from disk to support large artifacts (argoproj#4589)

Signed-off-by: Daniel Herman <dherman@factset.com>

fix(executor): Handle sidecar killing in a process-namespace-shared pod (argoproj#4575)

Signed-off-by: Daisuke Taniwaki <daisuketaniwaki@gmail.com>

docs: Add JSON schema for IDE validation (argoproj#4581)

Signed-off-by: Paul Brabban <paul.brabban@gmail.com>

refactor: Use polling model for workflow phase metric (argoproj#4557)

Signed-off-by: Simon Behar <simbeh7@gmail.com>

Addressing reviewers comments

Signed-off-by: Alex Capras <alexcapras@gmail.com>

Addressing reviewers comments

docs: Minor typo fix (argoproj#4610)

Signed-off-by: Paavo Pokkinen <paavo.pokkinen@vaimo.com>

fix(controller): Prevent tasks with names starting with digit to use either 'depends' or 'dependencies' (argoproj#4598)

Signed-off-by: terrytangyuan <terrytangyuan@gmail.com>

fix(docs): Bring minio chart instructions up to date (argoproj#4586)

Signed-off-by: Ranga Krishnan <ranga@bei.re>

fix(executor): Fixed waitMainContainerStart returning prematurely. Closes argoproj#4599 (argoproj#4601)

Signed-off-by: fsiegmund <siegmund@slb.com>

feat(controller): Enhanced artifact repository ref. See argoproj#3184 (argoproj#4458)

Signed-off-by: Alex Collins <alex_collins@intuit.com>

fix: Null check pagination variable (argoproj#4617)

Signed-off-by: Simon Behar <simbeh7@gmail.com>

fix: Perform fields filtering server side (argoproj#4595)

Signed-off-by: Simon Behar <simbeh7@gmail.com>

fix(server): Correct webhook event payload marshalling. Fixes argoproj#4572 (argoproj#4594)

Signed-off-by: Alex Collins <alex_collins@intuit.com>

feat(ui): Add columns--narrower-height to AttributeRow (argoproj#4371)

fix: Fix TestCleanFieldsExclude (argoproj#4625)

Signed-off-by: Simon Behar <simbeh7@gmail.com>

fix(argo-server): fix global variable validation error with reversed dag.tasks (argoproj#4369)

Signed-off-by: chenyu.zheng <chenyu.zheng@hulu.com>

fix: derive jsonschema and fix up issues, validate examples dir… (argoproj#4611)

Signed-off-by: Paul Brabban <paul.brabban@gmail.com>

fix(ui): Reference secrets in EnvVars. Fixes argoproj#3973  (argoproj#4419)

Signed-off-by: Alejandro Tejera <aletepe@gmail.com>

fix(ui): Fix Snyk issues (argoproj#4631)

Signed-off-by: Alex Collins <alex_collins@intuit.com>

feat(executor): More informative log when executors do not support output param from base image layer (argoproj#4620)

Signed-off-by: terrytangyuan <terrytangyuan@gmail.com>

Codegen patch. Signed off by alexcapras@gmail.com

Codegen patch. Signed off by alexcapras@gmail.com

Delete test.patch
alexcapras pushed a commit to alexcapras/argo that referenced this pull request Dec 2, 2020
Signed-off-by: github@finnesand.no <github@finnesand.no>

feat(ui): Add Template/Cron workflow filter to workflow page. Closes argoproj#4532 (argoproj#4543)

Signed-off-by: Tianchu Zhao <evantczhao@gmail.com>

feat(executor): Auto create s3 bucket if not present.

Signed-off-by: Alex Capras <alexcapras@gmail.com>

Apply codegen

Signed-off-by: Alex Capras <alexcapras@gmail.com>

Add argo-e2e label to test wf

Signed-off-by: Alex Capras <alexcapras@gmail.com>

chore: Updated stress test YAML (argoproj#4569)

Signed-off-by: Alex Collins <alex_collins@intuit.com>

docs: Updated kubectl apply command in manifests README (argoproj#4577)

Signed-off-by: Stefan Gloutnikov <stefan@gloutnikov.com>

feat(controller): Make MAX_OPERATION_TIME configurable. Close argoproj#4239 (argoproj#4562)

Signed-off-by: Alex Collins <alex_collins@intuit.com>

docs: Fix a typo in example (argoproj#4590)

Signed-off-by: Takayoshi Nishida <takayoshi.nishida@gmail.com>

feat(controller): Retry transient offload errors. Resolves argoproj#4464 (argoproj#4482)

Signed-off-by: Alex Collins <alex_collins@intuit.com>

fix(server): use the correct name when downloading artifacts (argoproj#4579)

Signed-off-by: Daniel Herman <dherman@factset.com>

fix(server): serve artifacts directly from disk to support large artifacts (argoproj#4589)

Signed-off-by: Daniel Herman <dherman@factset.com>

fix(executor): Handle sidecar killing in a process-namespace-shared pod (argoproj#4575)

Signed-off-by: Daisuke Taniwaki <daisuketaniwaki@gmail.com>

docs: Add JSON schema for IDE validation (argoproj#4581)

Signed-off-by: Paul Brabban <paul.brabban@gmail.com>

refactor: Use polling model for workflow phase metric (argoproj#4557)

Signed-off-by: Simon Behar <simbeh7@gmail.com>

Addressing reviewers comments

Signed-off-by: Alex Capras <alexcapras@gmail.com>

Addressing reviewers comments

docs: Minor typo fix (argoproj#4610)

Signed-off-by: Paavo Pokkinen <paavo.pokkinen@vaimo.com>

fix(controller): Prevent tasks with names starting with digit to use either 'depends' or 'dependencies' (argoproj#4598)

Signed-off-by: terrytangyuan <terrytangyuan@gmail.com>

fix(docs): Bring minio chart instructions up to date (argoproj#4586)

Signed-off-by: Ranga Krishnan <ranga@bei.re>

fix(executor): Fixed waitMainContainerStart returning prematurely. Closes argoproj#4599 (argoproj#4601)

Signed-off-by: fsiegmund <siegmund@slb.com>

feat(controller): Enhanced artifact repository ref. See argoproj#3184 (argoproj#4458)

Signed-off-by: Alex Collins <alex_collins@intuit.com>

fix: Null check pagination variable (argoproj#4617)

Signed-off-by: Simon Behar <simbeh7@gmail.com>

fix: Perform fields filtering server side (argoproj#4595)

Signed-off-by: Simon Behar <simbeh7@gmail.com>

fix(server): Correct webhook event payload marshalling. Fixes argoproj#4572 (argoproj#4594)

Signed-off-by: Alex Collins <alex_collins@intuit.com>

feat(ui): Add columns--narrower-height to AttributeRow (argoproj#4371)

fix: Fix TestCleanFieldsExclude (argoproj#4625)

Signed-off-by: Simon Behar <simbeh7@gmail.com>

fix(argo-server): fix global variable validation error with reversed dag.tasks (argoproj#4369)

Signed-off-by: chenyu.zheng <chenyu.zheng@hulu.com>

fix: derive jsonschema and fix up issues, validate examples dir… (argoproj#4611)

Signed-off-by: Paul Brabban <paul.brabban@gmail.com>

fix(ui): Reference secrets in EnvVars. Fixes argoproj#3973  (argoproj#4419)

Signed-off-by: Alejandro Tejera <aletepe@gmail.com>

fix(ui): Fix Snyk issues (argoproj#4631)

Signed-off-by: Alex Collins <alex_collins@intuit.com>

feat(executor): More informative log when executors do not support output param from base image layer (argoproj#4620)

Signed-off-by: terrytangyuan <terrytangyuan@gmail.com>

Codegen patch. Signed off by alexcapras@gmail.com

Codegen patch. Signed off by alexcapras@gmail.com

Delete test.patch

Signed-off-by: Alex Capras <alexcapras@gmail.com>
alexec pushed a commit that referenced this pull request Dec 3, 2020
Signed-off-by: Simon Behar <simbeh7@gmail.com>
@alexec
Copy link
Contributor

alexec commented Dec 3, 2020

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

workflow metrics is not correct in v2.12.0-rc2 and latest master branch
2 participants