New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
atc: behaviour: emit tasks waiting prometheus metric #5448
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initially I had some concerns about the ability to use a gauge (the number of tasks currently waiting, which can increase and decrease) instead of a counter (the number of attempts to schedule a task container which have failed due to all workers being at capacity), because I thought there could be some misbehaviour when multiple web nodes were in play. So I tested it out manually.
diff --git a/docker-compose.yml b/docker-compose.yml
index 5618630a7..aeac1ed14 100644
--- a/docker-compose.yml
+++ b/docker-compose.yml
@@ -11,7 +11,7 @@ services:
POSTGRES_USER: dev
POSTGRES_PASSWORD: dev
- web:
+ web-1:
build: .
image: concourse/concourse:local
volumes:
@@ -20,6 +20,7 @@ services:
depends_on: [db]
ports:
- 8080:8080
+ - 9100:9100
environment:
CONCOURSE_LOG_LEVEL: debug
CONCOURSE_POSTGRES_HOST: db
@@ -32,20 +33,53 @@ services:
CONCOURSE_CLUSTER_NAME: dev
CONCOURSE_CLIENT_SECRET: Y29uY291cnNlLXdlYgo=
CONCOURSE_TSA_CLIENT_SECRET: Y29uY291cnNlLXdvcmtlcgo=
+ CONCOURSE_CONTAINER_PLACEMENT_STRATEGY: limit-active-tasks
+ CONCOURSE_MAX_ACTIVE_TASKS_PER_WORKER: 1
+ CONCOURSE_PROMETHEUS_BIND_IP: "0.0.0.0"
+ CONCOURSE_PROMETHEUS_BIND_PORT: "9100"
+ CONCOURSE_PEER_ADDRESS: web-1
+
+ web-2:
+ build: .
+ image: concourse/concourse:local
+ volumes:
+ - .:/src
+ command: web
+ depends_on: [db]
+ ports:
+ - 8081:8080
+ - 9101:9100
+ environment:
+ CONCOURSE_LOG_LEVEL: debug
+ CONCOURSE_POSTGRES_HOST: db
+ CONCOURSE_POSTGRES_USER: dev
+ CONCOURSE_POSTGRES_PASSWORD: dev
+ CONCOURSE_POSTGRES_DATABASE: concourse
+ CONCOURSE_EXTERNAL_URL: http://localhost:8081
+ CONCOURSE_ADD_LOCAL_USER: test:test,guest:guest
+ CONCOURSE_MAIN_TEAM_LOCAL_USER: test
+ CONCOURSE_CLUSTER_NAME: dev
+ CONCOURSE_CLIENT_SECRET: Y29uY291cnNlLXdlYgo=
+ CONCOURSE_TSA_CLIENT_SECRET: Y29uY291cnNlLXdvcmtlcgo=
+ CONCOURSE_CONTAINER_PLACEMENT_STRATEGY: limit-active-tasks
+ CONCOURSE_MAX_ACTIVE_TASKS_PER_WORKER: 1
+ CONCOURSE_PROMETHEUS_BIND_IP: "0.0.0.0"
+ CONCOURSE_PROMETHEUS_BIND_PORT: "9100"
+ CONCOURSE_PEER_ADDRESS: web-2
worker:
build: .
image: concourse/concourse:local
command: worker
privileged: true
- depends_on: [web]
+ depends_on: [web-1,web-2]
ports:
- 7777:7777
- 7788:7788
stop_signal: SIGUSR2
environment:
CONCOURSE_LOG_LEVEL: debug
- CONCOURSE_TSA_HOST: web:2222
+ CONCOURSE_TSA_HOST: web-1:2222
# avoid using loopbacks
CONCOURSE_BAGGAGECLAIM_DRIVER: overlay
diff --git a/hack/overrides/prometheus.yml b/hack/overrides/prometheus.yml
index bd4ceb2ae..71f2dc762 100644
--- a/hack/overrides/prometheus.yml
+++ b/hack/overrides/prometheus.yml
@@ -8,13 +8,6 @@
version: '3'
services:
- web:
- environment:
- CONCOURSE_PROMETHEUS_BIND_IP: "0.0.0.0"
- CONCOURSE_PROMETHEUS_BIND_PORT: "9100"
- ports:
- - '9100:9100'
-
prometheus:
image: prom/prometheus
entrypoint:
@@ -30,7 +23,8 @@ services:
- job_name: 'concourse'
static_configs:
- targets:
- - 'web:9100'
+ - 'web-1:9100'
+ - 'web-2:9100'
" > config.yml
exec prometheus \
And what was striking was that it totally worked! I guess the logic that acquires a lock when attempting to schedule a container actually synchronizes through the database -- i was concerned we might "over-report" the number of waiting tasks since two different web nodes might be attempting to start the same task.
I ran
for i in {1..10}; do
fly -t web2 execute -c web/wats/fixtures/smoke-task.yml -i some-input=web/wats/fixtures/some-input &
fly -t dev execute -c web/wats/fixtures/smoke-task.yml -i some-input=web/wats/fixtures/some-input &
done
where dev
was localhost:8080
and web2
was localhost:8081
. And what I saw in prometheus was quite promising!
Each web node schedules its own containers. Now I need to attempt a similar exercise using job builds instead of one-offs (fly execute
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This continues to resist my attempts to break it - similar good behaviour when I trigger a bunch of job builds:
Before I get into it, the implementation itself looks pretty much gold - very idiomatic, small diff, no big WTFs, and it works! Very well done.
I'd love to see some documentation for these metrics, maybe around here (source) or here (source) -- would you mind submitting patches for those?
In terms of testing, I think you could certainly add unit tests for a few things.
- you're periodically forwarding the state of a global gauge to the emitter - there are tests for this behaviour in https://github.com/concourse/concourse/blob/master/atc/metric/periodic_test.go. (full disclosure, I have an open PR that adds tests here, too!)
- you're configuring an entry to be exposed by the prometheus endpoint - there are tests for this sort of thing in https://github.com/concourse/concourse/blob/master/atc/metric/emitter/prometheus_test.go
- something I find very interesting and wasn't obvious to me until I extensively tested manually was the concurrency-safety of this approach. A test around the worker client might have made this super clear! Around https://github.com/concourse/concourse/blob/master/atc/worker/client_test.go#L659-L667 maybe.
Finally, there's a merge conflict in the release notes - probably my bad for taking so long to get to this review. I was crunching on releasing 5.5.11, sorry! Anyway, I'm sure a simple rebase would take care of it.
This configuration allows to use EU based NewRelic account. Signed-off-by: Syamala Umamaheswaran <shykart2203@gmail.com>
Signed-off-by: Syamala Umamaheswaran <shykart2203@gmail.com>
NewRelicEmitter URL is the complete url and not the base URL as configuredin the NewRelicConfig Signed-off-by: Syamala Umamaheswaran <shykart2203@gmail.com>
Signed-off-by: Syamala Umamaheswaran <shykart2203@gmail.com>
Add NewRelicEmitter changes to release notes Signed-off-by: Syamala Umamaheswaran <shykart2203@gmail.com>
Hi @pivotal-jamie-klassen |
* NOTE: we don't log failed type conversions. This is probably something we want to start doing once we allow users to bypass dex. For now its probably ok to fail silently because our token structure will be consistent for the time being. https://github.com/concourse/concourse/projects/49#card-31357785 https://github.com/concourse/concourse/projects/49#card-36796693 Co-authored-by: Ciro S. Costa <cscosta@pivotal.io> Signed-off-by: Josh Winters <jwinters@pivotal.io>
Using the limit-active-tasks placement strategy gives the perfect opportunity to dynamically scale workers based on tasks running. This metric makes it easier for operators to implement such a scaling. #5057 Co-authored-by: Torben Neufeldt <torben.neufeldt@oss.volkswagen.com> Signed-off-by: Hannes Hasselbring <hannes@oss.volkswagen.com>
Co-authored-by: Torben Neufeldt <torben.neufeldt@oss.volkswagen.com> Co-authored-by: Syamala Umamaheswaran <shyam@oss.volkswagen.com> Signed-off-by: Hannes Hasselbring <hannes@oss.volkswagen.com>
Co-authored-by: Syamala Umamaheswaran <shyam@oss.volkswagen.com> Signed-off-by: Hannes Hasselbring <hannes@oss.volkswagen.com>
Co-authored-by: Hannes Hasselbring <hannes@oss.volkswagen.com> Signed-off-by: Syamala Umamaheswaran <shyam@oss.volkswagen.com>
- To avoid test pollution all previously registered collectors are unregistered in garbage collection test Co-authored-by: Hannes Hasselbring <hannes@oss.volkswagen.com> Signed-off-by: Syamala Umamaheswaran <shyam@oss.volkswagen.com>
…to be available - Make metrics test more stable Co-authored-by: Hannes Hasselbring <hannes@oss.volkswagen.com> Signed-off-by: Syamala Umamaheswaran <shyam@oss.volkswagen.com>
- introduce wait group so we know the context is cancelled before running the tests Signed-off-by: Syamala Umamaheswaran <shyam@oss.volkswagen.com>
@pivotal-jamie-klassen we have added test as mentioned in the last review. However our test that tests |
- Close sshClient Conn on keepalive -> SendRequest operation error - Add 5 second timeout for keepalive -> SendRequest operation Signed-off-by: Sameer Vohra <vohra.sam@gmail.com>
- pull out keepalive from tsa client - add test coverage for new behaviour Signed-off-by: Sameer Vohra <vohra.sam@gmail.com>
- Changed the timeout from 5s to 5m as 5s might be too aggresive for remote workers that don't have reliable connections. This timeout is to detect the condition where we observed SentRequest to hang indefinitely, which can still be reliably detected with a timeout of 5m. Signed-off-by: Sameer Vohra <vohra.sam@gmail.com>
Signed-off-by: Sameer Vohra <vohra.sam@gmail.com>
tl;dr - making this properly testable may require some dependency-breaking refactors. Or maybe a more sophisticated stub on the @shyamz-22 I don't have time to go deeply into a solution right now, but I will come back to this and I just want to bring the challenge to the surface. The full scenario we're trying to simulate is this:
In the For the record, I was able to get the test to consistently pass by writing it like this: Context("when waiting for worker to be available", func() {
BeforeEach(func() {
fakePool.FindOrChooseWorkerForContainerReturns(nil, nil)
go func() {
Eventually(metric.TasksWaiting.Max, 6*time.Second).Should(Equal(float64(1)))
cancel()
}()
})
It("metric task waiting is gauged and lock is released", func() {
Expect(metric.TasksWaiting.Max()).To(Equal(float64(0)))
Expect(fakeLock.ReleaseCallCount()).To(Equal(fakeLockFactory.AcquireCallCount()))
})
}) However, I strongly do not recommend keeping this approach, because instead of failing it just hangs indefinitely. But it serves as an interesting illustration of how to orchestrate the concurrent processes to achieve the simulation described. |
@pivotal-jamie-klassen thanks for the reply. We were also in the same opinion as you. We also had an idea for the refactor to be able to test this behaviour better. Could we explore this refactor? or would you suggest a lead on how we can proceed thanks |
Signed-off-by: Taylor Silva <tsilva@pivotal.io>
We should switch this over to using the API directly as well as it can make the build logs very noisy as well. Signed-off-by: Taylor Silva <tsilva@pivotal.io>
When using -node, everything in SynchronizedBeforeSuite is only run on one node, therefore other nodes don't have a pointer to kubeClient, it was never set on those nodes! Putting it in the suite's BeforeEach() ensures each node initializes the kubeClient. Signed-off-by: Taylor Silva <tsilva@pivotal.io>
Replaced with directly calling the API for the logs of each pod Signed-off-by: Taylor Silva <tsilva@pivotal.io>
@pivotal-jamie-klassen Thank you so much for the feedback. I have made some changes here. Please consider this as a draft. Changes:
Some differences to the implementation and the draft here: Previously the order of execution was
However the steps 1 and 2 are needed only when a worker is found. There are no tests to prove that the order matters. The current refactor translates roughly to the following (omitted release lock, you can find it in the code)
Observations:
➜ concourse git:(draft-waiting-tasks-refactor) gocyclo atc/worker/client.go
16 worker (*client).RunTaskStep atc/worker/client.go:249:1
12 worker (*client).chooseTaskWorker atc/worker/client.go:557:1
Please let me know how to proceed further. |
Make k8s-topgun produce less output
Signed-off-by: James Thomson <jthomson@pivotal.io>
k8s/topgun: update prometheus dependencies and disable worker
@shyamz-22 I quite like the reorganizing of that file, as well as the various I had one concern, and this mostly shows the limited degree to which I actually understand the affected components of Concourse. By acquiring the lock after choosing the worker, I can imagine this reducing the thread-safety of the worker-choosing routine. Since the worker-choosing part is no longer synchronized, can multiple web nodes race each other and do anything weird? I think the answer is no, because the only code path that actually calls Since you are making the retry and reporting interval configurable, there are some jobs in our pipeline that will complain unless the appropriate configuration is exposed via the helm chart and bosh release. So it would be helpful to make PRs to those repos as well, though I can help with that. Otherwise, the configuration doesn't actually need to be passed all the way to the operator. One could still externalize them from the method under test but have them always be constant in production. I suggest considering this because it seems like the kind of configuration an operator would have no idea how or why to tune -- indeed I can't think of why I would ever change that variable! Maybe you can? |
My confusion to achieve this are:
This is why I decided to pass it as a configuration. I cannot think of any other way as my understanding of all the components and code convention of Concourse is not strong |
@shyamz-22 sorry, I was waving my hands there. What I mean was that instead of adding these parameters you could make a change around here: adding some lines defining constants: var workerAvailabilityPollingInterval = 5 * time.Second
var workerStatusPublishInterval = 1 * time.Minute and then the code here: could instead look like workerClient := worker.NewClient(pool, workerProvider, compressionLib, workerAvailabilityPollingInterval, workerStatusPublishInterval) and similarly here. Does this make sense? I'm also open to making those things operator-configurable if you think there's a precedent, I just can't think of any. |
@pivotal-jamie-klassen I also cannot find a valid reason why an operator would need to configure the polling interval. I will do as you suggested by adding defaults and merge the changes. Hopefully this closes the PR 🤞 😄 |
Using the limit-active-tasks placement strategy gives the perfect opportunity to dynamically scale workers based on tasks running. This metric makes it easier for operators to implement such a scaling. #5057 Co-authored-by: Torben Neufeldt <torben.neufeldt@oss.volkswagen.com> Signed-off-by: Hannes Hasselbring <hannes@oss.volkswagen.com>
Co-authored-by: Torben Neufeldt <torben.neufeldt@oss.volkswagen.com> Co-authored-by: Syamala Umamaheswaran <shyam@oss.volkswagen.com> Signed-off-by: Hannes Hasselbring <hannes@oss.volkswagen.com>
Co-authored-by: Syamala Umamaheswaran <shyam@oss.volkswagen.com> Signed-off-by: Hannes Hasselbring <hannes@oss.volkswagen.com>
Co-authored-by: Hannes Hasselbring <hannes@oss.volkswagen.com> Signed-off-by: Syamala Umamaheswaran <shyam@oss.volkswagen.com>
- To avoid test pollution all previously registered collectors are unregistered in garbage collection test Co-authored-by: Hannes Hasselbring <hannes@oss.volkswagen.com> Signed-off-by: Syamala Umamaheswaran <shyam@oss.volkswagen.com>
…to be available - Make metrics test more stable Co-authored-by: Hannes Hasselbring <hannes@oss.volkswagen.com> Signed-off-by: Syamala Umamaheswaran <shyam@oss.volkswagen.com>
- introduce wait group so we know the context is cancelled before running the tests Signed-off-by: Syamala Umamaheswaran <shyam@oss.volkswagen.com>
- Externalize worker polling and status configuration to have more stable tests Signed-off-by: Syamala Umamaheswaran <shyam@oss.volkswagen.com>
Signed-off-by: Syamala Umamaheswaran <shyam@oss.volkswagen.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for being so responsive and thorough, and really taking the time for craftsmanship. It makes a huge difference!
@pivotal-jamie-klassen thank you to you too for providing enough information in each review really appreciate it :) |
Existing Issue
Fixes #5057
Why do we need this PR?
Using the limit-active-tasks placement strategy gives the perfect opportunity
to dynamically scale workers based on tasks running. This metric makes
it easier for operators to implement such a scaling.
Changes proposed in this pull request
limit-active-tasks
placement strategyContributor Checklist
Not sure if we need tests for Metrics. Looking through the code we didn't find any tests for metrics. Should we add some?
Reviewer Checklist
New config flags added? Ensure that they are added to the BOSH and Helm packaging; otherwise, ignored for the integration tests (for example, if they are Garden configs that are not displayed in the--help
text).