Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

atc: behaviour: emit tasks waiting prometheus metric #5448

Merged
merged 54 commits into from
May 12, 2020
Merged

atc: behaviour: emit tasks waiting prometheus metric #5448

merged 54 commits into from
May 12, 2020

Commits on May 1, 2020

  1. add configuration for NewRelic insights url

    This configuration allows to use EU based NewRelic account.
    
    Signed-off-by: Syamala Umamaheswaran <shykart2203@gmail.com>
    shyamz-22 committed May 1, 2020
    Configuration menu
    Copy the full SHA
    cbed34e View commit details
    Browse the repository at this point in the history
  2. add release notes

    Signed-off-by: Syamala Umamaheswaran <shykart2203@gmail.com>
    shyamz-22 committed May 1, 2020
    Configuration menu
    Copy the full SHA
    9197cd2 View commit details
    Browse the repository at this point in the history
  3. fix test by using correct assertion

    NewRelicEmitter URL is the complete url and not the base URL as configuredin the NewRelicConfig
    
    Signed-off-by: Syamala Umamaheswaran <shykart2203@gmail.com>
    shyamz-22 committed May 1, 2020
    Configuration menu
    Copy the full SHA
    002d8a3 View commit details
    Browse the repository at this point in the history
  4. Merge release notes with upstream changes

    Signed-off-by: Syamala Umamaheswaran <shykart2203@gmail.com>
    shyamz-22 committed May 1, 2020
    Configuration menu
    Copy the full SHA
    511bcb5 View commit details
    Browse the repository at this point in the history

Commits on May 2, 2020

  1. Merge changes from upstream branch

    Add NewRelicEmitter changes to release notes
    
    Signed-off-by: Syamala Umamaheswaran <shykart2203@gmail.com>
    shyamz-22 committed May 2, 2020
    Configuration menu
    Copy the full SHA
    97e925b View commit details
    Browse the repository at this point in the history

Commits on May 4, 2020

  1. atc: handle groups claims properly

    * NOTE: we don't log failed type conversions. This is probably something
    we want to start doing once we allow users to bypass dex. For now its
    probably ok to fail silently because our token structure will be
    consistent for the time being.
    
    https://github.com/concourse/concourse/projects/49#card-31357785
    https://github.com/concourse/concourse/projects/49#card-36796693
    
    Co-authored-by: Ciro S. Costa <cscosta@pivotal.io>
    Signed-off-by: Josh Winters <jwinters@pivotal.io>
    Josh Winters and Ciro S. Costa committed May 4, 2020
    Configuration menu
    Copy the full SHA
    013cfb3 View commit details
    Browse the repository at this point in the history

Commits on May 5, 2020

  1. atc: behaviour: emit tasks waiting prometheus metric

    Using the limit-active-tasks placement strategy gives the perfect opportunity
    to dynamically scale workers based on tasks running. This metric makes
    it easier for operators to implement such a scaling.
    
    #5057
    
    Co-authored-by: Torben Neufeldt <torben.neufeldt@oss.volkswagen.com>
    Signed-off-by: Hannes Hasselbring <hannes@oss.volkswagen.com>
    2 people authored and shyamz-22 committed May 5, 2020
    Configuration menu
    Copy the full SHA
    b647326 View commit details
    Browse the repository at this point in the history
  2. Add periodic test for tasks waiting metrics

    Co-authored-by: Torben Neufeldt <torben.neufeldt@oss.volkswagen.com>
    Co-authored-by: Syamala Umamaheswaran <shyam@oss.volkswagen.com>
    Signed-off-by: Hannes Hasselbring <hannes@oss.volkswagen.com>
    3 people committed May 5, 2020
    Configuration menu
    Copy the full SHA
    3e74ba7 View commit details
    Browse the repository at this point in the history
  3. update latest.md

    Co-authored-by: Syamala Umamaheswaran <shyam@oss.volkswagen.com>
    Signed-off-by: Hannes Hasselbring <hannes@oss.volkswagen.com>
    Hannes Hasselbring and shyamz-22 committed May 5, 2020
    Configuration menu
    Copy the full SHA
    947c351 View commit details
    Browse the repository at this point in the history
  4. add client test to check if metrics are gauged properly

    Co-authored-by: Hannes Hasselbring <hannes@oss.volkswagen.com>
    Signed-off-by: Syamala Umamaheswaran <shyam@oss.volkswagen.com>
    shyamz-22 and Hannes Hasselbring committed May 5, 2020
    Configuration menu
    Copy the full SHA
    35f6115 View commit details
    Browse the repository at this point in the history
  5. add test to check if task waiting metric is exposed by PrometheusEmitter

    - To avoid test pollution all previously registered collectors are unregistered in
      garbage collection test
    
    Co-authored-by: Hannes Hasselbring <hannes@oss.volkswagen.com>
    Signed-off-by: Syamala Umamaheswaran <shyam@oss.volkswagen.com>
    shyamz-22 and Hannes Hasselbring committed May 5, 2020
    Configuration menu
    Copy the full SHA
    256dc8a View commit details
    Browse the repository at this point in the history
  6. test if task lock is released in RunTaskStep when waiting for worker …

    …to be available
    
    - Make metrics test more stable
    
    Co-authored-by: Hannes Hasselbring <hannes@oss.volkswagen.com>
    Signed-off-by: Syamala Umamaheswaran <shyam@oss.volkswagen.com>
    shyamz-22 and Hannes Hasselbring committed May 5, 2020
    Configuration menu
    Copy the full SHA
    e0ec268 View commit details
    Browse the repository at this point in the history
  7. make test less flaky

    - introduce wait group so we know the context is cancelled before running the tests
    
    Signed-off-by: Syamala Umamaheswaran <shyam@oss.volkswagen.com>
    shyamz-22 committed May 5, 2020
    Configuration menu
    Copy the full SHA
    82a042d View commit details
    Browse the repository at this point in the history
  8. [#5082] Close ssh conn on worker when keepalive fails

    - Close sshClient Conn on keepalive -> SendRequest operation error
    - Add 5 second timeout for keepalive -> SendRequest operation
    
    Signed-off-by: Sameer Vohra <vohra.sam@gmail.com>
    xtreme-sameer-vohra committed May 5, 2020
    Configuration menu
    Copy the full SHA
    008ec24 View commit details
    Browse the repository at this point in the history
  9. [#5082] Refactor keepalive into a separate func

    - pull out keepalive from tsa client
    - add test coverage for new behaviour
    
    Signed-off-by: Sameer Vohra <vohra.sam@gmail.com>
    xtreme-sameer-vohra committed May 5, 2020
    Configuration menu
    Copy the full SHA
    f82ed96 View commit details
    Browse the repository at this point in the history
  10. [#5082] Increase tsa/client keepalive timeout

    - Changed the timeout from 5s to 5m as 5s might be too aggresive for
      remote workers that don't have reliable connections.
    
      This timeout is to detect the condition where we observed SentRequest
      to hang indefinitely, which can still be reliably detected with a
      timeout of 5m.
    
    Signed-off-by: Sameer Vohra <vohra.sam@gmail.com>
    xtreme-sameer-vohra committed May 5, 2020
    Configuration menu
    Copy the full SHA
    d70ca26 View commit details
    Browse the repository at this point in the history
  11. [#5082] add release note

    Signed-off-by: Sameer Vohra <vohra.sam@gmail.com>
    xtreme-sameer-vohra committed May 5, 2020
    Configuration menu
    Copy the full SHA
    858edcf View commit details
    Browse the repository at this point in the history

Commits on May 6, 2020

  1. Configuration menu
    Copy the full SHA
    a41fb70 View commit details
    Browse the repository at this point in the history
  2. simplify url function

    NewRelicConfig has a default value for URL and is never empty
    
    Signed-off-by: Syamala Umamaheswaran <shykart2203@gmail.com>
    shyamz-22 committed May 6, 2020
    Configuration menu
    Copy the full SHA
    2b635f2 View commit details
    Browse the repository at this point in the history
  3. Merge pull request #5532 from shyamz-22/master

    add configuration for NewRelic insights url
    xtreme-sameer-vohra committed May 6, 2020
    Configuration menu
    Copy the full SHA
    94ed811 View commit details
    Browse the repository at this point in the history

Commits on May 7, 2020

  1. Configuration menu
    Copy the full SHA
    ad94403 View commit details
    Browse the repository at this point in the history
  2. contributing: add instructions for local k8s

    despite us specifying how one would go about configuring the environment
    variables for the k8s-topgun tests, we never mentioned how one can go
    about getting the kubernetes cluster configured.
    
    `kind` is great as it assumes very few - all you need is docker - and
    extending the default configuration is also very great - just supply a
    piece of yaml (which is well documented in their website).
    
    Signed-off-by: Ciro S. Costa <cscosta@pivotal.io>
    Co-authored-by: Taylor Silva <tsilva@pivotal.io>
    Ciro S. Costa and taylorsilva committed May 7, 2020
    Configuration menu
    Copy the full SHA
    31b4e9e View commit details
    Browse the repository at this point in the history
  3. contributing: instructions for -nodes=N

    k8s-topgun tests are much better run concurrently rather than serially
    given how slow the initialization times are compared to unit tests.
    
    Signed-off-by: Ciro S. Costa <cscosta@pivotal.io>
    Ciro S. Costa committed May 7, 2020
    Configuration menu
    Copy the full SHA
    206c85d View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    df98844 View commit details
    Browse the repository at this point in the history
  5. Merge pull request #5323 from concourse/5082-workers-stall

    Close sshClient Conn on SendRequest error
    xtreme-sameer-vohra committed May 7, 2020
    Configuration menu
    Copy the full SHA
    3df973f View commit details
    Browse the repository at this point in the history
  6. atc/db: less cryptic error message

    Noticed this error message myself and it wasn't clear to me what was
    wrong. Someone on discord also mentioned that the error message was not
    helpful with their troubleshooting.
    
    This error message should guide the user to look at the likely source of
    the error: the parent resource type's config.
    
    Signed-off-by: Taylor Silva <tsilva@pivotal.io>
    taylorsilva committed May 7, 2020
    Configuration menu
    Copy the full SHA
    5d30607 View commit details
    Browse the repository at this point in the history
  7. Update atc/db/check_factory.go

    Co-authored-by: Rui Yang <ryang@pivotal.io>
    Signed-off-by: Taylor Silva <tsilva@pivotal.io>
    taylorsilva and Rui Yang committed May 7, 2020
    Configuration menu
    Copy the full SHA
    62dec49 View commit details
    Browse the repository at this point in the history
  8. Update testflight/custom_resource_check_test.go

    Co-authored-by: Rui Yang <ryang@pivotal.io>
    Signed-off-by: Taylor Silva <tsilva@pivotal.io>
    taylorsilva and Rui Yang committed May 7, 2020
    Configuration menu
    Copy the full SHA
    7c5c115 View commit details
    Browse the repository at this point in the history
  9. Update testflight/resource_type_test.go

    Co-authored-by: Rui Yang <ryang@pivotal.io>
    Signed-off-by: Taylor Silva <tsilva@pivotal.io>
    taylorsilva and Rui Yang committed May 7, 2020
    Configuration menu
    Copy the full SHA
    41c013c View commit details
    Browse the repository at this point in the history
  10. atc: log db transaction query

    when --log-db-queries is set to true
    
    Signed-off-by: Rui Yang <ryang@pivotal.io>
    Rui Yang committed May 7, 2020
    Configuration menu
    Copy the full SHA
    d961aa7 View commit details
    Browse the repository at this point in the history
  11. Merge pull request #5518 from concourse/install-tiller-contributing

    contributing: add tiller install instructions
    Ciro S. Costa committed May 7, 2020
    Configuration menu
    Copy the full SHA
    41c9523 View commit details
    Browse the repository at this point in the history
  12. Make k8s-topgun produce less output

    k8s-topgun has a lot of build output whenever any tests fail. Most of
    the output is coming from our repeated calls to `kubectl get pods` as we
    wait for pods to become ready for testing.
    
    This commit removes our usage of kubectl for this one function in favour
    of directly using k8s client-go package. Now this insane amount of json
    output will not be present.
    
    Leaving other usages of kubectl alone for now like `kubectl port-forward`.
    
    Signed-off-by: Taylor Silva <tsilva@pivotal.io>
    taylorsilva committed May 7, 2020
    Configuration menu
    Copy the full SHA
    3b07fdc View commit details
    Browse the repository at this point in the history
  13. oops: remove a focus

    Signed-off-by: Taylor Silva <tsilva@pivotal.io>
    taylorsilva committed May 7, 2020
    Configuration menu
    Copy the full SHA
    22c8f0a View commit details
    Browse the repository at this point in the history
  14. topgun: prometheus: update deps and disable worker

    the prometheus chart recently updated itself to have
    `kube-state-metrics` being a subchart, rather than embedded on itself.
    
    because of that, we need to first fetch its dependencies, and then later
    on, install it.
    
    here I also got rid of the Concourse worker that was being deployed -
    this is possible because nowadays we offer the ability to do so
    (something you couldn't before), and because the worker is unecessary
    for this case.
    
    Signed-off-by: Ciro S. Costa <cscosta@pivotal.io>
    Ciro S. Costa committed May 7, 2020
    Configuration menu
    Copy the full SHA
    ba65bca View commit details
    Browse the repository at this point in the history
  15. Merge pull request #5546 from concourse/less-cryptic-err

    atc/db: less cryptic error message
    taylorsilva committed May 7, 2020
    Configuration menu
    Copy the full SHA
    bc2ff0c View commit details
    Browse the repository at this point in the history

Commits on May 8, 2020

  1. Configuration menu
    Copy the full SHA
    9ffa9e2 View commit details
    Browse the repository at this point in the history
  2. Run go mod tidy

    Signed-off-by: Taylor Silva <tsilva@pivotal.io>
    taylorsilva committed May 8, 2020
    Configuration menu
    Copy the full SHA
    53ee9e6 View commit details
    Browse the repository at this point in the history
  3. Reduce the frequency of call to kubectl logs

    We should switch this over to using the API directly as well as it can
    make the build logs very noisy as well.
    
    Signed-off-by: Taylor Silva <tsilva@pivotal.io>
    taylorsilva committed May 8, 2020
    Configuration menu
    Copy the full SHA
    0e06ef7 View commit details
    Browse the repository at this point in the history
  4. Move kubeClient setup to BeforeEach

    When using -node, everything in SynchronizedBeforeSuite is only run on
    one node, therefore other nodes don't have a pointer to kubeClient, it
    was never set on those nodes!
    
    Putting it in the suite's BeforeEach() ensures each node initializes the
    kubeClient.
    
    Signed-off-by: Taylor Silva <tsilva@pivotal.io>
    taylorsilva committed May 8, 2020
    Configuration menu
    Copy the full SHA
    8f150bd View commit details
    Browse the repository at this point in the history
  5. Remove call to kubectl logs

    Replaced with directly calling the API for the logs of each pod
    
    Signed-off-by: Taylor Silva <tsilva@pivotal.io>
    taylorsilva committed May 8, 2020
    Configuration menu
    Copy the full SHA
    44cea28 View commit details
    Browse the repository at this point in the history

Commits on May 9, 2020

  1. Merge pull request #5520 from concourse/log-tx-query

    Rui Yang committed May 9, 2020
    Configuration menu
    Copy the full SHA
    e9de452 View commit details
    Browse the repository at this point in the history

Commits on May 11, 2020

  1. Merge pull request #5567 from concourse/make-k8s-topgun-less-noisy

    Make k8s-topgun produce less output
    taylorsilva committed May 11, 2020
    Configuration menu
    Copy the full SHA
    3abdcd3 View commit details
    Browse the repository at this point in the history
  2. add config file for PullRequest bot automation

    Signed-off-by: James Thomson <jthomson@pivotal.io>
    Rui Yang authored and James Thomson committed May 11, 2020
    Configuration menu
    Copy the full SHA
    8e9f120 View commit details
    Browse the repository at this point in the history
  3. Merge pull request #5568 from concourse/k8s-topgun-prometheus-fix

    k8s/topgun: update prometheus dependencies and disable worker
    Ciro S. Costa committed May 11, 2020
    Configuration menu
    Copy the full SHA
    313b9c6 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    3ac5db5 View commit details
    Browse the repository at this point in the history

Commits on May 12, 2020

  1. atc: behaviour: emit tasks waiting prometheus metric

    Using the limit-active-tasks placement strategy gives the perfect opportunity
    to dynamically scale workers based on tasks running. This metric makes
    it easier for operators to implement such a scaling.
    
    #5057
    
    Co-authored-by: Torben Neufeldt <torben.neufeldt@oss.volkswagen.com>
    Signed-off-by: Hannes Hasselbring <hannes@oss.volkswagen.com>
    2 people authored and shyamz-22 committed May 12, 2020
    Configuration menu
    Copy the full SHA
    5f3215b View commit details
    Browse the repository at this point in the history
  2. Add periodic test for tasks waiting metrics

    Co-authored-by: Torben Neufeldt <torben.neufeldt@oss.volkswagen.com>
    Co-authored-by: Syamala Umamaheswaran <shyam@oss.volkswagen.com>
    Signed-off-by: Hannes Hasselbring <hannes@oss.volkswagen.com>
    3 people committed May 12, 2020
    Configuration menu
    Copy the full SHA
    ce8d014 View commit details
    Browse the repository at this point in the history
  3. update latest.md

    Co-authored-by: Syamala Umamaheswaran <shyam@oss.volkswagen.com>
    Signed-off-by: Hannes Hasselbring <hannes@oss.volkswagen.com>
    Hannes Hasselbring and shyamz-22 committed May 12, 2020
    Configuration menu
    Copy the full SHA
    e5b2275 View commit details
    Browse the repository at this point in the history
  4. add client test to check if metrics are gauged properly

    Co-authored-by: Hannes Hasselbring <hannes@oss.volkswagen.com>
    Signed-off-by: Syamala Umamaheswaran <shyam@oss.volkswagen.com>
    shyamz-22 and Hannes Hasselbring committed May 12, 2020
    Configuration menu
    Copy the full SHA
    c4bfdcd View commit details
    Browse the repository at this point in the history
  5. add test to check if task waiting metric is exposed by PrometheusEmitter

    - To avoid test pollution all previously registered collectors are unregistered in
      garbage collection test
    
    Co-authored-by: Hannes Hasselbring <hannes@oss.volkswagen.com>
    Signed-off-by: Syamala Umamaheswaran <shyam@oss.volkswagen.com>
    shyamz-22 and Hannes Hasselbring committed May 12, 2020
    Configuration menu
    Copy the full SHA
    08a4485 View commit details
    Browse the repository at this point in the history
  6. test if task lock is released in RunTaskStep when waiting for worker …

    …to be available
    
    - Make metrics test more stable
    
    Co-authored-by: Hannes Hasselbring <hannes@oss.volkswagen.com>
    Signed-off-by: Syamala Umamaheswaran <shyam@oss.volkswagen.com>
    shyamz-22 and Hannes Hasselbring committed May 12, 2020
    Configuration menu
    Copy the full SHA
    b267a1f View commit details
    Browse the repository at this point in the history
  7. make test less flaky

    - introduce wait group so we know the context is cancelled before running the tests
    
    Signed-off-by: Syamala Umamaheswaran <shyam@oss.volkswagen.com>
    shyamz-22 committed May 12, 2020
    Configuration menu
    Copy the full SHA
    07e1b4a View commit details
    Browse the repository at this point in the history
  8. Refactor choose task worker to make it more testable

    - Externalize worker polling and status configuration
      to have more stable tests
    
    Signed-off-by: Syamala Umamaheswaran <shyam@oss.volkswagen.com>
    shyamz-22 committed May 12, 2020
    Configuration menu
    Copy the full SHA
    11a0216 View commit details
    Browse the repository at this point in the history
  9. fix merge conflicts with latest.md

    Signed-off-by: Syamala Umamaheswaran <shyam@oss.volkswagen.com>
    shyamz-22 committed May 12, 2020
    Configuration menu
    Copy the full SHA
    920eeb5 View commit details
    Browse the repository at this point in the history