Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance benchmarking for Concourse releases #3816

Open
ddadlani opened this issue May 3, 2019 · 3 comments
Open

Performance benchmarking for Concourse releases #3816

ddadlani opened this issue May 3, 2019 · 3 comments

Comments

@ddadlani
Copy link
Contributor

@ddadlani ddadlani commented May 3, 2019

As a Concourse dev, it would be nice to know the impact that a particular release has on the overall performance of Concourse. @jama22 pointed out that since a lot of the runtime backlog revolves around stability, this would help us understand whether the changes made in a particular release improve stability or have unforeseen consequences that impact resiliency or efficiency.

Here are some of the benchmarking metrics that we would like to measure against some predefined workload:

Worker:

  1. CPU load
  2. Memory usage
  3. Disk utilization
  4. System load average
  5. Min, max and average number of containers
  6. Min, max and average number of volumes
  7. Network I/O
  8. Standard deviation of the above values among all workers (esp. containers and volumes)

Web:

  1. HTTP response duration
  2. Number of DB connections
  3. Time taken to schedule a build

This issue is to discuss and track work related to building out a benchmarking tool or process.

@ddadlani

This comment has been minimized.

Copy link
Contributor Author

@ddadlani ddadlani commented Jun 11, 2019

We plan to use concourse/concourse/drills to create a performance environment. We can use https://linux.die.net/man/1/stress to simulate high load for CPU, memory and disk I/O.

For containers, volumes and network I/O we need to design a pipeline with several heavy get steps.

@pivotal-jamie-klassen

This comment has been minimized.

Copy link
Contributor

@pivotal-jamie-klassen pivotal-jamie-klassen commented Jun 12, 2019

Bumping @jchesterpivotal's wise observation that stddev loses its prediction power in the absence of normally-distributed data: #2874 (comment)

@jchesterpivotal

This comment has been minimized.

Copy link
Contributor

@jchesterpivotal jchesterpivotal commented Jun 12, 2019

I'd also add Brendan Gregg's book, Systems Performance: Enterprise and the Cloud to a reading list. It's very good at breaking down the way different components and systems can manifest performance problems.

@ddadlani ddadlani added this to Backlog in Worker Resiliency via automation Aug 8, 2019
@ddadlani ddadlani removed this from Icebox in Runtime Aug 8, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Worker Resiliency
  
Backlog
3 participants
You can’t perform that action at this time.