Skip to content

Conversation

@mboutet
Copy link
Contributor

@mboutet mboutet commented Jul 7, 2021

NOTE: Still a work in progress, tests are not yet all updated so they fail.

This PR addresses the performance issues introduced by this other PR when one or more of the following is true:

  • Lots of workers (> 200)
  • Lots of user classes (> 25)
  • High spawn rate (> 100)

I completely refactored the UsersDispatcher. In fact, it is almost completely different from before and a lot simpler. I also ditched the distribution.py logic as I'm now using this great little library which allows for a nginx-like weighted round-robin dispatch of the users. Ramp-down is also now supported (i.e. stopping users at a given rate).

I've implemented most of the tests to validate this new implementation in test_dispatch.py, but I've not yet updated the runners module.

I'm still missing the logic to ensure that all workers run the expected users prior to beginning a ramp-up/down. I'll implement it in the next few days.

I implemented a small benchmark with the following config:

  • Ramp-up from 0 to 100 000 users
  • 1000 workers
  • 50 user classes with varying weights
  • 5000 spawn rate

Each dispatch iteration takes around 130ms to compute which is very good. It's orders of magnitude faster than before. The performance is similar for the inverse scenario from 100 000 to 0 users.

@mboutet mboutet marked this pull request as draft July 7, 2021 21:51
def remove_worker(self, worker_node_id: str) -> None:
self._worker_nodes = [w for w in self._worker_nodes if w.id != worker_node_id]
if len(self._worker_nodes) == 0:
# TODO: Test this
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a test for this now, right? Remove the todo :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there's a test for the zero worker case yet. I'll add one.

client.user_classes_count = {}
if self._users_dispatcher is not None:
self._users_dispatcher.remove_worker(client.id)
# TODO: If status is `STATE_RUNNING`, call self.start()
Copy link
Collaborator

@cyberw cyberw Jul 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets talk about this after merge

@cyberw
Copy link
Collaborator

cyberw commented Jul 12, 2021

(deleted)

Edit: never mind, both the issues I saw are in 1.6 as well :)

@cyberw cyberw marked this pull request as ready for review July 12, 2021 12:16
@cyberw
Copy link
Collaborator

cyberw commented Jul 12, 2021

Lets discuss my proposed changes after merging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants