Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add 'worker_index' to WorkerRunner #2155

Merged
merged 3 commits into from
Aug 12, 2022
Merged

Conversation

gdm85
Copy link
Contributor

@gdm85 gdm85 commented Aug 8, 2022

Master will provide on ACK an ordinal identifier for each connected worker; these are deterministic (same id for same client_id).
This implementation is backward-compatible for older master nodes that do not provide the index in message payload for ACKs.

Fixes #1601

I have attempted to change the documentation to automatically cover this field, please advise if there are better ways.

Thanks!

@gdm85
Copy link
Contributor Author

gdm85 commented Aug 8, 2022

Any idea why the KeyError is happening in the CI check?

@cyberw
Copy link
Collaborator

cyberw commented Aug 8, 2022

I think the test is bad (fragile) because it assumes a msg with .data is not an ack

@cyberw
Copy link
Collaborator

cyberw commented Aug 8, 2022

Looks like a good start! Two questions:

Is the lock really needed? There are no io calls so I dont see how another greenlet could start to run?

Do we still need the old id:s or could this replace it? A number of places would be nicer if we used ordinal IDs instead of uuid:s (e.g. the logs)

@cyberw
Copy link
Collaborator

cyberw commented Aug 8, 2022

This needs some unit tests.

@gdm85
Copy link
Contributor Author

gdm85 commented Aug 8, 2022

I think the test is bad (fragile) because it assumes a msg with .data is not an ack

Ok, will try to fix it.

Is the lock really needed? There are no io calls so I dont see how another greenlet could start to run?

I assumed so; but since recv_from_client() only happens serially then I will remove it.

Do we still need the old id:s or could this replace it? A number of places would be nicer if we used ordinal IDs instead of uuid:s (e.g. the logs)

You mean the self.client_id (which is an UUID)? I guess it could be replaced but better in a separate PR? Also, some people downstream might rely on the hostname being there.

This needs some unit tests.

Any suggestion? If we have one for ACKs then I would modify that one; and perhaps add another one for old masters (I already verified manually that new workers can connect to old masters, but a test would be best).

@gdm85 gdm85 force-pushed the master branch 2 times, most recently from cb22a86 to 85ac54e Compare August 8, 2022 21:46
@cyberw
Copy link
Collaborator

cyberw commented Aug 8, 2022

You mean the self.client_id (which is an UUID)? I guess it could be replaced but better in a separate PR? Also, some people downstream might rely on the hostname being there.

Yes. We could do it later on but I kind of would like to do it now (rather than having two potentially breaking changes). Could be ok though.

This needs some unit tests.

Any suggestion? If we have one for ACKs then I would modify that one; and perhaps add another one for old masters (I already verified manually that new workers can connect to old masters, but a test would be best).

no suggestions off hand (I’m on mobile atm). No need to validate backwards/forward compatibility, it looks reasonable. Just validate that the ID is available and incremented correctly. Check the existing tests for inspiration.

@gdm85
Copy link
Contributor Author

gdm85 commented Aug 8, 2022

Yes. We could do it later on but I kind of would like to do it now (rather than having two potentially breaking changes).

The changes introduced in this PR are not breaking though.
I can make more changes so that client_id is completely replaced and becomes the ordinal int; it will become a bigger PR though.

no suggestions off hand (I’m on mobile atm). No need to validate backwards/forward compatibility, it looks reasonable. Just validate that the ID is available and incremented correctly. Check the existing tests for inspiration.

Ok, I have added one; thanks!

@cyberw
Copy link
Collaborator

cyberw commented Aug 8, 2022

Yes. We could do it later on but I kind of would like to do it now (rather than having two potentially breaking changes).
The changes introduced in this PR are not breaking though.

Sorry, I was unclear: what I meant was that the second PR would be a breaking one, as it would remove the ordinal_id added here (as it would no longer be needed).

I’d like to see it investigated at least. Doing it as a separate PR is fine if it becomes too complicated.

(the important thing is really just not making a proper release before merging the second one)

@gdm85
Copy link
Contributor Author

gdm85 commented Aug 9, 2022

I am going to fix the tests and then add another commit which "promotes" the ordinal ID as the client_uid; note however that I think it is beneficial that clients have an UUID, so that when they connect back the same ordinal will be assigned to them (at least until master is restarted).

I will try to find the best approach to preserve this functionality.

@gdm85
Copy link
Contributor Author

gdm85 commented Aug 10, 2022

I have been thinking through it; I think removing the UUIDs is a bad idea because right now these are used for a sort of "weak" handshake: clients report their UUIDs and the messaging/events is based on that.

I will make a commit that renames client_id into client_uuid (removing the socket hostname from it), and also introduce the new field as worker_index.

@gdm85 gdm85 changed the title feat: add 'ordinal_id' to WorkerRunner feat: add 'worker_index' to WorkerRunner Aug 10, 2022
docs/api.rst Outdated
@@ -133,7 +133,7 @@ Runner classes
:members: register_message, send_message

.. autoclass:: locust.runners.WorkerRunner
:members: register_message, send_message
:members: register_message, send_message, client_uuid, worker_index
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check whether this is desirable or not.

@cyberw
Copy link
Collaborator

cyberw commented Aug 10, 2022

I have been thinking through it; I think removing the UUIDs is a bad idea because right now these are used for a sort of "weak" handshake: clients report their UUIDs and the messaging/events is based on that.

I will make a commit that renames client_id into client_uuid (removing the socket hostname from it), and also introduce the new field as worker_index.

I see! Given that, I have changed my mind. I think it is preferable to keep client_id exactly as it is (including hostname), to promote backwards compatibility.

Can you also add an "integration test" (maybe in test_main.py) that checks that is correct on that side? (maybe by printing self.runner.worker_index from inside a task).

For convenience, how about adding worker_index to LocalRunner and just making it zero?

@cyberw
Copy link
Collaborator

cyberw commented Aug 10, 2022

MasterRunner.client_ordinals should probably be renamed worker_indexes instead (just to keep it consistent)

@gdm85 gdm85 force-pushed the master branch 2 times, most recently from 55965d5 to 316d1ff Compare August 10, 2022 18:26
@gdm85
Copy link
Contributor Author

gdm85 commented Aug 10, 2022

For convenience, how about adding worker_index to LocalRunner and just making it zero?

If there can never be more than 1 LocalRunner, then it is fine; I have changed it.

I have applied the other changes as well.

Do the integration tests run as part of the CI?

Edit: rebased on latest master.

Master will provide on ACK an ordinal identifier to each connected worker;
these are deterministic (same id for same client id) and guaranteed to not
change as far as the Master is alive.
LocalRunner has its worker_index always set to 0.
@cyberw
Copy link
Collaborator

cyberw commented Aug 10, 2022

It looks really good now. Can you adjust test_worker_indexes so that it actually launches two workers?

@gdm85 gdm85 force-pushed the master branch 3 times, most recently from b835f92 to b87ec00 Compare August 11, 2022 00:50
@gdm85
Copy link
Contributor Author

gdm85 commented Aug 11, 2022

@cyberw I have added another worker, but there was no other test using 2 workers so now I am incurring into some weird issue. Any idea why the master is returning a non-zero exit code?

@cyberw
Copy link
Collaborator

cyberw commented Aug 11, 2022

No particular ideas, but do a

self.assertIn("Shutting down (exit code 0)", stderr)

first, and you might get a clearer message

@cyberw
Copy link
Collaborator

cyberw commented Aug 12, 2022

I think I fixed the test. Will merge once build clears...

@cyberw
Copy link
Collaborator

cyberw commented Aug 12, 2022

Meh. Build doesnt appear to care about my commit. I'll merge and fix it here (upstream) if necessary.

@cyberw cyberw merged commit a6d5a1e into locustio:master Aug 12, 2022
@gdm85
Copy link
Contributor Author

gdm85 commented Aug 13, 2022

I didn't have time today, I see you found the mistake. Thank you!

@haomingbest
Copy link

haomingbest commented Aug 30, 2022

I have do this in my project.and it is some different
1,index is in class WorkerNode
2,Master have a index cache and if the workernode has leave ,the index is in cache to wait for next workernode
def get_index(self):
if len(self.index_cache) == 0:
ret = self.index_sequence
self.index_sequence += 1
else:
self.index_cache.sort(reverse=True)
ret = self.index_cache.pop()
return ret

def release_index(self, index):
if index not in self.index_cache:
self.index_cache.append(index)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Worker processes should have an id number
3 participants