Refactor handshake #77

jens-diewald · 2023-02-11T13:35:33Z

This is intended to simplify the handshake code.
It should simplify reusing much of the handshake code in order to resolve #75 in a future PR.

I also changed a few small things i came across while working on this. All should be documented within the commit messages.
I hope i did not miss any subtleties why async_handshake was a coroutine? I could not think of a good reason with the current code.

Feedback would be much appreciated :)

Or rather move it to the end where it is called at most once. This resolves the TODO on sspi_handshake::state and is intended to simplify the code to allow for further refactoring. There is a small change in behavior: As a server, all communication with the client is now done before manual_auth. Before, manual_auth was called as early as possible and possibly more data was sent through the stream after that.

What the comment stated for the AcceptSecurityContext documentation holds for InitializeSecurityContext just as much.

Again this is to simplify the code. I do not currently see a reason for the function to be a coroutine. Also add a comment about posting self.complete on the first call.

jens-diewald · 2023-02-11T13:39:40Z

include/boost/wintls/detail/async_handshake.hpp

+    }
+    // If this is the first call to this function, it would cause the completion handler
+    // (invoked by self.complete()) to be executed on the wrong executor.
+    // Ensure that doesn't happen by posting the completion handler instead of calling it directly.


@laudrup, i have to admit, i do not fully understand this. I wondered why the code was there and found an early commit where you added it. This comment is based on your commit message then.
Can you elaborate on this a bit further? What happens if "the wrong executor" is used?

I guess the executor we want is the one associated to our stream and it seems, that async_compose does not know about this executor on construction. If we are not in the first call of this function, the exectutor should be correctly selected by the next_layer_.async_read_some or net::async_write operations.
What i do not get:

If the first call to the function uses the wrong executor, why is it ever executed in the first place?

Why is self.get_executor() the correct executor that we use to post the call to self? Should that not still be the wrong executor?

Is there no way to tell async_compose to use the correct executor upon construction?

Okay, i think:

The first call to async_compose() is actually done synchronously when async_compose is called.

This is most likely wrong and should be next_layer.get_executor()

I can not find any hints towards this anywhere. I feel this would still be better than 2., though.

I will try to come up with a testcase for this and possibly fix it. Either in this PR or in a new one.
If somethink actually has to be fixed, the same applies to async_read and async_shutdown, i suppose.

There has been a semantic change in boost 1.81 (see here https://cppalliance.org/boost-release/2022/11/16/KlemensBoost181.html)

Basically, you got two executors involved in every op: the one of the io_object (which you can get by self.get_io_executor()) and the one of the completion token.

if you do a composed op, the first step will get invoked from the initiating function, i.e. synchronously, i.e. before the initiating function returns.

That is, when you call async_handshake and you call self.complete directly, you have an immediate completion. For callbacks that might be ok, but it generally isn't. That's especially true for read or write functions.
e.g.

stream.async_read([]{}sockect.async_read(...);});

This recursion can now potentially be infinite if the stream is in error. Note also that it's not only a problem of stack overflow, but that some completion tokens can't handle recursive completion, such as use_awaitable (i.e. coroutines).

So, we are in a situation, where we initiate the composed op, but we know in the initation how to complete, e.g. we have an error. Then we post so avoid the recursion. Before boost 1.81 you'd post to the executor of the completion, BUT that is peculiar. Let's say you have a socket working on executor1, and the completion is on executor2. Normally, any op on the socket would only complete if executor1 is running, except for some error scenarios. This is weird & inconsistent, so we post to executor1 (the io executor) and complete on executor2.

It's important to note that this is not wrong

template<typename Self> void operator()(Self& self, boost::system::error_code ec = {}, std::size_t length = 0) { if (ec) { self.complete(ec);

Because ec will never be true when initiating the op. So you'll only reenter here with an ec set after an op, which we can safely assumed complete "as if by post".

@klemens-morgenstern, thanks lot for this excellent explanation of the problem and the hint regarding the change in boost 1.81!
I do understand the problem we are trying to work around here now and i understand what should be done.
Yet, i am still struggling with the interface of the composed_op and what executor we get from self.get_executor() or the new self.get_io_executor(). (Is there any documentation available on this?)
From what you explained above, i gather self.get_executor() will return the executor of the completion handler? But what if the completion handler is a simple callback?
Also, in our case, the io_object would be the stream, right? I do not see how the composed_op would know about the executor associated to the stream in the first step. Am i missing something, or do we have to tell the composed op explicitly what we consider to be the io object?

When you call async_compose you pass in a set of objects:

async_compose(implementation, token, objs....);

The objs at the end will be used to determine the executors; if the token has none it'll be the same as the io-executor. If you don't spec any objecst the system_executor will be used (which you'll never want).

I see!
So in the wintls::stream class we should do

auto async_handshake(xxx) { return asio::async_compose<xxx>(async_handshake{xxx}, handler, *this); }

(or equivalently next_layer_ instead of *this) instead of currently:

auto async_handshake(xxx) { return asio::async_compose<xxx>(async_handshake{xxx}, handler); }

Also, if i get everything correctly, we could post to next_layer_.get_executor() in the first call, as this should do the same thing as posting to self.get_io_executor() with boost 1.81?

laudrup · 2023-02-14T10:55:16Z

@jens-diewald First of all thanks a lot for this PR. I really appreciate it.

Regarding the implementation of the handshake as a coroutine I had a lot of help from the nice guys from boost::beast, namely @vinniefalco and @madmongo1.

To be honest I can't remember all the details but I would rather avoid removing the coroutine part without understanding the full implications of that. We definitely don't want the async parts of the code to be blocking which is why a coroutine was needed in the first place as far as I remember.

Maybe we can get @vinniefalco and/or @madmongo1 to help a bit with this?

I definitely appreciate it if this could be simplified so I hope this can be merged in one way or another.

Thanks once again.

This reverts commit b437f7a.

Change the code such that posting self is only needed once. Also add a comment explaining why this is necessary.

jens-diewald · 2023-02-16T15:25:45Z

@laudrup Whether async_handshake is implemented as a coroutine or not should not be a problem with regard to asynchronity by itself. (Of course i could have gotten something wrong.)
Anyhow, i thought about this some more and i found, that if we remove the state member and enum, we can actually make use of the function being a coroutine. I have reverted the original change and added a new commit removing the state and making use of the coroutine instead. Maybe this is how it was originally intended?

I have then made another commit re-adding the comment regarding posting self on the first call to the function and a small simplification, so that the respective code is needed only once.
I have also thought about this some more, but i am afraid i still do not really get it. I will reply to my comment on the code to elaborate my understanding so far.

klemens-morgenstern · 2023-02-16T17:49:01Z

include/boost/wintls/detail/async_handshake.hpp

        }
      }

+      // If this is the first call to this function, it would cause the completion handler
+      // (invoked by self.complete()) to be executed on the wrong executor.
+      // Ensure that doesn't happen by posting the completion handler instead of calling it directly.
      if (!is_continuation()) {
        BOOST_ASIO_CORO_YIELD {
          auto e = self.get_executor();
          net::post(e, [self = std::move(self), ec, length]() mutable { self(ec, length); });


Suggested change

net::post(e, [self = std::move(self), ec, length]() mutable { self(ec, length); });

net::post(self.get_io_executor(), asio::append(std::move(self), ec, length));

This is bad because you lose the assocated executor, allocator & cancellation slot. Use append instead.

Thanks for the hint!
asio::append is available since boost 1.80 and from your blog post you linked above, i gather that self.get_io_executor() is available since boost 1.81.
Currently wintls tests against older boost versions also and i think it is reasonable to support older versions for now, to keep the potential testing audience larger.
Hm, so maybe we should add a distinction based on BOOST_VERSION for now? @laudrup?

klemens-morgenstern · 2023-02-16T17:50:39Z

include/boost/wintls/detail/async_handshake.hpp

@@ -40,94 +39,54 @@ struct async_handshake : boost::asio::coroutine {
      return entry_count_ > 1;


Isn't this supposed to return void ?

This is a small lambda that returns bool.
Although i do not know why this is a lambda and not a simple bool variable. Technically, even the entry_count_ member could be a bool i suppose.
Then again, looking into the source of asio::detail::composed_op, it seems that it keeps track of its invocations itself and that we could access that via asio::asio_handler_is_continuation(&self). Is that how it should be done?

See comments in post_self.hpp.

jens-diewald · 2023-03-03T14:38:06Z

I have moved the self-posting logic into a seperate file, added a case distinction on BOOST_VERSION and added some comments.
This is somewhat over-the-top, but i could not think of another reasonable way to document what is going on and how it should be done better in newer boost versions.
For the case distinction i added boost 1.80 and boost 1.81 to the CI matrix.

@klemens-morgenstern, if you find the time to look at this and possibly confirm that it makes sense to you, i would appreciate that a lot.

@laudrup, sorry that this went somewhat off-topic from my originally intended changes. The self-posting stuff would have deserved a seperate issue. From my side, the PR could be merged like this now. (Assuming the tests go through and there are no other objections.)

laudrup · 2023-03-07T12:23:31Z

@jens-diewald So sorry for the late reply and thanks once again for your pull request.

I would definitely like this to work with boost versions earlier than the latest. A somewhat new version could be required. Assuming there's tests ensuring it works on different versions then adding it to the CI is definitely the right thing to do. Thanks for that.

@klemens-morgenstern Thanks a lot for your input here. It seems like you're a lot more experienced with the details of boost::asio than I am so if you think these changes make sense then I'll feel fairly confident merging them.

So if we get the tests to pass and @klemens-morgenstern approves then so will I, although I'll probably look through the code again a final time before merging.

Thanks a lot once again.

This failed once on the github CI. Assuming that the failure was due to high load on the test server, this increases the timeout so that hopefully this will not fail again in the future.

laudrup · 2023-11-12T14:45:15Z

Hi @jens-diewald,

So sorry for not having looked at this for ages. I haven't really looked at this library for quite some time.

Is this still something you'd like me to merge at some point?

As you might have noticed I've update the pipeline to test newer boost versions and made a new release so it might be a good time for me to revisit this.

The main issue I have with the change is that I would very much like someone else to have a look at it as well. If you're still interested I'll see if I can ping some relevant people.

Thanks a lot.

jens-diewald · 2023-11-12T18:28:57Z

Hi @laudrup,

yes I think this pull request does improve the code.
I worked on this as a preparation of #78, but all the changes are independent of that issue.
A second opinion regarding the post_self part would be nice.

laudrup · 2023-11-20T18:57:55Z

@madmongo1 has said he'll have a look at this when time permits.

Hope we can get this merged.

jens-diewald added 5 commits February 11, 2023 13:33

Also send leftover handshake data as client.

66a435c

What the comment stated for the AcceptSecurityContext documentation holds for InitializeSecurityContext just as much.

Remove unused member from async_handshake

4ba64de

Remove redundant code

88ce93c

Make async_handshake::operator() not be a coroutine

b437f7a

Again this is to simplify the code. I do not currently see a reason for the function to be a coroutine. Also add a comment about posting self.complete on the first call.

jens-diewald mentioned this pull request Feb 11, 2023

Stream Shutdown does not produce Error on incomplete protocol shutdown #75

Open

jens-diewald commented Feb 11, 2023

View reviewed changes

jens-diewald added 3 commits February 16, 2023 15:54

Revert "Make async_handshake::operator() not be a coroutine"

9971739

This reverts commit b437f7a.

Make use of async_handshake being a coroutine

0d8b104

Simplify code and add comment

c6bb0b0

Change the code such that posting self is only needed once. Also add a comment explaining why this is necessary.

klemens-morgenstern reviewed Feb 16, 2023

View reviewed changes

jens-diewald added 2 commits March 3, 2023 15:26

Improve handling of immediately failing async operations

fd3aea5

See comments in post_self.hpp.

Add Boost 1.80 and Boost 1.81 to CI Unittests

7243bad

jens-diewald added 2 commits March 8, 2023 20:04

Set toolset for new boost versions and install-boost

12aec4a

Increase waiting time for OCSP responder in test

269342b

This failed once on the github CI. Assuming that the failure was due to high load on the test server, this increases the timeout so that hopefully this will not fail again in the future.

jens-diewald mentioned this pull request May 11, 2023

Implement full shutdown #78

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor handshake #77

Refactor handshake #77

jens-diewald commented Feb 11, 2023

jens-diewald Feb 11, 2023

jens-diewald Feb 16, 2023 •

edited

Loading

jens-diewald Feb 16, 2023 •

edited

Loading

klemens-morgenstern Feb 16, 2023

klemens-morgenstern Feb 16, 2023

jens-diewald Feb 17, 2023

klemens-morgenstern Feb 18, 2023

jens-diewald Feb 18, 2023 •

edited

Loading

laudrup commented Feb 14, 2023

jens-diewald commented Feb 16, 2023

klemens-morgenstern Feb 16, 2023

jens-diewald Feb 17, 2023

klemens-morgenstern Feb 16, 2023

jens-diewald Feb 17, 2023

jens-diewald commented Mar 3, 2023 •

edited

Loading

laudrup commented Mar 7, 2023

laudrup commented Nov 12, 2023

jens-diewald commented Nov 12, 2023

laudrup commented Nov 20, 2023

	net::post(e, [self = std::move(self), ec, length]() mutable { self(ec, length); });
	net::post(self.get_io_executor(), asio::append(std::move(self), ec, length));

		@@ -40,94 +39,54 @@ struct async_handshake : boost::asio::coroutine {
		return entry_count_ > 1;

Refactor handshake #77

Are you sure you want to change the base?

Refactor handshake #77

Conversation

jens-diewald commented Feb 11, 2023

Choose a reason for hiding this comment

jens-diewald Feb 16, 2023 • edited Loading

Choose a reason for hiding this comment

jens-diewald Feb 16, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jens-diewald Feb 18, 2023 • edited Loading

Choose a reason for hiding this comment

laudrup commented Feb 14, 2023

jens-diewald commented Feb 16, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jens-diewald commented Mar 3, 2023 • edited Loading

laudrup commented Mar 7, 2023

laudrup commented Nov 12, 2023

jens-diewald commented Nov 12, 2023

laudrup commented Nov 20, 2023

jens-diewald Feb 16, 2023 •

edited

Loading

jens-diewald Feb 16, 2023 •

edited

Loading

jens-diewald Feb 18, 2023 •

edited

Loading

jens-diewald commented Mar 3, 2023 •

edited

Loading