Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove synchronous postgres_backend #3576

Closed
wants to merge 3 commits into from
Closed

Conversation

arssher
Copy link
Contributor

@arssher arssher commented Feb 9, 2023

Remove sync postgres_backend_async, tidy up its split usage.

  • Use postgres_backend_async throughout safekeeper.
  • Use framed.rs in postgres_backend_async, similar to tokio_util::codec::Framed
    but with slightly more efficient split. Now IO functions are also cancellation
    safe. Also, there is no allocation in each message read anymore, and data in
    read messages still directly point to input buffer without copies.
  • In both safekeeper COPY streams, do read-write from the same thread/task with
    select! for easier error handling.
  • Tidy up finishing CopyBoth streams in safekeeper sending and receiving WAL
    -- join split parts back catching errors from them before returning.

Initially I hoped to do that read-write without split at all, through polling
IO:
#3522
However that turned out to be more complicated than I initially expected
due to 1) borrow checking and 2) anon Future types. 1) required Rc<Refcell<...>>
which is Send construct just to satisfy the checker; 2) can be workaround with
transmute. But this is so messy that I decided to leave split.

proxy stream.rs is adapted with minimal changes. It also benefits from framed.rs
improvements described above.

TODO:

  • rebase
  • fix proxy
  • fix unit tests
  • better error logging
  • probably simplify send_wal.rs

@arssher
Copy link
Contributor Author

arssher commented Feb 9, 2023

There is TODO list above, but otherwise it is ready for review; main tests pass. CC @petuhovskiy

@arssher arssher marked this pull request as ready for review February 13, 2023 12:33
@arssher arssher requested review from a team as code owners February 13, 2023 12:33
@arssher arssher requested review from petuhovskiy, funbringer and LizardWizzard and removed request for a team February 13, 2023 12:33
Copy link
Member

@petuhovskiy petuhovskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looked at everything but safekeeper/src/send_wal.rs, going to take another look tomorrow and probably post more minor comments, but so far looks good.

Found some minor doc issues which are not worth making github comment not to pollute discussion:

  • There were several typos in the comments, I think it should be possible to run some sort of grammar linter to fix them quickly (e.g. Fix typos #1818 (comment))
  • rustdoc comments (///) are missed in several places and there are usual // instead

Also:

  • write_message_noflush -> write_message and write_message -> write_message_flush can be confusing or error-prone, inverting the write_message semantics looks strange
  • Calling unwrap() and expect() in functions which are returning Result<> doesn't feel right IMO (except when checked with is_ok on previous line), even if we are sure that it won't panic normally

libs/pq_proto/src/lib.rs Outdated Show resolved Hide resolved
libs/utils/src/postgres_backend_async.rs Outdated Show resolved Hide resolved
safekeeper/src/receive_wal.rs Outdated Show resolved Hide resolved
@arssher
Copy link
Contributor Author

arssher commented Feb 14, 2023

Simplifed send_wal.rs by moving to socket split based approach with select!, like receive_wal.rs.

@arssher
Copy link
Contributor Author

arssher commented Feb 14, 2023

@funbringer Changes to proxy are not too invasive, but what is the basic way of testing this? I replaced guts of stream.rs and switched mgmt.rs to async.

@arssher
Copy link
Contributor Author

arssher commented Feb 14, 2023

There were several typos in the comments,

Run codespell. Most of suggestions were s/crate/create, but I fixed the rest :)

rustdoc comments (///) are missed in several places

Tried to look through most of files and added some, but point if you see more.

write_message_noflush -> write_message and write_message -> write_message_flush can be confusing or error-prone,

Yeah, I already bumped into this a couple of times. However, postgres_backend_async already has different meaning, so we break semantic anyway; and I'm much more proficient in safekeeper code than in pageserver, so adoped what was already there.

Calling unwrap() and expect() in functions which are returning Result<> doesn't feel right IMO

Removed most of these. A couple left, but they are mostly independent to this patch, should be fixed separately.

libs/pq_proto/src/framed.rs Outdated Show resolved Hide resolved
libs/pq_proto/src/framed.rs Show resolved Hide resolved
libs/pq_proto/src/framed.rs Outdated Show resolved Hide resolved
libs/pq_proto/src/framed.rs Outdated Show resolved Hide resolved
libs/pq_proto/src/framed.rs Outdated Show resolved Hide resolved
libs/pq_proto/src/lib.rs Outdated Show resolved Hide resolved
libs/pq_proto/src/lib.rs Outdated Show resolved Hide resolved
libs/utils/src/postgres_backend_async.rs Outdated Show resolved Hide resolved
libs/utils/src/postgres_backend_async.rs Outdated Show resolved Hide resolved
safekeeper/src/http/routes.rs Outdated Show resolved Hide resolved
safekeeper/src/wal_storage.rs Outdated Show resolved Hide resolved
// sends, so this avoids deadlocks.
let mut pgb_reader = pgb.split().context("START_WAL_PUSH split")?;
let peer_addr = *pgb.get_peer_addr();
let res = tokio::select! {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe use futures::future::join_all here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. The very idea is that we stop as soon as any of parts stops; this allows to easily retrieve error (not choosing among two, trying to guess which was the initial cause) and avoid synchronizing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like try_join_all can be helpful

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would work too, but select! here is enough.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would argue that try_join_all has a simpler semantics and thus is a better option here.

Copy link
Contributor Author

@arssher arssher Mar 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It has just different semantics. Once any task finished, whether with Ok or error, we need to stop polling. So on the closer look it doesn't fit here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This exactly matches try_join_all doc:

If any future returns an error then all other futures will be canceled and an error will be returned immediately.

Am I missing something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to stop even if one future returned Ok.

safekeeper/src/send_wal.rs Outdated Show resolved Hide resolved
@petuhovskiy petuhovskiy added the run-benchmarks Indicates to the CI that benchmarks should be run for PR marked with this label label Feb 15, 2023
@arssher
Copy link
Contributor Author

arssher commented Feb 16, 2023

Also reworked error handling, now double logging of closing connections on safekeepers is gone, and generally it is more accurate.

In total, this

2023-02-17T06:57:50.442504Z  INFO {tid=31}:WAL acceptor{ttid=3dc111e93b5ae3149c10edb96112cefe/234af0b0c4266e22cd1ba9cbb4b6eba9}: start receiving WAL since 0/240F170
2023-02-17T06:57:50.649755Z  INFO {tid=31}: Timeline 3dc111e93b5ae3149c10edb96112cefe/234af0b0c4266e22cd1ba9cbb4b6eba9 query failed with connection error: Socket IO error: walproposer closed the connection
2023-02-17T06:57:50.649800Z  INFO {tid=31}: query handler for 'START_WAL_PUSH' failed with expected io error: walproposer closed the connection
2023-02-17T06:57:50.649912Z ERROR connection handler exited: reader taken

became this

2023-02-17T06:59:14.906835Z  INFO {tid=27}:WAL receiver{ttid=4e87840cae32c623d40685aab403f19b/8917d69ff42aa53421367a6ef59e1110}: terminated: EOF on COPY stream

@arssher arssher force-pushed the asher/sk-async-pg-backend branch 2 times, most recently from 771e372 to 386a6e4 Compare February 16, 2023 23:47
safekeeper/src/lib.rs Outdated Show resolved Hide resolved
@arssher
Copy link
Contributor Author

arssher commented Feb 22, 2023

AFAIS I addressed all reviews so far, please have another look.

Copy link
Contributor

@LizardWizzard LizardWizzard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took another look. I'm still not on board with the changes.

Additionally, this patch contains unrelated changes (renaming functions, new psql show command). I would recommend splitting them out because the patch is really big and its hard to keep track of everything during reviews.

I'm still thinking that custom implementation of Framed that is slightly different from one in tokio is not what we need here.

The main motivation for the patch seems to be this change in safekeeper. First of all the PR title and description do not mention this. Commit messages do not reflect this intention and other changes included in this patch. As I said in one of the comments IMO the PR would benefit from splitting into independent parts each with its own motivation. Mixing independent changes leads to extended review time because of increased patch size and at some point many discussions start to block code that has no concerns to be merged without waiting for other conversations to resolve.

Secondly, to my mind similar result can be achieved without requiring split machinery (either tokio or futures one). Both of them seem to have their own cons, allocation of waker in futures, thread::yield_now() in tokio and in general this concept of split contains pitfalls exactly around use case with polling from different tasks (it is mentioned in your comments too). To me better way to achieve desired goal would be to change the way how these two sk tasks communicate with each other. See https://github.com/neondatabase/neon/pull/3576/files#r1118916294 and #3576 (comment)

Additionally taking into account #3667 I'm really not a fan of such a big refactoring without good unit test coverage. This refactoring is an opportunity to move to better structure for our whole postgres protocol handling machinery. Since we deal with parsing of data that came through network it makes sense to add some fuzzying routines. With all the rust tooling available it shouldnt be a really big task.

@petuhovskiy
Copy link
Member

Additionally, this patch contains unrelated changes (renaming functions)

If you are talking about write_message_noflush -> write_message, it actually was not renamed in this PR, it's a strange looking diff coming from migrating to postgres_backend_async from postgres_backend. The thing is, before this PR, we had:

This PR added write_message_flush to the async version and replaced postgres_backend with postgres_backend_async. That's why it's looking like a rename, but in fact we already had write_message() with different meaning between sync and async versions.

I would recommend splitting them out because the patch is really big and its hard to keep track of everything during reviews.

Yes, it's easier to review several small patches instead of one big, but it takes more time to prepare several small independent PRs.

The main motivation for the patch seems to be this change in safekeeper.

If you are talking about let (msg_tx, msg_rx) = channel(MSG_QUEUE_SIZE);, we already had similar logic for a long time.

impl ProposerPollStream {
fn new(mut r: ReadStream) -> anyhow::Result<Self> {
let (msg_tx, msg_rx) = channel();
let read_thread = thread::Builder::new()
.name("Read WAL thread".into())
.spawn(move || -> Result<(), QueryError> {
loop {
let copy_data = match FeMessage::read(&mut r)? {
Some(FeMessage::CopyData(bytes)) => Ok(bytes),
Some(msg) => Err(QueryError::Other(anyhow::anyhow!(
"expected `CopyData` message, found {msg:?}"
))),
None => Err(QueryError::from(std::io::Error::new(
std::io::ErrorKind::ConnectionAborted,
"walproposer closed the connection",
))),
}?;
let msg = ProposerAcceptorMessage::parse(copy_data)?;
msg_tx
.send(msg)
.context("Failed to send the proposer message")?;
}
// msg_tx will be dropped here, this will also close msg_rx
})?;
Ok(Self {
msg_rx,
read_thread: Some(read_thread),
})
}
fn recv_msg(&mut self) -> Result<ProposerAcceptorMessage, QueryError> {
self.msg_rx.recv().map_err(|_| {
// return error from the read thread
let res = match self.read_thread.take() {
Some(thread) => thread.join(),
None => return QueryError::Other(anyhow::anyhow!("read thread is gone")),
};
match res {
Ok(Ok(())) => {
QueryError::Other(anyhow::anyhow!("unexpected result from read thread"))
}
Err(err) => QueryError::Other(anyhow::anyhow!("read thread panicked: {err:?}")),
Ok(Err(err)) => err,
}
})
}
fn poll_msg(&mut self) -> Option<ProposerAcceptorMessage> {
let res = self.msg_rx.try_recv();
match res {
Err(_) => None,
Ok(msg) => Some(msg),
}
}
}

This PR mostly preserves it, but also adds a buffered channel for responses, which is now easy to do because of async tasks.

I'd say that this PR does mostly what is says, i.e. Removes synchronous postgres_backend; Adds framed.rs; Fixes CopyBoth errors in the logs. The diff looks large because a lot of code was moved in the process, but the logic stays the same in the most places.

@LizardWizzard
Copy link
Contributor

LizardWizzard commented Feb 28, 2023

If you are talking about let (msg_tx, msg_rx) = channel(MSG_QUEUE_SIZE);, we already had similar logic for a long time.

I'm talking about the whole idea around split and the way how different task is used rather than this exact line.

Yes, it's easier to review several small patches instead of one big, but it takes more time to prepare several small independent PRs.

I think you're trying to optimize the wrong part here, for the person who submits the patch its faster to prepare a bigger one, but at the end of the day merging smaller patches is faster than going through reviews for one big PR. Additionally for bigger patches it is easier to miss bugs just because of the amount of code and increased number of iterations. So IMO optimizing for review speed and thus total amount of time it takes to deliver a feature is a better strategy. We should optimize for better work as a team, not for individual contributor convenience.

If you are talking about write_message_noflush -> write_message,

Not only this, another example is pg_receivewal support.

Regarding the rename. It looks like more places used sync semantics, and you suggested to rename them in first comment:

write_message_noflush -> write_message and write_message -> write_message_flush can be confusing or error-prone, inverting the write_message semantics looks strange

Which lead to increased diff.

Personally I dont see a reason why this needs to be done in this patch. It can be split out to separate one that merges naming between sync and async.

Also I do think that write_message -> serialize_message can completely side step the confusion.

@petuhovskiy
Copy link
Member

I'm talking about the whole idea around split and the way how different task is used rather than this exact line.

Yes, you are proposing to remove split and write a slightly different code, with its own advantages and disadvantages. We can try to do that, but the current version is ported from the sync code that is known to work well.

I think you're trying to optimize the wrong part here, for the person who submits the patch its faster to prepare a bigger one, but at the end of the day merging smaller patches is faster than going through reviews for one big PR.

Sure, I forgot to mention that I'm also in favor of submitting smaller PRs, they're much easier to work with. But I'm not sure if it's worth splitting a PR that is accidentally turned out larger that expected.

Not only this, another example is pg_receivewal support.

+1, it looks unrelated now. But for me it's a small addition that doesn't make anything worse, I'm fine with both leaving and deleting it.

Regarding the rename. It looks like more places used sync semantics, and you suggested to rename them in first comment:

I don't get it. There was no rename, and I haven't suggested to rename anything. In can be confusing or error-prone, inverting the write_message semantics looks strange I meant that there is a confusion around this place, it turned out to be very true :)

As I explained above, sync code was migrated to async and that made it look like a rename, but postgres_backend_async::write_message has the same name before and after this PR. Moreover, if you are going to rename _async::write_message to _async::write_message_noflush, it will require renaming pageserver code that is already using postgres_backend_async and it leads to more diff.

@LizardWizzard
Copy link
Contributor

Yes, you are proposing to remove split and write a slightly different code,

I propose avoiding the necessity to write extra code. To me custom framed implementation is completely unnecessary.

the current version is ported from the sync code that is known to work well.

Code in main doesnt rely on tokio::io::split. I dont see why this is an argument here.

I haven't suggested to rename anything
inverting the write_message semantics looks strange

For code that was previously sync it is exactly a rename and this patch inverts the semantics. Additionally see #3576 (comment)

it will require renaming pageserver code that is already using postgres_backend_async and it leads to more diff.

This is why the entire naming conversion thing should be a separate patch.

@arssher
Copy link
Contributor Author

arssher commented Mar 1, 2023

As mentioned above, I'm going to do switch to write_message_noflush/write_message in the whole project. I can extract that to separate commit, would that help?

Secondly, to my mind similar result can be achieved without requiring split machinery (either tokio or futures one). Both of them seem to have their own cons, allocation of waker in futures, thread::yield_now() in tokio and in general this concept of split contains pitfalls exactly around use case with polling from different tasks

  • It probably technically can, but it is not obvious at all that the result would be simpler/better.
  • Exactly same problems arise in safekeeper walsender, solving that at pgbackend level avoids doing that twice.
  • Issues you mention about the split don't exist if polling happens in the same task, and that's the case here. In fact there is absolutely no real problem to solve with splitting socket for reading/writing from the same task, all these bilocks/thread sleeps are consequences of fighting with Rust strictness.
  • As everyone here noted, the patch is already big, and currently it at least doesn't change much conceptually how sk walreceiver/walsender works. Doing that in other way (not with split) is very loosely related work, and as @petuhovskiy said such proposals come "with its own advantages and disadvantages. We can try to do that, but the current version is ported from the sync code that is known to work well.", which I completely second. Personally I'm fine how it is done currently.

Code in main doesnt rely on tokio::io::split.

Of course, because it is sync. It instead impressively relies on our own crate specifically designed for splitting potentially tls enabled sync streams.

Additionally taking into account #3667 I'm really not a fan of such a big refactoring without good unit test coverage. This refactoring is an opportunity to move to better structure for our whole postgres protocol handling machinery.

I don't really mind adding more tests. However, neither I view that as highest priority here; our postgres_backend usage is internal and is really limited, it is mostly about handshake and simple query protocol, both are covered in simple_select.rs. Plus there is copy stream, but it is simple and gets tested in integration tests. Currently this patch blocks other work, e.g. Arthur's measuring traffic PR, that's really bad, and it is already large. So if someone is going for fuzzying that's okay, but not in this PR.

@arssher arssher force-pushed the asher/sk-async-pg-backend branch 4 times, most recently from 470a11e to 48bef87 Compare March 1, 2023 16:21
@arssher
Copy link
Contributor Author

arssher commented Mar 1, 2023

I'm going to do switch to write_message_noflush/write_message in the whole project. I can extract that to separate commit, would that help?

Did that, doesn't help much with diff...
And again, all review is addressed so far.

…nc.rs

To make it unifrom across the project; proxy stream.rs and older
postgres_backend uses write_message_noflush.
@arssher
Copy link
Contributor Author

arssher commented Mar 1, 2023

Rebased again.

@LizardWizzard
Copy link
Contributor

Did that, doesn't help much with diff...

It helped a bit. There is a separate commit with this change now, so the second commit became a little smaller. If you split more changes I believe it will shrink a bit more.

It probably technically can, but it is not obvious at all that the result would be simpler/better.

I dont understand this as an argument.

Exactly same problems arise in safekeeper walsender, solving that at pgbackend level avoids doing that twice.

Could you elaborate? I may lack some context

consequences of fighting with Rust strictness.

This is only partially true. Rust can be annoying and sometimes the suitable approach significantly differs from what you can think ahead of time. As I previously said in the comment, is there a reason why you cannot hold owned stream without using split and dispatch messages in it?

Rust strictness is about guarantees. If you cannot guarantee that it would be the same task then you should consider the worst case. Can someone misuse split abstraction and write code that starts polling it from multiple tasks? Sure. Can we do better? I think so, and refactoring IMO is a perfect place to explore the opportunity.

However, neither I view that as highest priority here;

Ok, when this can be a priority other than in refactoring PR that tries to reduce tech debt?

but it is simple and gets tested in integration tests

Integration tests have their weaknesses. In motivation paragraph here #3667 I explain why having unit tests increases developer velocity. This is true especially for small independent components like pg backend. Is it that hard to test Copy that we need to add follow up issue for that?

Additionally being "simple" and "internal" cannot be a reason why this shouldnt be covered.

You changed a whole lot of code here. And as I already mentioned this PR tries to be a refactoring.

Doing that in other way (not with split) is very loosely related work

I disagree. The refactoring you did tries hard to mirror logic from previous sync version. Significant part of code is dedicated to this job. So this is exactly about the code added in this patch, thus perfectly relevant.

Of course, because it is sync. It instead impressively relies on our own crate specifically designed for splitting potentially tls enabled sync streams.

So we fully rewrite this code for this logic to remain the same. This refactoring is an opportunity to make it clearer and do not return back to this code. Imagine we merged the PR. Tech debt associated with this is still there. Someone needs to get back to it and rewrite it again. Why do we need to rewrite the same thing twice if we already know which problems exist in newer version?

So if someone is going for fuzzying that's okay, but not in this PR.

Sure, I'm fine with it. Please create corresponding issue.

Currently this patch blocks other work, e.g. Arthur's measuring traffic PR, that's really bad

IMO this PR in its current form moves forward with uniting everything under async umbrella but it still keeps the tech debt that was there but in a bit different form despite the fact that it is a significant rewrite of the component. So if your proposal is to leave it here, merge the PR and be done, I'm not on board with the approach. Sorry.

Maybe you need someone else to break the knot.

- Use postgres_backend_async throughout safekeeper.
- Use framed.rs in postgres_backend_async, similar to tokio_util::codec::Framed
  but with slightly more efficient split. Now IO functions are also cancellation
  safe. Also, there is no allocation in each message read anymore, and data in
  read messages still directly point to input buffer without copies.
- In both safekeeper COPY streams, do read-write from the same thread/task with
  select! for easier error handling.
- Tidy up finishing CopyBoth streams in safekeeper sending and receiving WAL
  -- join split parts back catching errors from them before returning.

Initially I hoped to do that read-write without split at all, through polling
IO:
#3522
However that turned out to be more complicated than I initially expected
due to 1) borrow checking and 2) anon Future types. 1) required Rc<Refcell<...>>
which is Send construct just to satisfy the checker; 2) can be workaround with
transmute. But this is so messy that I decided to leave split.

proxy stream.rs is adapted with minimal changes. It also benefits from framed.rs
improvements described above.
@kelvich
Copy link
Contributor

kelvich commented Mar 2, 2023

Some thoughts after reading this thread.

  • avoiding allocation in the hot code path is IMO a good reason to not to use tokio Framed. In the case of safekeeper where most of network messages end up being written on disk it is not that important (as fsync will be order of magnitudes slower anyway). But this parsing code is used in read only requests (e.g. in pageserver and proxy) as well, so it is important. I agree that there are a good reasons to stick to the standard functions, but here I'm leaning towards the approach proposed in the PR.
  • current way of socket reading and disk writing in the safekeeper emerged after a big series of tests that took quite a long time and improved safekeeper performance about 5 times and finally surpassed speed of postgres replication. A lot of this speed came from emergent phenomenons (how much messages are piling up in the queue, what is the average size of the message, what is the fsync latency on our hardware, etc) that contribute to avoiding stalls in the walsender->walreceiver->walwriter pipeline. I think that it is some context that Arseny and Arthur had and reviewers did not, hence some amount of misunderstanding (however Arseny explained this in the comments). So +1 to the way it is done in the PR and -1 to the suggested changes until they are extensively perf-tested.
  • I get Arseny's explanation of why try_send will not work without significantly complicating state machine
  • I see that the current approach is labeled as a tech debt, but don't see any arguments on why. May be it is, but just by looking at the comments it seems like a matter of personal preference.

So to sum it up I'm +1 to proceed with the current PR. I also not against trying to avoid sock_split if it will simplify things, but better as a separate PR to check for any perf impact.

@arssher
Copy link
Contributor Author

arssher commented Mar 3, 2023

Just a minor correction:

avoiding allocation in the hot code path is IMO a good reason to not to use tokio Framed.

There are two 'allocation during each message processing' issues. First is in bilock.rs which is used when you split tokio Framed. In non-split usage it doesn't exist, so pageserver and proxy would be ok with that. Another one in our message reading code , which affects everything. Switch to framed, whether custom or tokio removes that.

@LizardWizzard
Copy link
Contributor

@kelvich

avoiding allocation in the hot code path is IMO a good reason to not to use tokio Framed.

As Arseny pointed out this is not relevant to tokio framed. Instead with current usage of split there is a "spin lock" with thread::yield_now() on AtomicBool. See https://github.com/tokio-rs/tokio/blob/3d33610ed2420111982e5a42c764761c9060e6ab/tokio/src/io/split.rs#L147. Do we know for sure that this does not have any negative performance impact?

allocation in the hot code path is IMO

Do we know for sure that this allocation has impact on performance? I agree that this is not good if we can easily avoid it. But if it doesnt have perf impact then having simpler code or less amount of it looks like a better option to me.

I think that it is some context that Arseny and Arthur had and reviewers did not, hence some amount of misunderstanding (however Arseny explained this in the comments). So +1 to the way it is done in the PR and -1 to the suggested changes until they are extensively perf-tested.

  1. If the context is only in people's heads this is also a problem. Important bits should be reflected in documentation and clear references should be given.
  2. The suggested approach shouldnt be different from what Arseny described as an intended solution. Regarding the state machine complexity, we (myself and @funbringer) provided a combinator that is abstracted over AsyncRead + AsyncWrite and protocol implementation in form of tokio_util::codec::{Encoder, Decoder}. Standalone example can be found here: https://github.com/LizardWizzard/duplex/blob/master/src/main.rs#L43. I posted it with description in our slack channel where we've discussed the PR.

until they are extensively perf-tested

Do we have a test case in our test suite that can be used for validation? Did we check our usual perf numbers for this patch to not cause any perf degradation?

I see that the current approach is labeled as a tech debt, but don't see any arguments on why.

IMO arguments are clear:

  1. The well known problem with split: TLS streams are not guaranteed to be "split" (full-duplex) safe tokio-rs/tls#40. There is even a comment in our code that mentions this issue. The answer to it was that we dont do it in multiple tasks, but I dont want to be the person accidentally spawning a task and waste time on debugging and ending up either working around spawn, or with another refactoring PR that removes dependency on split. If you think that I'm too paranoid on that, lets ask somebody else, it may be indeed the case.
  2. IMO reimplementing something that does precisely what tokio's Framed and tokio_util::codec abstractions were designed for is not a good practice.

If these still are not arguments for you, ok. Please note that this PR doesnt have an approval from storage team member (pageserver side of things). Codeowner-wise IMO it is important that each involved team gives an approval on a patch that touches core parts of the code base that the team uses to build components it is responsible for. So IMO thats reasonable to ask someone else. I would suggest Joonas or Christian.

And I havent seen an approval from @funbringer either.


Anyway, thanks @arssher for fixing all the comments

@LizardWizzard
Copy link
Contributor

Can we close this?

@arssher
Copy link
Contributor Author

arssher commented Mar 21, 2023

Superseded by #3759

@arssher arssher closed this Mar 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
run-benchmarks Indicates to the CI that benchmarks should be run for PR marked with this label
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants