Skip to content

Implement the PostgreSQL Streaming Replication Protocol#3281

Merged
Brar merged 56 commits intonpgsql:mainfrom
Brar:ReplicationEvolution2
Nov 8, 2020
Merged

Implement the PostgreSQL Streaming Replication Protocol#3281
Brar merged 56 commits intonpgsql:mainfrom
Brar:ReplicationEvolution2

Conversation

@Brar
Copy link
Copy Markdown
Member

@Brar Brar commented Oct 29, 2020

Closes #1520

Brar added 3 commits October 28, 2020 18:01
This type was added in PostgreSQL 9.4 and we didn't handle it yet.
While a type handler and a specific type isnt't strictly necessary for
the implementation of replication in Npgsql it will probably be useful
once people start using the replication feature and want to query their
database regading replication related information like
pg_replication_slots.
There was a very old ToDo-comment in the code because TimestampHandler
and TimestampTzHandler were passing values through NpgsqlDateTime in
order to convert them from/to the backend representation.
For Replication we need to convert timestamps in a completely separate
part of the codebase so we change it now and benefit from it multiple
times.
@Brar Brar requested review from YohDeadfall and roji as code owners October 29, 2020 01:31
@Brar
Copy link
Copy Markdown
Member Author

Brar commented Oct 29, 2020

@ Reviewers: Please don't bother investigating failing tests for now. I know about their flakiness and I expect to remove/disable most of them in the course of the review process.
For now they serve as documentation on how things are supposed to work and as regression tests to avoid accidentally breaking something when I integrate requested changes from your reviews.

Also I'll add one additional commit to stop the timers when cancelling (it was on my todo-list but somehow I forgot it last night).

Copy link
Copy Markdown
Member

@roji roji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a first partial round - I've gone through about half the files (I think), and it looks great. I've made lots of comments, but marked certain ones as IMPORTANT - anything without that can probably be ignored for now (up to you).

You should probably add yourself to CODEOWNERS for the appropriate paths 🤣

@roji roji mentioned this pull request Nov 8, 2020
4 tasks
@roji
Copy link
Copy Markdown
Member

roji commented Nov 8, 2020

From my side, we're very close to merging this for 5.0. Some notes:

Copy link
Copy Markdown
Member

@roji roji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really, really great to finally see this go in - and I'm very happy with the end results. Thanks for all the great work @Brar.

@Brar Brar merged commit 5a3b020 into npgsql:main Nov 8, 2020
@jberzy
Copy link
Copy Markdown

jberzy commented Nov 8, 2020

This is fantastic! Thanks @Brar and @roji

Comment on lines +592 to +605
async ValueTask<string> ReadLong(string s)
{
var builder = new StringBuilder(s);
bool complete;
do
{
await ReadMore(async, cancellationToken);
complete = ReadFromBuffer(out s);
builder.Append(s);
}
while (!complete);

return builder.ToString();
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a bit late, but I think this code will fail if the split is in the middle of a unicode character that uses more than 1 byte to represent. Better Ensure the max length of a string is already present in the buffer (as it was previously), or create a temporary byte buffer with all the data which you then convert to a string in one go using the encoding.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're technically right... However, note that this is for reading null-terminated strings, which AFAIK are only used for certain short ASCII control strings, rather than actual user data - so I doubt there's a real bug here. But you can open a new issue to at least have it tracked.

BTW if we do want to preemptively fix this, then rather than entirely buffering the strings them in a temporary buffer, we can use System.Text.Decoder to incrementally load more chunks a decode. Note that we haven't even done this optimization for actual user strings (which are probably much more important, see TextHandler.ReadLong).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least in the replication protocol they are used for user data as well, for example table names, which can contain non-ascii characters.

I don't see why System.Text.Decoder would be anything better. It gives you some chars on every iteration. These chars are temporary anyway and must then be made into a string, after which the previously buffered chars are discarded. But generally a char[] takes up more memory than the corresponding byte[] in UTF8, so I think it's better to just buffer everything as bytes. The encoder can then convert these to a string in-place, with no extra temporary space.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right... IIRC that's why I didn't do it in TextHandler either.

Open an issue for us to fix this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support for physical and logical replication

5 participants