GSoC 2023: Native Backup #120
Replies: 24 comments 120 replies
-
@jesperpedersen I just updated my doc, please let me know your thoughts. I'm not sure if I should move the checklist to the issue #123 page, as @MariamFahmy98(hi!) did last year. Also she created multiple PRs and issues but I'm not sure if I should do the same. I have a feeling that I only need to create one PR addressing #123 and push commits to it. And in the end they could be squashed into one single commit before merged. Let me know what you think :) |
Beta Was this translation helpful? Give feedback.
-
Hi @jesperpedersen, I read from the Frontend/Backend Protocol doc that you need to send a startup packet/message to the server before sending the actual query. I then looked up and verified this in postgres source code, there's a function called PQconnectPoll that sets up the tcp connection and sends the startup packet. However in our query execute functionality this seems to be ommitted, can you tell me if this is intentional? Should I add this? I also find it quite hard and time consuming to dive into the source code to verify such details in the protocol, and I think you mentioned about pgprtdbg as a tool to generate trace files of a protocol, do you think it's a good use case for tedious tasks like this? Besides, I'm trying to prioritize things here, and before sending base_backup command, pg_basebackup also does things like generateRecoveryConf or runIdentifySystem, I haven't really dive into their implementation and purpose, do you think they are necessary? I want to get a minimum version that works first. So I think it's necessary for me to be able to tell what's important and what's not (the postgres source code really is complicated!). |
Beta Was this translation helpful? Give feedback.
-
@jesperpedersen I just updated the progress above. Let me know what you think~ BTW, do I need to tag you everytime I update that? |
Beta Was this translation helpful? Give feedback.
-
@jesperpedersen I think I need to make some changes to the authentication functions to extract server version info, because that part of info is sticked with 'AuthenticationOK'. I put some details above, let me know if you have better idea to approach this. Thank you. |
Beta Was this translation helpful? Give feedback.
-
It is a matter of dealing with the core protocol - https://www.postgresql.org/docs/devel/protocol-message-formats.html - in both cases. They were developed at different times, you are interested you can find it the https://www.postgresql.org/list/pgsql-hackers/ archives.
Read about the
Yeah, we will have - or at least very likely - look into a solution with that. We could define a setting for maximum memory usage... |
Beta Was this translation helpful? Give feedback.
-
Hi @jesperpedersen , I'm now moving on to work on writing the data received to disk, I plan to implement a streamer struct to do the job. I have some questions though, about archive and compression in pgmoneta.
And it would be nice if you could take a look at my PR and let me know if it's good to go, before I start next phase's coding work. That'll save me from some rebasing work :) |
Beta Was this translation helpful? Give feedback.
-
Hi @jesperpedersen, currently we only have one tablespace, the default one, and its content is directly stored under the |
Beta Was this translation helpful? Give feedback.
-
Hi @jesperpedersen , I think there are existing libraries out there that can untar a file with a function call. Do you have any idea why Postgres chooses to parse and extract the tar file itself as it receives the file stream, instead of, e.g. receive |
Beta Was this translation helpful? Give feedback.
-
Hi @jesperpedersen , I'm having a tiny problem here. In order to use libarchive I need to include their Currently |
Beta Was this translation helpful? Give feedback.
-
Yeah I have tried that. Sadly it didn't work. We probably need to rename ours then. Do you have any suggestion?
That is great! I'll adjust my implementation according to it.
I'm not following. Do you mean the ones under |
Beta Was this translation helpful? Give feedback.
-
@jesperpedersen I found some memory leak in both my implemention and our current base backup implementation (the command line one). I was able to locate and fix our current one but for my implementation it's much harder to find. I will submit a fix to #130 after I found the leak. So please hold off merging it. And I'll also submit a separate PR addressing the memory leak in our current implementation shortly. |
Beta Was this translation helpful? Give feedback.
-
@jesperpedersen Changes you requested for #130 have been submitted. I'm thinking about our next steps. Which one do you want me to prioritize: receiving the wal or replacing our archive functionalities with libarchive? |
Beta Was this translation helpful? Give feedback.
-
#130 has been merged - so you can replace |
Beta Was this translation helpful? Give feedback.
-
Hi @jesperpedersen , I find the integration not as easy as I expected. Here are some problem I found:
Also If I don't set this |
Beta Was this translation helpful? Give feedback.
-
Hi @jesperpedersen, what is the correct way to start postgres server on a restored backup? Is it something like |
Beta Was this translation helpful? Give feedback.
-
@jesperpedersen I recreated the similar error after removing wal segments under
Could you check yours? There are 3 additional problems I found about the restored server:
|
Beta Was this translation helpful? Give feedback.
-
@jesperpedersen Changes to support tablespaces have been submitted to PR #137 . I hope you don't mind some design changes, especially the tar directory structure. Please let me know what you think. I'm curious though, our previous implementation works well, why do we want to replace it with libarchive? |
Beta Was this translation helpful? Give feedback.
-
Hi @jesperpedersen , I spent the weekend studying the Postgres's wal replication process. It looks clear enough but there are some details that I'm not sure about, I'm gonna list them below, along with some questions, please correct me if you find any misconception:
What's the difference between
|
Beta Was this translation helpful? Give feedback.
-
Hi @jesperpedersen , I hope you don't mind me borrowing some code snippet in |
Beta Was this translation helpful? Give feedback.
-
Hi @jesperpedersen, how did you notice the CPU usage problem? I'm looking into the cause here. |
Beta Was this translation helpful? Give feedback.
-
@jesperpedersen The CPU use rate problem got fixed by sleeping on socket returning -1. But we have another problem, basebackup now takes tremendously longer time, nearly 300 secs. I'm looking into the reason right now. |
Beta Was this translation helpful? Give feedback.
-
The progress log in the first section is becoming too long. I reversed the order so that the newest log would be on top. I hope this helps. Though I don't think many people would read this 😅 |
Beta Was this translation helpful? Give feedback.
-
Hey @jesperpedersen , should we worry that the wal process might try to read config->running in the shared memory after it's already been unmapped? I don't know whether that could happen or not. |
Beta Was this translation helpful? Give feedback.
-
Intro
Hi,
My name is Haoran Zhang. I have been selected as a GSoC contributor for pgmoneta this year (2023). I will work on replacing our current backup implementation.
Currently pgmoneta generates backups by sending
pg_basebackup
command to Postgres database server. We want to get rid of this dependency on Postgres binaries by implementing the process this command triggers. This mainly involves handling the client side of Postgres's streaming replication protocal.And since WAL replication shares the same infrastructure and protocal, we will try to implement this part as well.
You can find the most updated details in the progress log below (I'm a bit chatty). Or check out my proposal, though I should warn you that the I wrote the proposal under limited knowledge of this project. So reading this forum will provide you with more accurate details.
I'm open to suggestions and ideas, feel free to discuss them here or email me at
andrewzhr9911@gmail.com
May the goddess Moneta be with us this summer :)
List of the work done and to do:
Note that more details about the milestones may be added to this list as I start to work on them (M for milestone)
Related issues:
Progress Log
08/04/2023
Finally, #140 is merged into main branch. We got our native WAL infrastructure! This marks the end of my GSoC projects, huge thanks to Jesper for his patient, reponsive and inspiring guidance. I'll remain as one of the contributors to pgmoneta community and work towards more advanced backup strategies. For future developers, I sincerely hope the logs I kept below could be of help to you guys, good luck!
08/03/2023
Jesper found that our wal implementation is running at 100% CPU usage rate. I looked into it and realized that the socket is non-blocking by pgmoneta's default, and therefore creating a busy while loop. We fixed that by adding
sleep(1000000L)
in between. Now CPU usage rate is stable at about 3%.I also found that backup takes much longer time now. I initially suspected it's because the child wal process and the parent backup process are sharing the same socket, because they have the same socket number. But turns out it's normal because two processes have diffferent file descriptor tables(https://stackoverflow.com/questions/27746417/multiple-local-processes-have-the-same-socket). Forgive my rusty network programming knowledge.
The real reason is simply that my database is getting bigger because of too many pgbench. I guess I have to do some cleanup after pgbench. Also we could look into faster way to do our backup.
07/31/2023
The termination problem got fixed by
config->running
. I also enhanced our handling on receivingCopyDone
from server for reaching the end of the timeline.07/30/2023
I spent the weekend working on our receivewal implementation and get an initial but pretty nice version working. Server starts on the received log normally. And the program exits on server shut down. One thing that worth mentioning is that server sometimes sends data across the wal segment boundary. So I had to save those data and write to the next WAL file later. Also it's not stopping on
ctrl+c
and cannot deal with multiple timeline. Other than that, it's a really nice implementation.07/29/2023
Turns out receiving WAL is a never ending process, in an infinite while loop, unless we are requesting WAL from an older timeline. So normally we should not expect to see CommandComplete or CopyDone.
I also spent some time figuring out the WAL naming format. And here is what I find.
As we know each row of log in the WAL has an LSN, mutiple logs form one WAL segment, which also corresponds to one WAL file. So assume the segment size is 16MB, which is by Postgres default, the address, or the offset within each segment of each log record is 24 bit, and the total address length for an LSN is 64bit, then we can roughly seperate the address into
As for the segment address, the first 32 bits is what I call a segment group ID, or segment group address, we call it X, the last 8 bits is what I call a segment in-group offset, we call it Y.
Then the WAL format can be denoted as
Each part in this name has 4 bytes (32 bits), and all the address, ID and offset are in hex format, so each number in the name you see stands for 4 bits. And the remaining bits are padded by 0.
So now comes to the interesting question we care about, if we know an LSN of a record, which basically is a 64 bits integer, how do we construct the xlog file name? Please note that now the segment size is likely not the default 16 MB, but the principle is the same. We first need to remove the in-log offset part, so we devide LSN by the log segment size, and we get what I call the segment number(segno), you can also call it segment ID or segment address, this part corresponds to the 40 higher bits we talked about earlier.
Then we must know how many segments are there within each segment group, which is equivalent to the total number of addresses a segment in-group offset can represent. Bare in mind that the segment group ID is always 32 bit, and the total length of the addr is 64 bit, so the segment in-group offset along with the in-log offset also have 32 bit. That means the total number of addresses to denote a segment in-group offset is 2^32 / segment size, this number is what we want: the number of segments within each segment group.
To make this clearer, let's take the default 16MB segment size for example, the in-log offset is 24bit, and the segment in-group offset has 32 - 24 = 8 bit, it can represent 2^8 addresses, which can also be derived by 2^32 / 2^24. you can also think about it this way: only when the address increase from 0x00 to 0x11, the higher 32 bit address will increment by 1. This is why we call it number of segments per segment group.
Now that we know the number of segments per group, the rest is easy. The segment group ID is just segno / segments per group, and the segment in-group offset is just segno % segments per group.
07/25/2023
The naming format of the wal file is weird. I found a good post explaining this: https://www.crunchydata.com/blog/postgres-wal-files-and-sequuence-numbers
I did some simple coding to verify the process, basically I send the request and then just blindly receive messages and log their types. It is the same trick I used when writing native backup. But it's not working for the wal receiver. Firstly it is much slower to get the CopyBothResponse or the log data, I don't know if that's normal. And it seems that the server stops sending me anything after sending just one CopyData message. And sometimes it stops after sending me just the CopyBothResponse. I'm still looking into why, I think investigating the sender side code in postgres is a good start.
07/23/2023
I'm starting to worry that time isn't on my side now. I'm studying the details of the wal replication process by reading the postgres implementation but I worry I'm not fast enough. Anyway I found a great blog explaining Postgres's timeline https://www.highgo.ca/2021/11/01/the-postgresql-timeline-concept/ and point-in-time recovery https://www.highgo.ca/2021/10/01/postgresql-14-continuous-archiving-and-point-in-time-recovery/
07/19/2023
#137 has been merged. I think it's a good time that we start working on our wal solution. I'll probably take the next few days off to catch a breath and work on some personal issues. But I'll still do some initial investigation and designing on this.
07/18/2023
The bug indeed lies within WAL but we have little control right now. It requires understaing the WAL in pgmoneta. You can find more details in the discussion below. We replace symlink() with symlinkat() when restoring backup. This way we can move archive around without breaking the symlinks. It is magical.
07/16/2023
Jesper reports some bug on his side about our native backup last Friday (07/14). But I couldn't recreate the same error. I strongly suspect the problem lies within the WAL part. Since I have little control for now I'll have to wait until Monday to further discuss this with Jesper. But I don't think this is big problem.
I spent quite some time in the weekend working on our libarchive replacement and it finally worked. Thanks to this reference wiki: https://github.com/libarchive/libarchive/wiki/Examples#a-basic-write-example. PR has been submitted: #137 I couldn't decide what to do with tablespaces. Seems that user level table spaces are not restored. I wonder if this is a bug...
I also wonder why Jesper wants to replace his archive implementation with libarchive. His version works perfectly and I'm impressed that it's done from scratch.
07/12/2023
Thanks to Jesper I think I found out why server complains about missing wal segment. I suspect the problem is that I often just use
ctrl+c
to stop pgmoneta, leaving partially received wal segment. And when it's started again and trying to fetch this segment from server, it could have been removed from server side because I didn't use replication slot to preserve this segment.#134 is merged into main branch, hooray!
07/11/2023
#130 has been merged. I'll be working on archive.c for the next few days.
The integration PR #134 is submitted. This integration is not perfect, especially when it comes to receiving wal. We have too little control over this. Just executing a pg_receivewal command doesn't really go well with our native backup solution. And sometimes crushing bugs can occur.
There are some bugs I can fix though. For example, later I can work on skipping compressing symlink.
07/06/2023
With the help of valgrind, I located several memory leak issues in both main branch and my own working branch. For the problem in our main branch, I created issue #131 and PR #132. Fix for my own implementation has been submitted to #130
07/05/2023
Ok my mistake, Azure doesn't take credit card. After changing payment I was able to launch a Fedora 37 instance, so problem solved.
Support for server version > 15 has been submitted to PR #130
07/04/2023
Jesper wants this PR to also work for server whose version >15. I spent today working on a the basic structure. I can't believe that the biggest challenge I faced today is finding a Fedora 37 AMI on AWS. Because Postgres 15 is only available for Fedora > 36. I also tried Azure for a while but they don't allow me to rent their VM for some reason. I was able to find one on AWS. It charges a little more for the software. But it does the job, sort of... I was able to use it after it first launched, but it cannot boot after I stop it. It's driving me crazy because now I have to relaunch and re-install everything if I stop the instance. Anyway though, I was able to verify my basic structure and the copy out process anyway. The rest of the implementation should be easy enough.
I'm actually quite curious for other developers, how do you guys setup your environment? Are you using VMs on cloud or dual system, or anything else?
07/03/2023
This is just ridiculous, postgres sends the default table space THE LAST! So I cannot update the symbolic link until I receive everything else. Ugh!
The new PR has been submitted. See #130 . I will pause and take a break since Jesper's away and unable to provide feedback. I'm ahead of my schedule so waiting a little doesn't really hurt :D This is probably a good time to look into how exactly does receving xlog stream works actually.
06/30/2023
Great news! I got the tar file receiving and extraction to work! Now all I have to do is to do that again with other tablespaces and change the symbolic link under basedir.
06/26/2023
Ok here are some details I found about the streamer process. First, regarding mainifest, it's directly stored as
backup manifest.tmp
at the backup base directory and renamed tobackup_manifest
when it ends. The renaming is a sign of streaming completes. And as for the backup data, one streamer calledbbstreamer_tar_parser
will first parse the data according to tar format, the data will be labeled asBBSTREAMER_MEMBER_HEADER
,BBSTREAMER_MEMBER_CONTENTS
,BBSTREAMER_MEMBER_TRAILER
andBBSTREAMER_ARCHIVE_TRAILER
. If the data chunk for one particular label is incomplete, I think it will pause to receive more, I need to verify this. Anyway, the labeled chunk of data will then be sent to the next streamer:bbstreamer_tar_header
. It will act differently according to the label. For example, I think it create and opens the corresponding file on receive the header labeled data, and will write subsequent content labeled data to this file. The server seems to ignore terminating trailing chunks of 0s in the tar file when sending it. And if client doesn't untar the file, one streamer will append these chunks of 0s to it. But when parsing the tar file this seems not necessary. I need to looking to why. So I think I need to first understand tar file format, and see if there's anything I can use inwf_archive
. Then do a cleaner implementation, the postgres one is too complicated for us.06/24/2023
I don't think we really need a "streamer", Postgres uses this objective oriented streamer because it needs to support multiple compression, decompression and extraction ops. But we only need to extract the tar file. So one funtionality should be enough.
06/23/2023
Got the third PR merged into main branch. This one has the same functionality as
pgmoneta_query_execute
. I had to write it again because the data is now in stream buffer. I also made some changes topgmoneta_consume_stream_buffer
so that I can make use of existing APIs. See #12806/21/2023
The second PR of this project has been merged. I'm looking into what compression workflow is actually doing. And seems that it just compress every single file in the data directory. So I still need to untar everything I receive. Next step would be figure out the relative directory structure and the names of all the files server sends me. So that I can know which directory to save those files. This should be fairly simple.
06/17/2023
Turns out it was just one stupid mistake. Now it works perfectly. I already submitted PR, see #127
06/16/2023
Ok I'm pretty sure something's wrong with my implementation. Because server keeps complaining about broken pipe and losing connection with client (which unfortunately, is me). But if I just use pgmoneta_query_execute, the issue doesn't exist. I don't understand why because I'm basically doing the same thing, execpt reusing the buffer. But this is a good direction to look into. I'm sorry this takes so long. Fingers crossed I can get this nasty bug sorted by the end of this week.
06/14/2023
Ok turns out
SHA256
needs to be enclosed with quotation marks ('SHA256'
). Now the server accepts the command. Weiredlywal.c
is complaining, even though I did nothing on that. I had to disablewal.c
for now and come back to deal with it later. Because even though server gets the command, it cannot send data to my side. I need to look into this first.This is the error message on the server side:
06/13/2023
I implemented functionalities to read data into a stream buffer and consume the data as one message by another. It should work, except it doesn't, read() keeps returning -1 to me. I fix some small issue with my BASE_BACKUP command but it doesn't fix the problem. So it's debugging time!
I also found that in the replication mode (with replication flag set to true when connecting to server), server doesn't recognize simple query such as asking for server version. So I have to connect to server with no replication flag set, ask for the version, disconnect and reconnect with replication flag set. So let me know if this causes any problem and you have better ways.
So of all the things I tried, I forgot the easiest one: just checking the primary server log. And it just says 'syntax error'...
The current command is like this:
BASE_BACKUP LABEL 'pgmoneta_base_backup_20230614034921' FAST MANIFEST 'yes' MANIFEST_CHECKSUMS SHA256
Let me know if there's anything off.06/06/2023
So after 2 days of enlightening (and painful) source code reading, I think I finally get most details clear. Let's start with the beginning.
Before sending
BASE_BACKUP
command to server, client does 2 things first,runIdentifySystem
to get timeline ID, this is used to for WAL backup I think, so I'm not going to implement this in milestone 2 (and we probably don't need it in M3 as well, for server version > 9.3, the timeline id is also sent as a row data when the replication starts). AndgenerateRecoveryConf
, this is needed for server version < 12, the generatedrecovery.conf
will later be injected into the main tablespace of the backup data. But seems that we don't need this, since in thepg_basebackup
command we currently use,-R
option is not specified (what a relief lol).After receiving
BASE_BACKUP
command, server sends the backup data (along with manifest, if we ask for it, and we do) and WAL (if we ask for it using-X stream
, and we do) in two different connections. In Postgres, receiving WAL stream is handled in a child process, this part mainly uses a function calledReceiveXlogStream
, which will be our main focus in M3, and since it's also used inwal.c
, we can add it to common functions.As for our current main focus -- receving backup data, the protocol is actually not very complicated in terms of logic. The server will first send two ordinary result set (in row data format), the first one tells you starting position and timeline id of the WAL, we can ignore this in M2. The second one tells you the info like table space name, one row for each table space. Then it starts to send out the actual backup data, one tar file for each table, along with a manifest in the end. The data is sent like this: first server will send a
CopyOutResponse
message, this denotes the starting point of the copy stream, then a bunch ofCopyData
message, always one row of archive data per message. Then finish with aCopyDone
message, marking the end of copy stream.For server version < 15, server will start multiple copy streams, meaning for each table and the manifest it will send one
CopyOutResponse
, followed by theCopyData
messages andCopyDone
. And for version >= 15, everything is sent in one copy stream. The copy-out data now has a format to distinguish everything, which I believe is not updated in the doc. It's like this:|--TypeByte--|--Data--|
There are 4 types of copy data:
'n'
the starting point of a new archive data, the data will be two strings, the archive file name and its location/address.'d'
the actual data, could be archive or manifest'p'
progress report, I think we can ignore this for now'm'
the starting point of manifest data, there is no payload data for this messageIt is worth mentioning that for the archive name the server sends, if it's
NULL
or'\0'
, it means this is the main data directory and should be named asbase.tar
.As we receive copy-out data, we want to write it to disk. Postgres does this by assigning a streamer for each archive, it has a FILE descriptor and will write the received chunk of data to the file. It also does special things like compression or untar. Judging from our current
pg_baseackup command
options, I believe we want plain files. So we need to untar the archive some point of time. I think it's easier to untar the archive when it's received completely, but postgres parse the tar file as it's being received, I need to discuss it with Jesper.Another problem Jesper and I both noticed is that our current read data/query execution functionality is not sufficient enough to handle large data stream. For one thing,
CopyData
message has different format fromDataRow
message, so we need to parse it differently. Another more serious problem is that our buffer memory is ever-growing, i.e. it never shrinks, or reuse memory space of messages already consumed. Postgres handles this by maintaining cursors of bytes received and bytes consumed in the buffer. And every time before it reads new data, it left shift the data unconsumed usingmemmove
, overwriting consumed bytes and making room for new messages. The buffer still grows, but almost only when the available memory space is less than 8192 B, this is to prevent reading a partial packet, which is usually shorter than 8192 B. This is probably good fix for this problem. It needs more discussion.06/02/2023
I was not very productive this week because I got so many questions on the details of the replication protocal just by reading the doc (honestly I don't even know what to ask). I had no choice but to dive into the postgres source code today. The workflow was so much clearer but the workload is large. I got a lot of the questions answered but more questions emerged. And I need to mind the version as well.
So my plan in the next few days is to dig into source code and figure out every detail I can. Then decide what functionalities I should implement to get the protocol working. My plan is to leave out WAL streaming (functionalities in this part can be shared to our
wal.c
) for the time being, just focus on receiving base backup data. My worry right now is that since everything sent over the TCP is sticked together, and our current query execution implementation receives them all at once, we may run out of memory space because we are not writing backup data to disk until the very end, and there could be trouble handling the logic sinceCopyOutResponse
orCopyData
is not ordinary query response (they have raw data, not just rows and columns).Several functions that's worth digging into:
ReceiveXlogStream
this is for receiving WAL, so not very urgent for nowPQgetResult
does postgres have tricks handling sticky packet and `CopyOutResponse message? Does it prevent itself from reading backup data (because it needs to be received later by special functions)?ReceiveArchiveStream
receive backup tar stream all at once for version >= 15, how does it receive everything without eating up all the memory? Why is it tar if the command allows plain data? How does it seperate everything(data and manifest for each file) apart?ReceiveTarFile
Receive tar data for each file, this is for version < 15. Again, why is it only tar? And how does it seperate everything apart?ReceiveBackupManifest
For receiving manifest data for each filefsync_pgdata
make data persistent on diskAlso some conceptual question, like what's archive, what's the difference between archive and data?
05/30/2023
So I struggled a lot with the details with the protocol and the command format. And I think I may be overthinking this. There's no need to stick to what the protocol says. It's easier to just query the version again. Also we don't need to use all the options in BASE_BACKUP command. Only a few are useful to us and a lot can be hard-coded for our purpose. That saves me a lot of trouble here. We already got this part of functionalities merged into main branch, see #125
05/26/2023
Regarding version, I did find a function called pgmoneta_read_version. But that only reads from a txt file called
/PG_VERSION
, which is generated bypg_basebackup
AFTER the backup is generated. So I need to create some function to read server version here. According to the doc, after frontend password is authenticated, backend will send the parameters it feels interesting before sending ReadyForQuery message, in which server version is included. I just need to receive that message after authentication.05/24/2023
I'm slowly starting the project, the pace is expected at the beginning of the project. So far I have created two branches: 0001_common and 0002_native_backup. On native backup branch I just finished sending the startup message, I thought I need to create and send the startup message myself, but turns out it's already taken care of in
pgmoneta_server_authenticate
, so I'm now focusing on creating the backup message on the common branch.I was able to reuse some code from other message creation function but basebackup still have some differences. For one thing it has gone through some changes in terms of format and options. Here are the formats for BASE_BACKUP from v10 - v15:
BASE_BACKUP [ LABEL 'label' ] [ PROGRESS ] [ FAST ] [ WAL ] [ NOWAIT ] [ MAX_RATE rate ] [ TABLESPACE_MAP ]
BASE_BACKUP [ LABEL 'label' ] [ PROGRESS ] [ FAST ] [ WAL ] [ NOWAIT ] [ MAX_RATE rate ] [ TABLESPACE_MAP ] [ NOVERIFY_CHECKSUMS ]
BASE_BACKUP [ LABEL 'label' ] [ PROGRESS ] [ FAST ] [ WAL ] [ NOWAIT ] [ MAX_RATE rate ] [ TABLESPACE_MAP ] [ NOVERIFY_CHECKSUMS ]
BASE_BACKUP [ LABEL 'label' ] [ PROGRESS ] [ FAST ] [ WAL ] [ NOWAIT ] [ MAX_RATE rate ] [ TABLESPACE_MAP ] [ NOVERIFY_CHECKSUMS ] [ MANIFEST manifest_option ] [ MANIFEST_CHECKSUMS checksum_algorithm ]
BASE_BACKUP [ LABEL 'label' ] [ PROGRESS ] [ FAST ] [ WAL ] [ NOWAIT ] [ MAX_RATE rate ] [ TABLESPACE_MAP ] [ NOVERIFY_CHECKSUMS ] [ MANIFEST manifest_option ] [ MANIFEST_CHECKSUMS checksum_algorithm ]
BASE_BACKUP [ ( option [, ...] ) ]
Guess they finally realized how ridiculously long this command has become :| (in this version there are 14 options). Plus in this new version the options are seperated by ',' instead of spaceSo natually I have to accommodate these changes, I think we probably have some functionality to check server version. If not I'll just have to implement one. I can use some ideas here~
Another thing that's really bothering is that you can see how many options are there, and all of them are optional. So I'm thinking it might be worthwhile to create a struct
base_backup_options
to wrap these parameters, otherwise the function definition will be too long.05/17/2023
We got Jesper's patch merged into main branch. This will be the foundation of this project. This patch includes:
pgmoneta_create_XXX_message
, which are used to create a certain command to be sent to Postgres server. I can refer to that when I create the BASE_BACKUP command messagepgmoneta_query_execute
that executes the query, i.e. sends the command message to server and receives the response. I can probably use that to send the BASE_BACKUP command directly, or at least refer to it. But I'm still not sure what's the response for a BASE_BACKUP command, I'll look into the documentation for thisThe functionalities above follow the Frontend/Backend Protocol, the message format can be found here. As BASE_BACKUP gets received by server, a Streaming Replication Protocol will be initiated. This will be the second and the major part of this project. These documentations are very important and I'll make sure I understand the details in the next few days.
We also created issue #123 for this project.
Beta Was this translation helpful? Give feedback.
All reactions