-
Notifications
You must be signed in to change notification settings - Fork 128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Database subset #44
Conversation
…rce/postgres_stdin.rs
…rce/postgres_stdin.rs
I know it's a large PR but if you want to take a look at it @benny-n @fabriceclementz it's all good for me :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, very nice feature to have!
Good job @evoxmusic ! This is my favorite feature for the moment ! I noticed some errors during the backup restore command
It seems some rows are inserted twice so the primary key cannot be added. |
Yes I got the same issue and I will provide a fix |
I'm facing a small challenge on rows deduplication. Since RepliByte needs to be low on memory consumption, it's almost impossible to make row deduplication in memory. I am looking for a remediation. |
I think It would require having all the rows in memory for deduplicates them? |
It's a good idea but my concern is the IO performance. I presume it will be not super performant. For the database subset I am already using the local disk because there is no choice. We can't process data and assume that users will have a lot of memory available. However, we can assume that they have some disk space and it is a requirement for database subset. I am working on a function to dedup specific lines from a file. I will push my code tomorrow (I made a Rust conf tonight - I am super tired) |
@fabriceclementz @benny-n finally done!! 😄 I am going to merge, but I will need to also explain that they are some requirements to use the database subset feature since it needs some disk space to do the processing. We can improve that part in an incremental way |
Nice 👍 I will try this soon ! |
See this issue