-
-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
improvement: run the batch ingestion process in parallel #14
Comments
One idea: why not run this fn_process_batch only in the database using a trigger on the t_replica_batch table. |
Hi, the version 2 I'm currently working on it will split the read and reply in two separate subprocess. I'll try to release it by the end of the year. It will also add the init replica in parallel with a less invasive flush process. |
Btw, I'm getting curious about your migration. Any chance you'll be able to write a case study? :-) |
well - in principle we got a single mysql instance/process running 3 schema. As for speed we found that: So it is not something "special". The only thing is that I find a lot of things that - to me - looks strange and as we want to have all the data in postgressql we have to make sure that there is no dataloss. Hence all those questions with regards to logging and missmatches... Besides that I am totally happy with the tool itself - thanks for providing it! |
thanks for sharing :) I'll try to speed up the development of version 2 to provide a better experience :) |
@martinsperl-kognitiv I've just pushed an improvement for the replay function. My tests shown a faster execution with the reduction of cpu load and io wait. If you wanna give it a try. :) The upgrade procedure will add an extra table used by the procedure. Be sure to stop all the replica processes before upgrading the schema and take a backup of sch_chameleon. |
the version 1.7 will have the threaded option for running the read and replay in parallel |
As an improvement:
run the batch-process (executing fn_process_batch) in parallel (thread/forkl) to the parsing of binary logs when using
start_replica
That would help keep up with the catchup of the database.
Note: I am not asking for parallel table processing when running
init_replica
, but that would also help ;)The text was updated successfully, but these errors were encountered: