Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

parallelize kubycat? #10

Open
schlichtanders opened this issue Oct 16, 2023 · 4 comments
Open

parallelize kubycat? #10

schlichtanders opened this issue Oct 16, 2023 · 4 comments
Labels
enhancement New feature or request

Comments

@schlichtanders
Copy link

if many files change the same time it seems that kubycat is sequentially pushing the changes one after the other

that is very slow. It would be great if this could be parallelized

@schlichtanders
Copy link
Author

another example is on startup if sync on start is enabled

@sheldonjuncker
Copy link
Owner

Kubycat has to push file synchronously to maintain the order of operations that were performed. If a delete and a create happened one after the other, you would have to ensure that they were processed in that order, making parallelization at the file event level challenging.

I think where we could make it parallel is pushing files to different pods. For each file change, we could push it to each of the pods being synced to at the same time. If your slowdown is coming from having many pods, that ought to help.

That's the next feature I'll work on implementing, though at the moment I'm a bit pressed for time so PRs would be welcome.

@sheldonjuncker sheldonjuncker added the enhancement New feature or request label Oct 18, 2023
@schlichtanders
Copy link
Author

The slow down is unfortunatly coming from mass updates of an entire folder

examples:

  • the result of a build step, e.g. compiling the components of a website with parcel)
  • on startup sync

I guess the startup sync does not have the synchronization issues, so this could also be parallelized easily? That would already help a lot I think.

Ideally of course it would be great if kubycat could face the challenge and parallelize per file.

@sheldonjuncker
Copy link
Owner

Yeah, that makes sense. I think a couple of things could be done to address this. The first would be to change from individual item processing to batch processing, and scanning each batch for possible conflicts. (creates/deletes to the same files for example).

The offending items could be moved to the next batch such that all items in any given batch are safe to be processed in parallel.

I'll have to think about the issue with build folders. For folders like that which receive mass updates in short periods of times, I wonder if it would make sense to sync them less frequently and only at the entire folder level. Basically, it would wait till there hadn't been any changes for X period of time, then sync the entire folder instead of all of the individually changed files..

Sync on startup already syncs only the top-level item which in the case of folders, syncs all of the contents within too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants