Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[cache] Cache unchanged files on --dry-run #3834

Closed

Conversation

dorrogeray
Copy link
Contributor

PR for issue rectorphp/rector#7932

The idea is to cache unchanged files on --dry-run, so subsequent --dry-runs can be faster, while always producing same results.

packages/Parallel/WorkerRunner.php Outdated Show resolved Hide resolved
packages/Parallel/WorkerRunner.php Outdated Show resolved Hide resolved
@dorrogeray
Copy link
Contributor Author

@yguedidi I have updated the implementation as you have suggested.

There is a End to end test - e2e/timeout-file-not-cached failing, but I am not sure if this is a problem in this particular case, maybe its the test that should be changed somehow. The test case is about parallel process timing out, which happens on the first run, but thanks to the caching change in this PR does not happen on the second run (with cache), because the parallel process actually manages to save the file to cache after the timeout somehow. And the following process run with cache, thanks to this, is able to finish within that 1 second timer, producing a different output (since no timeout happens).

What do you think about this?

Also, there was one other failing test RectorOrderTest, but it seemed completely unrelated to the changes I made, so I just fixed the fixture - not sure about this one though, there might be something I am missing.

@samsonasik
Copy link
Member

The order test expectation fixed in latest main branch, you need to rebase.

The current e2e test should keep working with expected crash as currently shown, as they demo the double error keep showing on multiple run.

@dorrogeray
Copy link
Contributor Author

The current e2e test should keep working with expected crash as currently shown, as they demo the double error keep showing on multiple run.

That means that the timeout has to happen even when the cache is in effect, which we could perhaps make somehow happen in the e2e test scenario, by making the timeout 0.01s or by adding larger volume so even with cache, the timeout happens, but that does not seem right. In real-world scenario, we have no control over volume / cpu speed / timeout duration, etc, so it is impossible to guarrantee that the Timeout happens with cache just as it does without cache. To do so would be to artificially force with cache runs to be just as slow as those without the cache. Does that argument make sense?

I propose excluding timeout exception from the "same output with or without cache" rule based on the above.

@samsonasik
Copy link
Member

The e2e demonstrate crash on timeout, it configured with 1 second on purpose, that need to be shown again on run multiple times.

$rectorConfig->parallel(1);

@dorrogeray
Copy link
Contributor Author

@samsonasik with cache it easily finishes under 1 second. I don't think that is wrong, which is probably what we don't agree on.

The "same output with or without cache" makes sense probably for all cases but the ones where the only difference is processing time.

@yguedidi
Copy link
Contributor

@dorrogeray thanks for the work on this!

that test is using a big ruleset, so shouldn't take less than 1 seconds. and in that case the file shouldn't be marked as cachable and so not cached, so the second run fails too.
maybe the issue is that it's marked as cachable even when crash because of timeout? or the ruleset is not enough to exceed the 1 second limit?

@dorrogeray
Copy link
Contributor Author

maybe the issue is that it's marked as cachable even when crash because of timeout?

Yes, I think this is +- what happens:

  1. The master process signals the child worker process to quit because of the timeout (afaik this is initiated by the master process)

  2. The child worker has at this point either already cached the file (but has not terminated yet on its own) or simply continues and finishes the file it is working on currently (including caching it) before it terminates (maybe termination does not happen instantly?)

I assume that the child worker is who decides about marking the file as cacheable? If the child worker runs a bit longer after termination signal, it might not have a reason to not cache it..

Or something along those lines..

@yguedidi
Copy link
Contributor

@dorrogeray if I remember correctly, it's the job worker that fails on the memory limit, as the memory limit is passed to the command.
Will try to investigate that when I'll have some time

@dorrogeray
Copy link
Contributor Author

@yguedidi @samsonasik

I have done some further investigation, and have updated the PR with a change which would make the main process clear the files cached by the worker process if that process times out.

Here is PR which would be necessary in symplify/easy-parallel so that the master process has the required context (files requested for processing) when it is handling the timeout exception.

The findings:

  • The worker process does not know / has no option to take action when it is being terminated by the main process due to timeout exception, which is fired in the main process
    • Therefore, worker process cannot be responsible for clearing the batch of files it was processing during the request when the timeout has occurred
    • Therefore, without introducing this knowledge to the worker, the main process has to handle the removal of the batch of files from cache which were part of the last request to that worker during which the timeout occurred

I still think that this is an edge case, and the easiest solution would be just to allow different outputs if caching of unchanged files makes the second run finish faster, but I would be repeating my previous argument.

Also, I do not know what is the proper way of sending PR's to symplify/easy-parallel now that the symplify monorepo has been discontinued for development.

@@ -97,7 +97,7 @@ public function run(Encoder $encoder, Decoder $decoder, Configuration $configura

if ($errorAndFileDiffs[Bridge::SYSTEM_ERRORS] !== []) {
$this->invalidateFile($file);
} elseif (! $configuration->isDryRun()) {
} else {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this require e2e test for:

  • after multiple --dry-run keep show the diff, run apply changes to still apply the change.

For note: I think if you want current e2e timeout be deleted, that need to be replaced with "real crash use case", eg: can create a custom rule in e2e that must be crash, eg: echo dump($If_->else) node which the else still null so that will always crash.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@samsonasik I have opened a new PR Cache unchanged files on dry run v2, it seems that all the tests are passing now. It looks like that this rework of the timeout test fixed the issue. Additionaly, I have updated the e2e_consecutive_changes.yaml to include two dry runs, covering the test case you mentioned. Please let me know if this is sufficient for the PR to get merged, thanks!

// This sleep has to be here, because event though we have called $this->processPool->quitAll(),
// it takes some time for the child processes to actually die, and if we would delete the offending cache
// files right away, they could still write them "back" before they die
sleep(1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am interesting with this sleep(1); part, I can found a usecase in the past which CTRL+C sometime still keep running other process in the background, and the file keep changing, especially when running from IDE.

Could you cherry pick this part into separate PR? Thank you.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cherrypicked into #4280

@dorrogeray
Copy link
Contributor Author

Fixed in #4281

@dorrogeray dorrogeray closed this Jun 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants