Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fully populate the cache when running vendor/bin/rector process --dry-run #7932

Closed
dorrogeray opened this issue May 12, 2023 · 15 comments
Closed
Labels

Comments

@dorrogeray
Copy link

Feature Request

When running rector in vendor/bin/rector process --dry-run, the cache should be fully populated, just like when running rector without the --dry-run flag. The use case is that in the CI, you would ideally want to run rector with --dry-run, since that will return non-zero exit code when there are any changes made by applying rector rules. But you would also like to be able to cache the rector cache, so following runs can be executed faster / consume less resources. Currently however, it seems that using --dry-run writes only very little to the cache (almost nothing) when compared to running without. Therefore, the caching in the CI does not work as expected..

Here is a workaround I came up with:

  1. Restore cache
  2. Run vendor/bin/rector process which fully populates the rector cache (if it is not already there from the cache restore)
  3. Save cache
  4. Clear any changes with git reset --hard
  5. Run vendor/bin/rector process --dry-run which will use the cache from step 2, so it will be fast, but it will also exit with non-zero code if any changes are detected, which is what will make the pipeline fail (which is what we want)

Diff

  rector:
    runs-on: ubuntu-latest-16-cores-64-gb-ram
    strategy:
      fail-fast: false
      matrix:
        php-version: [ '7.4' ]

    steps:
      - name: Checkout
        uses: actions/checkout@v2

      - name: Setup PHP
        uses: shivammathur/setup-php@v2
        with:
          ini-values: zend.assertions=1
          php-version: ${{ matrix.php-version }}
          extensions: mbstring, sockets, imap, gmp, memcached

      - name: Cache composer dependencies
        uses: actions/cache@v2
        with:
          path: vendor
          key: ${{ runner.os }}-composer-${{ hashFiles('**/composer.lock') }}
          restore-keys: ${{ runner.os }}-composer-

      - name: Install dependencies
        run: |
          composer config github-oauth.github.com ${{ secrets.COMPOSER_TOKEN }}
          composer install --no-progress --prefer-dist --optimize-autoloader --ignore-platform-reqs

      - name: Copy phpstan.neon.example to phpstan.neon
        run: |
          cp phpstan.neon.example phpstan.neon

      - name: Restore rector cache
        id: cache-rector-restore
        uses: actions/cache/restore@v3
        with:
          path: ./temp/rector
          key: ${{ runner.os }}-rector-${{ hashFiles('**/composer.lock') }}-${{ github.ref }}
          restore-keys: |
            ${{ runner.os }}-rector-${{ hashFiles('**/composer.lock') }}-
            ${{ runner.os }}-rector-

      # Only real run will generate cache, which the --dry-run can then re-use
      - name: Rector generate cache
        run: vendor/bin/rector process

      - name: Clear changes generated while generating rector cache
        run: git reset --hard

      # Explicitly save rector cache
      - name: Cache rector save
        id: cache-rector-save
        uses: actions/cache/save@v3
        with:
          path: ./temp/rector
          # Adding github.run_id to force cache update https://github.com/actions/cache/blob/main/tips-and-workarounds.md#update-a-cache
          key: ${{ runner.os }}-rector-${{ hashFiles('**/composer.lock') }}-${{ github.ref }}-${{ github.run_id }}

      # Only --dry-run will fail when changes are detected
      - name: Rector
        run: vendor/bin/rector process --dry-run
@samsonasik
Copy link
Member

On CI, cache is default to MemoryCacheStorage, see:

https://github.com/rectorphp/rector-src/blob/0d51657b8f4c157d248d50f534f4247ad12ec3d6/config/config.php#L122-L123C35

You may need to define the Rector cache class, see:

https://getrector.com/documentation/cache-in-ci

@dorrogeray
Copy link
Author

@samsonasik I do have Rector\Caching\ValueObject\Storage\FileCacheStorage configured:

    $rectorConfig->cacheClass(FileCacheStorage::class);
    $rectorConfig->cacheDirectory('./temp/rector');

My setup working correctly in terms of where it is writing the cache, the problem is that the content of the cache is different when running:

vendor/bin/rector process --dry-run - which produces cache with a single tiny file, which does not work in terms that it would speed up following rector execution (since there is no "meat" in the cache)
image

vs:

vendor/bin/rector process - which produces cache with many files, and using this cache works - it does make rector finish almost instantly
image
...
image

@samsonasik
Copy link
Member

I see, that's seems expected.

--dry-run is expected to run multiple times and got the exact same result, so the cache it not applied for that

} elseif (!$configuration->isDryRun()) {
$this->changedFilesDetector->cacheFileWithDependencies($file->getFilePath());
}

see these e2e test for multiple run --dry-run

https://github.com/rectorphp/rector-src/blob/52295356ab5e31c2226f6ff96984bb08b7529937/.github/workflows/e2e_with_cache.yaml#L48-L54

@dorrogeray
Copy link
Author

Not sure if I understand this correctly, so are you saying that if the first --dry-run would generate the cache, that would that cause following --dry-run which would use that cache to end with different result?

@samsonasik
Copy link
Member

No, multiple --dry-run should show same result. That's the e2e test show.

The cache is applied, just not insert "changed diff" into cache.

There is also test for multiple refactor to keep change the file with invalidate changed files so it can apply the change.

https://github.com/rectorphp/rector-src/blob/55410904edd98b4a243c12b140cfda9ce171f919/.github/workflows/e2e_consecutive_changes.yaml#L46-L52

@yguedidi
Copy link

@dorrogeray for some context, the recent changes to the way cache works were made in rectorphp/rector-src#3614 to solve #7770
Before files were cached before processing and so in case of crash or manual stop they were not taken into account in a following run.
Now we cache files after processing, which is way more reliable and actually fix the issue.
But with --dry-run as the the the files are not changes in any way by definition, there is nothing to cache, so that as @samsonasik mentioned a following run gives the exact same result.

Important note:
a single run of Rector may not apply all possible changes! And it's the desired behavior by maintainers (see my attempt of full processing in rectorphp/rector-src#3724 for more info)
so in order to preserve both the existing behaviors regarding dry run and need for consecutive changing runs, the files that are cached are the ones that are not changed by Rector.

@samsonasik by writing the above I realize, as full processing got rejected, and so we cache only unchanged files, maybe we can also cache those files in dry run mode?
what do you think?

@dorrogeray
Copy link
Author

@samsonasik by writing the above I realize, as full processing got rejected, and so we cache only unchanged files, maybe we can also cache those files in dry run mode? what do you think?

I think that this would be the right solution. From the use-case perspective I have described, if I have empty cache, and run rector with --dry-run twice, and no files require any changes, the second --dry-run should "essentially do nothing" because it will have everything cached from the first run. Currently, both dry runs run from scratch.

@samsonasik
Copy link
Member

That will confusing if first run show the diff and second run not show anything, that was the behaviour that fixed in the past.

Cache improvement is fine as far as it doesn't make bad DX.

@yguedidi
Copy link

@samsonasik a file showing a diff is a changed file, and so it will not be cached. so in the second run it should appear again

@samsonasik
Copy link
Member

That's what I meant, first and second run should show the same result. That's the current e2e show.

@samsonasik
Copy link
Member

Here how cache works:

  • multiple run with --dry-run should show same result, which current implementation works (improvement is fine, as far as first and second run show same result).

  • multiple run without --dry-run apply change file, consecutively run to invalidate the file that last changed, so it can be changed again, which current implementation works (improvement is fine, as far as second run can continue what changed in the last run).

@dorrogeray
Copy link
Author

Thanks for the feedback, I will try to prepare a PR.

@yguedidi
Copy link

I let you take it @dorrogeray then! Feel free to ping me when ready

@dorrogeray
Copy link
Author

@yguedidi I have prepared a PR here: rectorphp/rector-src#3834

@dorrogeray
Copy link
Author

Fixed in rectorphp/rector-src#4281

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants