Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Determine modified files for pull requests and pushes #24

Merged
merged 15 commits into from
Dec 17, 2021

Conversation

adangel
Copy link
Member

@adangel adangel commented Dec 10, 2021

Determine modified files for pull requests ✔️
That's the first step. The modified files are determined via pull/{pull_number}/files.
This API call returns at most 3000 files. We use a page size of 30 and collect at most
10 pages, so that we have a total of 300 files - that's what we use for analyzing.
This means for big changes, not all files might be analyzed.

Determine modified files for pushes ✔️
WIP - needs to be tested though.
Works
Especially the case, if two commits modified the same file. -> github already combines these in the "files" property.

Handle cases when modified files can't be determined ✔️
For other events than push/pull_request (e.g. workflow_dispatch) no modified files are determined. The
action behaves as if the input parameter "analyzeModifiedFilesOnly" is set to "false".

Handle too many modified files ✔️
Now a warning is issued. That's probably enough for now, since creating annotations is also limited per workflow run (see https://github.com/actions/toolkit/blob/main/docs/problem-matchers.md). The more files are modified, the more problems can be detected, which makes it more likely to hit the annotation limit.

Log output for this mode ✔️
This is done.
Which mode is active (analyzing all files or only modified files) should be logged in the build log, e.g. "Running PMD on path xyz" or "Running PMD on x files".
The output is e.g: "Running PMD 6.41.0 on 2 modified files..." vs. "Running PMD 6.41.0 on all files in path src/main/java...".

That's the first step. The modified files are determined via pull/{pull_number}/files.
This API call returns at most 3000 files. We use a page size of 30 and collect at most
10 pages, so that we have a total of 300 files - that's what we use for analyzing.
This means for big changes, not all files might be analyzed.
@adangel adangel added this to the next milestone Dec 10, 2021
Also add logging which mode is active.
This also sets the output parameter to zero.
Usually debug logging is enabled by adding a repository secret
"ACTIONS_STEP_DEBUG" with value "true". However, this is
secret not available if a PR runs from a forked repository...
Let the unit tests run under macos and windows additionally.
The modified files are determined via compare API call. Similar
to pull requests, we only collect at most 10 pages. However, the
number of total changed files can't be calculated straightforward,
since the pagination happens on the basis of commits rather than files.
In any case, for big pushes this means, that not all files might
be analyzed.
This always logs a warning if we fetched page MAX_PAGE
regardless whether there would be more pages to fetch
or not. No effort is done to determine
whether the list of modified files is complete.

Also the modified files are now collected in
a Set to avoid duplicate entries.
@adangel adangel marked this pull request as ready for review December 16, 2021 10:58
@adangel adangel merged commit 0601cf7 into pmd:main Dec 17, 2021
@adangel adangel deleted the analyze-modified-files-only branch December 17, 2021 09:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support analyzing only modified files
1 participant