Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow defining specific files to check during overcommit --run #677

Closed
HarlemSquirrel opened this issue Oct 10, 2019 · 5 comments
Closed

Comments

@HarlemSquirrel
Copy link
Contributor

I know we can run pre-commit hooks on all files with https://github.com/sds/overcommit#continuous-integration

It would be great to be able to do this just on modified files and lines based on the commits ahead of the target branch in a pull request.

I'm working with some legacy code that was never linted and so this would help with incrementally improving the code base without needing to use a service like CodeFactor.

@sds
Copy link
Owner

sds commented Oct 11, 2019

Open to a pull request that adds support to the --run flag to allow specifying a list of files to scope the check to. You could then write your own code to determine which files to pass through in your CI run.

For example, if you want to compare the branch to origin/master:

git --no-pager diff --name-only origin/master | xargs overcommit --run

Allowing the user to pass in the set of files makes this solution more flexible for different workflows/filtering methodologies.

@sds sds changed the title Continuous integration run on changes files/lines Allow defining specific files to check during overcommit --run Oct 11, 2019
@cihati
Copy link

cihati commented Jun 29, 2020

My situation is exactly what @HarlemSquirrel is experiencing -- @sds your suggestion would be super useful. I have 0 Ruby experience (and am quite swamped with early stage startup life =), otherwise I'd love to create a PR.

@HarlemSquirrel
Copy link
Contributor Author

I've been using this bash script with pretty good success.

bundle exec rubocop --fail-level error | bash bin/diff_filter.sh

I had another thought about how this could work since overcommit already knows how to filter modified lines of files. A new --run-for-diff flag could take one argument that is a branch, tag, or commit hash to determine the modified files and lines as it does for pre-commit hooks.

bundle exec overcommit --run-for-diff master

@sds sds closed this as not planned Won't fix, can't repro, duplicate, stale Aug 20, 2022
@johnnymo87
Copy link

Yes, I really wish bundle exec overcommit --run-for-diff master existed! Overcommit can't seriously claim that it's useful on CI if it doesn't have the ability to scope its linting exclusively to the diff of the PR.

Fortunately, the scripts/diff_filter.sh shared above works quite nicely.

One note on the script -- it has a small bug where if app/models/recipe.rb:3 is in the PR diff, the script will match rubocop errors starting with app/models/recipe.rb:3 but also rubocop errors starting with e.g. app/models/recipe.rb:30. The solution I came up with is to match rubocop output more specifically using the colon that separates the line number from the column number that follows it (e.g. app/models/recipe.rb:3:2 and app/models/recipe.rb:30:4).

commit f747358719b718df8719749944dfc9744e9ab560
Author: Jonathan Mohrbacher <johnnymo87@gmail.com>
Date:   Thu Aug 31 11:30:11 2023 -0400

    Don't match lines that start with same line number

diff --git a/scripts/diff_filter.sh b/scripts/diff_filter.sh
index 861dcba1..9f9428b6 100755
--- a/scripts/diff_filter.sh
+++ b/scripts/diff_filter.sh
@@ -24,7 +24,7 @@ done
 err_count=0
 while read -r ln; do
   for m in ${diff_list[@]}; do
-    if [[ ${ln} =~ ^$m ]]; then
+    if [[ ${ln} =~ ^$m: ]]; then
       echo $ln
       err_count=$((err_count+1))
       break

@johnnymo87
Copy link

johnnymo87 commented Sep 1, 2023

Ah shoot, there's actually a problem with this script.

The script looks at git blame on the files in the diff, and extract the lines from the git blame output that match the commits in the PR. The problem is that if you make a change in one commit and then reverse it then next, the script will still consider this line, and you'll get linting errors about it even though it no longer appears in the PR diff.

As a solution, I threw out the git blame part of the approach and replaced it with the scripts/git-diff-changed-lines.sh and scripts/git-difftool-changed-lines.sh files from this gist (with one minor tweak).

diff --git c/scripts/diff_filter.sh i/scripts/diff_filter.sh
index 9f9428b6..30750175 100755
--- c/scripts/diff_filter.sh
+++ i/scripts/diff_filter.sh
@@ -10,16 +10,11 @@
 BASE_REMOTE=origin
 BASE_BRANCH=master
 
-diff_list=()
-commit_list=$(git --no-pager log --no-merges $BASE_REMOTE/$BASE_BRANCH...HEAD | grep -e '^commit' | sed -e "s/^commit \(.\{8\}\).*/\1/")
-for f in $(git --no-pager diff $BASE_REMOTE/$BASE_BRANCH...HEAD --name-only --diff-filter=AM); do
-  for c in $commit_list; do
-    diffs=$(git --no-pager blame --show-name -s $f | grep $c | sed -e "s/^[^ ]* *\([^ ]*\) *\([0-9]*\)*).*$/\1:\2/")
-    for ln in $diffs; do
-      diff_list+=( $ln )
-    done
-  done
-done
+# Get the directory of the current script
+DIR="$(dirname "$0")"
+
+# Directly assign the output to the array
+IFS=$'\n' read -r -d '' -a diff_list < <("$DIR/git-diff-changed-lines.sh" $BASE_REMOTE/$BASE_BRANCH...HEAD && printf '\0')
 
 err_count=0
 while read -r ln; do

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants