New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add the CLI option to fetch only recently changed files for formating. #1416
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stremlenye
Thank you for your contribution! It would be great to add this option.
I left some comments on your PR, could you check them out?
@@ -134,13 +135,18 @@ object CliArgParser { | |||
.text( | |||
"""migrate .scalafmt CLI style configuration to hocon style configuration in .scalafmt.conf""" | |||
) | |||
opt[Unit]("diff") | |||
.action((_, c) => c.copy(diff = Some("master"))) | |||
.text("If set, only format edited files in git diff against master.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you keep --diff
option available (as an alias of --mode diff
) to avoid breaking changes?
In that case, the description of --diff
would be sufficient with something like Alias to "--mode diff"
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. Makes sense.
opt[String]("diff-branch") | ||
.action((branch, c) => c.copy(diff = Some(branch))) | ||
.text( | ||
"If set, only format edited files in git diff against provided branch." | ||
"If set, only format edited files in git diff against provided branch. Has no effect if mode set to `changed`." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
|
||
private def getFileFromGitStatusLine(s: String): Option[File] = | ||
s.substring(statusCodesSubstringLength) | ||
.split(moveStatusArrowDelimiter) match { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my environment (details below), this method sometimes failed to extract the file correctly.
$ cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=18.04
DISTRIB_CODENAME=bionic
DISTRIB_DESCRIPTION="Ubuntu 18.04.2 LTS"
$ git version
git version 2.21.0
Because, in some cases, the status line shows something like following (status codes occupy only one column), and, as a result, s.substring(3)
returns calafmt-core/shared/src/main/scala/org/scalafmt/util/GitOps.scala
, instead of scalafmt-core/shared/src/main/scala/org/scalafmt/util/GitOps.scala
.
M scalafmt-core/shared/src/main/scala/org/scalafmt/util/GitOps.scala
I'm not sure why this is happening ... (sry)
What do you think about extracting the file names by this way instead?
s.trim.split(" ").tail.mkString
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That inconsistency is quite unfortunate.
Using the split
to parse the path out was my initial take on the problem but it fails if the path has white spaces in it.
If you don't mind that would help a lot if you could run the script below and share the output of gitlog
file from the status_test
directory.
#!/usr/bin/env bash
function commit() {
commit_message="$1"
git add .
echo "$commit_message" >> gitlog
echo "-------git status-------" >> gitlog
echo "$(git status)" >> gitlog
echo "-------git status --porcelain-------" >> gitlog
echo "$(git status --porcelain)" >> gitlog
echo "-------git status --porcelain=1-------" >> gitlog
echo "$(git status --porcelain=1)" >> gitlog
echo "-------git status --porcelain=2-------" >> gitlog
echo "$(git status --porcelain=2)" >> gitlog
echo "-------git status -z-------" >> gitlog
echo "$(git status -z)" >> gitlog
echo "-------git status -z --porcelain=1-------" >> gitlog
echo "$(git status -z --porcelain=1)" >> gitlog
echo "-------git status -z --porcelain=2-------" >> gitlog
echo "$(git status -z --porcelain=2)" >> gitlog
echo "--------------" >> gitlog
echo "--------------" >> gitlog
git commit -am ${commit_message}
}
test_dir="status_test"
mkdir ${test_dir}
cd ${test_dir}
git init .
touch gitlog
commit "Init"
mkdir dir1
mkdir dir2
mkdir "dir 1"
echo "blah blah" > dir1/file1.txt
echo "blah blah" > dir2/file2.txt
echo "blah blah" > dir\ 1/file3.txt
commit "c1"
echo "blah blah" >> dir1/file1.txt
mv dir2/file2.txt dir\ 1/file2.txt
rm dir\ 1/file3.txt
commit "c2"
import org.scalatest._ | ||
import org.scalactic.source.Position | ||
|
||
import scala.reflect.io.Directory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not used?
@tanishiking |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for addressing the issues :) This worked fine even on my environment 👍
It's almost LGTM but I left some nitpick comments.
By the way, I ran the script you wrote and I got the following result. It's odd, it looks fine when I run git status --porcelain
directly, but it reproduces only when I run it through Scala. (This might be the problem of my working environment ...) 🤔
cat status_test/gitlog
Init
-------git status-------
On branch master
No commits yet
Changes to be committed:
(use "git rm --cached <file>..." to unstage)
new file: gitlog
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git checkout -- <file>..." to discard changes in working directory)
modified: gitlog
-------git status --porcelain-------
AM gitlog
-------git status --porcelain=1-------
AM gitlog
-------git status --porcelain=2-------
1 AM N... 000000 100644 100644 0000000000000000000000000000000000000000 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 gitlog
-------git status -z-------
AM gitlog
-------git status -z --porcelain=1-------
AM gitlog
-------git status -z --porcelain=2-------
1 AM N... 000000 100644 100644 0000000000000000000000000000000000000000 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 gitlog
--------------
--------------
c1
-------git status-------
On branch master
Changes to be committed:
(use "git reset HEAD <file>..." to unstage)
new file: dir 1/file3.txt
new file: dir1/file1.txt
new file: dir2/file2.txt
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git checkout -- <file>..." to discard changes in working directory)
modified: gitlog
-------git status --porcelain-------
A "dir 1/file3.txt"
A dir1/file1.txt
A dir2/file2.txt
M gitlog
-------git status --porcelain=1-------
A "dir 1/file3.txt"
A dir1/file1.txt
A dir2/file2.txt
M gitlog
-------git status --porcelain=2-------
1 A. N... 000000 100644 100644 0000000000000000000000000000000000000000 30f21dd6e547530a30543f0c40089f102b8afe0d dir 1/file3.txt
1 A. N... 000000 100644 100644 0000000000000000000000000000000000000000 30f21dd6e547530a30543f0c40089f102b8afe0d dir1/file1.txt
1 A. N... 000000 100644 100644 0000000000000000000000000000000000000000 30f21dd6e547530a30543f0c40089f102b8afe0d dir2/file2.txt
1 .M N... 100644 100644 100644 641879f5fb028620aed44fad54345120a9500fae 641879f5fb028620aed44fad54345120a9500fae gitlog
-------git status -z-------
A dir 1/file3.txtA dir1/file1.txtA dir2/file2.txt M gitlog
-------git status -z --porcelain=1-------
A dir 1/file3.txtA dir1/file1.txtA dir2/file2.txt M gitlog
-------git status -z --porcelain=2-------
1 A. N... 000000 100644 100644 0000000000000000000000000000000000000000 30f21dd6e547530a30543f0c40089f102b8afe0d dir 1/file3.txt1 A. N... 000000 100644 100644 0000000000000000000000000000000000000000 30f21dd6e547530a30543f0c40089f102b8afe0d dir1/file1.txt1 A. N... 000000 100644 100644 0000000000000000000000000000000000000000 30f21dd6e547530a30543f0c40089f102b8afe0d dir2/file2.txt1 .M N... 100644 100644 100644 641879f5fb028620aed44fad54345120a9500fae 641879f5fb028620aed44fad54345120a9500fae gitlog
--------------
--------------
c2
-------git status-------
On branch master
Changes to be committed:
(use "git reset HEAD <file>..." to unstage)
renamed: dir2/file2.txt -> dir 1/file2.txt
deleted: dir 1/file3.txt
modified: dir1/file1.txt
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git checkout -- <file>..." to discard changes in working directory)
modified: gitlog
-------git status --porcelain-------
R dir2/file2.txt -> "dir 1/file2.txt"
D "dir 1/file3.txt"
M dir1/file1.txt
M gitlog
-------git status --porcelain=1-------
R dir2/file2.txt -> "dir 1/file2.txt"
D "dir 1/file3.txt"
M dir1/file1.txt
M gitlog
-------git status --porcelain=2-------
2 R. N... 100644 100644 100644 30f21dd6e547530a30543f0c40089f102b8afe0d 30f21dd6e547530a30543f0c40089f102b8afe0d R100 dir 1/file2.txt dir2/file2.txt
1 D. N... 100644 000000 000000 30f21dd6e547530a30543f0c40089f102b8afe0d 0000000000000000000000000000000000000000 dir 1/file3.txt
1 M. N... 100644 100644 100644 30f21dd6e547530a30543f0c40089f102b8afe0d 34408573c0d09495ebf32aca7c7fd50fd958a3ec dir1/file1.txt
1 .M N... 100644 100644 100644 a47dfb0eaf9706be2d1d8fbae10e022d36c0defb a47dfb0eaf9706be2d1d8fbae10e022d36c0defb gitlog
-------git status -z-------
R dir 1/file2.txtdir2/file2.txtD dir 1/file3.txtM dir1/file1.txt M gitlog
-------git status -z --porcelain=1-------
R dir 1/file2.txtdir2/file2.txtD dir 1/file3.txtM dir1/file1.txt M gitlog
-------git status -z --porcelain=2-------
2 R. N... 100644 100644 100644 30f21dd6e547530a30543f0c40089f102b8afe0d 30f21dd6e547530a30543f0c40089f102b8afe0d R100 dir 1/file2.txtdir2/file2.txt1 D. N... 100644 000000 000000 30f21dd6e547530a30543f0c40089f102b8afe0d 0000000000000000000000000000000000000000 dir 1/file3.txt1 M. N... 100644 100644 100644 30f21dd6e547530a30543f0c40089f102b8afe0d 34408573c0d09495ebf32aca7c7fd50fd958a3ec dir1/file1.txt1 .M N... 100644 100644 100644 a47dfb0eaf9706be2d1d8fbae10e022d36c0defb a47dfb0eaf9706be2d1d8fbae10e022d36c0defb gitlog
--------------
--------------
private final val renameStatusCode = "R" | ||
private final val renameStatusArrowDelimiter = "-> " | ||
|
||
private def extractPathPart(s: String): Try[String] = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add some comments that explain the behavior (and include a reference to https://git-scm.com/docs/git-status#_short_format)?
Now it's a bit complicated and worth explaining.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure 👍
pathRaw <- extractPathPart(s) | ||
path = trimQuotes(pathRaw) | ||
file <- Try(new File(path)) | ||
} yield file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we remove the Try
block from Try(new File(path))
and change it to following? I think we don't have to catch the exceptions for new File
because File
constructor throws exception only when path
is null
// ...
path = trimQuotes(pathRaw)
} yield new File(path)
I honestly have zero ideas why different environments act different in there ways of treating parsed lines but could be something related to the way the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!! Thank you so much (and sry for my late reply)
No worries and thanks for the review ✋ |
The premise is the desire to reduce testing/reformatting time by limiting the workload surface.
While working as part of the team on the same repo the master's head could go quite significantly farther potentially enlarging the
diff
which causes the formatting slowdown. The impact is quite dramatic if thescalafmt --diff --test
is used as part of pre-commit hook.Current PR revisits the
--diff
option turning it into--mode [diff/changed]
to switch between two modes.While
--mode diff
preserves the behaviour ofscalafmt --diff
the--mode changed
utilises thegit status
to fetch the files for formatting limiting the surface down to the very minimum of the recent changes.