Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the CLI option to fetch only recently changed files for formating. #1416

Merged
merged 6 commits into from May 14, 2019
Merged

Add the CLI option to fetch only recently changed files for formating. #1416

merged 6 commits into from May 14, 2019

Conversation

stremlenye
Copy link
Contributor

The premise is the desire to reduce testing/reformatting time by limiting the workload surface.
While working as part of the team on the same repo the master's head could go quite significantly farther potentially enlarging the diff which causes the formatting slowdown. The impact is quite dramatic if the scalafmt --diff --test is used as part of pre-commit hook.

Current PR revisits the --diff option turning it into --mode [diff/changed] to switch between two modes.
While --mode diff preserves the behaviour of scalafmt --diff the --mode changed utilises the git status to fetch the files for formatting limiting the surface down to the very minimum of the recent changes.

@tanishiking tanishiking self-requested a review May 4, 2019 07:54
Copy link
Member

@tanishiking tanishiking left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stremlenye
Thank you for your contribution! It would be great to add this option.
I left some comments on your PR, could you check them out?

@@ -134,13 +135,18 @@ object CliArgParser {
.text(
"""migrate .scalafmt CLI style configuration to hocon style configuration in .scalafmt.conf"""
)
opt[Unit]("diff")
.action((_, c) => c.copy(diff = Some("master")))
.text("If set, only format edited files in git diff against master.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you keep --diff option available (as an alias of --mode diff) to avoid breaking changes?
In that case, the description of --diff would be sufficient with something like Alias to "--mode diff".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Makes sense.

opt[String]("diff-branch")
.action((branch, c) => c.copy(diff = Some(branch)))
.text(
"If set, only format edited files in git diff against provided branch."
"If set, only format edited files in git diff against provided branch. Has no effect if mode set to `changed`."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍


private def getFileFromGitStatusLine(s: String): Option[File] =
s.substring(statusCodesSubstringLength)
.split(moveStatusArrowDelimiter) match {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my environment (details below), this method sometimes failed to extract the file correctly.

$ cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=18.04
DISTRIB_CODENAME=bionic
DISTRIB_DESCRIPTION="Ubuntu 18.04.2 LTS"

$ git version
git version 2.21.0

Because, in some cases, the status line shows something like following (status codes occupy only one column), and, as a result, s.substring(3) returns calafmt-core/shared/src/main/scala/org/scalafmt/util/GitOps.scala, instead of scalafmt-core/shared/src/main/scala/org/scalafmt/util/GitOps.scala.

M scalafmt-core/shared/src/main/scala/org/scalafmt/util/GitOps.scala

I'm not sure why this is happening ... (sry)


What do you think about extracting the file names by this way instead?

s.trim.split(" ").tail.mkString

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That inconsistency is quite unfortunate.
Using the split to parse the path out was my initial take on the problem but it fails if the path has white spaces in it.

If you don't mind that would help a lot if you could run the script below and share the output of gitlog file from the status_test directory.

#!/usr/bin/env bash

function commit() {
  commit_message="$1"
  git add .
  echo "$commit_message" >> gitlog
  echo "-------git status-------" >> gitlog
  echo "$(git status)" >> gitlog
  echo "-------git status --porcelain-------" >> gitlog
  echo "$(git status --porcelain)" >> gitlog
  echo "-------git status --porcelain=1-------" >> gitlog
  echo "$(git status --porcelain=1)" >> gitlog
  echo "-------git status --porcelain=2-------" >> gitlog
  echo "$(git status --porcelain=2)" >> gitlog
  echo "-------git status -z-------" >> gitlog
  echo "$(git status -z)" >> gitlog
  echo "-------git status -z --porcelain=1-------" >> gitlog
  echo "$(git status -z --porcelain=1)" >> gitlog
  echo "-------git status -z --porcelain=2-------" >> gitlog
  echo "$(git status -z --porcelain=2)" >> gitlog
  echo "--------------" >> gitlog
  echo "--------------" >> gitlog
  git commit -am ${commit_message}
}

test_dir="status_test"

mkdir ${test_dir}
cd ${test_dir}
git init .
touch gitlog
commit "Init"
mkdir dir1
mkdir dir2
mkdir "dir 1"
echo "blah blah" > dir1/file1.txt
echo "blah blah" > dir2/file2.txt
echo "blah blah" > dir\ 1/file3.txt
commit "c1"

echo "blah blah" >> dir1/file1.txt
mv dir2/file2.txt  dir\ 1/file2.txt
rm dir\ 1/file3.txt
commit "c2"

import org.scalatest._
import org.scalactic.source.Position

import scala.reflect.io.Directory
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not used?

@stremlenye
Copy link
Contributor Author

@tanishiking
I've made few changes which suppose to address the issue you had with path parsing as well as the backwards compatibility. Could you please validate them on your system? I was not able to reproduce the case neither on my machine nor on my peer's Ubuntu one.

Copy link
Member

@tanishiking tanishiking left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for addressing the issues :) This worked fine even on my environment 👍
It's almost LGTM but I left some nitpick comments.

By the way, I ran the script you wrote and I got the following result. It's odd, it looks fine when I run git status --porcelain directly, but it reproduces only when I run it through Scala. (This might be the problem of my working environment ...) 🤔

cat status_test/gitlog
Init
-------git status-------
On branch master

No commits yet

Changes to be committed:
  (use "git rm --cached <file>..." to unstage)

        new file:   gitlog

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

        modified:   gitlog
-------git status --porcelain-------
AM gitlog
-------git status --porcelain=1-------
AM gitlog
-------git status --porcelain=2-------
1 AM N... 000000 100644 100644 0000000000000000000000000000000000000000 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 gitlog
-------git status -z-------
AM gitlog
-------git status -z --porcelain=1-------
AM gitlog
-------git status -z --porcelain=2-------
1 AM N... 000000 100644 100644 0000000000000000000000000000000000000000 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 gitlog
--------------
--------------
c1
-------git status-------
On branch master
Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

        new file:   dir 1/file3.txt
        new file:   dir1/file1.txt
        new file:   dir2/file2.txt

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

        modified:   gitlog
-------git status --porcelain-------
A  "dir 1/file3.txt"
A  dir1/file1.txt
A  dir2/file2.txt
 M gitlog
-------git status --porcelain=1-------
A  "dir 1/file3.txt"
A  dir1/file1.txt
A  dir2/file2.txt
 M gitlog
-------git status --porcelain=2-------
1 A. N... 000000 100644 100644 0000000000000000000000000000000000000000 30f21dd6e547530a30543f0c40089f102b8afe0d dir 1/file3.txt
1 A. N... 000000 100644 100644 0000000000000000000000000000000000000000 30f21dd6e547530a30543f0c40089f102b8afe0d dir1/file1.txt
1 A. N... 000000 100644 100644 0000000000000000000000000000000000000000 30f21dd6e547530a30543f0c40089f102b8afe0d dir2/file2.txt
1 .M N... 100644 100644 100644 641879f5fb028620aed44fad54345120a9500fae 641879f5fb028620aed44fad54345120a9500fae gitlog
-------git status -z-------
A  dir 1/file3.txtA  dir1/file1.txtA  dir2/file2.txt M gitlog
-------git status -z --porcelain=1-------
A  dir 1/file3.txtA  dir1/file1.txtA  dir2/file2.txt M gitlog
-------git status -z --porcelain=2-------
1 A. N... 000000 100644 100644 0000000000000000000000000000000000000000 30f21dd6e547530a30543f0c40089f102b8afe0d dir 1/file3.txt1 A. N... 000000 100644 100644 0000000000000000000000000000000000000000 30f21dd6e547530a30543f0c40089f102b8afe0d dir1/file1.txt1 A. N... 000000 100644 100644 0000000000000000000000000000000000000000 30f21dd6e547530a30543f0c40089f102b8afe0d dir2/file2.txt1 .M N... 100644 100644 100644 641879f5fb028620aed44fad54345120a9500fae 641879f5fb028620aed44fad54345120a9500fae gitlog
--------------
--------------
c2
-------git status-------
On branch master
Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

        renamed:    dir2/file2.txt -> dir 1/file2.txt
        deleted:    dir 1/file3.txt
        modified:   dir1/file1.txt

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

        modified:   gitlog
-------git status --porcelain-------
R  dir2/file2.txt -> "dir 1/file2.txt"
D  "dir 1/file3.txt"
M  dir1/file1.txt
 M gitlog
-------git status --porcelain=1-------
R  dir2/file2.txt -> "dir 1/file2.txt"
D  "dir 1/file3.txt"
M  dir1/file1.txt
 M gitlog
-------git status --porcelain=2-------
2 R. N... 100644 100644 100644 30f21dd6e547530a30543f0c40089f102b8afe0d 30f21dd6e547530a30543f0c40089f102b8afe0d R100 dir 1/file2.txt    dir2/file2.txt
1 D. N... 100644 000000 000000 30f21dd6e547530a30543f0c40089f102b8afe0d 0000000000000000000000000000000000000000 dir 1/file3.txt
1 M. N... 100644 100644 100644 30f21dd6e547530a30543f0c40089f102b8afe0d 34408573c0d09495ebf32aca7c7fd50fd958a3ec dir1/file1.txt
1 .M N... 100644 100644 100644 a47dfb0eaf9706be2d1d8fbae10e022d36c0defb a47dfb0eaf9706be2d1d8fbae10e022d36c0defb gitlog
-------git status -z-------
R  dir 1/file2.txtdir2/file2.txtD  dir 1/file3.txtM  dir1/file1.txt M gitlog
-------git status -z --porcelain=1-------
R  dir 1/file2.txtdir2/file2.txtD  dir 1/file3.txtM  dir1/file1.txt M gitlog
-------git status -z --porcelain=2-------
2 R. N... 100644 100644 100644 30f21dd6e547530a30543f0c40089f102b8afe0d 30f21dd6e547530a30543f0c40089f102b8afe0d R100 dir 1/file2.txtdir2/file2.txt1 D. N... 100644 000000 000000 30f21dd6e547530a30543f0c40089f102b8afe0d 0000000000000000000000000000000000000000 dir 1/file3.txt1 M. N... 100644 100644 100644 30f21dd6e547530a30543f0c40089f102b8afe0d 34408573c0d09495ebf32aca7c7fd50fd958a3ec dir1/file1.txt1 .M N... 100644 100644 100644 a47dfb0eaf9706be2d1d8fbae10e022d36c0defb a47dfb0eaf9706be2d1d8fbae10e022d36c0defb gitlog
--------------
--------------

private final val renameStatusCode = "R"
private final val renameStatusArrowDelimiter = "-> "

private def extractPathPart(s: String): Try[String] =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add some comments that explain the behavior (and include a reference to https://git-scm.com/docs/git-status#_short_format)?
Now it's a bit complicated and worth explaining.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure 👍

pathRaw <- extractPathPart(s)
path = trimQuotes(pathRaw)
file <- Try(new File(path))
} yield file
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove the Try block from Try(new File(path)) and change it to following? I think we don't have to catch the exceptions for new File because File constructor throws exception only when path is null

  // ...
  path = trimQuotes(pathRaw)
} yield new File(path)

@stremlenye
Copy link
Contributor Author

I honestly have zero ideas why different environments act different in there ways of treating parsed lines but could be something related to the way the Process treats the output. The one could be merely trimming the line like M path/to/file.scala ending up with M path/to/file.scala.
Anyway whatever – if it works, it works 😄

Copy link
Member

@tanishiking tanishiking left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!! Thank you so much (and sry for my late reply)

@tanishiking tanishiking merged commit 0e1b579 into scalameta:master May 14, 2019
@stremlenye
Copy link
Contributor Author

No worries and thanks for the review ✋

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants