Improve isWorkingDirClean() check for projects with generated code #688

jbrunton · 2020-07-25T12:50:34Z

Fix for #687

I tried to add some additional test coverage, but for some reason the behavior which I can reliably reproduce in a node terminal (see notes in #687) was only inconsistently reproducible in the tests I tried. I'm not yet clear what it is about the testing environment that causes this, but the existing tests show that the change still works for existing cases, so perhaps that's good enough?

jbrunton · 2020-07-25T12:56:03Z

Just noticed the package.json file was updated for some reason. Let me fix that..

jbrunton · 2020-07-25T13:01:21Z

Fixed – made an error rebasing first time around.

webpro · 2020-07-26T11:15:01Z

Thanks @jbrunton

I was hoping we wouldn't need a write action (update index), even though it's only Git internals.

What about the options from https://gist.github.com/sindresorhus/3898739, such as:

git status --porcelain
git status -suno

I'm just challenging here, if I have the time I'll try to actually set up some scenarios to test.

jbrunton · 2020-07-26T11:47:34Z

@webpro: you're right to challenge, Git is a complex beast!

git status -suno should be semantically equivalent – happy to rewrite using that.

jbrunton · 2020-07-26T12:00:01Z

I'll take some time to read up on those other options too. I'm now wondering if git diff --quiet would do the job too..

jbrunton · 2020-07-27T12:47:05Z

I did some further investigation, and I think git diff --quiet HEAD makes the most sense. I'm going to do a little more benchmarking and will report back.

jbrunton · 2020-07-27T12:58:09Z

@webpro oops, I forgot that gist you linked to already did the benchmarking. (I did some of my own, which probably wasn't necessary given the results in the gist, but the results appeared to be consistent.)

The commands I looked at were:

git diff --quiet HEAD
git update-index -q --refresh && git diff-index --quiet HEAD --
git status --porcelain --untracked-files=no
git status --short --untracked-files=no

As with the results in the gist, git diff --quiet HEAD was consistently the best performing. It's also the simplest to use, so that's what I went with.

Finally, to triple check the scenarios we want to solve for:

If there are any changes to tracked files (whether staged or unstaged) then the repo should be considered dirty.
If there are no changes to any tracked files, the repo should be considered clean.
Untracked files should be ignored.
If a file is touched but its content has not change, then it should be considered unchanged.

The above commands all meet these criteria.

lib/plugin/git/Git.js

webpro · 2020-07-29T14:56:40Z

lib/plugin/git/Git.js

-    return this.exec('git diff-index --quiet HEAD --', { options }).then(
+    // Note: the update-index is required here for cases where files may be touched/recreated
+    // during a build/test hook. Git will mark those as potentially changed, but won't diff them
+    // until the index is updated. See also https://github.com/release-it/release-it/issues/687


I guess the comment can be removed again. Thanks for all the work! Will test this asap.

whoops! removed.

jbrunton · 2020-07-29T18:50:32Z

For reference, I also created some scripts for testing the suitability of different commands (to sense check I was reading the docs correctly!) and for benchmarking: https://github.com/jbrunton/test-git-dirty-checks

jbrunton · 2020-08-07T12:15:40Z

@webpro: just to confirm, I've addressed review comments. No hurry from my point of view, but let me know if you'd like a hand with testing (e.g. generating test cases).

webpro · 2020-08-08T09:01:53Z

Apologies for not getting back to you earlier, and thanks for the elaborate help! Really appreciated.

I'm trying to make a test case that fails with the current setup, and passes in this PR.

Existing passing test:

test.serial('should throw if working dir is not clean', async t => {
  const gitClient = factory(Git, { options: { git } });
  sh.exec('rm file');
  const expected = { instanceOf: GitCleanWorkingDirError, message: /Working dir must be clean/ };
  await t.throwsAsync(gitClient.init(), expected);
});

The test I had in mind on how I understand the issue:

sh.exec('touch file');

or maybe

sh.exec('rm file');
sh.exec('echo line > file');

However, these tests also pass with git diff-index --quiet HEAD -- (i.e. this command returns no diff output).

Maybe there's an issue with the result of the sh.exec commands. Am I missing something, are there any other git operations happening in the build command?

jbrunton · 2020-08-08T20:54:31Z

It's certainly a fun one – I've been finding the behavior quite inconsistent in some very consistent ways :) I haven't been able to figure out why, though.

If I run commands in the node terminal, I always see the previous method fail, and the new method work.

For example, running inside the node terminal in the release-it repo (or any node package):

node
Welcome to Node.js v12.18.0.
Type ".help" for more information.
> sh = require('shelljs')
> sh.exec('git diff-index --quiet HEAD').code
0
> sh.touch('package.json')
> sh.exec('git diff-index --quiet HEAD').code
1
> sh.exec('git diff-index --quiet HEAD').code
1
> sh.exec('git diff-index --quiet HEAD').code
1
> sh.exec('git diff --quiet HEAD').code
0
> sh.exec('git diff-index --quiet HEAD').code
0

However, if I run the same git commands in my terminal (i.e. invoking git directly), then I never see a failure:

> touch package.json
> git diff-index --quiet HEAD
> echo $?
0

When I run the test-commands.js script in the test-git-dirty-checks repo I created, I see the expected failures a majority of the time. (The script runs checks after touching a file 10 times on each run as the results are nondeterministic for diff-index, but on my laptop the diff-index check usually produces an exit code of 1 on about 7 or 8 of those runs.)

Starting test run for command: git update-index -q && git diff-index --quiet HEAD --
┌─────────┬───────────────────┬───────────────┬────────────────────────────┬─────────────────┬──────────────────────┐
│ (index) │     scenario      │  stdoutEmpty  │            code            │ expectedFailure │       failure        │
├─────────┼───────────────────┼───────────────┼────────────────────────────┼─────────────────┼──────────────────────┤
│    0    │   'clean repo'    │     true      │             0              │      false      │        false         │
│    1    │ 'untracked file'  │     true      │             0              │      false      │        false         │
│    2    │  'staged change'  │     true      │             1              │      true       │         true         │
│    3    │  'commited file'  │     true      │             0              │      false      │        false         │
│    4    │   'touch file'    │ 'always true' │ '<various> (3 x 0, 7 x 1)' │      false      │ '<nondeterministic>' │
│    5    │ 'unstaged change' │     true      │             1              │      true       │         true         │
└─────────┴───────────────────┴───────────────┴────────────────────────────┴─────────────────┴──────────────────────┘

Finally, when I tried to write a unit test for this scenario, I occasionally saw the expected failure using git diff-index, but only in a minority of cases (about 1 in 10, I would estimate).
The other scenario I've tested was what originally led me to this: when using release-it in a repo with a step that touches files as part of some preflight checks (another use cases I recently encountered was in a node package that runs a format check prior to release), I've pretty consistently seen failures using diff-index.

I've not been able to figure out why I see such different results depending on the environment (especially given that four of the five scenarios use node + shelljs but seem to produce different results). My own conclusion was that writing a unit test may not be feasible, but I could put together some small sample projects to demonstrate the differences through manual testing?

Also: it sounds like you've been looking into scenario 4 (unit tests). Out of curiosity, have you tried any of the other cases? EDIT: no particular need for you to test all the above, but I'd be curious what you see if you use a node terminal, and I can also put together a small example for scenario 5 (running release-it in a repo).

jbrunton · 2020-08-08T23:09:11Z

Small test case for scenario 5 (repo running release-it where the git checks fail): https://github.com/jbrunton/test-release-it-diff-check

EDIT: I realized that the auth check may fail if you try to clone and test this repo? I can add you as a collaborator if that simplifies testing.

webpro · 2020-08-15T06:09:59Z

Thank you for the extended research and tests, @jbrunton! I've tried a few things on the command directly and your test case repo and I'm convinced this PR is definitely an improvement.

jbrunton · 2020-08-15T09:29:32Z

That's great. Thanks for merging and releasing, @webpro!

jbrunton changed the title ~~Improve clean repo checks for projects with generated code~~ Improve clean repo check for projects with generated code Jul 25, 2020

jbrunton changed the title ~~Improve clean repo check for projects with generated code~~ Improve isWorkingDirClean() check for projects with generated code Jul 25, 2020

jbrunton force-pushed the improve-clean-repo-check branch from f7ed557 to 255caa0 Compare July 25, 2020 13:00

use git diff instead of diff-index to check for clean dir

0230f6f

jbrunton force-pushed the improve-clean-repo-check branch from 255caa0 to 0230f6f Compare July 27, 2020 12:43

jbrunton commented Jul 27, 2020

View reviewed changes

lib/plugin/git/Git.js Show resolved Hide resolved

webpro reviewed Jul 29, 2020

View reviewed changes

remove redundant comment

f33db9d

webpro merged commit 8cbac81 into release-it:master Aug 15, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve isWorkingDirClean() check for projects with generated code #688

Improve isWorkingDirClean() check for projects with generated code #688

jbrunton commented Jul 25, 2020

jbrunton commented Jul 25, 2020

jbrunton commented Jul 25, 2020

webpro commented Jul 26, 2020

jbrunton commented Jul 26, 2020

jbrunton commented Jul 26, 2020

jbrunton commented Jul 27, 2020

jbrunton commented Jul 27, 2020 •

edited

webpro Jul 29, 2020

jbrunton Jul 29, 2020

jbrunton commented Jul 29, 2020

jbrunton commented Aug 7, 2020

webpro commented Aug 8, 2020

jbrunton commented Aug 8, 2020 •

edited

jbrunton commented Aug 8, 2020 •

edited

webpro commented Aug 15, 2020

jbrunton commented Aug 15, 2020 •

edited

Improve isWorkingDirClean() check for projects with generated code #688

Improve isWorkingDirClean() check for projects with generated code #688

Conversation

jbrunton commented Jul 25, 2020

jbrunton commented Jul 25, 2020

jbrunton commented Jul 25, 2020

webpro commented Jul 26, 2020

jbrunton commented Jul 26, 2020

jbrunton commented Jul 26, 2020

jbrunton commented Jul 27, 2020

jbrunton commented Jul 27, 2020 • edited

webpro Jul 29, 2020

Choose a reason for hiding this comment

jbrunton Jul 29, 2020

Choose a reason for hiding this comment

jbrunton commented Jul 29, 2020

jbrunton commented Aug 7, 2020

webpro commented Aug 8, 2020

jbrunton commented Aug 8, 2020 • edited

jbrunton commented Aug 8, 2020 • edited

webpro commented Aug 15, 2020

jbrunton commented Aug 15, 2020 • edited

jbrunton commented Jul 27, 2020 •

edited

jbrunton commented Aug 8, 2020 •

edited

jbrunton commented Aug 8, 2020 •

edited

jbrunton commented Aug 15, 2020 •

edited