Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

git comparator reporting incorrect deltas #52

Open
alexkli opened this issue Dec 3, 2020 · 11 comments
Open

git comparator reporting incorrect deltas #52

alexkli opened this issue Dec 3, 2020 · 11 comments
Labels
bug Something isn't working
Milestone

Comments

@alexkli
Copy link
Contributor

alexkli commented Dec 3, 2020

In these conditions the git comparator did only report a 0.1% increase with the after size:

  • circleci
  • lots of files added

Reported:

+0.1% (93 MB => 93.1 MB)

but it should have been something like this:

+150% (37 MB => 93.1 MB)
@alexkli alexkli added the bug Something isn't working label Dec 3, 2020
@alexkli alexkli added this to the 1.3.0 milestone Dec 23, 2020
@alexkli alexkli changed the title git comparator not properly working git comparator not Dec 29, 2020
@alexkli alexkli changed the title git comparator not git comparator reporting incorrect deltas Dec 29, 2020
@alexkli
Copy link
Contributor Author

alexkli commented Apr 29, 2022

Possible example: https://github.com/adobe/node-fetch-retry/runs/6223100946?check_suite_focus=true

Has this unknown revision error, but still reports git size difference in the end.

Calculating size changes for
- git
fatal: ambiguous argument 'master..a5d5ec9c2eed8751a4e95700006dbb649ff1baad': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'
- node_modules
...

@alexkli
Copy link
Contributor Author

alexkli commented Apr 29, 2022

Unfortunately it seems this has been exaggerated with the changes of #72 - moving to a pure local cloning for the after folder. Previously we had local cloning in some cases (some CIs) I think.

I believe the local cloning that git does optimizes storage by using hard links etc., so it might always measure smaller.

For a "clean slate" checkout comparison – which the git comparator is supposed to report on – we'd have to do a clean checkout without optimizations. However, that is not trivial with the optimized checkout that CIs do... (see #72 and #63).

@charIeszhao
Copy link

Hey, I tried this promising tool and had the exact same issue. :-)
Will give it another try once you fix it. Thanks and please keep up the good work.

@alexkli
Copy link
Contributor Author

alexkli commented May 9, 2022

@demonzoo Unfortunately it seems not trivial to solve, given the fixes for #72. Most CIs do nasty tricks to pull/checkout only the actual current revision and no other branches (to make it fast), and that's the context that sizewatcher typically runs inside. With #72 we build on top of that optimization and thus make sizewatcher itself faster. A complete clean git checkout - two times with "before" and "after" state as needed by sizewatcher - can actually be very slow for larger repos...

Right now I'd be inclined to disable the git comparator, until this is solved, because it will always be wrong.

As for the solution, we might have to keep the git comparator opt-in then, and have it do its own clean checkout procedure (which has to be figured out). So that for cases where users don't care about git sizes, but others, that sizewatcher is not unnecessarily slow.

Also I believe that due to the optimizations that git itself does in its .git storage, the definition of a "clean checkout" at a given branch + revision is not very stable... making it hard to compare two versions. Maybe the only thing to look at is the delta - e.g. see if someone added many large files - and report on that, but not reporting a total size, as that might be slightly different every time anyway (even with the same repo state).

@omerXfaruq
Copy link

Hello @alexkli, I have a similar situation on this PR. We are trying to add sizewatcher to our repo's CI checks, is there a solution that we can apply?

@alexkli
Copy link
Contributor Author

alexkli commented May 25, 2022

@farukozderim The implementation of the git comparison is faulty at the moment. The only thing you can do right now is disable the git comparator through configuration, which will at least not confuse you with the incorrect reports (or block if you make it report as status check on PRs):

comparators:
  git: false

I hope to disable it by default in the next version as a quick fix, and then need to investigate further on how to measure an accurate git checkout delta, as it's actually not a trivial problem (see my above comment). Which might take an unknown amount of time, unfortunately.

@omerXfaruq
Copy link

@alexkli thanks for the reply, would it be enough if I just add these lines to the configuration file?

comparators:
  # set a comparator "false" to disable it
  git: false

@alexkli
Copy link
Contributor Author

alexkli commented May 26, 2022

Yes, that should work.

@omerXfaruq
Copy link

omerXfaruq commented May 26, 2022

@alexkli that fixed the error but it does not warn even if there is a file >1MB committed in the PR, any idea about this?


Run npx @adobe/sizewatcher
npx: installed 49 in 4.649s
Checking out git branches...
Comparing changes from 'main' to 'test-big-file-diff'

Calculating size changes for

Sizewatcher measured the following changes:

  'main' => 'test-big-file-diff' (sha aa0f7198311773ee26ac59703fd540bfb5f37e0f)


Done. Cleaning up...

This is my sizewatcher config:


report:
  # to report a github commit status (will block PR if it fails)
  # default: false
  githubStatus: true
  # to report a comment on the github PR
  # default: true
  githubComment: false

# global thresholds when to warn or fail a build
# note that one failing or warning comparator is enough to fail or warn
# can be either
# - percentage: "50%" ("-10%" for size decrease)
# - absolute limit, as byte string: "10 MB", "5 KB"
#   see https://www.npmjs.com/package/xbytes
# - absolute limit, as byte number: 1000000
limits:
  # when to fail - default: 100%
  fail: 10 MB
  # when to warn - default: 30%
  warn: 1 MB
  # below the ok limit you will get a cheers for making it notably smaller
  # default: -10%
  ok: -1 MB

comparators:
  # set a comparator "false" to disable it
  git: false

@alexkli
Copy link
Contributor Author

alexkli commented May 26, 2022

That's expected, if you turn off the git comparator it will not look at git changes (commits) at all. Sorry :-/

sizewatcher isn't limited to git changes, it also has other comparators including for nodejs etc. Hopefully more in the future.

@alexkli
Copy link
Contributor Author

alexkli commented May 26, 2022

I added a warning in the readme to avoid confusion: d6019de

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants