Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add feature to only changes that were already modified (git) #1324

Open
vinipsmaker opened this issue Feb 19, 2017 · 11 comments
Open

Add feature to only changes that were already modified (git) #1324

vinipsmaker opened this issue Feb 19, 2017 · 11 comments

Comments

@vinipsmaker
Copy link

vinipsmaker commented Feb 19, 2017

Hi, could you add a feature to not produce noisy git diffs? Thanks.

[edited for tone, by nrc]

@nrc nrc added the fun! :) label Feb 19, 2017
@nrc
Copy link
Member

nrc commented Feb 19, 2017

If somebody wanted to work on this, then I'd be happy to take a PR. But it would be an interesting (big) project. It would mean adding knowledge of Git (and presumably Hg and other VCSs) to Rustfmt and would mean making the Rustfmt work on only sections of code (which it can't do today). It would also be imperfect because at a minimum we have to operate on a whole AST node, not literally just a chunk of text.

@vinipsmaker
Copy link
Author

It would also be imperfect because at a minimum we have to operate on a whole AST node, not literally just a chunk of text.

Why would it be imperfect? I don't get. You still should work on the AST. Just don't change the tokens/text that match the "sacred" lines.

@nrc
Copy link
Member

nrc commented Feb 19, 2017

Why would it be imperfect?

Because Rustfmt might change the number of lines or the content of lines outside the diff. In the former case, this makes it hard (perhaps not impossible, but hard - consider that git diff is trying to solve the same problem (matching changes to lines) and often gets it quite wrong. It is easier for Rustfmt since it is only a single change, but still very hard). The second change means it might not be possible to do this. Consider the following:

foo_bar(arg1, arg2,
        arg3, arg4);  // Only this line is in the diff.

Rustfmt has to work on the node whole node, and makes:

foo_bar(
    arg1,
    arg2,
    arg3,
    arg4,
};

If we restrict to the changed lines we get:

foo_bar(arg1, arg2,
    arg3,

Which doesn't compile. One can create examples that do compile but have different behaviour. You might suppose we could track the tokens we actually change, but this is not done and would be a huge change (and since we can move tokens as well as change the whitespace around them, is much harder than it appears). Finally, even if we managed, some how, to achieve all this, then hopefully the changing indents in the above example shows how by formatting only some lines in a node, we might actually make formatting worse instead of better.

@garyyu
Copy link

garyyu commented Sep 18, 2018

I love rustfmt, it's great 👍 But indeed, rustfmt's code changing on untouched code of a git commit is an evil for code review (for this git commit), heavy headache on this.

Hope somebody can take this issue and let rustfmt have this feature!

@scampi
Copy link
Contributor

scampi commented Apr 11, 2019

For the git we could leverage https://crates.io/crates/git2. In addition, this feature could rely on the file_lines option to offload most of the work:

  • get the diff from git
  • turn the diff into the format understood by file_lines

@zroug
Copy link

zroug commented Apr 11, 2019

I use a script that is very similar to this approach for myself. It's far from perfect, both because the script itself isn't perfect and because file_lines doesn't always work perfectly. But sometimes I find it very useful as a temporary workaround. I haven't added error handling or anything.

https://gist.github.com/zroug/28605f45a662b483fb4a7c3545627f66

bors added a commit to rust-lang/rust that referenced this issue Jun 11, 2019
Add deny(unused_lifetimes) to all the crates that have deny(internal).

@Zoxc brought up, regarding #61722, that we don't force the removal of unused lifetimes.
Turns out that it's not that bad to enable for compiler crates (I wonder why it's not `warn` by default?).

I would've liked to enable `single_use_lifetimes` as well, but #53738 makes it unusable for now.

For the `rustfmt` commit, I used rust-lang/rustfmt#1324 (comment), and manually filtered out some noise.

r? @oli-obk cc @rust-lang/compiler
@eddyb
Copy link
Member

eddyb commented Jun 11, 2019

@zroug That works great, I just had to add "--edition=2018" to the args list.

bors added a commit to rust-lang/rust that referenced this issue Jun 12, 2019
rustc: replace `TyCtxt<'a, 'gcx, 'tcx>` with `TyCtxt<'gcx, 'tcx>`.

This first lifetime parameter of `TyCtxt` has been phantom for a while, thanks to @Zoxc, but was never removed, and I'm doing this now in preparation for removing the `'gcx`/`'tcx` split.

I wasn't going to do this as a separate step, and instead start converting uses of `TyCtxt` to a single-lifetime alias of it (e.g. `type TyCx<'tcx> = TyCtxt<'tcx, 'tcx, 'tcx>;`) but it turns out (as @Zoxc rightly predicted) that there is far more fallout from not needing a lifetime for the first parameter of `TyCtxt`.

That is, going from `TyCtxt<'a, 'gcx, 'tcx>` to `TyCtxt<'tcx, 'gcx, 'tcx>` (the first commit in this PR) has the largest amount of fallout out of all the changes we might make (because it can require removing the `'a` parameter of `struct`s containing `tcx: TyCtxt<'a, ...>`), and is the hardest to automate (because `'a` is used everywhere, not just with `TyCtxt`, unlike, say `'gcx, 'tcx` -> `'tcx`).

So I'm submitting this now to get it out of the way and reduce further friction in the future.

**EDIT**: for the `rustfmt` commit, I used rust-lang/rustfmt#1324 (comment), and manually filtered out some noise, like in #61735, but unlike that PR, there was also a weird bug to work around.
It should be reviewed separately, and dropped if unwanted.

cc @rust-lang/compiler r? @nikomatsakis
@eddyb
Copy link
Member

eddyb commented Jun 13, 2019

Just ran into this:

thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Os { code: 7, kind: Other, message: "Argument list too long" }', src/libcore/result.rs:999:5

So I modified your script to do 100 files at a time (pretty arbitrary but it worked in that case): https://gist.github.com/eddyb/23490b0a707d7ea87cba3be3ca93f0d7

bors added a commit to rust-lang/rust that referenced this issue Jun 14, 2019
Unify all uses of 'gcx and 'tcx.

This is made possible by @Zoxc landing #57214 (see #57214 (comment) for the decision).

A bit of context for the approach: just like #61722, this is *not* how I originally intended to go about this, but @Zoxc and my own experimentation independently resulted in the same conclusion:
The interim alias `type TyCx<'tcx> = TyCtxt<'tcx, 'tcx>;` attempt required more work (adding `use`s), even only for handling the `TyCtxt<'tcx, 'tcx>` case and not the general `TyCtxt<'gcx, 'tcx>` one.

What this PR is based on is the realization that `'gcx` is a special-enough name that it can be replaced, without caring for context, with `'tcx`, and then repetitions of the name `'tcx` be compacted away.
After that, only a small number of error categories remained, each category easily dealt with with either more mass replacements (e.g. `TyCtxt<'tcx, '_>` -> `TyCtxt<'tcx>`) or by hand.

For the `rustfmt` commit, I used rust-lang/rustfmt#1324 (comment), and manually filtered out some noise, like in #61735 and #61722, and like the latter, there was also a weird bug to work around.
It should be reviewed separately, and dropped if unwanted (in this PR it's pretty significant).

cc @rust-lang/compiler r? @nikomatsakis
@zroug
Copy link

zroug commented Jun 16, 2019

Great changes @eddyb, thanks!

@cormacrelf
Copy link

If your problem is noisy diffs, AND you've already run cargo fmt on your unstaged changes, then a line-based-rustfmt solution isn't enough. That's super common with format-on-save plugins or muscle-memory keybindings, if you've forgotten, or if you've had formatting disabled for a while (I recently used a stage1 build of rustc for a while, so no rustfmt). A way to salvage this is my tool git-index-exec: it can run things like cargo fmt on a copy of your index, and write any changes back to your index, never touching your working directory. Some scenarios:

  • You've accidentally run cargo fmt and now you can't find your work in the diff noise. Run git index-exec 'cargo fmt', and git diff now shows what it did before. The lines that did change have also been formatted. You can now commit the formatting by itself, and continue working.
  • You forgot to format your last commit. Simply cargo-fmt both the working directory and the index, and commit the formatted index with --amend.
  • You want a pre-commit hook cargo fmt --check that doesn't require unstaged changes to be formatted correctly. Add git index-exec 'cargo fmt --check --quiet' || exit $? or similar to your .git/hooks/pre-commit file.
  • You want to know if what you're about to commit compiles, but you don't know if you have staged all of the relevant changes to make it compile on its own. (For example, a rename operation that touched some other file.) git index-exec 'cargo check' does the job. You can also add this as a pre-commit hook, which will similarly be much more correct than using cargo-check on the working directory.

Under the hood, it's just using a git worktree in /tmp, which stays alive between usages so that subsequent check/clippy runs are fast. There's not much to it, but once you have it, everything looks like a hammer. The biggest upside is that it means the solution to your formatting problems always involves more formatting, not trying to restrict the application. It just lets you point the unstoppable train that is rustfmt at more useful targets.

@tgross35
Copy link
Contributor

tgross35 commented Feb 3, 2023

After having worked with a codebase that does incremental formatting... it's difficult. There are all kinds of annoying edge cases (is this the right indentation level?), style might not be coherent with surrounding content, and enforcing on CI can be tricky.

A better solution is to perform a one-time formatting of your entire codebase, then add the commit hash to a .git-blame-ignore-revs file. This is supported by both GitHub and GitLab and will hide that commit from the blame viewer, so you don't wind up with untraceable history.

After that you can just use rustfmt as-is without worrying about incremental formatting, and your blames won't have the noise of the big formatting commit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants