Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PR: Enforce LF line endings in Spyder code #18691

Merged
merged 3 commits into from
Jul 28, 2022

Conversation

impact27
Copy link
Contributor

@impact27 impact27 commented Jul 16, 2022

Description of Changes

  • Wrote at least one-line docstrings (for any new functions)
  • Added unit test(s) covering the changes (if testable)
  • Included a screenshot or animation (if affecting the UI, see Licecap)

Transform CRLF to LF and add a test

Issue(s) Resolved

Fixes #18690

Affirmation

By submitting this Pull Request or typing my (user)name below,
I affirm the Developer Certificate of Origin
with respect to all commits and content included in this PR,
and understand I am releasing the same under Spyder's MIT (Expat) license.

I certify the above statement is true and correct:

@impact27
Copy link
Contributor Author

I recommand the "Hide whitespace" option for reviewing

@dalthviz dalthviz added this to the v6.0alpha1 milestone Jul 21, 2022
@dalthviz dalthviz removed this from the v6.0alpha1 milestone Jul 22, 2022
@dalthviz
Copy link
Member

Should this one be closed @impact27 ?

@impact27
Copy link
Contributor Author

Should this one be closed @impact27 ?

Why? Don’t you think that having mixed line return in the spyder code base is strange @dalthviz?

@ccordoba12
Copy link
Member

ccordoba12 commented Jul 26, 2022

@impact27, I agree with you because it's becoming an issue with more and more people contributing to the project.

Furthermore, it's something that we've thought to do for a long time: PR #2370 describes the steps to follow to not break git blame, which is really important.

The problem is that enforcing LF as the only line ending in our codebase would break practically all open PRs, so I don't know what to do about it.

@impact27
Copy link
Contributor Author

We could provide instructions to solve the conflicts. Something along the lines of:

  1. Merge the last commit on master before this PR, solve conflicts as usual
  2. Merge this PR, resolve all conflicts in favour of your PR
  3. Run a script to transform all CRLF to LF

@impact27
Copy link
Contributor Author

impact27 commented Jul 26, 2022

So something along the lines of:

git checkout mybranch
git fetch upstream
git merge hash_before_18691
# resolve any conflicts as usual
git merge -X ours hash_18691
git rm --cached -r .
git commit -am "Remove CRLF"
git merge master
# resolve any conflicts as usual
git push

@impact27
Copy link
Contributor Author

I am not sure what to do about git blame. You can use the -w option to ignore white spaces. I don’t think I understand how to work around that without changing history

@dalthviz
Copy link
Member

dalthviz commented Jul 26, 2022

Why? Don’t you think that having mixed line return in the spyder code base is strange @dalthviz?

I though it was an outdated/orphan PR left before the closing the codeeditor.py issue at #18690 but now looking more closely the changes here this solves the line return for all Spyder so I guess we need instead to figure out how to apply the changes without causing to much trouble to already open PRs and in general future merges.

Maybe as described at #2370 we could do this with the new check/test added here for the CI (maybe adding some logic to only check the files changed in the current PR?) to then reinforce this measure (and prevent introducing more mismatches in terms of line endings) and also doing it in multiple PRs? Maybe even checking the current open PRs and applying the line endings correction in the files that each PR touches?

Also maybe I'm not totally sure how the diff between line endings work with version control but shouldn't the correction be done over 5.x too?

@impact27
Copy link
Contributor Author

Maybe as described at #2370 we could do this with the new check/test added here for the CI (maybe adding some logic to only check the files changed in the current PR?) to then reinforce this measure (and prevent introducing more mismatches in terms of line endings)

I have added a test checking for CRLF in python files

and also doing it in multiple PRs? Maybe even checking the current open PRs and applying the line endings correction in the files that each PR touches?

I am not sure about multiple PR, would this not make the transition more painful?

This PR essentially does two things:

  1. add a line to .gitattributes and a Ci test
  2. commit the result of this change

Also maybe I'm not totally sure how the diff between line endings work with version control but shouldn't the correction be done over 5.x too?

This is an excellent point, I will open this PR on 5.x

@impact27 impact27 changed the base branch from master to 5.x July 26, 2022 20:17
@impact27
Copy link
Contributor Author

impact27 commented Jul 26, 2022

Ok new strategy for fixing open PR, git merge has a ignore-cr-at-eol:

git merge -X ignore-cr-at-eol master
git rm --cached -r spyder/**/*.py
git add -A
git commit --amend

@impact27 impact27 force-pushed the fix_return branch 2 times, most recently from d4f1e0e to c3748b8 Compare July 26, 2022 20:57
@ccordoba12 ccordoba12 added this to the v5.3.3 milestone Jul 26, 2022
@stevetracvc
Copy link
Contributor

The problem is that enforcing LF as the only line ending in our codebase would break practically all open PRs, so I don't know what to do about it.

Hah! I just submitted a tiny PR (one line), and for the life of me couldn't figure out why git diff showed a CR. Then after figuring out that the whole file was like that, I decided I might as well just leave it in rather than editing the entire file.

Copy link
Member

@CAM-Gerlach CAM-Gerlach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some high-level comments:

  • Right now, this conforms even files that should be CRLF, like .bat files, to LF (including those in our vendored external-deps directory, which is particularly undesirable)
  • Furthermore, the text=auto option will rely on autodetection to determine the file to convert, which may not always be strictly reliable, as opposed to relying on known file extensions whenever possible.
  • Finally, this may still check out files as CRLF, which is inconsistent across users and platforms, and may produce a extra Git warning on checkin

Instead, I suggest using the well-proven, standardized .gitattributes that has been used on many of our repos now for years (as well as many others), with QtPy being among those using the latest version. This avoids all the problems mentioned above, along with bringing other benefits like better diffs and not exporting irrelevant files in Git tarballs.

This will also conform all files automatically on commit, so there is no need for a hacky and expensive test in the test suite (and why the non-standard file name capitalization?). If we do want to be absolutely sure that no LF files are ever committed again or merged again, adding the relevant built-in pre-commit hook would be a better solution (though again, probably not worth it on its own since Git already makes sure of it).

As for reducing the impact on the git blame/git history, you can use .git-blame-ignore-revs as of Git 2.23 and later which GitHub finally now supports (beta) and make sure to squash this down to one atomic commit first once its ready.

Beyond that, the problem of breaking existing PRs is a significant concern. Our strategy before when eliminating trailing whitespace was using this script I wrote which conformed all files not touched by existing PRs. That approach could be used, perhaps combined with ignoring PRs that are stale by more than a month or so, but it would blunt much of the impact of the changes, so I'm not sure about that and whether we should just go ahead and do it.

@ccordoba12
Copy link
Member

ccordoba12 commented Jul 27, 2022

Ok, let's try to merge this PR before any other one so that @impact27 doesn't have to constantly rebase it.

About preserving git blame (which is really important for me when checking which PRs changed what), I found this:

https://black.readthedocs.io/en/latest/guides/introducing_black_to_your_project.html
https://docs.github.com/en/repositories/working-with-files/using-files/viewing-a-file#ignore-commits-in-the-blame-view

So, if I understood things correctly, you need to leave a single commit with the line ending changes and then add its hash to a .git-blame-ignore-revs file placed in the root of the repo.

@ccordoba12 ccordoba12 changed the title PR: Enforce no CRLF returns in spyder code PR: Enforce LF line endings in Spyder code Jul 27, 2022
@CAM-Gerlach
Copy link
Member

CAM-Gerlach commented Jul 27, 2022

The major remaining action items are:

  • Rebase and redo the conform using the linked .gitattributes for the reasons mentioned
  • Add the .git-blame-ignore-revs with the commit hash
  • Eliminate the superflous test or at the very least, name it properly

@impact27
Copy link
Contributor Author

Apparently the tests don't like .git-blame-ignore-revs:

 lists of files in version control and sdist do not match!
missing from sdist:
  .git-blame-ignore-revs

@ccordoba12
Copy link
Member

Apparently the tests don't like .git-blame-ignore-revs

To fix that please add .git-blame-ignore-revs to setup.cfg, under the ignore subsection.

@impact27
Copy link
Contributor Author

impact27 commented Jul 27, 2022

I did some more testing, and the best way to fix current PR seems to be:
1 - Cherry-pick the commit that changes .gitattributes

git cherry-pick 03bc8d549f6e7b30516be8678403fc8c7714685d 

2 - Reset line ending for all py files inside the spyder directory

git rm --cached -r spyder/**/*.py
git add -A
git commit -m "CRLF to LF"

3 - merge 5.x (or master) ignoring CR:

git merge -X ignore-cr-at-eol 5.x -m "merge 5.x"    

After this the PR can be merged into 5.x / master as usual

@ccordoba12
Copy link
Member

I did some more testing, and the best way to fix current PR seems to be

Thanks for checking that, but I have one question: is it possible to avoid this step?

git commit -m "CRLF to LF"

The thing is that would introduce a commit per PR that would need to be added to .git-blame-ignore-revs as well to not break git blame. But I was hoping for a single commit (i.e. the one on this PR) instead.

@impact27
Copy link
Contributor Author

impact27 commented Jul 27, 2022

I am not sure how git blame works, but wouldn't it just check the merge commit which doesn't change line endings? Maybe rebase is better

@impact27
Copy link
Contributor Author

impact27 commented Jul 27, 2022

The nuclear option would be to do all these steps, then:

git reset --soft 5.x
git commit -m "All my work in a single commit"
git push -f

All the commits would be lost, but the result would be as expected

@ccordoba12
Copy link
Member

but wouldn't it just check the merge commit which doesn't change line endings?

I think git blame simply tells what's the last commit that changed a given line in the code, regardless if it was done in a merge or not. So the idea would be to avoid doing more line-ending changes outside this PR.

@CAM-Gerlach, what do you think about this?

@impact27
Copy link
Contributor Author

Ideally you would rebase PR on top of this PR. You should be able to do it with:
git rebase -X ignore-cr-at-eol 5.x
But I get an:
error: add_cacheinfo failed to refresh for path 'spyder/plugins/editor/widgets/editor.py'; merge aborting.

@impact27
Copy link
Contributor Author

I found the magic git command:
git rebase 5.x -s recursive -X renormalize
With git version 2.37.1

@ccordoba12
Copy link
Member

That's great! Thanks @impact27 for looking into that.

Please remove your last commit here (the one about updating to Mac 11 in our CIs) because it's part of PR #18837 now. Then this should be ready.

Copy link
Member

@CAM-Gerlach CAM-Gerlach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like all the things I mentioned were addressed, thanks. However, instead of just dropping 0040028 and rebasing on the base branch with it merged, it seems not one but two commits were made undoing the original commits. Since this cannot be squash-merged (or .git-blame-ignore-revs will of course not work, since the resulting commit hash cannot be known a priori), these need to all be dealt with properly.

@impact27
Copy link
Contributor Author

.git-blame-ignore-revs does not completely ignore the "remove CR" commit, and still shows up on empty lines, see the screenshot below:
Screenshot 2022-07-28 at 10 56 50

@ccordoba12
Copy link
Member

ccordoba12 commented Jul 28, 2022

@impact27, it seems Github only shows the commit that changed line endings on blank lines; the other lines preserve the last commit that changed them, which is what we want (otherwise we'd see the entire file as changed by yourself).

Could you attend today's developers meeting to talk about this and see if we can merge it? @CAM-Gerlach is going to be there too, so it'd be a great opportunity to go through this in person.

Copy link
Member

@CAM-Gerlach CAM-Gerlach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this is LGTM at least from my side, but we can talk about it at the meeting today since this does have some potential for disruption on existing PRs so we need to make sure we've carefully considered the implications and everyone's on the same page

@ccordoba12
Copy link
Member

ccordoba12 commented Jul 28, 2022

Ok, let's merge this one. Thanks @impact27 for your work on it and @CAM-Gerlach for the review.

Note: The failures with the Windows installer are unrelated to this.

@ccordoba12 ccordoba12 merged commit 5958941 into spyder-ide:5.x Jul 28, 2022
@impact27
Copy link
Contributor Author

@ccordoba12 for master you can git merge 5.x -s recursive -X renormalize

ccordoba12 added a commit that referenced this pull request Jul 28, 2022
@ccordoba12
Copy link
Member

Thanks for the tip, I just did it.

@ccordoba12
Copy link
Member

ccordoba12 commented Jul 28, 2022

One additional note: to ignore the commits added to .git-blame-ignore-revs locally, you need to run

git config blame.ignoreRevsFile .git-blame-ignore-revs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants