Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change write_bytes to write_text to preserve newlines #157

Merged
merged 3 commits into from
Dec 2, 2023

Conversation

nardi
Copy link

@nardi nardi commented Nov 20, 2023

Hello, I had a problem using this plugin on Windows (Python 3.11). I noticed that after running a command that uses version substitution, pyproject.toml and other files were back to their original contents, but the newlines had been changed (from CRLF to LF), which causes Git to see it as modified. Running git add on them detects that they are otherwise unchanged and doesn't stage them, but it is still a bit confusing.

I was able to fix this by changing calls to write_bytes to instead use write_text, which writes newlines using a platform specific convention ("universal newlines"), which I think is usually the preferred behavior. It could be that the calls to write_bytes were made for a reason that I'm not aware of, but if this is an acceptable change it would improve the user experience on Windows.

Thanks!

@mtkennerly
Copy link
Owner

Hi! Thanks for your PR. Someone else reported the same issue before (#105), but in that case, changing from write_text to write_bytes was the solution 😅 (also see v0.21.2)

My understanding is that write_text normalizes the line endings based on the OS, which may or may not be what you want depending on your Git core.autocrlf and core.eol settings. write_bytes is supposed to preserve whichever line endings the file originally had, so it's not clear to me why they would be changing here.

For what it's worth, I'm on Windows, and I used to see this issue with write_text, but I haven't been seeing it with write_bytes. I'm not sure what environment/config difference might account for that.

@nardi
Copy link
Author

nardi commented Nov 20, 2023

Ah, thanks for your reply! I guess the behavior makes sense, actually. Python does this newline normalization, which means it reads both LF and CRLF as LF. Then, it writes LF as CRLF or LF depending on the platform. So when reading and then writing text, LF will be preserved on Linux and CRLF on Windows, but CRLF will be converted to LF on Linux, and LF to CRLF on Windows. This is what was happening in issue #105.

Currently, the usage is asymmetric: read_text and write_bytes are used. This means they are normalized to LF on read and then written as LF no matter the platform. Instead, if we really want to preserve the original newlines, we have two options:

  1. Use both read_bytes and write_bytes
  2. Use open(..., newline="") for reading and write_text(..., newline="") for writing

I would say using the newline="" argument is the option that conveys intent better, but unfortunately there is no newline argument to read_text. Also, the one for write_text only exists since Python 3.10. So I think the first would be the best option, replacing calls to read_text with read_bytes(...).decode("utf8"). Do you agree? :)

@mtkennerly
Copy link
Owner

Ah, that's a great point about read_text. I agree that read_bytes would be a good solution :)

@mtkennerly mtkennerly merged commit f0db558 into mtkennerly:master Dec 2, 2023
25 checks passed
@mtkennerly mtkennerly added this to the v1.2.0 milestone Dec 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants