-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Xed corrupts files by placing a linebreak at the end of file #195
Comments
I just figured out that 'cat' (cat (GNU coreutils) 8.27) doesn't have issues parsing files without ending line breaks. |
in the Zealots good ol' emacs its an configurable option(guess we should do the same): Require Final Newline |
It's not a bug. Xed is a text editor. Text requires there to be a trailing newline at the end of the file, or many parsers will fail. Most text editors are happy to oblige and ensure there's a newline at the end. If you want to edit binaries, a hex editor would be better suited. |
@haarp: I strongly disagree. All line breaks, other than the one forcibly and invisibly added at the end, are editable in gedit/xed. And certainly not 'most' text editors add this extra line break. None of the text editors I've used on Windows do this (Notepad, Notepad++, PNotepad), and even on Linux not all do this either, in fact I used some of these to remove these line-breaks. None of the major IDE's do this either (MS Visual Studio, Netbeans, etc). It is not 'normal' behavior at all. It is ridiculous to suggest one has to resort to hex editing to remove this line-break, every time one saves a text file. There aren't even any good hex editors for Linux (wxHexEditor is reasonable though, I personally use HexWorkshop running under Wine). If an 'expert' user is so inclined to require the ending line-break in his files, he can trivially add this himself: it is just one freaking keystroke!! But as long as the line-break is forced on the user, one has to resort to far more time consuming efforts to remove it (hex editing, installing a second text editor). I've written a few text editors and complex text parsers myself, and it is absolutely trivial to handle EOF cases. I would not trust any tools that struggle with this. And as dodona2 commented, it appears the issue with 'cat' has been fixed by now anyway, so this lame workaround is no longer needed. This issue should still be regarded as a bug; it is undocumented behavior, non-intuitive, invisible to the user, non-optional, and can unnoticeably corrupt user data. You just have to google for the tons of issues that users have due to this 'feature'; most of them have no idea their files even have ending line-breaks, because it is intentionally hidden from them. It personally took me weeks to find out that I could not trust gedit to display all characters stored in the file, I was looking for bugs in my own and other software that didn't exist. By the way, here's the workaround for gedit: https://stackoverflow.com/questions/3056740/gedit-adds-line-at-end-of-file |
There is a reason for the trailing newline. It's literally the POSIX standard. http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_206 So it should come as no surprise that many parsers have problems when the trailing newline is missing. A line not ending in a newline is not considered a line at all! It is also part of the ANSI C 1989 standard (section 2.1.1.2) and ISO C 1999 (section 5.1.1.2). I wasn't suggesting using a hex editor to remove the newline. I was suggesting to use them on binary files, those being the only files where a trailing newline is undesirable. The matter is not as simple as "one freaking keystroke". You will tend to forget it, often leaving broken text files in your wake. The mcedit text editor for example has the option of warning the user when he saves a file without a trailing newline. I am not opposed to making it a setting, but it should not be the default. |
Ok, that sounds fair, and probably explains why this behavior is expected by some tools. Binary files definitely aren't the only files in which trailing line-breaks aren't needed, in our cross platform project we have thousands of plain text files without them (typically originating from Windows users), and editing by some Linux (Xed/gedit) users adds the new-line, causing confusion with version control. I concur that it is easy to forget adding the new-line, so a warning would be useful. Making the feature optional IMO is best. I'm willing to live with current behavior as default as compromise, but I do think the new line should always be visible in the editor view, as a reminder to the user (perhaps even rendered with a visible glyph). |
That would in fact be incorrect behavior. The final newline is the terminator of the last line, not an actual line break. Granted, many tools get confused by this distinction, and the problem isn't new either. See this comment from 2005 for example: |
Then explain the choice of the CR+LF characters? LF means line-feed: on old typewriters to slide the paper up, literally displaying a new line. Carriage return meant sliding the carriage back, with the caret pointing at the start of that fresh new empty line; ready for you to begin typing. But I guess this begins to expose the inadequacy of the choices of the POSIX standard a bit now... There's absolutely no distinction between the random line-break anywhere in the file and the last one forced on us. Therefor, the editor should show one empty line at the bottom of the text, same as the others, but which one cannot delete. I have used some editors on windows in the past that did this (I think MSVC5 or VB4/5), it felt 'wrong' as UI behavior (especially because the new-line wasn't written to file) but could actually be a reasonable, intuitive solution for this case. Even better, if the forced line-break is displayed visibly (with this symbol: ↵), it is absolutely clear to the user whether the behavior is enabled, and why there's a trailing line-break added to the file. If it is disabled, the editor could display a warning in it's place. |
I can already hear the lament of users as they try to figure out why there's a symbol they can't delete :P |
Meh. Just display a tooltip with explanation when you hover the mouse pointer over the ↵ symbol. |
Just a heads-up: this major bug is still present in Xed 1.8.3. |
xed 1.8.3 still does not show the last newline of a text file. I assume that POSIX does not require hiding a character from displaying, and hope that this is resolved soon. I got bitten writing a PHP script which had an extra newline not shown in xed, and my local default PHP configuration was tolerant, but on the productive server there was another configuration. This is obscure as hell, and I will use another editor until this bug is resolved. It would be best just save the file as is. People know the ENTER key and what it does. |
+1 for more evidence I've also ran into PHP issues several times; this bug can cause the well known "headers already send" PHP error because of this invisible whitespace character, e.g. in a 'config.php' file after the '?>' tag. Even if you save it without making changes. I'm sure I'm not the only one who has taken down an entire website for hours due to simply saving a file in gedit/xed. |
Geany shows correctly the end of file, not Pluma/xed : #61 (comment) |
geany incorrectly shows a double end-of-line character, and if you try deleting the so-called empty line at the end, it still saves the file with its (correct) single newline terminator for the last line of text. Then when you re-open the file in geany, it still has that mysterious "final empty line" which you're positive you just deleted. This is unintuitive, broken behavior which violates the POSIX specification (and a few others) for what the actual definition of a file, or a line within a file, is. It goes contrary to how proper text editors work, and I'm having a hard time understanding what the rationale for imitating geany's lies might be. Can you please clarify what the actual use case is for this? |
When you edit a document with a LF or with a CRLF at the end of each line :
When you add a new line in the middle of a CRLF file with mcedit, CR is missing at the end of this line and mcedit registers the file like this : When you edit a document with a CR at the end of each line :
So, none of these editors works in the same manner. They all have defaults. But I prefer Geany because this editor shows symbols of CR and LF characters at the end of the lines and it matches correctly with what is registered. In Geany, the cursor is positioned where new typing will go. So, we know what we do. It's not the case with Pluma/xed. With mcedit, it's impossible to add CR. |
I have stated that xed is doing this correctly, since there is no such thing as an empty line at the end of that file. Saying that no, really, it is incorrect, may help you feel vindicated in your belief, but it doesn't actually answer my question, "what is the use case for wanting the text editor to lie"?
This indicates that mcedit has opened a file formatted in the "DOS" line endings format, using the unix open mode, and is incorrectly injecting Unix line endings into a DOS-formatted file. If I were to open the file in, say, vim, it would not display confusing ^M characters that don't mean anything -- it would continue to display simply a series of lines of readable text (possibly prose sentences), but the footer, right next to the name of the file, would display the status message:
None of that is what I'd truly call "correct". If you want your text editor to display symbols, then the non-printable (formatting-only!) ASCII bytes
In xed or pluma or vim or Notepad++ or any other text editor, the cursor is positioned where new typing will go. By default that is the beginning of the first line of the file, but if I move the cursor to the end of the last line, I can append to the end of the last line. If I move the cursor to the end of the last line, then type the ENTER key, future typing will be appended to a new line (which did not exist in the file when I opened the file). I still do not understand why any of this is a problem. But then, neither do I understand why you are anxious to add ASCII byte
|
Sorry, but that is factually incorrect: CR = Carriage Return means effectively 'end of line', or more correctly, return to beginning of the (same!) line. LF = Line Feed, literally means 'end of current line AND start of new line'. If no characters follow, it is by definition an empty line. Xed does not show this, but does write it to the file when saved. It even adds it to unchanged files when saved.
Please don't do that sort of thing here.
When the user is in control of his/her computer. Xed takes control away from the user by adding unwanted hidden linebreaks, and 'lies' about it, by hiding it in the application.
Then that is a problem of that particular piece of software. Let's not add bugs to text editors to work around that problem. And as another user already commented in this thread, those issues were fixed decades ago.
Most word processors or code editors have a feature to show hidden characters (including linebreaks): https://i.stack.imgur.com/tPKls.png Can we please conclude that people have different preferances and opinions on this, and that Xed needs, at least, a configurable option to disable this behavior? |
Right, if you have a single LF end-of-line terminator followed by an additional blank line terminated by a second end-of-line terminator, that might be a problem. xed doesn't do that, though? xed is compliant with the following policy: PHP coding standards mandate:
As long as you stick strictly to end-of-line terminators, without adding additional byte sequences consisting of a blank line, terminated by an LF, php won't try to print blank lines as part of your application before you're done printing the headers.
WHAT. Examples???
I'm pretty sure every such file I've ever come across has either used linebreaks or not cared about them. I'm also pretty sure I've never seen any password system ever, which permits an LF as a character of the password itself, if only due to the fact that the key is used to submit the password, not add more bytes before submission. Hashes are not permitted to contain newlines, since they are hex-encoded strings and thus the pool of permitted characters when representing a hash is A keyfile is usually treated as a binary file, for example you can use a png as a keyfile. I don't recommend editing binary files in xed for any reason.
Consistently using terminating LF in all your files ensures that version control correctly accounts for your files. git will produce messages indicating unnecessary changes, via the message "No newline at end of file", indicating the missing LF is bad.
Do not change digitally archived files for any reason. Open them in readonly mode, and do not click save.
The nonrecoverable error is "this file does not match the POSIX definition of a line of text", but in fact most C compilers today do in fact include clumsy workarounds that complicate the parser, as a hack job in order to work around bugs in Windows text editors. It is a generally problematic issue for any file format or workflow in which:
and
You may also try using
Then consider showing your support for #225
Set the gsettings configuration key It will certainly not be made mandatory... |
Xed's behavior seems to be consistent with other editors, though there does seem to be quite a bit of variation. After trying a number of different editors, they all defaulted to adding a new line, though some did have the option to disable it. Many of them don't show the extra line either, and some of them are inconsistent about it (Kate, for example, shows the extra line at first, allows you to delete it, and then adds the newline on save but doesn't show it until you reopen the file). I think it's hard to argue that xed behaves 'incorrectly' since it doesn't do anything that isn't common among other editors. However, I do agree that there should be an option if possible so I tagged this as a feature request rather than a bug. Please note that I'm not sure this is even something we control currently, and may be handled by the upstream libraries we rely on (looking through the code, I couldn't find any instance of where we add newline characters so my first guess is that it's added by some code in GtkSourceView or even Gtk itself). That doesn't mean we can't change the behavior, but it may pose additional challenges in implementing this option. |
xed/data/org.x.editor.gschema.xml.in Line 164 in 4f43977
Which maps to Line 380 in 6e36dc4
It seems plausible to expose this as a preferences checkbox rather than require dconf-editor. gedit doesn't seem to have a preferences checkbox, incidentally. The checkbox would then be rather like using |
@eli-schwartz ah, thanks! I guess I was looking in the wrong place... :) If we already have a key for it, then it should be a simple matter to implement it. It wont make it into a release for 6 months or so, but there's always dconf editor for now. I wonder why it's not already in settings though... |
I guess it is probably inherited design from gedit? |
Many thanks! |
Xed corrupts every saved file by adding line break characters (CR/LF) at the end of the file. This line break is not displayed within Xed editor. Even when saving a file without changes, these characters are silently added. This is anti-intuitive behavior and causes many problems, especially when editing configuration files, or files that otherwise require strict formatting. It also causes all sorts of issues with files under version control.
I fully understand the history and absurd reasoning of this 'feature', and that Xed has merely inherited this behavior from gedit (which also has this same intentional bug). I really do think it is time to fix the default behavior. If 'cat' and other command line tools have issues parsing files without ending line breaks, then that's an issue to be solved by the respective program. No need to break other applications - and worse; user data - as workaround!
Please note that this issue is not a duplicate of #61, which merely deals with displaying this line break (and the confusion it evidently causes). This ticket addresses the cause of those issues.
Question: Is there also a workaround for Xed? For gedit one could change some gnome-settings-something, somewhere to override this behavior.
FYI: Xed 1.4.6, Mint Cinnamon 18.2 (fresh install)
The text was updated successfully, but these errors were encountered: