Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

save: Save files atomically #3273

Open
wants to merge 19 commits into
base: master
Choose a base branch
from

Conversation

JoeKar
Copy link
Collaborator

@JoeKar JoeKar commented Apr 28, 2024

The target situation shall be, that micro checks if the file to be stored already exists and if so it shall work with a temporary file first, before the target file is overwritten respective the temporary file renamed to the target file.
Possible symlinks pointing to the target file will be resolved, before the save takes place. This shall guarantee that this symlink isn't renamed by the new approach.

TODOs:

Fixes #1916
Fixes #3148
Fixes #3196

@JoeKar JoeKar requested a review from dmaluka April 28, 2024 15:10
internal/buffer/save.go Outdated Show resolved Hide resolved
@JoeKar JoeKar force-pushed the fix/save-atomically branch 2 times, most recently from 7d663d8 to afdc7a1 Compare April 29, 2024 21:46
@@ -150,6 +150,32 @@ func (b *Buffer) saveToFile(filename string, withSudo bool, autoSave bool) error
// Removes any tilde and replaces with the absolute path to home
absFilename, _ := util.ReplaceHome(filename)

// Resolve possible symlinks to their absolute path
for {
// Unfortunately only Stat() checks for recursion
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not only Stat() "checks for recursion". Any syscall that needs to do anything with the file itself, not with the symlink to it, returns this ELOOP errno if there is a symlink loop, because in this case there is no actual file for it to do anything with, there are only symlinks pointing to each other, not to an actual file.

In particular, lstat(), unlike stat(), does not return an error in this case, since it deals with the symlink itself, not with the file it points to.

So a correct comment here would be like: "A trick to let the OS do symlink loop detection for us, since we are lazy to do it ourselves"

// Unfortunately only Stat() checks for recursion
_, err := os.Stat(absFilename)
if err != nil {
return err
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now try to save a new file (not existing yet).

}
}
break
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, you do this symlink resolving in saveToFile(), not in overwriteFile()? That doesn't look nice: the code responsible for atomic saving is spread between different functions. (The only reason why we do this symlink resolving is to ensure correctly working rename, right? So I think it should be together with the code that does that rename.)

Also is has undesired side effects: the mkparents code below now deals with the resolved path, not the original one. That is not something the user would expect, right? The user expects mkparents to only create missing directories in the original path specified by the user, not try to "fix a broken symlink" by creating missing directories in a resolved path.

Another point: resolving symlink in saveToFile() means that it only works when saving the buffer, not in other cases when micro writes a file.

In this regard, a general note about this PR: what about settings.json and bindings.json? We should write them atomically as well. (So we should not just move this symlink resolving to overwriteFile() but also, for example, replace ioutil.WriteFile() usage with overwriteFile() usage.)

Copy link
Collaborator Author

@JoeKar JoeKar Apr 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this regard, a general note about this PR: what about settings.json and bindings.json? We should write them atomically as well. (So we should not just move this symlink resolving to overwriteFile() but also, for example, replace ioutil.WriteFile() usage with overwriteFile() usage.)

Hm, good point...definitely.
But shouldn't we go the way via a temporary buffer (NewBufferFromString()) or will it be too much? From my perspective a lot of additional checks, are done then by the buffer layer already.

Edit:
Hm, buffer will introduce cyclic includes, so at least overwriteFile() needs to be moved from the buffer to e.g. the util package, to solve it for settings.json.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, and that makes it non-trivial: we need to partially decouple the overwrite functionality from buffer, to make it more generic to allow using it not just for buffers but also for other files, like settings.json.

And that decoupled functionality would include both the rename (with resolving symlinks etc) and the overwrite fallback.

And since it would include the overwrite fallback, it means we'd want to be able to restore a broken settings.json from the saved backup copy, which means that the backup functionality would also need to be partially decoupled from buffer?

I'd like to keep it simpler, but how?

Copy link
Collaborator Author

@JoeKar JoeKar May 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently I can only imagine to do this with a kind of these function literals as callbacks to be performed in the caller or as a result of the return. Might be naive. 🤔

@@ -54,7 +55,7 @@ func overwriteFile(name string, enc encoding.Encoding, fn func(io.Writer) error,
screen.TempStart(screenb)
return err
}
} else if writeCloser, err = os.OpenFile(name, os.O_WRONLY|os.O_CREATE|os.O_TRUNC, 0666); err != nil {
} else if writeCloser, err = os.OpenFile(tmpName, os.O_WRONLY|os.O_CREATE|os.O_TRUNC, 0666); err != nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if there is already a file with the same .tmp name? It will be silently truncated, which is bad.

BTW, you know when there may be already this file, for example? When micro itself already created this file last time, when trying to save the file, but os.Rename() failed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be solved with the backup approach.

screen.TempStart(screenb)
if err != nil {
return err
}
} else {
if err == nil {
err = os.Rename(tmpName, name)
Copy link
Collaborator

@dmaluka dmaluka Apr 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are legitimate cases when rename fails (or some previous steps fail, e.g. creating the .tmp file) but it is actually perfectly possible to save the file, so we should still try our best to save it, rather than just return an error to the user. For example:

  • Micro has no write permissions to the directory where the file is, so it cannot create other files in this directory and cannot change the name of the file. But it has write permissions to the file itself, so it can overwrite the file contents.
  • The file itself is a mountpoint (e.g. a bind mountpoint), so technically it is on a different filesystem than other files in the same directory, so rename fails.

So, what to do? If the buffer has been successfully written to the temporary file and then os.Rename() failed, it means we already have a backup copy, so we can now overwrite the user's file as well. If overwrite successfully completes (including fsync!), we can remove the temporary file. If overwrite fails, we should keep the temporary file. And we should probably rename the temporary file to the backup file in ~/.config/micro/backups then? (And if this rename, in turn, fails (e.g. if ~/.config/micro/backups is on a different filesystem), we should try to copy & remove then? hmmm...)

But if it was not os.Rename() but some previous step that failed, e.g. if we failed to create .tmp file due to lack of permission, what then? Then we should probably be able to create a temporary file in ~/.config/micro/backups instead. Actually, such a temporary file should be already there, - it is the backup file.

All this makes me think: as you noted yourself in your TODO, we should try to reuse the backup file if possible; and given the above, it seems we'd better just use the backup, no other temporary file. So the algorithm can be simple and clean:

  1. If the backup option is on, synchronize the backup file with the buffer.
  2. If the backup option is off (no backup file yet), create the backup file.
  3. Rename the backup file to the user's file.
  4. If rename failed, overwrite the user's file with the buffer.
  5. If overwrite failed, don't remove the backup.

The .tmp approach might have some advantages over the above simplified approach (the reduced likelihood of rename failure, e.g. if the user works a lot file files on a different filesystem than ~/.config/micro), but they seem to be outweighed by the increased likelihood of rename failure in other cases (e.g. lack of permission to the directory), the increased complexity and mess, and the fact that we may leave nasty .tmp files around in the user's directories.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW looks like vim does more or less the same: https://github.com/vim/vim/blob/master/src/bufwrite.c#L1179

Moreover, if the file is a symlink, it doesn't seem to try to resolve it, it immediately falls backs to overwriting the file from the backup instead of renaming. I guess we can do the same, for simplicity (after all, a symlink is just a corner case, and we need to support the overwrite fallback anyway).

OTOH, I guess it doesn't hurt if we do resolve the symlink. E.g. we can move that to a separate resolveSymlink() function, so the code will not become much more complicated.

P.S. And nano, for that matter, doesn't seem to use rename at all: https://git.savannah.gnu.org/cgit/nano.git/tree/src/files.c#n1748
It just always overwrites the file from the temporary file, even though it is slower and less convenient.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P.S. I hope we don't need to do all those scrupulous proactive checks like st.st_dev != st_old.st_dev that vim does, we can just try os.Rename() and fall back to overwrite if it fails. With the exception for symlinks, which we need to check or resolve beforehand, to prevent replacing them with regular files.

BTW... what about FIFOs, sockets, device files, ...?

Copy link
Collaborator Author

@JoeKar JoeKar Apr 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW... what about FIFOs, sockets, device files, ...?

micro seems to have serious trouble with such files anyway, it crashed with a fifo, it hung with a fd...
vim "opens" them and tells, that these are no files.

Edit:
I suggest to handle files only in case they're supported by micro. Means, we should prevent loading (and storing) files like ModeDevice + ModeCharDevice, ModeIrregular, etc.pp. because a crash or stuck is unacceptable.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the backup option is off (no backup file yet), create the backup file.

Since the backupThread is async (with possible delay) we can create the backup anyway and don't need to consider the option. For the removal of the backup the option is important then, right?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our goal is only to ensure that the original file is never left in a damaged state (or in the worst case, it is left in a damaged state, but we have a backup so we can restore it), right? If /home is full so we cannot do anything, it is not our problem.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...A more tricky problem: when renaming, we should preserve file permissions (chmod) of the original file, and probably also its owner & group (chown). And even worse, probably also its extended attributes (setxattr).

Vim seems to do all that: here, here and here.

And if anything of that fails, we should probably fall back to overwrite, before even trying to rename. (We don't want to silently change user's file attributes, right?)

Copy link
Collaborator Author

@JoeKar JoeKar May 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, that's a position. But even then we can only proceed in the moment the backup was successful, otherwise we would take a broken backup as source to overwrite a healthy target.

Our goal is only to ensure that the original file is never left in a damaged state (or in the worst case, it is left in a damaged state, but we have a backup so we can restore it), right?

Do we then really have? In the moment we ignore running /home out of space and accept damaging the backup we've no "plan b" in the moment something goes wrong with the target.

Edit:

And if anything of that fails, we should probably fall back to overwrite, before even trying to rename. (We don't want to silently change user's file attributes, right?)

Yes, then we're safer with the direct overwrite from a different source instead moving the source.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should make sure to avoid this situation. If the backup was unsuccessful, we should not overwrite the target file with it, we should just tell the user that we cannot save the file. Also I think we should remove the broken backup.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I can live with that. It's the simpler approach and we don't need to use temporary files.

@JoeKar JoeKar marked this pull request as draft April 30, 2024 18:46
@JoeKar JoeKar marked this pull request as ready for review May 12, 2024 19:49
@@ -64,6 +64,8 @@ var (
// BTStdout is a buffer that only writes to stdout
// when closed
BTStdout = BufType{6, false, true, true}
// BTDefaultNoSyntax is a default buffer with disabled syntax highlighting
BTDefaultNoSyntax = BufType{0, false, false, false}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could have used BTScratch too, but this would change the current approach of not storing this kind of buffer, so I decided to create a new one to explicit highlight what it is used for.
Currently I've no feeling, if defining his Kind to be of the same type as BTDefault was a wise choice...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the whole approach is wrong. Why not decouple the backup functionality from the Buffer (and move it outside the buffer package) instead?

What is a "buffer"? It is a file that is edited or viewed by the user, not a file like settings.json that is being used by micro internally, right?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...Also there is an issue: if e.g. settings.json is also opened by the user, the same file in ~/.config/micro/backups/ is used for saving it both as a buffer and internally by micro when saving settings.

On second thought: actually maybe it's ok (maybe even useful) that they use same backup file, as long as we make sure that they don't conflict? (Do we?)

...And also, independently of your PR, looks like we also have at least the following issues with backups:

  1. util.EscapePath(b.AbsPath) does not uniquely encode the file path, so the same backup file may be used for different files (e.g. if a file has the % character in its name). (And in Windows it's even worse, since both slashes and colons are encoded with %...)
  2. b.Backup() executed asynchronously (from backupThread) is accessing the line array without locking. (Yeah, same again...)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the whole approach is wrong. Why not decouple the backup functionality from the Buffer (and move it outside the buffer package) instead?

That's also a possibility. Currently I've no overview if it's worth the effort. At least we wouldn't need the Buffer as intermediate step and thus no new type.

On second thought: actually maybe it's ok (maybe even useful) that they use same backup file, as long as we make sure that they don't conflict? (Do we?)

At least I didn't focus on that yet. So most probably we don't.

[...], looks like we also have at least the following issues with backups: [...]

I will add it to the TODOs.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently I've no overview if it's worth the effort.

I think it is. Assuming that we want to implement this feature (generalized backups) at all, let's implement it cleanly, not via ugly design shortcuts.

...Though the higher-level question worth thinking about: is implementing this feature (i.e. "safe save" not just for buffers but also for internal files) worth the effort in the first place? Seems like it is, occasionally wiped settings.json is very annoying.

Partially this is a consequence of the (IMHO unfortunate) historical decision to let micro write its config files automatically (e.g. vim and nano don't write their vimrc and nanorc, so they have no dilemma how to write them safely)... OTOH even regardless of settings.json and buffers.json there are also those serialized files in ~/.config/micro/buffers/ which would need to be written by micro anyway...

@@ -24,6 +24,35 @@ import (
// because hashing is too slow
const LargeFileThreshold = 50000

func resolvePath(filename string) (string, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICS your PR doesn't implement the rename case (i.e. the actual atomic save) yet. Which is fine, but it means that resolving symlinks is not really needed yet?

Copy link
Collaborator Author

@JoeKar JoeKar May 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is one reason, why I left that behind:
With the rename we will loose the original creation time stamp of the target file and will take over the one from the file which we renamed. The user could need these timestamps for further file filtering, which I don't like to remove.

Edit:
And yes, in that moment we could throw away the whole symlink resolving.

Edit2:
Correct typo.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the rename we will use the original creation time stamp of the target file and will take over the one from the file which we renamed. The user could need these timestamps for further file filtering, which I don't like to remove.

I don't understand this at all. What does it mean "to take over the timestamp"? What "filtering"?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me explain it with this snippet:

$ ls -lth --time=birth
-rw-rw-r-- 1 joeran joeran    0 17. Mai 07:13 new.txt
-rw-rw-r-- 1 joeran joeran    0 17. Mai 07:12 old.txt
$ ls -lth --time=mtime
-rw-rw-r-- 1 joeran joeran    0 17. Mai 07:13 new.txt
-rw-rw-r-- 1 joeran joeran    0 17. Mai 07:12 old.txt
$ echo "test" > new.txt 
$ ls -lth --time=birth
-rw-rw-r-- 1 joeran joeran    5 17. Mai 07:13 new.txt
-rw-rw-r-- 1 joeran joeran    0 17. Mai 07:12 old.txt
$ ls -lth --time=mtime
-rw-rw-r-- 1 joeran joeran    5 17. Mai 07:14 new.txt
-rw-rw-r-- 1 joeran joeran    0 17. Mai 07:12 old.txt
$ mv new.txt old.txt 
$ ls -lth --time=birth
-rw-rw-r-- 1 joeran joeran    5 17. Mai 07:13 old.txt
$ ls -lth --time=mtime
-rw-rw-r-- 1 joeran joeran    5 17. Mai 07:14 old.txt

By "filtering" I was referring to the use case a user likes to search/filter for files created instead of modified at a certain time. We will overwrite the files birth date in the moment we rename the file, which we shouldn't do.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I wasn't even aware that file creation timestamps are now a thing in Linux.

So, if we implement rename, we need not just to copy the file's owner, mode and xattrs (as I mentioned earlier) but also copy at least it creation time...

Aaaa... got it. These creation timestamps are immutable in Linux, we cannot just set them.

Ok, I got your point: you don't want to implement rename at all, for this reason. Yeah, it makes sense (and there may be even more reasons, which we don't realize yet).

So, we can just remove symlink resolving now, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, we can just remove symlink resolving now, right?

Yep, due to the fact we don't need the full absolute path any longer (no renaming), it's superfluous.
We can simply work with the given paths (and symlinks too).

@@ -57,9 +57,8 @@ func resolvePath(filename string) (string, error) {
// overwriteFile opens the given file for writing, truncating if one exists, and then calls
// the supplied function with the file as io.Writer object, also making sure the file is
// closed afterwards.
func overwriteFile(name string, enc encoding.Encoding, fn func(io.Writer) error, withSudo bool) (err error) {
func overwriteFile(name string, enc encoding.Encoding, fn func(io.Writer) error, withSudo bool, lock func(), unlock func()) (err error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose you agree that this is extremely ugly?

And this screen stop/start stuff is really only needed for the withSudo stuff, which is in turn only needed for the buffer save case, it is not used in any other use cases of overwriteFile().

So instead of adding more such specific arguments to overwriteFile() and making the interface completely confusing, it seems we should do quite the opposite: make it more abstract and remove the notion of sudo from the interface. Instead of passing withSudo, can we pass 2 additional optional callbacks: the "pre-write" callback (which also optionally returns the desired io.WriteCloser, instead of the default one opened normally via os.OpenFile()) and the "post-write" callback?

The implementations of those callbacks (needed for the sudo case only) would be nicely encapsulated in the buffer package.

While we're at it, enc is also buffer-specific and not used in other cases, so it would be nice to remove it from the interface to the implementation of those buffer-specific callbacks as well. (That seems more tricky though, we can talk about it later.)

@@ -57,7 +57,7 @@ func resolvePath(filename string) (string, error) {
// overwriteFile opens the given file for writing, truncating if one exists, and then calls
// the supplied function with the file as io.Writer object, also making sure the file is
// closed afterwards.
func overwriteFile(name string, enc encoding.Encoding, fn func(io.Writer) error, withSudo bool, lock func(), unlock func()) (err error) {
func overwriteFile(name string, enc encoding.Encoding, fn func(io.Writer) error, withSudo bool, sucmd string, lock func(), unlock func()) (err error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto...

// OverwriteFile opens the given file for writing, truncating if one exists, and then calls
// the supplied function with the file as io.Writer object, also making sure the file is
// closed afterwards.
func OverwriteFile(name string, enc encoding.Encoding, fn func(io.Writer) error, withSudo bool, sucmd string, lock func(), unlock func()) (err error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not exactly nit: WriteFile() might be better?

As long as it doesn't implement the rename case, OverwriteFile() is fine too, and maybe even better (after all that's what it does). But if we implement the rename, it would be not just overwrite, so we should rename it to WriteFile() then?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update: per my other comment, now it seems all this should be reworked, to simply call OverwriteFile() first for the backup and then for the target file. If so, it makes sense to keep it named OverwriteFile(), and add new WriteFile() function, which would call OverwriteFile() first for the backup and then for the target?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The WriteFile() calling the OverwriteFile() doesn't sound wrong and feels a bit cleaner as interface for the rest of the packages.
I've currently just one problem with the overall backup abstraction:
We need to obtain the backup location from the config again respective need too find a nice solution to pass it into this new abstraction.

At least this was the reason why I decide to currently stick to the buffer -> backup path.

@@ -106,6 +106,10 @@ func (b *Buffer) saveToFile(filename string, withSudo bool, autoSave bool) error
return err
}

b.Path = filename
absPath, _ := filepath.Abs(filename)
b.AbsPath = absPath
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why need to move it? Not obvious to me.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was necessary for the following b.Backup(true) call, which needed b.Path being set, since it works at the same Buffer.
Ok, the next two lines aren't necessary, but I decided to keep them at one place in that moment.

@@ -64,6 +64,8 @@ var (
// BTStdout is a buffer that only writes to stdout
// when closed
BTStdout = BufType{6, false, true, true}
// BTDefaultNoSyntax is a default buffer with disabled syntax highlighting
BTDefaultNoSyntax = BufType{0, false, false, false}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the whole approach is wrong. Why not decouple the backup functionality from the Buffer (and move it outside the buffer package) instead?

What is a "buffer"? It is a file that is edited or viewed by the user, not a file like settings.json that is being used by micro internally, right?

if _, e = io.Copy(file, readCloser); e != nil {
readCloser.Close()
b.Close()
return
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you actually apply the saved backup afterwards? For recovering broken bindings.json (which is what it is for, right?)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I didn't tested the apply-path. I trusted the present functionality. :(

@@ -64,6 +64,8 @@ var (
// BTStdout is a buffer that only writes to stdout
// when closed
BTStdout = BufType{6, false, true, true}
// BTDefaultNoSyntax is a default buffer with disabled syntax highlighting
BTDefaultNoSyntax = BufType{0, false, false, false}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...Also there is an issue: if e.g. settings.json is also opened by the user, the same file in ~/.config/micro/backups/ is used for saving it both as a buffer and internally by micro when saving settings.

On second thought: actually maybe it's ok (maybe even useful) that they use same backup file, as long as we make sure that they don't conflict? (Do we?)

...And also, independently of your PR, looks like we also have at least the following issues with backups:

  1. util.EscapePath(b.AbsPath) does not uniquely encode the file path, so the same backup file may be used for different files (e.g. if a file has the % character in its name). (And in Windows it's even worse, since both slashes and colons are encoded with %...)
  2. b.Backup() executed asynchronously (from backupThread) is accessing the line array without locking. (Yeah, same again...)

buf, err := buffer.NewBufferFromFileAtLoc(files[i], btype, flagStartPos)
file := files[i]
fileInfo, _ := os.Stat(file)
if !fileInfo.Mode().IsRegular() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this should be done inside NewBufferFromFileAtLoc(), after the fileInfo.IsDir() check.

Try open /dev/zero in micro (i.e. after micro already started).

Also:

  1. You ignore the error returned by os.Stat(), so micro crashes if the file doesn't exist.
  2. This doesn't respect the parsecursor option, i.e. micro foo.txt:100 doesn't work (and again, crashes).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried /dev/zero, but as argument.
Unfortunately I don't use the *:x:y and therefore didn't realize the different path in that moment.

Thank you for spotting and suggestion.
I will correct it.

if errors.Is(err, fs.ErrNotExist) {
return filename, nil
}
if err != nil {
return filename, err
}
fileInfo, err := os.Lstat(filename)
if !fileInfo.Mode().IsRegular() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's cleaner to do this outside resolvePath() (after it)? To make the code clear: first we resolve the symlink, then we check if the file is a regular file.

Yeah, by doing it in resolvePath() we avoid doing the same os.Stat() syscall twice. But this is not really a hot path, right?

Wait... actually you do os.Stat() at every iteration of the loop, which even more redundant, right? So it seems both cleaner and more efficient to do it this way:

  1. first call os.Stat() before the loop (for symlink loop detection)
  2. then do the os.Lstat() loop
  3. then (maybe outside resolvePath()) do os.Stat() for the resolved path (to check if it's a regular file).

...Also, if this function will only do symlink resolving, nothing more, then maybe resolveSymlinks() would be a better (more clear) name?

unlock()
return err
}
} else if writeCloser, err = os.OpenFile(name, os.O_WRONLY|os.O_CREATE|os.O_TRUNC, 0666); err != nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tried to add some keybinding via the bind command (with your PR applied)... I shouldn't have. It said Backup file doesn't exist and wiped my bindings.json.

I have no idea why it failed to find the backup (and I can't reproduce it again). As to why it truncated the file while it should have left in untouched, I guess, well, at least this os.O_TRUNC is the answer...

So, doing all the backup manipulations inside the fn() callback seems wrong. It seems we should instead just OverwriteFile() the backup file first, then OverwriteFile() the target file?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at least this os.O_TRUNC is the answer

Damn, yes...didn't care about these flags so far. Do we really need to open it that way? Because according to the docs it should truncate on write anyway or am I wrong?

Anyway, decoupling seems to be better then and most probably easier to follow.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It either truncates on open() or doesn't truncate at all. It doesn't truncate on write(). That's the common Unix semantics. Which docs say otherwise?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

os#WriteFile explains it the following:

[...]; otherwise WriteFile truncates it before writing, [...]

But I'm unsure if different go interfaces treat it differently and if I misunderstood it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

os.WriteFile() aka ioutil.WriteFile() is a helper which does all that at once: it opens the file (truncating it), writes the entire content to it, and closes it.

Why can't we just use this os.WriteFile() then? First, because, you know, we have this sudo stuff and this encoding stuff (though that is relevant to buffer files only), and second, because we want to sync the file to the disk (and that is relevant to both buffer and non-buffer files), which we do via f.Sync(). With os.WriteFile() we can't sync the file, since we don't have its descriptor (so we could only sync everything, not just this file, which is very inefficient).

// write lines
if fileSize, e = file.Write(b.lines[0].data); e != nil {
var readCloser io.ReadCloser
readCloser, e = b.GetBackupFile()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this hassle? Can't we just write the buffer to the file (the same buffer that we've just written to the backup)? We still have this buffer in memory, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, we've it still available.
I've been stuck on this backup idea and directly used it for the copy action.
At least it was shorter from code point of view within this path.

But if you insist to use the backup just as a fallback file, then we can do it the other way around.

@JoeKar
Copy link
Collaborator Author

JoeKar commented May 16, 2024

Still a lot of rework left, but thank your for your time reviewing this! 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants