Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New unified favourites can disappear after game exit #4023

Closed
Fixer-007 opened this issue Apr 22, 2016 · 32 comments
Closed

New unified favourites can disappear after game exit #4023

Fixer-007 opened this issue Apr 22, 2016 · 32 comments
Labels
Bug Issues that were confirmed to be a bug @ Client / Audiovisuals Windows

Comments

@Fixer-007
Copy link
Contributor

Fixer-007 commented Apr 22, 2016

Minetest Version: c350cfb and 21079cc

Bug:
Say you have 4 favourite servers, you join one, walk a bit, exit back to server list and decides to exit game completely (maybe even before server list is loaded). There is a high chance that you will lose your favourite servers, file disappears (deleted). I lost my favourites two or three times in one day.
You can lose it after logining to a server.
Try to join already only favourite servers to reproduce.

I don't know why file disappears but it connected to game exit. Please investigate as it can be confusing for noobs.

@est31
Copy link
Contributor

est31 commented Apr 23, 2016

cc @kilbith

@est31 est31 added the Bug Issues that were confirmed to be a bug label Apr 23, 2016
@kilbith
Copy link
Contributor

kilbith commented Apr 23, 2016

@est31 Why are you pinging me ? I haven't touched the deleting mechanism.

And even then, the mainmenu code (through core.delete_favorite) can only delete an entry, not the file. This function is called only when you press the button [Del. Favorite], nothing has been changed here recently.

@rubenwardy
Copy link
Member

Can this be reproduced before kilbith's change?

@rubenwardy
Copy link
Member

I asked whether it can be. You modified it last which is why you're being pinged. I'm away from my computer so I can't check myself

@Fixer-007
Copy link
Contributor Author

Can this be reproduced before kilbith's change?

Don't know, I didn't use that feature before kilbith change :)

@SmallJoker
Copy link
Member

SmallJoker commented Apr 23, 2016

If this is really Minetest's problem it must be a new one because when I tested it the first time I deleted my own favoriteservers.txt manually and it had over 180 entries of visited servers, which wouldn't be the case if the list was deleted every 4th time.
But searching in the deletion function could be misleading. The file could be also be emptied to begin writing the updated favorite list in it.

@Zeno-
Copy link
Contributor

Zeno- commented Apr 23, 2016

@kilbith nobody is pointing at you... they are pointing at a commit that might be a possible cause (there was nothing personal that I can see). It's a normal procedure for tracking down bugs (look at the latest commits first)

It could also be related to 2177f30 (@Ekdohibs)

@Fixer-007
Copy link
Contributor Author

Fixer-007 commented Apr 23, 2016

It could also be related to 2177f30 (@Ekdohibs)

This bug on earlier version (day before), there is a slight possibility that it can fix the bug, I will try to compile with it.

@kilbith
Copy link
Contributor

kilbith commented Apr 23, 2016

Suspecting that my changes caused that is stupid. It has nothing to do with the mainmenu code and moreover with my changes. As I explained above, this piece of code has no capability to delete a file.

I'm not yet sure that @est31 reproduced himself the bug before labelling as so.

The file could be also be emptied to begin writing the updated favorite list in it.

The maincode does not modify the file in any way for displaying the list, except for deleting an entry. It uses its datas in the formspec but that's it, its integrity is always kept. Also note Fixer indicates a file deletion, not an emptying.

@Zeno- Zeno- added Unconfirmed bug Bug report that has not been confirmed to exist/be reproducible and removed Bug Issues that were confirmed to be a bug labels Apr 23, 2016
@Zeno-
Copy link
Contributor

Zeno- commented Apr 23, 2016

@kilbith let people rule out your commit and then there is no question

@Fixer-007
Copy link
Contributor Author

Fixer-007 commented Apr 23, 2016

Still having this problem on 21079cc, lost favourites (entire file was deleted) after connecting to Xanadu server, not sure why and how. :/ It looks random, can lose it after connecting to server, or on game exit from serverbrowser.

@Fixer-007
Copy link
Contributor Author

Fixer-007 commented Apr 24, 2016

Trying to reproduce it on older build before kilbith PR. Can't reproduce yet.
Important observation: there is more activity with file after kilbith PR, before it was just same file that won't change after connection to server, or after server exit or game exit. With kilbith change there are more file activity, things are moving up and down after connection to server, or after server exit or game exit, file size can increase or decrease.

@paramat paramat added this to the 0.4.14 milestone Apr 25, 2016
@paramat
Copy link
Contributor

paramat commented Apr 25, 2016

Added to milestones as is a possible blocker.

@gregorycu
Copy link
Contributor

What platforms was this observed on? I cannot reproduce this on Windows x64 build.

@Fixer-007
Copy link
Contributor Author

Windows 7 sp1 64bit

@paramat
Copy link
Contributor

paramat commented Apr 25, 2016

Still unreproducable apparently.

@Fixer-007
Copy link
Contributor Author

Fixer-007 commented Apr 26, 2016

More data now since I think I caught it in action.
Logs from Sysinternals process monitor (in csv format): log.zip
Version: 26a9a85
Process: started MT, don't remember any favourites (file was probably absent on first run, since I've compiled newer version), went to change my Texture pack, went back and joined Extreme Survival Server, exited, joined Liberty Land, while on server noticed absence of favourites file, exit to server browser, no file, total exit from MT, favs file with Liberty Land.

Debug.txt, suspected time of file deletion is 2016-04-26 21:21:34

-------------
  Separator
-------------

2016-04-26 21:16:15: WARNING[Main]: Irrlicht: PNG warning: Incorrect bKGD chunk length
2016-04-26 21:16:15: WARNING[Main]: Irrlicht: PNG warning: Incorrect bKGD chunk length
2016-04-26 21:16:15: WARNING[Main]: Irrlicht: PNG warning: Incorrect bKGD chunk length
2016-04-26 21:16:18: WARNING[Main]: Irrlicht: PNG warning: Unknown iTXt compression type or method
2016-04-26 21:16:18: WARNING[Main]: Irrlicht: PNG warning: Incorrect bKGD chunk length
2016-04-26 21:16:18: WARNING[Main]: Irrlicht: PNG warning: Incorrect bKGD chunk length
2016-04-26 21:16:18: WARNING[Main]: Irrlicht: PNG warning: Incorrect bKGD chunk length
2016-04-26 21:16:18: WARNING[Main]: Irrlicht: PNG warning: Incorrect bKGD chunk length
2016-04-26 21:16:19: WARNING[Main]: Irrlicht: PNG warning: Interlace handling should be turned on when using png_read_image
2016-04-26 21:19:22: WARNING[Main]: Irrlicht: Unknown data object in animation of .x file: AnimTicksPerSecond
2016-04-26 21:19:23: WARNING[Main]: Irrlicht: Unknown data object in animation of .x file: AnimTicksPerSecond
2016-04-26 21:21:34: WARNING[Main]: Irrlicht: PNG warning: Interlace handling should be turned on when using png_read_image
2016-04-26 21:21:34: WARNING[Main]: Irrlicht: PNG warning: Interlace handling should be turned on when using png_read_image
2016-04-26 21:21:34: WARNING[Main]: Irrlicht: PNG warning: Interlace handling should be turned on when using png_read_image
2016-04-26 21:21:35: WARNING[Main]: Irrlicht: PNG warning: Interlace handling should be turned on when using png_read_image
2016-04-26 21:21:37: WARNING[Main]: Irrlicht: PNG warning: Interlace handling should be turned on when using png_read_image
2016-04-26 21:21:41: WARNING[Main]: Irrlicht: PNG warning: Unknown iTXt compression type or method

@gregorycu
Copy link
Contributor

I've taken a look at the log files. It's actually pretty funny.

The intended logic is, create temp file, delete real file, rename temp file to real file.

What actually happens is.

Create temp file
Open real file
Flag that the file should be deleted when it is closed
Close real file HOWEVER something else has the file open
Attempt to rename temp to real file, which fails cause the real file still "exists"
Something else closes the real file and it is deleted because of the flag on the file

If you remove the application filter in ProcMon, you should see other applications use the file. Maybe a virus scanner? Windows file indexer? You don't need to reproduce the bug, you just need to see something else opens the file and you've proven this bug is possible. To prove this actually caused your specific issue you could check to make sure the actions happen in the order when you reproduce the bug. But I am, myself, satisfied.

If this is the case, we probably need to rework the windows code. To be honest, I'm not sure what a good solution would involve.

@Zeno-
Copy link
Contributor

Zeno- commented Apr 27, 2016

Well, as a start

#ifdef _WIN32
    // On POSIX compliant systems rename() is specified to be able to swap the
    // file in place of the destination file, making this a truly error-proof
    // transaction.
    // However, on Windows, the target file has to be removed first.
    remove(path.c_str());
#endif

if remove() does not return 0 then perhaps something should be logged (the errno at least)? Also if it fails (doesn't return 0) then it shouldn't really continue to try and rename the file which is going to fail because remove() failed[1], so some extra logic somehow needs to be added here.

[1] Or does remove() on Windows actually just set a flag and indicate success? Is that what you're referring to @gregorycu? If that's what happens I wonder what checking if the file exists would return after the call to remove()... if the check does indicate that the file still exists perhaps everything could bail out at that point. Hypothetically, you'd then lose the temp file but not the existing file which is more important.

Edit 2: Actually, perhaps something as simple as below is sufficient for something that seems to rarely happen anyway. IF it does start to occur more often we will also have a log entry to make finding this easier in future.

#ifdef _WIN32
    // On POSIX compliant systems rename() is specified to be able to swap the
    // file in place of the destination file, making this a truly error-proof
    // transaction.
    // However, on Windows, the target file has to be removed first.
    if (remove(path.c_str()) == -1) {
        // a) Some code here to log an error probably.
        // b) Remove the temp file (will lose it... oh well)
        // c) Bail out and return false
    }
#endif

@gregorycu
Copy link
Contributor

The issue isn't that remove fails, the issue is that the filesystem defers deletion of the file until all open handles are closed (including handles due to the search indexer, virus scanners etc.). In this exact case, remove would have returned 0, yet we would have still suffered the problem. Maybe we need to use ReplaceFile.

@Zeno-
Copy link
Contributor

Zeno- commented Apr 27, 2016

That was my question, really. Is there any way to tell that Windows deferred the deletion? The Windows documentation, btw, says that remove() should return -1 if the file is already open

Otherwise, it returns -1 and sets errno either to EACCES to indicate that the path specifies a read-only file or the file is open [...]

https://msdn.microsoft.com/en-us/library/2da4hk1d.aspx

@gregorycu
Copy link
Contributor

gregorycu commented Apr 27, 2016

I've taken a look at the inner workings of remove, and it calls DeleteFile.

https://msdn.microsoft.com/en-au/library/windows/desktop/aa363915%28v=vs.85%29.aspx

The DeleteFile function fails if an application attempts to delete a file that has other handles open for normal I/O or as a memory-mapped file (FILE_SHARE_DELETE must have been specified when other handles were opened).
The DeleteFile function marks a file for deletion on close. Therefore, the file deletion does not occur until the last handle to the file is closed. Subsequent calls to CreateFile to open the file fail with ERROR_ACCESS_DENIED.

Emphasis mine. So, the documentation for remove is less than precise.

@Zeno-
Copy link
Contributor

Zeno- commented Apr 27, 2016

I don't trust the documentation. A check for -1 would not hurt in any case (it actually should be there in the first place even if the bug remains).

@gregorycu
Copy link
Contributor

The check wouldn't really change anything. Just means we can warn when we fail to update the favourites in some circumstances (like someobody opened the file in notepad). Won't prevent favourites from being wiped in this case

The documentation of DeleteFile appears to be consistent with Fixer's data from Process Monitor.

@gregorycu
Copy link
Contributor

The thing we really need to solve is wiping out the main file. We should fix the other cases too, but the first is a priority as there is nothing the user can do to avoid it.

@Zeno-
Copy link
Contributor

Zeno- commented Apr 27, 2016

I guess what I'm wondering (and I cannot tell this from the documentation because as usual it's incomplete) is whether or not the return value of remove() indicates success even if the delete is deferred. Common sense says that yes it would, but whether or not that actually happens is another story. Even if it does still indicate success, surely Windows has a function to tell whether or not the remove/delete was deferred or not. Then again, maybe not.

@Zeno- Zeno- added the Windows label Apr 27, 2016
@gregorycu
Copy link
Contributor

gregorycu commented Apr 27, 2016

DeleteFile and thus remove will return 0, even in the case of deferred delete. From the perspective of the caller, flagging the file as "to be deleted" is mission accomplished. It has no idea when that will occur. In most cases, this will happen when the caller gives up its handle to the file. Even if it were to check the number of open handles on the file, that can change by the time we get to closing the file. We may be able to loop until the file actually gets deleted, but a simplier solution would be to use ReplaceFile.

@Zeno-
Copy link
Contributor

Zeno- commented Apr 27, 2016

shrug use ReplaceFile then :)

@Zeno-
Copy link
Contributor

Zeno- commented Apr 27, 2016

It's hard to believe that the Windows API behaves like that. It must cause bugs in millions of applications, but I will remove the "unconfirmed bug" in any case because I can see that the Windows-only code at least has issues (like not checking return values of functions)

@Zeno- Zeno- removed the Unconfirmed bug Bug report that has not been confirmed to exist/be reproducible label Apr 27, 2016
@gregorycu
Copy link
Contributor

It's documented behaviour that would explain what we observed. We can ask Fixer to reproduce, though anyone on Windows with a virus scanner / search indexer should be able to. It's possible to write an app that will act just like a virus scanner and hold the file open for a while if you want to test the theory.

@Zeno- Zeno- added the Bug Issues that were confirmed to be a bug label Apr 27, 2016
@Fixer-007
Copy link
Contributor Author

Fixer-007 commented Apr 27, 2016

Looks like the possible cause of the bug is conflict with foobar2000, which was monitoring minetest folder for music ^_^ (including favourites file).
Proof: https://i.imgur.com/54mzjNT.png

@Zeno-
Copy link
Contributor

Zeno- commented May 1, 2016

Should be fixed by c957346

@Zeno- Zeno- closed this as completed May 1, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Issues that were confirmed to be a bug @ Client / Audiovisuals Windows
Projects
None yet
Development

No branches or pull requests

8 participants