-
Notifications
You must be signed in to change notification settings - Fork 661
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MS Word 2007 & 2010 can not save file normally in mirror.exe #249
Comments
It's strange to me that you at some point get the following output: ###GetFileInfo 1584 GetFileInformationByHandle success, file size = 13063 result = 0 |
The system seems to make a request to GetFileInfo without calling @superhq , can you try to add |
@Liryna I try to use CreateFile with share mode FILE_SHARE_READ|FILE_SHARE_WRITE|FILE_SHARE_DELETE in MirrorGetFileInformation function, but result dose not change. |
Could this be related to the fact that it isn't possible to use the "FILE_FLAG_DELETE_ON_CLOSE" attribute with Dokany (or is this only true with FUSE)? I believe MS Office is using this flag to remove the related temporary file after closing the program. |
Hi @jetwhiz, This is only true with FUSE. |
The issue here is that after So we open a new temporary handle in On my side, if I replace all this code https://github.com/dokan-dev/dokany/blob/master/samples/dokan_mirror/mirror.c#L595-L604 I am not sure on this fix because it is not offen that the issue happen on my side. So if someone that really had the issue can make a test and see if it fix the behaviour or at least improve, it would be great :) |
I've tried many times to reproduce this bug. I made it happen right away the first time. Since then, I couldn't make it happen except one time when I wasn't set up for debugging it. I think return STATUS_INVALID_PARAMETER is probably right. I will put that change in my fs and see how that goes. |
I got it to happen again with return STATUS_INVALID_PARAMETER. Unfortunately, I wasn't debugging. |
We get to MirrorMoveFile() from DokanSetRenameInformation(). I think it should be getting there from NtSetInformationFile() with rename information. I think in that case, Windows would change the name of the file, but the open handle (which I read needs to be opened DELETE for this to work, and I see the last time the Word tmp file is opened it is for DELETE and also other things) would still be valid. So the name of the file changes, but the handle stays valid. I think MirrorMoveFile() should use SetFileInformationByHandle() and set the file rename information. It is available in Vista and later. I think changing MirrorMoveFile() to do that will fix the bug. And I think MirrorGetFileInformation() should return STATUS_INVALID_PARAMETER if there isn't a handle. Here is the code I tried, and it seems to work
|
hi @bailey27, That's a way to do the rename ! Changes has been pushed 5e95252 It does improve the issue on my side but I can still make it happen. In the log, I no longer have
Or CreateFile that fail because the file does not exist (that seems to be normal) I can only say, that I can only reproduce this behaviour if explorer is open in the same folder as the file saved by word. |
@Liryna , Thanks for the tip about needing to have explorer open in the folder. I can reproduce it easily now. I still don't know what's happening now. But I did notice just now that in the code I submitted, I used FILE_RENAME_INFORMATION instead of FILE_RENAME_INFO. They are the same except for a BOOL instead of BOOLEAN and a DWORD instead of a ULONG. I think they are the same functionally. But FILE_RENAME_INFO is defined if you include windows.h, but FILE_RENAME_INFORMATION looks like it's defined in a ddk header that some people who would want to compile mirror.c in your release might not have. I think it would be better to change it to FILE_RENAME_INFO. |
Struct has been changed 8d58560 👍 FILE_RENAME_INFORMATION used was the one defined in In my
About the issue, I think it is that explorer try to get acces to word file created during saving (there is something like 3-4 files created). I guess at this moment we open a handle in one of the file when word want to move/delete or get exclusive access. I even looked to see an acces denied during createfile on such files but I found nothing for now 😢 |
@Liryna , Weird. on MSDN they say there's a BOOL there. I was thinking just a few minutes ago that it could have been bad before. I might have left heap garbage in the upper bytes of the BOOL if there were really a BOOL there. But I guess I was lucky, and it really didn't matter. Thanks for changing it. It looks to me like when word saves the file, it creates a tmp file for saving the new version of the document. Then it moves the old docx to another tmp file. Then it writes the new doc to the first tmp file. Then it moves the first temp file to the original filename. Then it deletes the 2nd tmp file (with the old version of the doc in it). And when it fails, then you have the 2nd tmp file (the one with the old version of the doc) still there. I had some debug info I wanted to paste here but I lost it. But what it said was it could not delete the tmp file because it was not found (in the delete upon cleanup). Now after word failed, I cleared the hidden attribute of an undeleted temp file. On my mirror drive, I cannot read the file. But in my real directory, I can.
I don't see any reads happening in mirror. I can still get file info on it on the mirror drive.
Could it be that dokan1.sys is what is failing? That it returns an error when word tries to read the file and nothing about it gets to mirror? I haven't been looking at the driver. I will look there next. |
I tried using the driver I built from (almost) current git source. And I couldn't reproduce the bug. I switched to the driver built from the RC3 source zip, and I got word to fail. And I unhid the .tmp file and I couldn't read it. In the driver, I see it is getting here when I try to read the file. It thinks the file is a directory.
Then I tried to read it from the root dir of the fs (c:\tmp\mirror) and I could read it ok. |
😲 Thats very wiered.do you think being able to see from where it is set ? (Good job to find out 👍 ) |
This is what I've been able to find out so far. I close word, close the explorer window, run process explorer and search for ~WRL0001.tmp and don't find any open instance of it. Then I run my program to open the file and read 16 bytes from it. Then I break on my breakpoint in DokanDispatchCreate() in the driver and watch the filename to see when it's interesting. When it calls DokanGetFCB() to get the fcb for this file, it finds an fcb for this file in the linked list of active fcbs. This fcb has flags = 1, which is for DOKAN_FILE_DIRECTORY. It looks like there are only two fcbs active (open?). One if for \ and the other is for this file. So how did that file's flags get set to 1, and why is there still and fcb for it if no process had it open before m program opened it? |
The behavior can be explained if SL_OPEN_TARGET_DIRECTORY is set during the create. This should be handled in the driver properly, which is in my opinion actually not the case. The driver creates fcb for c:\myfile.txt instead of just \ if the SL_OPEN_TARGET_DIRECTORY is set. The driver should consider this flag and extract the proper path and behave like it's a normal open of a directory. Actually the full complexity is forwarded up to the user mode application. Please check if I'm correct with my assumption before doing any changes. 😄 |
I'm not really sure. When I try to reproduce this bug with word, I do see some creates with SL_OPEN_TARGET_DIRECTORY. I changed the DbgPrint about SL_OPEN_TARGET_DIRECTORY in DispatchCreate() in the dll to print the fileName (before the file name part gets stripped), and I see in dbgview that this option is being used only to open with filename \ and not the tmp file. I think if the temp file were being specified h an SL_OPEN_TARGET_DIRECTORY then that could explain it. But it looks like the temp file is never opened with SL_OPEN_TARGET_DIRECTORY. I have not been able to reproduce the actual bug at all today. Until just now, because I said that :-) But what I did see is that if I try to repro the bug but can't repro it, then I exit word and close the explorer window, do a dir /a in the directory and see there is no ~WRL0001.tmp file anymore. Then I set a breakpoint in DokanGetFCB() and do something to trigger and look at the fcbs in the list and there is one for ~WRL0001.tmp sill there. But it doesn't have the directory flag set. I think somehow the driver is leaking the fcb of a deleted file. Or maybe the kernel still thinks there's something to be done with that file and the fcb should still exist? But what? I tried to reproduce this by just renaming a file (which mirror now does with SetFileInformationByHandle()) and I didn't have a stale fcb in the list. |
makes no sense, why should windows open \ with SL_OPEN_TARGET_DIRECTORY? You don't need to debug, the information is part of the driver log file so you can post the log file here if you can reproduce it. The behavior in the driver is wrong, so it should be fixed before we go on with the investigations. My assumption is that the driver says open the file ~WRL0001.tmp with SL_OPEN_TARGET_DIRECTORY set. Driver creates FCB for the filename ~WRL0001.tmp instead of filename of the parent directory. The next time windows makes CreateFile request for ~WRL0001.tmp without the flag SL_OPEN_TARGET_DIRECTORY. The driver takes the available fcb instead of creating a new one. |
Yes, you right! But, it didn't look like that from the DLL. I put this code in the DLL to see what file was being opened with SL_OPEN_TARGET_DIRECTORY.
It prints out this every time (that the filename is \ )
But when I break on the debug print about SL_OPEN_TARGET_DIRECTORY in DokanDispatchCreate() in the driver
And step through the code, I can see that the file is the temp file, that an fcb with that filename is created, and I can watch that filename get marshaled into the eventContext, and I even du'ed around in the eventContext until I found it. But somewhere it's getting changed. I thought it was maybe CheckFileName() in the dll, but it wasn't. So, I think you are right. And I think there is some race condition as to whether this temp file gets an fcb created for it as a directory or not. And when in does get created as a directory first, which doesn't happen all that often, then word can't use the file, so it has the error. I agree that the driver should handle SL_OPEN_TARGET_DIRECTORY as you say. I think that will fix this bug. |
I changed my driver (based on current git source) to deal with SL_OPEN_TARGET_DIRECTORY. It removes the file name from the path, ors in FILE_DIRECTORY_FILE in the create options, makes sure FILE_NON_DIRECTORY_FILE is clear, and makes sure that the DLL doesn't get the SL_OPEN_TARGET_DIRECTORY in the flags (by masking it it out from the EventContext->Flags in SetCommonEventContex(). So the dll should process it like a normal open directory request. I don't see the DLL getting any SL_OPEN_TARGET_DIRECTORY after this. I haven't tried that hard to reproduce the actual word bug, but it hasn't happened yet. BTW, I moved my word file into a subdir. So now it's in \blah\1.docx. When I run my new driver (based on the current git source), and then open and save the file in word, exit word and close the explorer window, then do my type foo.txt in the root dir and watch what happens in DokanGetFCB(), I see there is an fcb in the list for \blah~WRL0001.tmp, and it has Flags = (16) DOKAN_DELETE_ON_CLOSE. I think this could be bad. I tried it with the plain git source (no SL_OPEN_TARGET_DIRECTORY change) and I found it still having Flags = (16) DOKAN_DELETE_ON_CLOSE. I didn't notice this delete on close flag before, I think because I was using the RC3 driver. Then I tried the RC3 source with and without my changes, and there is fcb there for \blah~WRL0001.tmp but no DOKAN_DELETE_ON_CLOSE (and no directory bit either this time). While I was testing it, I notice that when I do "type foo.txt" in my M:, then somebody starts trying to open various files like \1.docx, \2.docx, \blah~WRL0001.tmp, etc. So maybe it's normal for these files to be there because somebody's trying to open them. But I think the DELETE_ON_CLOSE flag being there when I run the current git source is weird and maybe bad. |
Hi @bailey27 , That's a good debug you did 👍 About DELETE_ON_CLOSE, your are right, there is a change since RC3: 59d08ff I also think that a request on (For information, It would be really great to have a good implementation of it ! ❤️ |
I'm not really sure now that SL_OPEN_TARGET_DIRECTORY should be handled only in the driver. It think what MS wants to do with SL_OPEN_TARGET_DIRECTORY is "Really open a handle to the directory that the file is in, but also check if the file exists, and if it doesn't, then return STATUS_SUCCESS, but set in the info FILE_DOES_NOT_EXIST" I think the real problem now is that the driver ends up getting an fcb for the wrong file. I think maybe the only change needed about this is that the driver should, if SL_OPEN_TARGET_DIRECTORY is set, then get an fcb for the parent dir instead? |
I backed out the changes from 59d08ff, and now I don't see DOKAN_DELETE_ON_CLOSE in the fcb for the tmp file that exists after word exits and explorer is closed. |
SL_OPEN_TARGET_DIRECTORY - If this flag is set, the file's parent directory should be opened. This is what Microsoft says. If you say file does not exist, would this mean that the folder does not exist or the file does not exist? Please send me a link of your repository so I can review your changes. Can you explain why the driver should get the wrong fcb? The driver should check the flag as first and change the filename before creating the fcb. Mirror should ignore the flag and the job is done. I would just remove the code from mirror to be sure that this special case is not handled by mirror. Driver should also not pass such information to mirror. |
I don't have any changes ready to commit. I did exactly what you said in the last paragraph (except I removed the flag right before dispatching the event to dokan1.dll) And there was no wrong fcb then. Except for this other problem with DOKAN_DELETE_ON_CLOSE which seems to be in the current source and not in RC3. If I back out the changes in the commit that @Liryna mentioned (59d08ff), then that problem goes away. "If you say file does not exist, would this mean that the folder does not exist or the file does not exist?" I'm saying that I think that if SL_OPEN_TARGET_DIRECTORY is specified, and STATUS_SUCCESS is returned, but info has FILE_DOES_NOT_EXIST, then it means the parent directory was opened, but the file does not exist. I'm basing that on what I read in this tread on OSR https://www.osronline.com/showthread.cfm?link=261045 Also, there is this code in the dokan DLL. https://github.com/dokan-dev/dokany/blob/e49cf5b/dokan/create.c#L216-228 But I think that isn't right, because the status it is checking is about if the directory was opened, not the file. I know what you say is what MS says, but their documentation isn't always complete. And if it weren't like this, then what would be the purpose of SL_OPEN_TARGET_DIRECTORY? Why couldn't the caller just open the directory? If it is as I think, then the purpose of SL_OPEN_TARGET_DIRECTORY would be so somebody can open the parent directory of a file and also check if the file exists at the same time. |
Does the os set SL_OPEN_TARGET_DIRECTORY and DELETE_ON_CLOSE on the same time resp in one request? |
DELETE_ON_CLOSE on Fcb should probably be clean when the ccb cleanup is made (and no other are currently requesting delete also) ? |
@bailey27 I think you are generally right with your statements and dokan is handling the SL_OPEN_TARGET_DIRECTORY nearly correctly without any changes. You said that you have created a new document and you saved it. Then you exited the word. After exiting the word is there still a \blah If all this statements are correct so I should be able to reproduce the issue. I'm not sure how the driver should behave but I think when the flag SL_OPEN_TARGET_DIRECTORY is set, this should be considered during DokanGetFCB so that dokan retrieves the correct one and also destroys the correct one. It's the Fcb of the parent object which should be used, if one exists. I know that word makes many CreateFile for \blah @Liryna Here I think the problem is more why the fcb \blah~WRL0001.tmp is still there even if the files does not exist anymore. |
I had a try at making it do what I think it should do about SL_OPEN_TARGET_DIRECTORY in the driver and in the dll @marinkobabic , The problem was it's fcb had the directory bit set, and word couldn't use it like a normal file. I think this happened because, as you suggested, the dir bit was getting set because of the mis-handling of SL_OPEN_TARGET_DIRECTORY. I think there are two two other problems. I think that there are stale fcb's. After I exit word and close explorer, if I do "M:\ type foo.txt" in the root of my mirror drive, and break in DokanGetFCB, I see fcbs there for \1.docx and \blah I think \1.docx comes from some "recent files" thing in explorer trying to open it and failing, but the fcb gets left around. The other problem is that this fcb for \blah~WRL0001.tmp has the delete on close flag (but the one for \1.docx does not). If I back out the changes in 59d08ff, then the fcb for \blah~WRL0001.tmp doesn't have the delete on close flag set. |
In your actual changes why are you not passing the original name up to usermode? Without the original name the usermode can't check if the file exists. |
I think I am passing the original name. If it's SL_OPEN_TARGET_DIRECTORY, then parentDir is not null, and it passes fileName. In this case, fileName doesn't belong to the fcb, parentDir does. @Liryna , "I also think that a request on \ should be denied (STATUS_ACCESS_DENIED) when CreateDisposition = FILE_CREATE FILE_OVERWRITE FILE_OVERWRITE_IF FILE_SUPERSEDE" I made it so it returns access denied if there isn't a parent dir. I Don't understand about the CreateDisposition. Maybe you could check for that case? |
@bailey27 |
@bailey27 |
I forgot to free fileName if DokanGetParentDir() returns an error. I put your debug print in my driver. To reproduce the problem, I browse to m:\blah in explorer, then open the file 1.docx in Word 2010 and modify and it and save it and close it I was wondering if maybe it could leak the fcb because in these two places it does __leave without freeing the fcb. here and. here I saw some of those debug prints about oplock status, so I changed them to also print fcb->FileName (not fileName). Then I saw this
Then when I save the file, \blah\1.docx gets renamed to \blah~WRL0001.tmp. The filename in the fcb should get changed to \blah~WRL0001.tmp when it gets renamed. So maybe it's leaking the fcb when the FsRtlOplockFsctrl() returns an error? |
Because of this reason I also add the FileCount to the debug print. Normally during rename the FileCount is 1. The rename is the main reason why the file create happens with SL_OPEN_TARGET_DIRECTORY. The question is if the file close after the rename destroys the fcb properly or not? If you could post the log in pastebin so also other contributors may read it. About the oplock, no idea. But let's see the log. At the end if the log we will see that you try to open foo.txt and DokanGetFCB will print the WRL file which is still there. Your actual state is still that the file is not saved after modifying it? Is the fcb still there after the first save? |
No, the file is saved. I haven't seen it not be saved since I put the changes about SL_OPEN_TARGET_DIRECTORY in the code. I have been debugging the fcb leak since then. I'm sorry about not pasting logs. It looked to me like the reason the file wasn't being saved before I changed the SL_OPEN_TARGET_DIRECTORY behavior was that the fcb for that file picked up the DOKAN_FILE_DIRECTORY bit because of the wrong SL_OPEN_TARGET_DIRECTORY behavior before. Then when Word tried to read from the file, the read returned an error because everybody thought it was a directory now. And I think the fcb leak caused it to always be a directory after that happened. But the erroneous directory bit isn't happening now, so word always saves the file. I think what happens now about the fcb leak is Word tries to open \blah\1.docx The FsRtlOplockFsctrl() returns an error, causing an fcb to be leaked for \blah\1.docx. It leaks because the fcb isn't freed then, and DokanDispatchCreate() returns STATUS_INVALID_PARAMETER (0xc000000d). So there is now an fcb for \blah\1.docx with FileCount 1, and no corresponding close will happen to decrement it because of this create returned an error. Word seems to shrug off this failure, and it opens \blah\1.docx again (I think several times actually), and then word renames it to \blah~WRL0001.tmp. When the rename happens, the fcb (which has +1 too many FileCount because of the leak) has its FileName changed to \blah~WRL0001.tmp. And the +1 too many count is still there. When I exit everything and type foo.txt to make the driver walk through the list of active fcbs, there is one for \blah~WRL0001.tmp still there. But I think it was leaked when it was called \blah\1.docx, because of the oplock failure. |
Here is the log. The oplock failures happens even before I open the file in Word. So it must be explorer that is trying to open \blah\1.docx and getting the oplock failure. It also tries to open and gets an oplock failure on \1.docx (this file hasn't been there in the fs for 2 days). And I see the fcb for \1.docx leaked in the fcb list as well. You can see in the log that when the rename happens, the FileCount is 2.
|
If I free the allocated things (fcb, ccb, eventContext) if the oplock things are causing DokanDispatchCreate() to return an error, then the fcb leak doesn't happen. Here is the log I was typing foo.txt after closing everything, and it wasn't seeing \blah~WRL0001.tmp in the fcb list. |
@bailey27 If so far all is fine I would say that you should do a pull request. It was a pleasure to talk to you. Good job! |
Rename scenario could be: Open handle to file1 |
I thought you wanted me to make sure that the fcb was being renamed. I stepped through it in the debugger, and it appeared to be so. I already did the pull request before I saw your last comment. |
Then is fine :-) Thanks |
Thank @bailey27 for all your work ! Thats really amazing a great and @marinkobabic for your expertise as always ! ! ❤️ 🏆 🎉 Well found for the OpLock leak ! I also updated CreateFile support status #149 |
Environment
Check List
Description
Do the fallowing steps in mirror sample can reproduce the problem:
Logs
log.txt
The text was updated successfully, but these errors were encountered: