Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite user-mode threading to work with overlapped IO/IOCP #210

Open
Corillian opened this issue Apr 4, 2016 · 43 comments
Open

Rewrite user-mode threading to work with overlapped IO/IOCP #210

Corillian opened this issue Apr 4, 2016 · 43 comments

Comments

@Corillian
Copy link

The current threading model used by Dokan is a major performance bottleneck in any user-mode driver that can make asynchronous IO calls - for example a driver that provides access to an FTP server. The OS can potentially make thousands of IO calls a second but with a limit of 15 threads and no async callback mechanism those threads will spend most of their time blocking on IO operations (in the FTP example they will block on socket read/writes).

To solve this I propose the following:

  1. DeviceIoControl() in dokan.c needs to use an IO completion port for dispatching IO events to the user-mode driver. IO events will automatically be handled on the system thread pool and neither Dokan nor the user-mode driver are required to manage their own threads nor worry about the number of threads that are allocated.
  2. A flag should be added to Dokan context information provided to the event handler that allows it to specify whether or not the event will be handled asynchronously. This is because the end of an asynchronous call requires special handling - it must explicitly tell Dokan to send a response to the kernel.
  3. The Dokan user-mode Dispatch*() functions will need to be split in two: one half (BeginDispatch*()) for event setup and the call into the driver event handler and one half (EndDispatch*()) for validation, cleanup, and sending the result to the kernel. Finalization for an asynchronous event (EndDispatch*()) can be initiated by a DokanFinishAsyncEvent() function called from the driver that takes a pointer to the context information as an argument.
  4. Currently information is passed to the user-mode driver and back to the kernel via function arguments. This would need to be changed to a pointer to a structure that's internally held by the Dokan event context info so that DokanFinishAsyncEvent() knows where to find the information it needs to pass back to the kernel.
  5. Dokan context objects should be pooled and push/popped from a global stack to prevent the driver from constantly hitting the memory allocator. A robust implementation should also include time stamps so that a number of context objects in the global pool beyond some threshold can be cleaned up after some amount of time passes to account for large temporary spikes in activity.

Unfortunately this both a non-trivial undertaking and would require breaking API changes however my driver is already running into problems with throughput due to restrictions imposed by the current Dokan threading model.

@Liryna
Copy link
Member

Liryna commented Apr 4, 2016

Great propsition @Corillian ! I totally agree with you ❤️

Removing the threads limitation would enhance a lot the dokan speed also !

If I understood well, most of the changes are going to be in the Dokan Library and we are keeping the DeviceIoControl for communication Library/Kernel, are you we gonna break API compatibility in this communication ?

About point 5, is it not related to #148 ?

It seems that you have a lot investigate around how to make it, do that mean it is something that you are already planning to do ?

@Corillian
Copy link
Author

About point 5, is it not related to #148 ?

Yes point 5 is identical to #148 except I don't believe the lookaside list API is available in user mode.

If I understood well, most of the changes are going to be in the Dokan Library and we are keeping the DeviceIoControl for communication Library/Kernel, are you we gonna break API compatibility in this communication ?

All of the changes would be in the user mode dokan.dll. It will keep DeviceIoControl() and it will involve breaking API changes as all of the event handler functions will need to have their arguments packed into (pooled) dynamically allocated structures.

I have no problem doing this change myself however I don't have a lot of time at the moment. There's a good chance I wouldn't be able to get to it until sometime in June at the earliest but that could change (for better or worse).

@Liryna
Copy link
Member

Liryna commented Apr 4, 2016

Oups yes lookaside does not exist for user mode ofc 😄 sorry

Ok I see the breaking changes you are talking about. It would be nice if we can just add a proxy fonction for each event that would convert the dynamically allocated structures into current function parameters and the other way back at the user function return.
If we really cannot avoid it, we will deal with it. It is just that I really would not like to force people to update all their implementation every version :)

Don't worry about time, we also currently have couple of things that we need to do to push for pushing this 1.0.0 out 😞 (some stability issue)

@JohnOberschelp
Copy link

I have also been happily and successfully using Dokany from the beginning and have also run into the bottleneck of waiting on IO.

I think a much easier solution that does not break our API, would be for the driver to handle the return value STATUS_PENDING from (at least) user mode function dokanOperations->ReadFile.

All existing Dokany projects would continue to operate, and developers that want concurrent IO could do their own that worked by returning STATUS_PENDING in conjunction with IO handler thread(s).

@marinkobabic
Copy link
Contributor

Driver can only communicate to the user mode if the user mode tells the driver "here you have a worker". How would that happen using the approach suggested by @Corillian? Need to understand the full flow.

The approach suggested by @JohnOberschelp seems to me much simpler. But in this case one thread will still be blocked. If driver gets STATUS_PENDING it will make the same call again using maybe even the same thread, until it gets another STATUS. Here we are limited in the amount of threads resp. workers provided to the driver. Possible solution here is to create a new additional thread resp. worker in user mode, if user mode returns STATUS_PENDING. STATUS_PENDING means this thread is still working resp. thread is blocked. We need to keep the old thread running with setting a special async flag in the context like suggested by @Corillian. If once this special threads have been handled properly those can be terminated. The thread amount specified in dokan options is just the minimum amount of threads which will be handled in parallel.

In generally we can increase the amount of threads in the dokan options to 64. This can be done already yet.

@JohnOberschelp
Copy link

You are right, @marinkobabic, one thread will be blocked, but not the rest of the threads.

Just by implementing handling of STATUS_PENDING on reads, and implementing lazy writes in userland, all Dokany threads would be able to go as far as they can till they are I/O bound. For read in particular, all the driver may need to do (as @marinkobabic said) when it receives a STATUS_PENDING, is yield to other threads, and try again; the read request is still completely valid. @Corillian, wouldn't this be all you need to get your FTP requests running in parallel?

@Corillian
Copy link
Author

DokanMain() will create an IO completion port, bind it to a thread pool (https://msdn.microsoft.com/en-us/library/windows/desktop/ms686760(v=vs.85).aspx), open the device with overlapped IO enabled, bind the device to the IO completion port, queue an async DeviceIoControl(), and then return (the current implementation blocks) while passing a state object back to the user. When a completion packet is available it will be dequeued on the thread pool bound to the completion port and a callback provided by dokan.dll will be called. This callback will then call the appropriate BeginDispatch*() for the packet which will remove a dispatch context object from the object pool (or allocate a new one if the pool is empty), fill in its arguments according to packet type, and then call into the user mode driver via the callback specified during Dokan setup (i.e. dokanOperations->ReadFile).

Inside the user mode driver callback a pointer to a structure with the relevant arguments for that packet will be provided. It should be done this way because we've then gone through the trouble of creating a proper structure to marshal packet arguments so that driver writers don't have to constantly reinvent the wheel for their own async operations. They just pass around a single pointer and reference it as needed. If the user mode driver callback will be asynchronous it will return STATUS_PENDING which tells Dokan to immediately exit the completion packet callback so that the thread can begin servicing other completion packets. If it's a synchronous callback it'll return its normal error code and Dokan will then automatically call EndDispatch*() to release the dispatch context object back to the object pool as well as return the result to the kernel.

Outstanding asynchronous operations will also need to be tracked by Dokan so that during shutdown all pending IO operations will be verified complete before tearing down the IO completion port and the thread pool. Once a user-mode asynchronous operation completes it will need to call DokanFinishAsyncEvent() which will cause Dokan to call the appropriate EndDispatch*(). We should also make our thread pool and IO completion port available to user mode drivers since a thread can only be bound to a single IO completion port and there's no reason to force driver writers to create another thread pool.

@marinkobabic
Copy link
Contributor

@JohnOberschelp
If you write so many files as you have threads, then all of them will be blocked.

@Corillian
This will be an amazing improvement. I totally agree. 😃

@suy
Copy link

suy commented Apr 8, 2016

I agree that the current API is weird, and doesn't make much sense for some cases (my filesystem queries the network asynchronously too, so I have to workaround it by using promises and futures, and blocking the callback in DokanOperations). But note that is pretty easy to implement a filesystem in Dokan+FUSE this way.

Also, I'm not very familiar with the Windows API, and even if I were, I will have to use both Windows' and Unix's if I had to write a cross-platform filesystem. My current approach for that is using a cross platform library that isolates me from that as much as possible.

An alternative would be to document the protocol to talk to the kernel device, am I right? FUSE is considering it and there is already a protocol sketch.

That would be helpful for people who want to use Dokan with any language.

@Maxhy
Copy link
Member

Maxhy commented Apr 8, 2016

Interesting discussion you started @Corillian. Current threading model was always a bother to me so I'm glad to see suggestions. My opinion is that it currently works and only result to performance issue so even if I totally agree, I will personally do on it after few others issues / bug fixes. Or maybe you already planned to work on it on your side as you already contributed several times? Better to know it to avoid work duplicate as happened with WiX installer 😄.

@suy For now, most current language Dokan libraries (C#, Java, ...) are wrapping the C resulting dynamic library dokan1.dll. I currently prefer this way to avoid user-mode handling work duplicate (libraries would have to through native win32 call anyway to talk to the driver). But documenting the protocol to talk to kernel device should be done for the sake of documentation indeed!

@Liryna Liryna added this to the 1.1.0 milestone Jun 28, 2016
@Liryna
Copy link
Member

Liryna commented Jun 28, 2016

Hi @Corillian ,

Since it is june, I just wanted to know if you got any time on this ?
(No need to hurry, 1.0.0 is even not released yet 😃 )

@Corillian
Copy link
Author

No unfortunately the stuff I needed to get done before I could address this ended up taking a bit longer than I had anticipated. I noticed that 1.0.0 isn't released yet so I figured the delay on my end wouldn't be a huge issue ;). I'm still planning on implementing this, hopefully soon as the lack of throughput is a near constant complaint from users of my driver.

@Corillian
Copy link
Author

Alright I've started working on this. I won't be working on it every day so I don't know exactly how long it will take but it's currently under way =).

@Liryna
Copy link
Member

Liryna commented Jul 11, 2016

@Corillian 👍 don't hesitate if you need help!

@Corillian
Copy link
Author

@Liryna Why is ZwCreateFile() called twice for SL_OPEN_TARGET_DIRECTORY?

@Liryna
Copy link
Member

Liryna commented Jul 12, 2016

@Corillian I don't have my environment to test with me but I would say that this come from the last pull requests of @bailey27.

If I remember, there should be a first createfile to know if the file exist and a second for opening the directory.

Or do you mean you have two time the same createfile?

@bailey27
Copy link
Contributor

bailey27 commented Jul 13, 2016

@Corillian ,

I changed the behavior of SL_OPEN_TARGET_DIRECTORY to fix #249 (MS Word 2007 & 2010 can not save file normally in mirror.exe).

If SL_OPEN_TARGET_DIRECTORY is specified, then the caller expects to open the parent dir of the file. But it also expects to be told in create Information if the child did not exist.

@Corillian
Copy link
Author

Going through directory.c it seems to me that since Dokan caches the previous directory listing and provides its own pattern search maintaining FindFilesWithPattern() as a Dokan operation is a bad idea. @Liryna any opposition to removing it?

Also is it correct to assume that the kernel synchronizes all file operations on a file handle? If that's not true then there are quite a few synchronization bugs in the user-mode driver.

Lastly shouldn't directory list caching be done in the kernel and not the user-mode driver? I don't understand why we have to endure 2 user/kernel transitions when SL_RESTART_SCAN isn't specified just because the list gets cached in user space instead of the kernel.

@marinkobabic
Copy link
Contributor

marinkobabic commented Jul 19, 2016

Independent where the caching is done, how is the following scenario covered here:

  1.   FindFiles returns File1, File2
    
  2.   The mirrored File1 is removed.
    
  3.   Only the application knows this (maybe), which uses the dokan usermode library
    
  4.   UserMode and Kernel have no idea, that the file has been removed 
    

Actually it seems that the cache is not updated.

To make the whole system stable and faster:

  • the caching should be moved to kernel, like Keith said

  • notification mechanism should be implemented, so that the cache can be updated(update attribute, name, size, removed, etc.)

  • during the file open, if usermode reports that the files does not exists also update the cache

  • even if just some attributes of the file have change, rename, size changed, this must be reported to kernel

  • consider flag FO_NO_INTERMEDIATE_BUFFERING, FO_TEMPORARY_FILE, FO_DELETE_ON_CLOSE during the cache

  • enable/disable cache

    The cache can be enabled/disabled

    Only changes of files which did not happen through virtual drive must be reported.

This way the file and directory information can retrieve the information from cache. If the findfiles should be cached, then every external change on the mirrored filesystem needs to be reported to the kernel. FindFilesWithPattern is in generally not a bad idea, it simplifies the implementation. The developer don’t needs to check the mask for *

Thanks

Marinko

@Corillian
Copy link
Author

Ok after looking through FastFat it looks like it's up to the driver to do its own synchronization - which is what I had assumed would be the case. FastFat obviously doesn't need to do any caching because it has direct access to the drive but it does maintain the state of the previous search. It looks like FindFilesWithPattern() is worth keeping around however I do think the cache should be maintained in the kernel and not in the user-mode driver.

@Corillian
Copy link
Author

Here's the FastFat code for searching:

//
        //  Determine where to start the scan.  Highest priority is given
        //  to the file index.  Lower priority is the restart flag.  If
        //  neither of these is specified, then the Vbo offset field in the
        //  Ccb is used.
        //

        if (IndexSpecified) {

            CurrentVbo = FileIndex + sizeof( DIRENT );

        } else if (RestartScan) {

            CurrentVbo = 0;
            Ccb->OffsetToStartSearchFrom = 0;

        } else {

            CurrentVbo = Ccb->OffsetToStartSearchFrom;
        }

As @marinkobabic mentioned there is currently a problem with refreshing the cache. As far as I'm aware Dokan does not currently provide a driver with a way to notify the file system that the file system has changed?

@Liryna
Copy link
Member

Liryna commented Jul 19, 2016

I agree that the cache should probably review/rewrite and move to the kernel.
Notification of file changes are made during CreateFile and SetInformation when I file is created, delete, ....
I would say that this part works because for exemple we see explorer being directly aware of a new file and show it

DokanNotifyReportChange(fcb, FILE_NOTIFY_CHANGE_DIR_NAME,

DokanNotifyReportChange(fcb, FILE_NOTIFY_CHANGE_SIZE,

@Corillian
Copy link
Author

Yes but say your driver is a remote FTP server - when a new file gets created on that server there's no way to notify Dokan and Windows Explorer won't be aware of the new file until the next time it happens to issue a request for a directory listing.

@Corillian
Copy link
Author

Alright sure thing, thanks!

@Corillian
Copy link
Author

@Liryna looking at FastFat this behavior does not seem to be accounted for which makes me think that Dokan is doing something wrong and that's why the OS is sending read/write's after close/cleanup.

@Corillian
Copy link
Author

Ok I think I found the problem. From the documentation for IRP_MJ_CLEANUP:

Receipt of the IRP_MJ_CLEANUP request indicates that the handle reference count on a file object has reached zero. (In other words, all handles to the file object have been closed.) Often it is sent when a user-mode application has called the Microsoft Win32 CloseHandle function (or when a kernel-mode driver has called ZwClose) on the last outstanding handle to a file object.

It is important to note that when all handles to a file object have been closed, this does not necessarily mean that the file object is no longer being used. System components, such as the Cache Manager and the Memory Manager, might hold outstanding references to the file object. These components can still read to or write from a file, even after an IRP_MJ_CLEANUP request is received.

MirrorCleanup() is closing the file handle when it should in fact be cleaned up in MirrorCloseFile(). I'm seeing this in the IRP's:

  1. IRP_MJ_CLEANUP
  2. IRP_MJ_READ
  3. IRP_MJ_CLOSE

Which is exactly what the documentation for IRP_MJ_CLEANUP says can happen.

@Liryna
Copy link
Member

Liryna commented Aug 2, 2016

@Corillian Thank you for pointing the part of documentation showing it!
Still has not been able to reach my computer to make the research :(

So does everything is now always for you on the dokan behavior?

Corillian pushed a commit to Corillian/dokany that referenced this issue Aug 3, 2016
More or less rewrote all of dokan.dll to support async IO.
Rewrote the Mirror example to support the new Dokan API. Still need to add async IO support.
User-mode drivers can reeturn STATUS_PENDING to indicate they are performing an async operation at which point they are responsible for calling EndDispatch*() to notify the driver that the operation has completed.
Still some known issues that need to be fixed.
Corillian pushed a commit to Corillian/dokany that referenced this issue Aug 3, 2016
Close/Cleanup no longer gets called if Create fails and no user context is supplied
Renamed EndDispatch*() functions to DokanEndDispatch*() since they're public
The DOKAN_VECTOR API is now exported from dokan.dll
Corillian pushed a commit to Corillian/dokany that referenced this issue Aug 3, 2016
Made critical sections a bit easier to identify
Added missing exports
Corillian pushed a commit to Corillian/dokany that referenced this issue Aug 3, 2016
Added DokanInit() and DokanShutdown()
FIxed a bunch of memory leaks caused by the Close handler
@Corillian
Copy link
Author

So does everything is now always for you on the dokan behavior?

I have no idea what you just asked lol. I've submitted the initial PR for review: #307

It's going to need a lot of testing and review.

@Liryna
Copy link
Member

Liryna commented Aug 3, 2016

@Corillian Thank you corillian! I will take a look as soon as I can. Probably this weekend :)

Hahaha yes that was a strange sentence. I just wanted to know if you still think there is an issue on dokan behavior with read and write after cleanup.

@Corillian
Copy link
Author

no I believe I fixed it, so far I haven't encountered any additional problems.

Corillian pushed a commit to Corillian/dokany that referenced this issue Aug 9, 2016
Access tokens now get passed to the user mode driver for SetFileSecurity and CreateFile
Fixed an issue in Mirror with readonly files not returning access denied in CanDelete
Corillian pushed a commit to Corillian/dokany that referenced this issue Aug 9, 2016
Added unit tests for file streams
Corillian pushed a commit to Corillian/dokany that referenced this issue Aug 9, 2016
Corillian pushed a commit to Corillian/dokany that referenced this issue Aug 9, 2016
Corillian pushed a commit to Corillian/dokany that referenced this issue Aug 10, 2016
Corillian pushed a commit to Corillian/dokany that referenced this issue Aug 30, 2016
Corillian pushed a commit to Corillian/dokany that referenced this issue Aug 30, 2016
Fixes for DOKAN_OPTION_FORCE_SINGLE_THREADED
Mirror now closes a file handle in Cleanup again
Added more unit tests
Some unit tests are still failing due to issues with setting file security
@andreas-eriksson
Copy link

Is this still awaiting a review? Can I do anything to help test it?

@Liryna
Copy link
Member

Liryna commented Jan 18, 2017

Hi @andreas-eriksson ,

There is a security file issue #307 (comment)
that need to be fixed.
@Corillian is very busy often so this can take time or someone could probably contribute to his PR the fix.

After this step, we need people to test the mirror for being sure that everything is still working properly.
And finally I will help @Corillian for updating the document before merge 👍

@Rondom
Copy link
Contributor

Rondom commented Apr 23, 2017

I think a way forward would be for someone to resolve the merge-conflicts with the master-branch.
Then we can fix the file-security issue and then do extensive testing.

I would be willing to modify the API to be able to use both the old API and the new API. This would simplify testing and would allow people to test the new architecture without having to modify their filesystem. We can mark the new API this PR introduced as an experimental API until it is stable.

I tried to merge with master but gave up because I was not really sure how to resolve some of the conflicts :-( I did not try for too long, so maybe it was just me so I don't want to discourage anyone from giving it a try.

@Liryna
Copy link
Member

Liryna commented Jan 3, 2022

This is now partially available in 2.0.0 thanks to @Corillian PR
The only part I see that could be useful to merge is the async possibility for the user FS by returning status pending and call the dokan API later to complete when possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants