Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deadlock in filesystem #7

Closed
ghost opened this issue Aug 17, 2011 · 10 comments
Assignees
Milestone

Comments

@ghost
Copy link

@ghost ghost commented Aug 17, 2011

I've found what appears to be a deadlock in the kernel side of osxfuse. When attempting to open() or __unlink() a file in another thread during a moveItemAtPath:toPath:error: or createFileAtPath:attributes:userData:error: and waiting for the result the fuse thread will timeout before the other thread can continue.

The second thread is always waiting on either open() or __unlink() depending on the operation. The killing of the osxfuse thread after 60 seconds allows the secondary thread to continue.

I am trying to put together a simple example but it's not consistently happening.

@ghost

This comment has been minimized.

Copy link
Author

@ghost ghost commented Aug 17, 2011

I've managed to reproduce the issue by expanding the HelloFS example. Is there anywhere suitable to upload the project?

@bfleischer

This comment has been minimized.

Copy link
Member

@bfleischer bfleischer commented Aug 18, 2011

The deadlock you are seeing might be somewhat related to the EncFS deadlock issue.

You could put your extended HelloFS project in a gist here on github and post a link to it. It will be very helpful to be able to reproduce this bug.

@ghost

This comment has been minimized.

Copy link
Author

@ghost ghost commented Aug 18, 2011

Hi Benjamin,

I've uploaded the project to http://www.digital-dog.co.uk/Deadlock.zip

Before you run you will need to create a folder called /temp to contain the files it creates.

There is a PHP script in the folder called test.php, just run that in a Terminal window with 'php -f test.php' once the HelloFS project is running.

What it does:

The script runs in a loop creating 1000 files, then renaming them all and then deleting them all (crude I know but it produces the issue).

The project takes each of the requests and runs them on a separate thread and waits for the operation to complete just like it would wait for the database in my main project. The rename actually deletes and recreates the file, this is to simulate the conditions that occur in the database when the issue occurs in my project. I hope this is of some use to help you debug.

I have had it deadlock after 250 files and sometimes after 10,000 files but it always deadlocks eventually.

I've been speaking to another developer who has the ability to debug the kext and he has said that it appears to be something to do with the 64 bit vnode locking. Thought I would pass that on in case it is of any use.

Please let me know if you need any more information or if there is anything I can help with.

Thanks,
Dan

On 18 Aug 2011, at 12:22, bfleischer wrote:

The deadlock you are seeing might be somewhat related to the EncFS deadlock issue.

You could put your extended HelloFS project in a gist here on github and post a link to it. It will be very helpful to be able to reproduce this bug.

Reply to this email directly or view it on GitHub:
#7 (comment)

@ghost ghost assigned bfleischer Aug 19, 2011
@bfleischer

This comment has been minimized.

Copy link
Member

@bfleischer bfleischer commented Aug 21, 2011

Thanks for your upload. I'm able to recreate the issue.

On a side note:
I modified your script "test.php" to just create 20,000 files without renaming or removing them afterwards. This seems to produce the issue, too.

@ghost

This comment has been minimized.

Copy link
Author

@ghost ghost commented Aug 21, 2011

Thanks for letting me know. I had a feeling that it would happen without the renaming and deleting but I was trying to replicate the conditions of the main project.

Please let me know if you would like me to test any updates.

Thanks,
Dan

Sent from my iPhone

On 21 Aug 2011, at 14:58, bfleischer reply@reply.github.com wrote:

Thanks for your upload. I'm able to recreate the issue.

On a side note:
I modified your script "test.php" to just create 20,000 files without renaming or removing them afterwards. This seems to produce the issue, too.

Reply to this email directly or view it on GitHub:
#7 (comment)

@joscha

This comment has been minimized.

Copy link

@joscha joscha commented Aug 23, 2011

you might also have a look at this here: https://trac.macports.org/ticket/30129 as it might be a related problem

@bfleischer

This comment has been minimized.

Copy link
Member

@bfleischer bfleischer commented Aug 23, 2011

This issue can be considered fixed as of yesterday. The fix is based on Anatol's work. We just need to make sure that there are no side effects causing other issues. I will upload the patch as soon as I am convinced it is safe.

@ghost

This comment has been minimized.

Copy link
Author

@ghost ghost commented Aug 24, 2011

That's great news. If you need anyone to test before you release please let me know.

@ghost ghost closed this Aug 24, 2011
@ghost ghost reopened this Aug 24, 2011
bfleischer added a commit that referenced this issue Aug 28, 2011
Release biglock before calling "int msleep(...)" and reacquire it after wakeup.
The kernel extension calls msleep in function "int fticket_wait_answer(...)"
when waiting for an response from the user space daemon. This can lead to a
deadlock, if the user space daemon inadvertently causes an operation on the
FUSE file system.

Closes issue #7: Deadlock in filesystem
@bfleischer

This comment has been minimized.

Copy link
Member

@bfleischer bfleischer commented Aug 28, 2011

Should be fixed in OSXFUSE 2.3.2. Dan, please reopen this issue if you still encounter this problem.

@bfleischer bfleischer closed this Aug 28, 2011
@ghost

This comment has been minimized.

Copy link
Author

@ghost ghost commented Sep 6, 2011

Been running for a few days now without issue so seems fixed. Thanks.

bfleischer added a commit that referenced this issue Nov 4, 2011
Release biglock before calling "int msleep(...)" and reacquire it after wakeup.
The kernel extension calls msleep in function "int fticket_wait_answer(...)"
when waiting for an response from the user space daemon. This can lead to a
deadlock, if the user space daemon inadvertently causes an operation on the
FUSE file system.

Closes issue #7: Deadlock in filesystem
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.