Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xdg-document-portal: Close files as needed #1159

Closed
wants to merge 1 commit into from

Conversation

hfiguiere
Copy link
Collaborator

@hfiguiere
Copy link
Collaborator Author

This needs much more testing. (and maybe a few more bits of cleanup)

// G_UNLOCK (physical_inodes);
goto retry_atomic_decrement1;
}
inode->fd = fd;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't it leak inode->fd here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could add a check that it's -1 and if not close it. but this code path it is supposed to be closed.

@@ -167,12 +167,15 @@ G_LOCK_DEFINE (domain_inodes);

typedef struct {
gint ref_count; /* atomic */
gint use_count; /* atomic */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How's the use count any different from the refcount? Can you please clarify why is it necessary to track them separatedly?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because it gets kept alive and the fd gets "revived". the _use and _release calls are not matching the _ref and _unref.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see https://github.com/flatpak/xdg-desktop-portal/pull/1159/files#diff-2f05bdfc1f7bed1308fa7848357777aad2cadd1d6133d92678296ef2697c4e49R2122

(I have an updated version of that patch with more consistent formatting but the same semantics)

@hfiguiere hfiguiere marked this pull request as ready for review October 21, 2023 01:42
@hfiguiere
Copy link
Collaborator Author

My biggest fear with this PR is to have missed something and create conditions that would cause data loss.

@alexlarsson I'm sorry to ping you, but it feel like you would be the best person to help reviewing this change. Or even guidance to thoroughly test the corner cases.

Comment on lines +168 to +193
/* The reference count and use count are different. The latter still
* allow keeping track of the file without keeping it open, which otherwise
* keep the files around indefinitely.
* See https://github.com/flatpak/xdg-desktop-portal/issues/689
*/

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not make a weak reference where you'd use the use_count? Would that solve anything? Ideally it wouldn't keep the file open, so it can still be freed up when nothing else references it.

Issue flatpak#689
flatpak#689

Signed-off-by: Hubert Figuière <hub@figuiere.net>
@alexlarsson
Copy link
Member

This change seems very wrong to me. I mean, what is even the point of having a XdgPhysicalInode in memory if the fd is closed, as that is all it tracks?

Anyway, this seems non-threadsafe, as there is no guarantee that after a call to ensure_docdir_inode() that physical->fd is open. The atomics guarantee that the memory lives, sure, but the fd can be closed at any time by some other thread, so you can't use the PhysicalInode for anything.

In addition, the entire reason we're keeping the O_PATH fd open in the XdgPhysicalInode is to avoid issues if an external party modifies the backing filesystem (i.e. not via the fuse fs). If we close the fd when the kernel drops the inode from the cache, then the entire point of this feature (as documented in the comment at the top of the file) is lost.

As for the underlying issue, the portal fuse implementation should never keep around PhysicalInode references to something that is not in use. Like for instance, a XdpPhysicalInode is only being held open by an XdpInode, and those should only be kept open if the kernel is referencing it via the fuse upcall APIs.

So, given that we regularly flush the dcaches for the fuse mount, we should only keep a inode open if there is some long-term kernel reference keeping an inode in memory, such as a process keeping a file open or having it as current working directoy. As that seems to not be the case here, we must have some kind of leak, and the real fix is to find the cause of the leak and fix that. What we need to do is make a minimal reproducer and track what reference is keeping the inode alive.

@alexlarsson
Copy link
Member

alexlarsson commented Nov 7, 2023

I tried the reproducer in celluloid-player/celluloid#778, and even when the movie player has exited, xdg-document-portal keeps the video open, as seen by lsof.

However, when I flush the kernel inode cache (sudo sh -c 'echo 2 > /proc/sys/vm/drop_caches') it is released, because the kernel sends a bunch of FORGET events for the inodes.

So, it seems like the ownership tracking in the portal fuse impl is working, but maybe our workaround that is supposed to keep the kernel from caching these inodes forever is not working?

@alexlarsson
Copy link
Member

This seems like a better fix: #1190

@alexlarsson alexlarsson closed this Nov 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Triaged
Status: Done
Development

Successfully merging this pull request may close these issues.

None yet

4 participants