Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xdg-document-portal keeps files open indefinitely, preventing unmounting of removable media #689

Closed
nurupo opened this issue Jan 13, 2022 · 24 comments · Fixed by #1190
Closed
Assignees
Labels
bug needs diagnosis Root cause of the issue needs to be diagnosed portal: documents Issues with the documents portal
Milestone

Comments

@nurupo
Copy link

nurupo commented Jan 13, 2022

I'm trying out Chromium flatpak. I downloaded a few files using Chromium to a removable drive and then attempted to unmount it, but the system kept telling me that it can't do that, that the device is busy. I lsof the mount point to see what is using the fs and lo-and-behold:

$ sudo lsof /media/[REDACTED]
lsof: WARNING: can't stat() fuse.portal file system /run/user/1000/doc
      Output information may be incomplete.
COMMAND       PID   USER   FD   TYPE DEVICE  SIZE/OFF    NODE NAME
xdg-docum 1564418 nurupo   19u   REG  254,1    151987 8912913 /media/[REDACTED] (deleted)
xdg-docum 1564418 nurupo   20u   REG  254,1    826201 8912914 /media/[REDACTED] (deleted)
xdg-docum 1564418 nurupo   21u   REG  254,1   1873749 8912915 /media/[REDACTED] (deleted)
xdg-docum 1564418 nurupo   22u   REG  254,1 170303175 8912916 /media/[REDACTED] (deleted)
xdg-docum 1564418 nurupo   23u   REG  254,1   2276729 8912917 /media/[REDACTED] (deleted)
xdg-docum 1564418 nurupo   24u   REG  254,1 170130747 8912918 /media/[REDACTED] (deleted)
xdg-docum 1564418 nurupo   27u   REG  254,1    791113 8912921 /media/[REDACTED] (deleted)
xdg-docum 1564418 nurupo   28u   REG  254,1    308858 8912922 /media/[REDACTED] (deleted)
xdg-docum 1564418 nurupo   29u   REG  254,1   9789825 8912925 /media/[REDACTED] (deleted)
xdg-docum 1564418 nurupo   31u   REG  254,1    324365 8912924 /media/[REDACTED] (deleted)

The files were downloaded hours ago, deleted hours ago too (notice the "deleted" after each file), and Chromium was closed long time ago, but xdg-document-portal still has them open and is preventing me from unmounting the drive, forcing me to kill it. What gives? I don't want having to fish its PID and kill it every time I want to unmount my drive. That's 0/10 usability.

The expected behavior is that when I download something via Chromium on my removable drive, once it's done downloading, I can unmount the drive without xdg-document-portal getting in my way, with Chromium still running, just like it would be the case with a non-Flatpak Chromium.

Running Debian Bullseye with KDE Plasma.

@rowbawts
Copy link

rowbawts commented May 1, 2022

Any progress on this?

@GeorgesStavracas
Copy link
Member

No activity means no progress

@Fraetor
Copy link

Fraetor commented May 22, 2022

This also means that when files are deleted the space they occupied isn't reclaimed. I was playing some large video files with Flatpak Celluloid and my drive filled up.

@rowbawts
Copy link

This also means that when files are deleted the space they occupied isn't reclaimed. I was playing some large video files with Flatpak Celluloid and my drive filled up.

Yeah this is how I discovered the bug.

@raqbit
Copy link

raqbit commented Jun 18, 2022

Ran into this while simply mounting a USB drive, opening an MP4 file on it in Gnome Videos via Nautilus and then not being able to unmount the USB drive again.

~ lsof +f -- /media/<user>/<drive>
COMMAND    PID  USER   FD   TYPE DEVICE   SIZE/OFF NODE NAME
xdg-docum 6104 <user>   14u   REG    8,2 8280498469   74 /media/<user>/<drive>/<file>.mp4

@0rzech
Copy link

0rzech commented Dec 3, 2022

@Fraetor @rowbawts Is the space being reclaimed once xdg-document-portal is closed, or perhaps you mean that it borks the filesystem and makes the space permanently occupied?

@GeorgesStavracas I had the same today with xdg-desktop-portal process. lsof /run/media/[REDACTED] showed multiple /.Trash-[REDACTED]/files/[REDACTED] (deleted with Meld flatpak from Fedora registry) still being open by the portal.

It kept these files open even after I've deleted them from trash using Gnome Files. lsof /run/media/[REDACTED] showed then the same entries, but with /.Trash-[REDACTED]/expunged/[REDACTED] (deleted).

/etc/mtab says that the filesystem is vfat. I closed xdg-desktop-portal process with kill -s SIGTERM. After doing so I could finally unmount my device. Was that the correct way to do it, or perhaps it could leave the filesystem in wrong state?

@GeorgesStavracas
Copy link
Member

The correct way to do it is not do it. xdg-desktop-portal should not open files indefinitely. I don't know how to fix it, and I haven't looked into this problem. Someone needs to do that, and I don't think I can provide much guidance, since I have no idea what's the appropriate way to fix this issue. Someone with deeper knowledge on this problem domain needs to chime in.

@0rzech
Copy link

0rzech commented Dec 3, 2022

Thanks for replying. I understand that it's a bug and should not happen in the first place. My question was whether it was a correct workaround on my part (the bug already hit me after all) to do it that way in order to avoid filesystem inconsistency or not. I understand you may not know the answer to that question, so do you maybe know where or who to ask about it?

@xioren
Copy link

xioren commented May 22, 2023

Hi @alexlarsson, can you provide some insight please? From the git history of xdg-document-portal, you seem to have a deeper understanding of that code than most. This bug is quite egregious as #689 (comment) points out. Currently no one working on this project seems to know how to remedy the situation.

@bubolex
Copy link

bubolex commented Aug 17, 2023

Workaround for Ubuntu 23.04:

Log out then log in. You should be able to unmount the removable media.

@quazar-omega
Copy link

Doesn't that equate to just terminating the service?

@jadahl
Copy link
Collaborator

jadahl commented Aug 17, 2023

Is this only a problem with vfat external drives or does it happen with other filesystems as well?

A potentially better work around, anyhow, is systemctl --user restart xdg-document-portal.service, but it might affect other apps' access to files.

@bubolex
Copy link

bubolex commented Aug 17, 2023

Doesn't that equate to just terminating the service?

I have no idea. I have not tried this.

I closed all applications but files on my NFS share were still being accessed by xdg-document-portal.

A potentially better work around, anyhow, is systemctl --user restart xdg-document-portal.service, but it might affect other apps' access to files.

I guess restaring it is also a good idea after closing all applications.

[UPDATE] systemctl --user restart xdg-document-portal.service also worked

@xioren

This comment was marked as off-topic.

@berenddeschouwer
Copy link

Is this only a problem with vfat external drives or does it happen with other filesystems as well?

btrfs and ext4 external drives over here, same issue.

@Fraetor
Copy link

Fraetor commented Sep 17, 2023

Mine was an NFS mount.

@nurupo
Copy link
Author

nurupo commented Sep 17, 2023

Guys/gals, no point in spamming the issue with the filesystem you use.

The issue is not unique to a specific filesystem and is not just external drives that have this issue - it happens on all filesystems and all types of drives, you are just more likely to notice it using an external drive as you might want to unmount it, which would surface the issue.

The actual issue seems to be xdg-document-portal not closing the files when an application using xdg-document-portal closes them or when the application exits.

@orowith2os
Copy link

I can confirm there's a file descriptor leak somewhere, I'm just not sure on how to debug it.

xdg-docum 2719  oro   13r      REG              179,1  4986878     208 /run/media/oro/F138-9724/『呪術廻戦』第2期「渋谷事変」OP 「SPECIALZ」8bitアレンジ【ゲームボーイ】 [t_GXBSYNh0s].webm
xdg-docum 2719  oro   14u      REG              179,1  4986878     208 /run/media/oro/F138-9724/『呪術廻戦』第2期「渋谷事変」OP 「SPECIALZ」8bitアレンジ【ゲームボーイ】 [t_GXBSYNh0s].webm
xdg-docum 2719  oro   14u      REG              179,1  4986878     208 /run/media/oro/F138-9724/『呪術廻戦』第2期「渋谷事変」OP 「SPECIALZ」8bitアレンジ【ゲームボーイ】 [t_GXBSYNh0s].webm

If I knew the places and functions necessary, I could go out and printf-debug them.

@xioren
Copy link

xioren commented Sep 30, 2023

@orowith2os I tried pinpointing the issue a while back. From what I remember the issue was in one of these files (I believe) (1, 2)

My takeaway was that there is no straightforward solution, which is why it is being ignored. Either the portal system needs a fundamental rewrite or API changes are needed. Obviously feel free to take a look, the more eyes on the problem the better. This has to be addressed sooner or later.

@nurupo
Copy link
Author

nurupo commented Sep 30, 2023

Perhaps @alexlarsson could shed some light on this issue, they seem to be familiar with the code, going by git blame document-portal-fuse.c.

Also, this comment caught my eye, sounds like it might be describing this issue:

* One problem with this approach is that the kernel tends to keep
* inodes alive for a very long time even if they are *only*
* references by the dcache (event though we will not really use the
* dcache info due to the 0 valid time). This is unfortunate, because
* it means we will keep a lot of file descriptor open. But, we
* can't know if the kernel needs the inode for some non-dcache use
* so we can't close the file descriptors.
*
* To work around this we regularly emit entry invalidation calls
* to the kernel, which will make it forget the inodes that are
* only pinned by the dcache.

@hfiguiere hfiguiere added portal: documents Issues with the documents portal needs diagnosis Root cause of the issue needs to be diagnosed labels Oct 4, 2023
@orowith2os
Copy link

orowith2os commented Oct 13, 2023

The documents store appears to not be able to run under valgrind, so trying to find the leak will take longer, unless someone is able to help resolve that.

@hfiguiere
Copy link
Collaborator

I don't think valgrind will show you this.

In document-portal-fuse.c

The XdpPhysicalInode is held in the global GHashTable physical_inodes. It only closes the file description when its refcount is at zero, but this never happen.

The fix is to figure out when we can unref the corresponding XdpPhysicalInode (removing it from the hashtable)

Note: I am working on this.

hfiguiere added a commit to hfiguiere/xdg-desktop-portal that referenced this issue Oct 18, 2023
hfiguiere added a commit to hfiguiere/xdg-desktop-portal that referenced this issue Oct 21, 2023
hfiguiere added a commit to hfiguiere/xdg-desktop-portal that referenced this issue Nov 2, 2023
Issue flatpak#689
flatpak#689

Signed-off-by: Hubert Figuière <hub@figuiere.net>
alexlarsson added a commit to alexlarsson/xdg-desktop-portal that referenced this issue Nov 7, 2023
This fixes flatpak#689

The mechanism we're using to flush the dcache and thus avoiding
long-term caching of physical inodes which keep a file descriptor open
doesn't seem to work quite right.

A trivial test of cat /run/user/1000/doc/$docid/$filename properly
leads to the PhysicalFile being freed. However, an application that
keeps the file open for a longer time (longer than the delay we have
before the current dcache flush workaround is run), ends up with the
dcache entry for the inode to be kept in cache. This means that we
when the last other kernel reference (like an open fd) to the inode is
gone we don't get a FORGET event from fuse, and thus we never free the
PhysicalInode.

This is clearly visible, because if you end up with the file
descriptor leak like in the issue, then flushing the dcache (`sudo sh
-c 'echo 2 > /proc/sys/vm/drop_caches'`) fixes the leak.

It seems the current workaround of just invalidating the toplevel
directory entry doesn't work, so we replace it with tracking each
individual lookup that ends up with a physical inode and flush all of
them. This seems to fix the issue for me.
alexlarsson added a commit to alexlarsson/xdg-desktop-portal that referenced this issue Nov 7, 2023
This fixes flatpak#689

The mechanism we're using to flush the dcache and thus avoiding
long-term caching of physical inodes which keep a file descriptor open
doesn't seem to work quite right.

A trivial test of cat /run/user/1000/doc/$docid/$filename properly
leads to the PhysicalFile being freed. However, an application that
keeps the file open for a longer time (longer than the delay we have
before the current dcache flush workaround is run), ends up with the
dcache entry for the inode to be kept in cache. This means that we
when the last other kernel reference (like an open fd) to the inode is
gone we don't get a FORGET event from fuse, and thus we never free the
PhysicalInode.

This is clearly visible, because if you end up with the file
descriptor leak like in the issue, then flushing the dcache (`sudo sh
-c 'echo 2 > /proc/sys/vm/drop_caches'`) fixes the leak.

It seems the current workaround of just invalidating the toplevel
directory entry doesn't work, so we replace it with tracking each
individual lookup that ends up with a physical inode and flush all of
them. This seems to fix the issue for me.
alexlarsson added a commit to alexlarsson/xdg-desktop-portal that referenced this issue Nov 8, 2023
This fixes flatpak#689

The mechanism we're using to flush the dcache and thus avoiding
long-term caching of physical inodes which keep a file descriptor open
doesn't seem to work quite right.

A trivial test of cat /run/user/1000/doc/$docid/$filename properly
leads to the PhysicalFile being freed. However, an application that
keeps the file open for a longer time (longer than the delay we have
before the current dcache flush workaround is run), ends up with the
dcache entry for the inode to be kept in cache. This means that we
when the last other kernel reference (like an open fd) to the inode is
gone we don't get a FORGET event from fuse, and thus we never free the
PhysicalInode.

This is clearly visible, because if you end up with the file
descriptor leak like in the issue, then flushing the dcache (`sudo sh
-c 'echo 2 > /proc/sys/vm/drop_caches'`) fixes the leak.

It seems the current workaround of just invalidating the toplevel
directory entry doesn't work, so we replace it with tracking each
individual lookup that ends up with a physical inode and flush all of
them. This seems to fix the issue for me.
alexlarsson added a commit to alexlarsson/xdg-desktop-portal that referenced this issue Nov 8, 2023
This fixes flatpak#689

The mechanism we're using to flush the dcache and thus avoiding
long-term caching of physical inodes which keep a file descriptor open
doesn't seem to work quite right.

A trivial test of cat /run/user/1000/doc/$docid/$filename properly
leads to the PhysicalFile being freed. However, an application that
keeps the file open for a longer time (longer than the delay we have
before the current dcache flush workaround is run), ends up with the
dcache entry for the inode to be kept in cache. This means that we
when the last other kernel reference (like an open fd) to the inode is
gone we don't get a FORGET event from fuse, and thus we never free the
PhysicalInode.

This is clearly visible, because if you end up with the file
descriptor leak like in the issue, then flushing the dcache (`sudo sh
-c 'echo 2 > /proc/sys/vm/drop_caches'`) fixes the leak.

It seems the current workaround of just invalidating the toplevel
directory entry doesn't work, so we replace it with tracking each
individual lookup that ends up with a physical inode and flush all of
them. This seems to fix the issue for me.
alexlarsson added a commit to alexlarsson/xdg-desktop-portal that referenced this issue Nov 8, 2023
This fixes flatpak#689

The mechanism we're using to flush the dcache and thus avoiding
long-term caching of physical inodes which keep a file descriptor open
doesn't seem to work quite right.

A trivial test of cat /run/user/1000/doc/$docid/$filename properly
leads to the PhysicalFile being freed. However, an application that
keeps the file open for a longer time (longer than the delay we have
before the current dcache flush workaround is run), ends up with the
dcache entry for the inode to be kept in cache. This means that we
when the last other kernel reference (like an open fd) to the inode is
gone we don't get a FORGET event from fuse, and thus we never free the
PhysicalInode.

This is clearly visible, because if you end up with the file
descriptor leak like in the issue, then flushing the dcache (`sudo sh
-c 'echo 2 > /proc/sys/vm/drop_caches'`) fixes the leak.

It seems the current workaround of just invalidating the toplevel
directory entry doesn't work, so we replace it with tracking each
individual lookup that ends up with a physical inode and flush all of
them. This seems to fix the issue for me.
alexlarsson added a commit to alexlarsson/xdg-desktop-portal that referenced this issue Nov 8, 2023
This fixes flatpak#689

The mechanism we're using to flush the dcache and thus avoiding
long-term caching of physical inodes which keep a file descriptor open
doesn't seem to work quite right.

A trivial test of cat /run/user/1000/doc/$docid/$filename properly
leads to the PhysicalFile being freed. However, an application that
keeps the file open for a longer time (longer than the delay we have
before the current dcache flush workaround is run), ends up with the
dcache entry for the inode to be kept in cache. This means that we
when the last other kernel reference (like an open fd) to the inode is
gone we don't get a FORGET event from fuse, and thus we never free the
PhysicalInode.

This is clearly visible, because if you end up with the file
descriptor leak like in the issue, then flushing the dcache (`sudo sh
-c 'echo 2 > /proc/sys/vm/drop_caches'`) fixes the leak.

It seems the current workaround of just invalidating the toplevel
directory entry doesn't work, so we replace it with tracking each
individual lookup that ends up with a physical inode and flush all of
them. This seems to fix the issue for me.
GeorgesStavracas pushed a commit to alexlarsson/xdg-desktop-portal that referenced this issue Nov 11, 2023
This fixes flatpak#689

The mechanism we're using to flush the dcache and thus avoiding
long-term caching of physical inodes which keep a file descriptor open
doesn't seem to work quite right.

A trivial test of cat /run/user/1000/doc/$docid/$filename properly
leads to the PhysicalFile being freed. However, an application that
keeps the file open for a longer time (longer than the delay we have
before the current dcache flush workaround is run), ends up with the
dcache entry for the inode to be kept in cache. This means that we
when the last other kernel reference (like an open fd) to the inode is
gone we don't get a FORGET event from fuse, and thus we never free the
PhysicalInode.

This is clearly visible, because if you end up with the file
descriptor leak like in the issue, then flushing the dcache (`sudo sh
-c 'echo 2 > /proc/sys/vm/drop_caches'`) fixes the leak.

It seems the current workaround of just invalidating the toplevel
directory entry doesn't work, so we replace it with tracking each
individual lookup that ends up with a physical inode and flush all of
them. This seems to fix the issue for me.
github-merge-queue bot pushed a commit that referenced this issue Nov 11, 2023
This fixes #689

The mechanism we're using to flush the dcache and thus avoiding
long-term caching of physical inodes which keep a file descriptor open
doesn't seem to work quite right.

A trivial test of cat /run/user/1000/doc/$docid/$filename properly
leads to the PhysicalFile being freed. However, an application that
keeps the file open for a longer time (longer than the delay we have
before the current dcache flush workaround is run), ends up with the
dcache entry for the inode to be kept in cache. This means that we
when the last other kernel reference (like an open fd) to the inode is
gone we don't get a FORGET event from fuse, and thus we never free the
PhysicalInode.

This is clearly visible, because if you end up with the file
descriptor leak like in the issue, then flushing the dcache (`sudo sh
-c 'echo 2 > /proc/sys/vm/drop_caches'`) fixes the leak.

It seems the current workaround of just invalidating the toplevel
directory entry doesn't work, so we replace it with tracking each
individual lookup that ends up with a physical inode and flush all of
them. This seems to fix the issue for me.
hfiguiere pushed a commit to hfiguiere/xdg-desktop-portal that referenced this issue Nov 22, 2023
This fixes flatpak#689

The mechanism we're using to flush the dcache and thus avoiding
long-term caching of physical inodes which keep a file descriptor open
doesn't seem to work quite right.

A trivial test of cat /run/user/1000/doc/$docid/$filename properly
leads to the PhysicalFile being freed. However, an application that
keeps the file open for a longer time (longer than the delay we have
before the current dcache flush workaround is run), ends up with the
dcache entry for the inode to be kept in cache. This means that we
when the last other kernel reference (like an open fd) to the inode is
gone we don't get a FORGET event from fuse, and thus we never free the
PhysicalInode.

This is clearly visible, because if you end up with the file
descriptor leak like in the issue, then flushing the dcache (`sudo sh
-c 'echo 2 > /proc/sys/vm/drop_caches'`) fixes the leak.

It seems the current workaround of just invalidating the toplevel
directory entry doesn't work, so we replace it with tracking each
individual lookup that ends up with a physical inode and flush all of
them. This seems to fix the issue for me.
GeorgesStavracas pushed a commit that referenced this issue Nov 22, 2023
This fixes #689

The mechanism we're using to flush the dcache and thus avoiding
long-term caching of physical inodes which keep a file descriptor open
doesn't seem to work quite right.

A trivial test of cat /run/user/1000/doc/$docid/$filename properly
leads to the PhysicalFile being freed. However, an application that
keeps the file open for a longer time (longer than the delay we have
before the current dcache flush workaround is run), ends up with the
dcache entry for the inode to be kept in cache. This means that we
when the last other kernel reference (like an open fd) to the inode is
gone we don't get a FORGET event from fuse, and thus we never free the
PhysicalInode.

This is clearly visible, because if you end up with the file
descriptor leak like in the issue, then flushing the dcache (`sudo sh
-c 'echo 2 > /proc/sys/vm/drop_caches'`) fixes the leak.

It seems the current workaround of just invalidating the toplevel
directory entry doesn't work, so we replace it with tracking each
individual lookup that ends up with a physical inode and flush all of
them. This seems to fix the issue for me.
@hfiguiere hfiguiere added this to the 1.18 milestone Nov 23, 2023
@ninja-
Copy link

ninja- commented Feb 22, 2024

I've just hit this in Ubuntu 23.10. Hopefully the fix lands into Ubuntu at some point

xdg-desktop-portal is already the newest version (1.18.0-1ubuntu1).
xdg-desktop-portal-gtk is already the newest version (1.14.1-1).
xdg-desktop-portal-gnome is already the newest version (45.0-1).

@ninja-
Copy link

ninja- commented Feb 22, 2024

As a workaround you can restart the portals using

systemctl --user restart xdg-desktop-portal-gtk xdg-desktop-portal-gnome xdg-desktop-portal xdg-document-portal

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug needs diagnosis Root cause of the issue needs to be diagnosed portal: documents Issues with the documents portal
Projects
Status: Triaged
Development

Successfully merging a pull request may close this issue.