Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RT hangs when files in the file browser tab change or get deleted, or when batch processing #3289

Closed
da-phil opened this issue May 16, 2016 · 41 comments
Labels
type: bug Something is not doing what it's supposed to be doing

Comments

@da-phil
Copy link

da-phil commented May 16, 2016

I'm using the latest commit on master

Branch: master
Version: 4.2.851
Changeset: 80f86c7189c4b88ee51fddbef5874e312aedf648

Recently I noticed that RT hangs while files in the directory which is shown in the file browser tab change (when image is saved after editing) or get deleted. sometimes RT hangs for several seconds until the file browser keeps up with the changes.

RT also becomes completely unresponsive when batch processing pictures, I always have to wait until all files are processed before I can do anything with the GUI again.

@da-phil da-phil changed the title RT hangs when files in the file browser tab change or get deleted RT hangs when files in the file browser tab change or get deleted, or when batch processing May 16, 2016
@Floessie
Copy link
Collaborator

Cannot reproduce with debug and release builds of the latest master on Debian Stretch AMD64 (multi-core machine).

Could you please provide more information and have a look at "How to write useful bug reports"? For example you completely omitted your OS or features (found in your PP3s) you have activated for your RAWs. Does it always happen or only with certain RAWs? Have you checked that your storage medium (HD/SSD/network) still performs as expected when reading/writing files? Are you sure there is no other process hammering it?

Question upon question ... ;)

@Beep6581
Copy link
Owner

@da-phil ping.

@da-phil
Copy link
Author

da-phil commented May 23, 2016

Sorry for the late reply!
My OS: Ubuntu 14.04.4 LTS amd64
My computing hardware:

AMD Phenom(tm) II X6 1090T Processor (6 cores)
16GB RAM

And this is my RT config (~/config/RawTherapee/options)
options.txt

This behavior is completely independent of RAW developing features, it also happens when I don't use any features, not even colour profiles, as I said it also happens when the folder which is opened in the RT file browser changes (files were added or deleted).

@Beep6581
Copy link
Owner

Wild shot: what if you delete (or rename) your options file, then run RT, this will create a new options file, then create an empty folder /home/phil/emptyfolder , then edit this new options file and set

DarkFramesPath=/home/phil/emptyfolder
FlatFieldsPath=/home/phil/emptyfolder
ClutsDirectory=/home/phil/emptyfolder

Did that make any difference?

@da-phil
Copy link
Author

da-phil commented May 23, 2016

Unfortunately it didn't, thanks for the idea though!

@Floessie
Copy link
Collaborator

And this only happens in RT, or also in other GTK based programs? I only ask, because in the past I had some problems (hangs) with gvfs-backends installed in other GTK applications.

Silly question: Your RT was compiled with OpenMP enabled?

@da-phil
Copy link
Author

da-phil commented May 24, 2016

I killed all gvfs services and tried again, but unfortunately it also wasn't a gvfs backend :(
Yes my build is with OpenMP enabled!
My current version:

Branch: master
Version: 4.2.865
Changeset: d0cef55aada8e51321d44982a2a8ca258eaa6139
Compiler: cc 5.3.0
Processor: x86_64
System: Linux
Bit depth: 64 bits
Gtkmm: V2.24.4
Build type: release
Build flags: -std=c++11 -std=gnu++11  -Werror=unused-label -fopenmp -Werror=unknown-pragmas -O3 -DNDEBUG
Link flags:    
OpenMP support: ON
MMAP support: ON

I also tried it with a debug build, but had the same problem.

@heckflosse
Copy link
Collaborator

@da-phil When batch processing, is the source folder the same as destination folder or is it a different folder?

@Floessie
Copy link
Collaborator

@da-phil Maybe you could use strace (with -fr) to look where it hangs when files are added or removed?

@da-phil
Copy link
Author

da-phil commented Jun 4, 2016

@heckflosse jup, it's the same folder.
@Floessie good idea, will try that!

@heckflosse
Copy link
Collaborator

heckflosse commented Jun 4, 2016

@da-phil

If you batch process, is the destination folder same as source folder or different?
@heckflosse jup, it's the same folder.

Does rt hang also when destination is different to source?

@da-phil
Copy link
Author

da-phil commented Jun 4, 2016

@heckflosse when I save to another folder RT doesn't hang.
I guess the strace output looks really interesting and might lead to the problem, but I'm afraid to post it here because it contains all paths of my image folder which I wouldn't like to put in the public domain :/
I've to find a regexp to search and replace them all ^^

@heckflosse
Copy link
Collaborator

@da-phil Saving to source folder is a known issue for rt. rt periodically checks the source folder for new files afaik. This can cause the behaviour you detected (hangs or freezes). It's an Issue we (rt) have to solve. Meanwhile you can work around it saving to a different folder.

@da-phil
Copy link
Author

da-phil commented Jun 5, 2016

ah, I see, I didn't know it was a known problem, I was pretty sure some months ago I didn't have that problem, RT was responsive at all times.

the amount of syscalls happening on my machine is incredible, I checked other software too, even in idle mode there are so many syscalls appearing.
when I start strace with the options -f and -r RT wouldn't even come up, it seems to block the whole x-server session. I had to use a startup terminal to kill strace until the xserver became responsive again. I saw some fishy FUTEX syscalls before RT was killed, see log file:
2016-06-05-raw_therapee_strace_1_short.txt
unfortunately I can't make sense out of most sycalls and decide whether they are necessary or not.

@da-phil
Copy link
Author

da-phil commented Jun 9, 2016

I just realized that this problem was not noticeable when I worked in a fairly small folder with maybe 30 RAW and JPG files, however when I work in a folder with hundreds of RAW and JPG files I notice the lag while RT scans for new files and creates new thumbnails.

looks like this implementation might need a tweak or a different method. from working with events and fileIO blocking calls (select(), poll() and the likes) myself I imagine having a blocking call in a separate thread which waits for incoming events in the filesystem and hence triggers updates in the RT file browser without even blocking the GUI. but before talking bullsh*t, I probably should have a look at the code first myself ;)

@heckflosse
Copy link
Collaborator

@da-phil 'I probably should have a look at the code first myself ;)' That is a great idea!!!!!

@da-phil
Copy link
Author

da-phil commented Jun 9, 2016

@heckflosse I saw that there is stuff like inotify or dnotify for Linux, however there even seems to be an OS agnostic way in GTK with the GFileMonitor functions. I assume GTK has an underlying OS specific implementation for each supported OS.

@heckflosse
Copy link
Collaborator

@da-phil I would really appreciate if you can take a look at this issue. I'm a specialist for speedups but not for the other stuff!!! I can at least help to test a solution on Win64 and Linux64

Ingo

@da-phil
Copy link
Author

da-phil commented Jun 10, 2016

I'm so busy recently I might not be able to do it, I should rather process a lot of pictures with RT instead of hacking it :D
but I'll keep it in mind!

@da-phil
Copy link
Author

da-phil commented Jul 5, 2016

@heckflosse
After I had this issue again while processing wedding pictures within a big folder, I spend a few minutes looking into the code and I spotted a possible suspect, this callback function:
void FileCatalog::on_dir_changed (const Glib::RefPtr<Gio::File>& file, const Glib::RefPtr<Gio::File>& other_file, Gio::FileMonitorEvent event_type, bool internal)
from here:
https://github.com/Beep6581/RawTherapee/blob/master/rtgui/filecatalog.cc#L1764
which is set in function
void FileCatalog::dirSelected (const Glib::ustring& dirname, const Glib::ustring& openfile)
from here:
https://github.com/Beep6581/RawTherapee/blob/master/rtgui/filecatalog.cc#L585

So it seems that the FileMonitor callback gets called after an event (file created / deleted / changed) and calls reparseDirectory () within the GUI thread and blocks the whole thread / event loop while browsing the watched folder again.

Anybody an idea how to move reparseDirectory () into a seperate thread which synchronizes with the GUI thread to update the watched folder in the file browser?

@heckflosse
Copy link
Collaborator

heckflosse commented Sep 5, 2016

@da-phil I just dug out this Issue. Is it still valid? @Floessie Do you have an idea how to solve it or is it already solved by a commit from you which I remember vaguely?

@Floessie
Copy link
Collaborator

Floessie commented Sep 6, 2016

@heckflosse I think, you remembered #3318, but that is unfortunately not related to this very problem. I'm still no GUI expert at all, but I'll have a look at the code. I suppose, it's not that critical in terms of time...

@da-phil
Copy link
Author

da-phil commented Sep 6, 2016

@heckflosse @Floessie this "flaw" is still valid, at least in master branch, I've not yet tested gtk3 branch. I called it a flaw because it's just annoying when working with image folders with above 500 entries, but not critical.
I already pointed the issue out above, the crux is that reparseDirectory() is blocking the GUI event loop, it should run in a separate thread which belongs to FileCatalog class.
I'd love to do it myself but I don't have much GUI programming experience, especially not with glib/gtk, only Qt.

@heckflosse
Copy link
Collaborator

@da-phil Thanks for the additional hints. It's not likely that I'll solve the Issue (because I'm not very experienced in GUI stuff). Maybe @Floessie can fix it.

@Floessie
Copy link
Collaborator

Floessie commented Sep 7, 2016

@da-phil @heckflosse Okay, here is my first take (patch against current master):

diff --git a/rtgui/filecatalog.cc b/rtgui/filecatalog.cc
index 3ed7608..98df5d6 100644
--- a/rtgui/filecatalog.cc
+++ b/rtgui/filecatalog.cc
@@ -51,7 +51,9 @@ FileCatalog::FileCatalog (CoarsePanel* cp, ToolBar* tb, FilePanel* filepanel) :
     previewsToLoad(0),
     previewsLoaded(0),
     coarsePanel(cp),
-    toolBar(tb)
+    toolBar(tb),
+    do_reparse_directory_pending(false),
+    do_reparse_directory_expected_counter(0)
 {

     inTabMode = false;
@@ -1697,7 +1699,6 @@ void FileCatalog::filterChanged ()

 void FileCatalog::reparseDirectory ()
 {
-
     if (selectedDirectory.empty()) {
         return;
     }
@@ -1707,44 +1708,9 @@ void FileCatalog::reparseDirectory ()
         return;
     }

-    std::vector<Glib::ustring> nfileNameList = getFileList ();
-
-    // check if a thumbnailed file has been deleted
-    const std::vector<ThumbBrowserEntryBase*>& t = fileBrowser->getEntries ();
-    std::vector<Glib::ustring> fileNamesToDel;
-
-    for (size_t i = 0; i < t.size(); i++)
-        if (!Glib::file_test (t[i]->filename, Glib::FILE_TEST_EXISTS)) {
-            fileNamesToDel.push_back (t[i]->filename);
-        }
-
-    for (size_t i = 0; i < fileNamesToDel.size(); i++) {
-        delete fileBrowser->delEntry (fileNamesToDel[i]);
-        cacheMgr->deleteEntry (fileNamesToDel[i]);
-        previewsLoaded--;
-    }
-
-    if (!fileNamesToDel.empty ()) {
-        _refreshProgressBar();
-    }
-
-    // check if a new file has been added
-    for (size_t i = 0; i < nfileNameList.size(); i++) {
-        bool found = false;
-
-        for (size_t j = 0; j < fileNameList.size(); j++)
-            if (nfileNameList[i] == fileNameList[j]) {
-                found = true;
-                break;
-            }
-
-        if (!found) {
-            checkAndAddFile (Gio::File::create_for_parse_name (nfileNameList[i]));
-            _refreshProgressBar ();
-        }
+    if (!do_reparse_directory_pending.exchange(true)) {
+        Glib::signal_timeout().connect_once(sigc::bind(sigc::mem_fun(*this, &FileCatalog::doReparseDirectory), ++do_reparse_directory_expected_counter), 500);
     }
-
-    fileNameList = nfileNameList;
 }

 #ifdef WIN32
@@ -1939,6 +1905,7 @@ void FileCatalog::setFilterPanel (FilterPanel* fpanel)
     filterPanel->set_sensitive (false);
     filterPanel->setFilterPanelListener (this);
 }
+
 void FileCatalog::trashChanged ()
 {
     if (trashIsEmpty()) {
@@ -1948,6 +1915,56 @@ void FileCatalog::trashChanged ()
     }
 }

+void FileCatalog::doReparseDirectory(unsigned long expected_counter)
+{
+    if (expected_counter != do_reparse_directory_expected_counter) {
+        return;
+    }
+
+    do_reparse_directory_pending = false;
+
+    std::vector<Glib::ustring> nfileNameList = getFileList ();
+
+    // check if a thumbnailed file has been deleted
+    const std::vector<ThumbBrowserEntryBase*>& t = fileBrowser->getEntries ();
+    std::vector<Glib::ustring> fileNamesToDel;
+
+    for (size_t i = 0; i < t.size(); i++)
+        if (!Glib::file_test (t[i]->filename, Glib::FILE_TEST_EXISTS)) {
+            fileNamesToDel.push_back (t[i]->filename);
+        }
+
+    for (size_t i = 0; i < fileNamesToDel.size(); i++) {
+        delete fileBrowser->delEntry (fileNamesToDel[i]);
+        cacheMgr->deleteEntry (fileNamesToDel[i]);
+        previewsLoaded--;
+        //Gtk::Main::iteration(false);
+    }
+
+    if (!fileNamesToDel.empty ()) {
+        _refreshProgressBar();
+    }
+
+    // check if a new file has been added
+    for (size_t i = 0; i < nfileNameList.size(); i++) {
+        bool found = false;
+
+        for (size_t j = 0; j < fileNameList.size(); j++)
+            if (nfileNameList[i] == fileNameList[j]) {
+                found = true;
+                break;
+            }
+
+        if (!found) {
+            checkAndAddFile (Gio::File::create_for_parse_name (nfileNameList[i]));
+            _refreshProgressBar ();
+            //Gtk::Main::iteration(false);
+        }
+    }
+
+    fileNameList = nfileNameList;
+}
+
 // Called within GTK UI thread
 void FileCatalog::buttonQueryClearPressed ()
 {
diff --git a/rtgui/filecatalog.h b/rtgui/filecatalog.h
index b7ea180..ea190ed 100644
--- a/rtgui/filecatalog.h
+++ b/rtgui/filecatalog.h
@@ -19,6 +19,9 @@
 #ifndef _FILECATALOG_
 #define _FILECATALOG_

+#include <set>
+#include <atomic>
+
 #ifdef WIN32
 #include "windirmonitor.h"
 #endif
@@ -26,7 +29,6 @@
 #include "exiffiltersettings.h"
 #include <giomm.h>
 #include "fileselectionlistener.h"
-#include <set>
 #include "fileselectionchangelistener.h"
 #include "coarsepanel.h"
 #include "toolbar.h"
@@ -148,11 +150,13 @@ private:
     int previewsToLoad;
     int previewsLoaded;

-
     std::vector<Glib::ustring> fileNameList;
     std::set<Glib::ustring> editedFiles;
     guint modifierKey; // any modifiers held when rank button was pressed

+    std::atomic<bool> do_reparse_directory_pending;
+    std::atomic<unsigned long> do_reparse_directory_expected_counter;
+
 #ifndef _WIN32
     Glib::RefPtr<Gio::FileMonitor> dirMonitor;
 #else
@@ -164,6 +168,7 @@ private:
     std::vector<Glib::ustring> getFileList ();
     BrowserFilter getFilter ();
     void trashChanged ();
+    void doReparseDirectory(unsigned long expected_counter);

 public:
     // thumbnail browsers
@@ -254,7 +259,7 @@ public:
     void runFilterDialog ();

     void on_realize();
-    void reparseDirectory ();
+    void reparseDirectory();
     void _openImage (std::vector<Thumbnail*> tmb);

     void zoomIn ();

The idea is, to accumulate calls to reparseDir(), as it is called many many times via on_dir_changed() when I'm watching a dir and then create many images in it. I've not (yet) moved the work in an own thread, as this is complicated and the preview creation is already queued. Please have a look, if this patch is sufficient for you.

@da-phil
Copy link
Author

da-phil commented Sep 7, 2016

Thank you @Floessie! I still wonder whether FileCatalog::doReparseDirectory() will be still run within the GTK UI thread or in a seperate thread?
I'll test it later!

@Floessie
Copy link
Collaborator

Floessie commented Sep 7, 2016

@da-phil FileCatalog::doReparseDirectory() is still run within the main loop thread. The new thing is, that you can now make mass changes to the directory you're in, and doReparseDirectory() will be called only once. Maybe this won't fix you very problem, but for me it made RT usable again when copying files to or deleting from the directory. Before, I had to wait for minutes. I guess, this will also help with your batch processing problem. Still, a single update will block the UI for a few seconds. If this still hits you, I could have a deeper look...

@da-phil
Copy link
Author

da-phil commented Sep 7, 2016

Ah, I didn't fully realise that this was another issue with doReparseDirectory().
To solve this problem I had imagined spawing an extra thread with the functionality of doReparseDirectory() but with a blocking call on a condition variable which will be triggered in reparseDirectory().

Something like that:

class FileCatalog {
....
public:
    void stopReparseThread();

private:
    std::atomic<bool> stop_reparse;
    std::mutex mutex_blk;
    std::condition_variable cond_reparse;
...
};


FileCatalog::FileCatalog(...)
{
....
    stop_reparse = false;
....
}

void FileCatalog::reparseDirectory()
{
....
    cond_reparse.notify_all();
}

void FileCatalog::doReparseDirectory()
{
    std::lock_guard<std::mutex> lock(mutex_blk);
    while(!stop_reparse)
    {
        cond_reparse.wait(lock); // block here
        // do the reparsing here
    }
}

void FileCatalog::stopReparseThread()
{
    stop_reparse = true;
    cond_reparse.notify_all();
}

int main(...)
{
...
    std::thread reparseDirectory_thread(&FileCatalog::doReparseDirectory, &filecatalog, &received_signal);
...
   // shutting down thread
   filecatalog.stopReparseThread();
   reparseDirectory_thread.join();
...
}

It's really just putting a thought together in a quick & dirty hack, I don't have any idea how to do it in the glib/gtk framework.
Maybe we can also run doReparseDirectory() in a detached thread?

@Floessie
Copy link
Collaborator

Floessie commented Sep 7, 2016

@da-phil In principle that's feasible. But as always: The devil is in the details. ;) If you come up with an implementation, we will all rejoice and the honor will be yours. Be careful when interacting with the UI thread. But first measure, if you are on the right track. I'd start the thread joinable in FileCatalog::FileCatalog() and destroy it in FileCatalog::~FileCatalog().

@da-phil
Copy link
Author

da-phil commented Sep 7, 2016

@Floessie Unfortunately your patch didn't fix the issue, but it might have made things faster.
Thanks for the hint, now that I had a brief look at the code again I understand how to implement the extra thread. I try to keep everything to FileCatalog class then.

@Floessie
Copy link
Collaborator

Floessie commented Sep 8, 2016

@da-phil Great! 👍 We can use your help and contribution!

@Floessie
Copy link
Collaborator

Floessie commented Sep 8, 2016

@da-phil Did some research to help you with your implementation:

  • Using std::thread and friends seems to be fine in GTK. The first "GTK thread" example I found used it as well.
  • fileBrowser->getEntries() is not thread-safe, but there is a chance if you extend ThumbBrowserBase with a function std::vector<Glib::ustring> getFilenames() const and use the entryRW rw-mutex with MYREADERLOCK(l, entryRW) like in e.g. ThumbBrowserBase::enableTabMode().
  • cacheMgr and _refreshProgressBar() seem thread-safe.
  • The check, if a new file was added, could be compacted using <algorithm>, I guess. Making copies of the lists as std::set<> and using std::set_* on them would also yield fileNamesToDel.
  • previewsLoaded and previewsToLoad must be made std::atomic<>. Still there is a race potential with dirSelected() while you process the list. Be prepared... ;)
  • fileNameList must be protected by a mutex. This will only affect closeDir() and dirSelected(). Maybe use this mutex also for previewsLoaded and previewsToLoad (see above).

All told, this really seems to be doable, and you seem to have the skills to do so. 👍 If you get stuck, we will help, of course. Patches are okay, but if you expect several review iterations you might be better off forking here on GitHub so that we can easily test your branch. This is also better on the long run, if you like to contribute every now and then.

HTH and good luck
Flössie

@da-phil
Copy link
Author

da-phil commented Sep 8, 2016

@Floessie Wow, thanks for your insightful help, with those hints it will be much easier for sure! It would have taken me ages to find out about the thread-safeness of all those components! :o
I hope I can start soon, right now I've to just work with RT, not working on RT :D

@Floessie
Copy link
Collaborator

Floessie commented Sep 9, 2016

@da-phil You're welcome. Same with me: I'm also working more with than on RT. :)

@Beep6581
Copy link
Owner

@da-phil can you still reproduce this?

@Beep6581 Beep6581 added the type: bug Something is not doing what it's supposed to be doing label Jan 22, 2018
@heckflosse
Copy link
Collaborator

Still valid? Or can we close?

@Thanatomanic
Copy link
Contributor

Just for reference: it might have been fixed by #4576 or is related to the open #4732.

@Beep6581
Copy link
Owner

Beep6581 commented Oct 28, 2018

I just saved a HDR DNG from HDRMerge into a folder which I had open in RT's File Browser, and RT crashed.
This was using branch colortoning-labregions commit 50c6238.

@da-phil
Copy link
Author

da-phil commented Oct 30, 2018

I couldn't test the new code on a folder with thousands of pictures yet, but what I noticed now is, that newly saved files don't appear in the file-browser, only after you hit the 'refresh' button. In my opinion this is also not a desirable behaviour, although the hanging GUI seems not to be a problem anymore.

@Beep6581
Copy link
Owner

@da-phil can you reproduce the freeze in 5.5(rc*)?

@da-phil
Copy link
Author

da-phil commented Dec 15, 2018

@Beep6581 no, now I can't reproduce the error anymore, this is great, thanks to everybody who was involved in fixing it! However, now I have to hit the refresh button in the filebrowser for the newly exported file(s) to appear in the filebrowser.
I tested it by browsing a folder with ~1.3k pics while a pic was exported and saved in the background

I've been testing with the latest version from dev.

Version: nightly-1179-g1e45112d4
Branch: dev
Commit: 1e45112d4
Commit date: 2018-12-15
Compiler: cc 7.3.0
Processor: x86_64
System: Linux
Bit depth: 64 bits
Gtkmm: V3.22.2
Lensfun: V0.3.2.0
Build type: release
Build flags:  -std=c++11 -march=native -Werror=unused-label -Wall -Wuninitialized -Wno-deprecated-declarations -Wno-unused-result -fopenmp -Werror=unknown-pragmas -O3 -DNDEBUG -ftree-vectorize
Link flags:  -march=native
OpenMP support: ON
MMAP support: ON

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug Something is not doing what it's supposed to be doing
Projects
None yet
Development

No branches or pull requests

5 participants