New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use a semaphore to avoid fetching complete library to memory #410
Conversation
-> this happens if the processing of items is slower as the fetching of new -> if a big library is synced, the old behavior could lead to extensive use of memory -> the semaphore acts like a buffer that only allows fetching of new items from the library if old ones are processed -> the current size of the 'buffer' is hard coded to 2 * [max. item fetch limit] * [number of download threads]
-> prior all threads that fetched items from the server and their results stayed in memory until the sync was finished
I can't say I'm familiar with semaphores (heard of them, never had to use them). Is there any particular reason not to use a fixed size Queue for this task? |
Basically the semaphore does exactly that: Like before all thread jobs are getting created and started by the pool. But during execution of one thread it checks if enough items were processed (the fixed number). If so, the execution continues. If not, the thread waits until the semaphore gets released. This happens after an item is processed. To implement this behaviour with a semaphore was the first thing that came to my mind. But there are for sure other ways to achieve that. Just had a quick look at the source of the Emby Kodi plugin: It seems that they don't use threading at this point anymore... So items get fetched when they are needed. So is a thread pool really needed here? |
This was implemented in #202 in order to speed up syncs I believe. If I where to implement this kind of parallel processing myself, I'd most likely use I am not opposed to removing threading here if the performance impact of that is minimal (trading some performance for actually working on more systems is generally fine, 10x slower maybe not). |
Ok, but then processing of the items should also be parallelized. Because after fetching the items this is the bottle neck.
Would be no problem for me.
On lower powered devices the multi-threading could be disabled. The simplest method is to set the queue size to 1 if we assume your implementation suggestion. This setting could be linked to the 'download threads' setting. Also the default value (currently 3) can be set programmatically according to the available CPU cores. Anyway, #350 (comment) shows, that my "fix" does not work for all. |
After some additional testing i have some observations to share and a few questions that i hope are not too noobish ;)
Now two related questions:
I did a local database reset followed by a full scan after removing
|
There is no such thing as too noobish questions when debuging issues such as these – rubber duck debugging is a thing after all.
That sounds really strange to me, I don't see any reason for that happening, except possibly the settings in the UI not matching what's read in this file.
I'd consider this just random variation (I'm assuming you tested once, and didn't average 100 runs of each 😉), so it looks like these parameters don't do anything for sync speed in your particular case. Could you share some details on your library composition? Just a thought that appeared to me earlier today; could this be an issue related to rapidly writing to the database, and slow (SD-card) storage? |
@macearl let me try to answer your questions:
The "2" came from the first commit of my PR: jellyfin-kodi/jellyfin_kodi/downloader.py Line 284 in 09b0bdb
My problem was, that I didn't know how big should I size the semaphore. At first it was My intention of this PR was to save memory, so I removed the Basically the size of the semaphore should not be too big, because then more memory is needed. But it also should not be too low, because on a slow network/internet connection there should be some items in memory for a faster procession of the items. First point
Line jellyfin-kodi/jellyfin_kodi/downloader.py Line 282 in e510097
DTHREADS . That means that there is a number of DTHREADS threads that can execute jobs. If all threads have a job, the next job will not be started until another job finishes.For further information, have a look at the documentation: https://docs.python.org/3/library/concurrent.futures.html Second point was answered above Third point
Thats not right. The semaphore gets released after the results of one job are processed. That means after jellyfin-kodi/jellyfin_kodi/downloader.py Line 326 in e510097
This release call is within the for-loop that processes the results of a single job. Answers to your questions
I answered this above. Do you have an idea of how to size the semaphore right?
Sorry I've got no idea why this fails... The behavior you describe is very strange. Comment on your measurements
|
@oddstr13 How should we proceed with this PR? |
I wonder if we're overthinking this and it's more a order of operations issue than a data query/threading issue. For example, this function: jellyfin-kodi/jellyfin_kodi/full_sync.py Lines 272 to 299 in 1709a61
It's processing each item, but then it's not clearing it's memory each iteration, so the memory usage would constantly be climbing (if I'm reading it right). Also, it's opening the database lock multiple times, which I'm not entirely sure is the most efficient use. It does highlight why parallelizing adding data to the database is an issue though. Sqlite is only meant to be accessed from one place, so I think if we try to multithread that part of it we'll run into conflicts sooner or later. I'd love to be wrong on that part of it though. |
I was thinking we could try a single fetching thread with a queue size somewhere in the range of 10, LIMIT*2, as that would probably eliminate the http request latency (keeping data ready for processing). I don't think there's a need for multiple threads to process the data, and I don't think there is much of a benefit to have multiple http requests at the same time either if the server is slow, it is slow. We should be using HTTP keep-alive anyways (are we?), so TCP handshakes shouldn't cause additional latency. What would be interesting to compare is the full sync time on single threaded (fetch then process, repeat) vs. one, two & three http fetching threads. My current thinking is either remove threadhing here, or use just a single thread, depending on how the performance is on something low-end (I think Pi3 is a reasonable benchmark device here?). |
I can't confirm that. On my search for the memory leak I used Pydev debugger. While investigating the function you mentioned, I noticed that in the background my memory filled up. So this function can't be the culprit, because it's execution was interrupted by the debugger.
What about https://sqlite.org/threadsafe.html? Can this be used to add items parallel to the database or update them?
I think on multi-core systems that would make sense. Normally this should result in a better performance on those systems.
That's quite a good point... Except the Jellyfin server uses also multi threading for processing GET requests. Does it? If so, multiple parallel GETs could be faster than one GET after another. But this is just a guess... |
Had a poke around over the weekend and i think i found the problem: because the variables
Yeah i would also put it down to normal variation. If the connection speed to the server is fast enough it probably does not make a difference. The Library is about 1500 Movies and 300 episodes no music. The scan was done on a RPI 4 1GB, on my desktop machine the scan finishes in about 2 minutes.
Ah, didn't check the previous commits, makes sense
yeah i probably got confused at this point or missed the indentation.
Not really, i would probably go the "just set it to DTHREADS" route as that is what i personally would expect the option to mean, but then again it might just be me and someone else expects something completely different. If a fixed size queue would be better than a Semaphore i have no idea, but i also don't really see anything wrong with the current implementation. I was able to fully sync on both the RPi4 and a RPI3+ with the number of semaphores set to just With the original limit of |
I'm glad that you found a working combination. @oddstr13 @mcarlton00: |
Whatever solution fixing the problem while keeping or improving maintainability is fine with me. 😃 |
After spending my afternoon reading SQLite and python documentation about multi threading, I must consider that implementing my approach is not as easy as I thought. So @mcarlton00 is right here and I was overthinking this. So the easiest will be as @oddstr13 said:
Just remove the multithreading for fetching items from the database. Or do you have another idea for a better syncing performance? If not, I think we can close this PR. Thanks for your assistance :) |
Kudos, SonarCloud Quality Gate passed! 0 Bugs No Coverage information |
@mcarlton00 I've added this change and rebased everything to the current master. Feel free to merge if it's ok for you. |
1.5 confirmations that it works, good enough for me. If there's other issues I'm sure we'll find them over time. |
Hello,
after my #350 (comment) I had some time to investigate this bug. I tracked it down to the lines
jellyfin-kodi/jellyfin_kodi/downloader.py
Lines 282 to 286 in 65f400b
In the first line a thread pool is opened. For each bunch of items that should be fetched from the Jellyfin server a thread is created (second line). When iterating over the thread pool starts (the
for
loop), all threads are executed.This leads to the following problem:
If the processing of single items (e.g. adding them to the Kodi library) takes longer than the fetching process of the threads does, the fetched items get stored in memory until they get processed.
For a big library that means a lot of memory is needed.
My approach to solve this problem is the use of a simple semaphore that acts like a buffer. There is now a fixed number of items that should be pre-fetched from the Jellyfin server. Only after items are processed by the Kodi plugin more items can be fetched from the server.
The current fixed number gets calculated:
2 * [max. item fetch limit] * [number of download threads]
Please let me know your thoughts on this.