Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Syncing Series is SLOW #2260

Closed
stevezau opened this issue Oct 4, 2023 · 22 comments
Closed

Syncing Series is SLOW #2260

stevezau opened this issue Oct 4, 2023 · 22 comments

Comments

@stevezau
Copy link
Contributor

stevezau commented Oct 4, 2023

Describe the bug
I've noticed syncing series takes several hours. It used to be quick..

To Reproduce

  1. Start the sync series job

Expected behavior
Faster than 4-5hrs.

I have 9000 movies, that syncs fast. I have 3000 series, it takes several hours.

Software (please complete the following information):
Bazarr Version 1.3.0
Package Version v1.3.0-ls218 by linuxserver.io
Sonarr Version 4.0.0.688
Radarr Version 5.0.3.8107
Operating System Linux-6.2.0-33-generic-x86_64-with
Python Version 3.11.5

Additional context
What's the best way for me to t-shoot? I can turn on debug but what should i look for?

@halali
Copy link
Collaborator

halali commented Oct 4, 2023

we dont support sync all series at all.

but you have to imagine every series have couple episodes... so 3000 series can be like 50000 episodes or so

@stevezau
Copy link
Contributor Author

stevezau commented Oct 4, 2023

I mean the "Sync with Sonarr" job.

It used to take a while on the first run but it was much faster after. But it is still pretty slow.

I would have thought it would be faster on the following run as the files have no changed?

Maybe ill turn on debug and look at the logs..

@morpheus65535
Copy link
Owner

Should be improved in upcoming beta. I don't have a library that large but let me know how it goes.

@stevezau
Copy link
Contributor Author

stevezau commented Oct 8, 2023

Sure, ill test when the new docker image is out. thanks!

@morpheus65535
Copy link
Owner

Can you confirm the improvement? I'm about to merge nightly to main branch and release 1.3.1.

@stevezau
Copy link
Contributor Author

I'm using linuxserver/bazarr:latest atm.

if i switch to the development tag, does that have the changes in ther for me to test?

@morpheus65535
Copy link
Owner

if i switch to the development tag, does that have the changes in ther for me to test?

Yes

@stevezau
Copy link
Contributor Author

stevezau commented Oct 12, 2023

Updated to dev.

I immediately noticed Movies syncing was MUCH MUCH faster.

for context, it synced 8000 movies from radarr in 60 seconds, sonarr sync is still only up to the 6th item in the series list

However, Series sync still seems slow? How do you want me to t-shoot?

@morpheus65535
Copy link
Owner

Before 1.3.0, we were syncing series and episodes separately so we were getting all episodes metadata from all series, deleting the one that were not present anymore, updating the modified episodes and adding the new one.

Since 1.3.0, series and episodes sync aren't separate process anymore. Now, during the series sync, for each series, we make a call to Sonarr API to list episodes and delete/update/add them to the database. We aren't doing more call to Sonarr, it's all the same. The difference is that we update database after each series instead of one time at the end.

If nothing as changed, it should take long to sync it, maybe a couple of minutes.

I've improved database call to delete/update/insert in batch for a series instead of doing this for each episodes. It should have improve significantly. Didn't you see any difference?

@stevezau
Copy link
Contributor Author

Hmm ok. well it synced 8000 movies in 60 seconds. It has been running for over an hr and a half now and it is only up to series 1029.

Sonarr is minimal CPU. disks at 3% usage.

So, it does not feel faster to me.

@stevezau
Copy link
Contributor Author

Is there someway i can see where the bottlenecks are?

@morpheus65535
Copy link
Owner

morpheus65535 commented Oct 12, 2023

How long did it take before 1.3.0? Anything that I could use as a baseline? I'm going through the code to see if I can optimize something else.

@stevezau
Copy link
Contributor Author

I am unsue how long it took before but it feels like it is going at the same speed as before. But, as you pointed out it is doing more then before so maybe it is going faster but I'm preceiving it as the same.

Hmm ok.. I could turn on debugging to see where it is spending most of it's time?

@morpheus65535
Copy link
Owner

Hmm ok.. I could turn on debugging to see where it is spending most of it's time?

You can but, TBH, I wish you luck trying to analyze the sync log for 3000 series! lol

@stevezau
Copy link
Contributor Author

stevezau commented Oct 12, 2023

13/10/2023 08:58:32|INFO    |socketio.server                 |emitting event "data" to all [/]|
13/10/2023 08:58:32|INFO    |engineio.server                 |<REDACTED>: Sending packet MESSAGE data 2["data",{"type":"progress","action":"update","payload":{"id":"series_progress","header":"Syncing series...","name":"Full Bloom","value":1151,"count":3939}}]|
13/10/2023 08:58:32|INFO    |socketio.server                 |emitting event "data" to all [/]|
13/10/2023 08:58:32|INFO    |engineio.server                 |<REDACTED>: Sending packet MESSAGE data 2["data",{"type":"series","action":"update","payload":1889}]|
13/10/2023 08:58:32|DEBUG   |root                            |BAZARR Starting episodes sync from Sonarr.|
13/10/2023 08:58:32|DEBUG   |urllib3.connectionpool          |Starting new HTTP connection (1): ***.***.***.***:8989|
13/10/2023 08:58:32|DEBUG   |urllib3.connectionpool          |http://***.***.***.***:8989 "GET /api/v3/episode?seriesId=1889&apikey=(removed) HTTP/1.1" 200 None|
13/10/2023 08:58:32|DEBUG   |urllib3.connectionpool          |Starting new HTTP connection (1): ***.***.***.***:8989|
13/10/2023 08:58:32|DEBUG   |urllib3.connectionpool          |http://***.***.***.***:8989 "GET /api/v3/episodeFile?seriesId=1889&apikey=(removed) HTTP/1.1" 200 None|


13/10/2023 08:58:34|DEBUG   |root                            |BAZARR All episodes from series ID 1889 synced from Sonarr into database.|
13/10/2023 08:58:34|INFO    |socketio.server                 |emitting event "data" to all [/]|
13/10/2023 08:58:34|INFO    |engineio.server                 |<REDACTED>: Sending packet MESSAGE data 2["data",{"type":"progress","action":"update","payload":{"id":"series_progress","header":"Syncing series...","name":"The Crimes That Changed Us","value":1152,"count":3939}}]|
13/10/2023 08:58:34|INFO    |socketio.server                 |emitting event "data" to all [/]|
13/10/2023 08:58:34|INFO    |engineio.server                 |<REDACTED>: Sending packet MESSAGE data 2["data",{"type":"series","action":"update","payload":1892}]|
13/10/2023 08:58:34|DEBUG   |root                            |BAZARR Starting episodes sync from Sonarr.|
13/10/2023 08:58:34|DEBUG   |urllib3.connectionpool          |Starting new HTTP connection (1): ***.***.***.***:8989|
13/10/2023 08:58:34|DEBUG   |urllib3.connectionpool          |http://***.***.***.***:8989 "GET /api/v3/episode?seriesId=1892&apikey=(removed) HTTP/1.1" 200 None|
13/10/2023 08:58:34|DEBUG   |urllib3.connectionpool          |Starting new HTTP connection (1): ***.***.***.***:8989|
13/10/2023 08:58:34|DEBUG   |urllib3.connectionpool          |http://***.***.***.***:8989 "GET /api/v3/episodeFile?seriesId=1892&apikey=(removed) HTTP/1.1" 200 None|
13/10/2023 08:58:35|DEBUG   |root                            |BAZARR All episodes from series ID 1892 synced from Sonarr into database.|
13/10/2023 08:58:35|INFO    |socketio.server                 |emitting event "data" to all [/]|
13/10/2023 08:58:35|INFO    |engineio.server                 |<REDACTED>: Sending packet MESSAGE data 2["data",{"type":"progress","action":"update","payload":{"id":"series_progress","header":"Syncing series...","name":"Moon Knight","value":1153,"count":3939}}]|
13/10/2023 08:58:35|INFO    |socketio.server                 |emitting event "data" to all [/]|
13/10/2023 08:58:35|INFO    |engineio.server                 |<REDACTED>: Sending packet MESSAGE data 2["data",{"type":"series","action":"update","payload":1893}]|
13/10/2023 08:58:35|DEBUG   |root                            |BAZARR Starting episodes sync from Sonarr.|
13/10/2023 08:58:35|DEBUG   |urllib3.connectionpool          |Starting new HTTP connection (1): ***.***.***.***:8989|
13/10/2023 08:58:35|DEBUG   |urllib3.connectionpool          |http://***.***.***.***:8989 "GET /api/v3/episode?seriesId=1893&apikey=(removed) HTTP/1.1" 200 None|
13/10/2023 08:58:35|DEBUG   |urllib3.connectionpool          |Starting new HTTP connection (1): ***.***.***.***:8989|
13/10/2023 08:58:36|DEBUG   |urllib3.connectionpool          |http://***.***.***.***:8989 "GET /api/v3/episodeFile?seriesId=1893&apikey=(removed) HTTP/1.1" 200 None|
13/10/2023 08:58:36|DEBUG   |root                            |BAZARR All episodes from series ID 1893 synced from Sonarr into database.|
13/10/2023 08:58:36|INFO    |socketio.server                 |emitting event "data" to all [/]|
13/10/2023 08:58:36|INFO    |engineio.server                 |<REDACTED>: Sending packet MESSAGE data 2["data",{"type":"progress","action":"update","payload":{"id":"series_progress","header":"Syncing series...","name":"Hawkeye (2021)","value":1154,"count":3939}}]|
13/10/2023 08:58:36|INFO    |socketio.server                 |emitting event "data" to all [/]|
13/10/2023 08:58:36|INFO    |engineio.server                 |<REDACTED>: Sending packet MESSAGE data 2["data",{"type":"series","action":"update","payload":1894}]|
13/10/2023 08:58:36|DEBUG   |root                            |BAZARR Starting episodes sync from Sonarr.|
13/10/2023 08:58:36|DEBUG   |urllib3.connectionpool          |Starting new HTTP connection (1): ***.***.***.***:8989|
13/10/2023 08:58:36|DEBUG   |urllib3.connectionpool          |http://***.***.***.***:8989 "GET /api/v3/episode?seriesId=1894&apikey=(removed) HTTP/1.1" 200 None|
13/10/2023 08:58:36|DEBUG   |urllib3.connectionpool          |Starting new HTTP connection (1): ***.***.***.***:8989|
13/10/2023 08:58:36|DEBUG   |urllib3.connectionpool          |http://***.***.***.***:8989 "GET /api/v3/episodeFile?seriesId=1894&apikey=(removed) HTTP/1.1" 200 None|
13/10/2023 08:58:37|DEBUG   |root                            |BAZARR All episodes from series ID 1894 synced from Sonarr into database.|
13/10/2023 08:58:37|INFO    |socketio.server                 |emitting event "data" to all [/]|
13/10/2023 08:58:37|INFO    |engineio.server                 |<REDACTED>: Sending packet MESSAGE data 2["data",{"type":"progress","action":"update","payload":{"id":"series_progress","header":"Syncing series...","name":"Walker","value":1155,"count":3939}}]|
13/10/2023 08:58:37|INFO    |socketio.server                 |emitting event "data" to all [/]|
13/10/2023 08:58:37|INFO    |engineio.server                 |<REDACTED>: Sending packet MESSAGE data 2["data",{"type":"series","action":"update","payload":1895}]|
13/10/2023 08:58:37|DEBUG   |root                            |BAZARR Starting episodes sync from Sonarr.|
13/10/2023 08:58:37|DEBUG   |urllib3.connectionpool          |Starting new HTTP connection (1): ***.***.***.***:8989|
13/10/2023 08:58:37|DEBUG   |urllib3.connectionpool          |http://***.***.***.***:8989 "GET /api/v3/episode?seriesId=1895&apikey=(removed) HTTP/1.1" 200 None|
13/10/2023 08:58:37|DEBUG   |urllib3.connectionpool          |Starting new HTTP connection (1): ***.***.***.***:8989|
13/10/2023 08:58:37|DEBUG   |urllib3.connectionpool          |http://***.***.***.***:8989 "GET /api/v3/episodeFile?seriesId=1895&apikey=(removed) HTTP/1.1" 200 None|



13/10/2023 08:58:44|DEBUG   |root                            |BAZARR All episodes from series ID 1895 synced from Sonarr into database.|
13/10/2023 08:58:44|INFO    |socketio.server                 |emitting event "data" to all [/]|
13/10/2023 08:58:44|INFO    |engineio.server                 |<REDACTED>: Sending packet MESSAGE data 2["data",{"type":"progress","action":"update","payload":{"id":"series_progress","header":"Syncing series...","name":"The Old Man","value":1156,"count":3939}}]|
13/10/2023 08:58:44|INFO    |socketio.server                 |emitting event "data" to all [/]|
13/10/2023 08:58:44|INFO    |engineio.server                 |<REDACTED>: Sending packet MESSAGE data 2["data",{"type":"series","action":"update","payload":1896}]|
13/10/2023 08:58:44|DEBUG   |root                            |BAZARR Starting episodes sync from Sonarr.|
13/10/2023 08:58:44|DEBUG   |urllib3.connectionpool          |Starting new HTTP connection (1): ***.***.***.***:8989|
13/10/2023 08:58:44|DEBUG   |urllib3.connectionpool          |http://***.***.***.***:8989 "GET /api/v3/episode?seriesId=1896&apikey=(removed) HTTP/1.1" 200 None|
13/10/2023 08:58:44|DEBUG   |urllib3.connectionpool          |Starting new HTTP connection (1): ***.***.***.***:8989|
13/10/2023 08:58:44|DEBUG   |urllib3.connectionpool          |http://***.***.***.***:8989 "GET /api/v3/episodeFile?seriesId=1896&apikey=(removed) HTTP/1.1" 200 None|
13/10/2023 08:58:45|DEBUG   |root                            |BAZARR All episodes from series ID 1896 synced from Sonarr into database.|
13/10/2023 08:58:45|INFO    |socketio.server                 |emitting event "data" to all [/]|
13/10/2023 08:58:45|INFO    |engineio.server                 |<REDACTED>: Sending packet MESSAGE data 2["data",{"type":"progress","action":"update","payload":{"id":"series_progress","header":"Syncing series...","name":"StartUp","value":1157,"count":3939}}]|
13/10/2023 08:58:45|INFO    |socketio.server                 |emitting event "data" to all [/]|
13/10/2023 08:58:45|INFO    |engineio.server                 |<REDACTED>: Sending packet MESSAGE data 2["data",{"type":"series","action":"update","payload":1897}]|
13/10/2023 08:58:45|DEBUG   |root                            |BAZARR Starting episodes sync from Sonarr.|
13/10/2023 08:58:45|DEBUG   |urllib3.connectionpool          |Starting new HTTP connection (1): ***.***.***.***:8989|
13/10/2023 08:58:45|DEBUG   |urllib3.connectionpool          |http://***.***.***.***:8989 "GET /api/v3/episode?seriesId=1897&apikey=(removed) HTTP/1.1" 200 None|
13/10/2023 08:58:45|DEBUG   |urllib3.connectionpool          |Starting new HTTP connection (1): ***.***.***.***:8989|
13/10/2023 08:58:45|DEBUG   |urllib3.connectionpool          |http://***.***.***.***:8989 "GET /api/v3/episodeFile?seriesId=1897&apikey=(removed) HTTP/1.1" 200 None|```

@stevezau
Copy link
Contributor Author

Hmm, when i watch the logs there is a delay from this line

13/10/2023 12:39:45|DEBUG   |urllib3.connectionpool          |http://***.***.***.***:8989 "GET /api/v3/episodeFile?seriesId=5&apikey=(removed) HTTP/1.1" 200 None|```

to the next line.. 

```13/10/2023 12:39:50|DEBUG   |root                            |BAZARR All episodes from series ID 5 synced from Sonarr into database.|```

sometimes it is from 5-30 seconds.

Do you think that's the Sonarr API taking a while to respond? It is not clear if this is an IO wait or if bazarr gets the response and is spending time doing something else. 

@morpheus65535
Copy link
Owner

I'm trying different approches to improve time spent by Bazarr to parse the episodes returned by Sonarr but everything I try end up making it worst.

How many episodes does Sonarr have for the series with the longer delay?

@morpheus65535
Copy link
Owner

Determining if values returned by Sonarr API are different from the one stored in database are different take time. I don't see any way to improve this right now.

I've put 4 hours today alone to improve this and haven't found. Let me know if you figure something that I could have miss.

Until then, I guess I'll release 1.3.1 as it is.

Thanks for your involvement!

@stevezau
Copy link
Contributor Author

@morpheus65535 can you point me to the area of code that handles this? i might see if i can live debug what's the bottleneck on my side.

@morpheus65535
Copy link
Owner

@stevezau
Copy link
Contributor Author

@morpheus65535 thanks, im yet to setup a dev env.

I had a quick look at the code tho. does it parse each file, on every run?

If it does, i wonder how movies are so fast, does it not also parse the movie files?

@morpheus65535
Copy link
Owner

morpheus65535 commented Oct 16, 2023

I had a quick look at the code tho. does it parse each file, on every run?

Do you mean parsing the video files? Unless you've disable the usage of cache in Settings-->Scheduler, it will use the cached ffprobe/mediainfo result to prevent analyzing all the files over and over again.

If it does, i wonder how movies are so fast, does it not also parse the movie files?

What take time is calling API and parsing the returned JSON. Writing changes to database is also time consuming.

For movies, only one call to Radarr API return all the movies, we parse the JSON once, call the database to compare existing values with parsed JSON returned and do a bulk delete/update/insert to database. For each movies, we index external subtitles.

For series, we do one call to get all the series, then, for each series, we store the series to database, call Sonarr again to get the episodes, call it one last time to get the episodes files, call the database to compare existing values with JSON returned and then we bulk delete/update/insert episodes to database. For each episode, we index the external subtitles.

For 3000 series, it means we must call Sonarr 6001 times where it took only one call to Radarr. It also means that we must do 12003 database calls where it took only 4 calls for movies.

There's no other way to do it. This is the way Sonarr devs have designed their API and they didn't had any interest in a global episodes API last time I spoke with them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants