-
-
Notifications
You must be signed in to change notification settings - Fork 807
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue with artist image handling #2130
Comments
This is a chicken and egg issue: As a artist image can come from an external source (Spotify), I can only verify if the image is available after calling it. I cannot call Spotify for all artists beforehand or I could easily hit their rate limit. The previous approach was to never return the artist's Now we have local images. The approach I took was to resolve any images at query time (not scan time), because of the external images limitation explained above. I understand it is not ideal, but This allows clients, specially those that request info on demand (ex: Airsonic Refix), to show artist images in their Artist list view. I'll have to think of another approach for this, and suggestions are welcome. |
What kind of headers do you return for those images (for the cache, max age, etag, ... support 304 code?) In all cases having different urls might not be an issue if you do not return that generic image so that client can choose what missing image they want to display. Current solution just can't be detected on the client side. IMO the normal way, would be to return 404 when no image and let the client decide what to do. As a side note with your current change you might actually generate a tons more calls to the spotify API since before the client needed to call the proper endpoint or visit the artist page, now any client that display a grid of artist image will start multiple dozens of artist image request at the same time. |
Are we talking about the
I agree, but that's not what Subsonic/Airsonic does:
Yes, but some clients were already doing this for "pre-caching". But, initiating these calls from Navidrome, for all artists, would hit the rate limit way faster. I'm actually trying to pre-cache the images, but this is unreliable. |
Since you now return artistImageUrl that's what I use so urls like Previously when those where not present I fallback to coverArt to generate the getCoverArt. But now you always give a value so it's always used.
There's many thing they do not do correctly too ;)
I do but with manners and delays. For the images you give me urls that are supposed to be valid and present so there's no throttling or anything, it will send multiple dozens of queries at the same time. |
Yeah, a third way to return artist images. So messy....
Well, arguably, if we are implementing Subsonic compatible servers/clients, we should consider Subsonic the "golden standard".
Good point. I just tried returning 404 in the
It was not a criticism to Synfonium, just a general observation. Subtracks also does that and I think Substreamer as well. Actually, any player that shows a grid of artist images will try to request a lot of artist images in one go.
But this is a one time thing that will happen when upgrading. No way around it. |
We already have quite a few discussions about that, just returning the type in the answer is breaking the Golden standard. For the rest, it's really more about your API keys, I just explain how Symfonium and probably other apps may trigger a lot more queries to the images than before and could be a problem. If it's not then all good. But as a large project you might have issues. (I don't know the Spotify API rules not using that one for the artist images). Anyway, returning 404 would be nice and a quick working fix for this thanks. Just be aware that most apps will probably not cache the 404 code and will retry downloading the image at a possible frequent rate, so be sure to not trigger a Spotify API query each time. |
@deluan works for new artists never loaded, but for those that previously returned the default image it still return the default. What should I clear, and can it be done automatically? |
The image cache needs to be cleared, but there's no API call for that. I'll change the artist cache key format and this should resolve it automatically. |
Perfect thanks. |
Closed by #2134 |
@deluan So after a full artistInfo2 to all the artists the error 404 each time takes quite some time to be sent. |
|
That would be nice I have a 404 error cache to not retry indefinitely but only in memory, so dropped at each app start. I assume other apps don't even have such and would query each time. |
…if data is already cached locally. Relates to #2130 (comment)
Hey @Tolriq , can you try with the latest dev build? I also fixed some bugs that could affect the image cache and serve broken images. |
@deluan Do you have a link to the built files? I use the windows exe for all my tests . |
@deluan ok so it works now, and seems fast to return the 404 after restart / clear cache. Don't know if it's normal but after tons of scrolling now Navidrome shows tons of
|
This message shows when it takes more than 5 seconds to get info from external sources. Maybe a slow connection? |
2Gb fiber ;) Probably more a deadlock, or too much at the same time and triggering a limit on last.fm? I do not keep the logs on that test instance, is there a parameter on windows to to a file?
|
Not a deadlock, and a rate limit would show a different message. Not sure why you are getting so many messages like that. This is with the latest binary or it was like that before?
No, logs are always sent to STDOUT/ERR. If you run it on Windows as a service, you can check the logs: https://www.navidrome.org/docs/faq/#where-are-the-logs. If you run as a command, just try to do something like |
With the latest I do not think I saw that before, but did not really looked. And there's a ton more of those. Just pasted a small extract. For this test instance I use a open window to see the logs as they arrive when debugging. Anyway not a big deal but I could have seen the original query time. |
@deluan Ok so there's something wrong here :) This is navidrome only killing a 13900K .... on fastscrolling artists. You need to put some throttling in place or something :) It seems it tries again to rescrape all the artists. |
There should be throttling in place. Default is 100/minute, maximum NumCPUs threads in parallel. Which client are you fast scrolling on? Synfonium? I thought it pre-cached the images.... |
The 100/min does not work for sure. And don't know what the thread do but even when compiling complex things I can't max out my CPU like this. Even if the 100 min is fixed, this is probably still a little too CPU aggressive. Yes it's Symfonium, and no I do not pre cache the images without the option enabled. Since you now returns urls for all artists when scrolling it does triggers tons of requests, that what I tried to explain earlier, the new system will trigger more stuff. |
Maybe your limits are only on the getcoverart and not the /share stuff. |
This means that ND will start refusing calls (429) if the number of queued requests gets up to 100 in the last minute.
Good catch! I'll fix that. |
Beware the limits should be on the sub calls to externals services not for returning the already cached images. |
It is not 100 requests per minute. It means it enqueues requests up to 100 in a minute. If your hardware can handle more than that, it won't queue anything. See here for details: https://github.com/go-chi/chi/blob/master/middleware/throttle.go#L35 Anyway, these values are customizable, and the defaults work pretty well for my library of 1500 artists/75K songs, on an bellow average machine. |
Still not working with last build I still kill my CPU and lag. It takes like 2 or 3 seconds to go full CPU, still process thousands of queries (Most calls are cancelled by Symfonium, but I guess the call is not cancelled internally by Navidrome after). I got a few
Then after a while I got hundreds of
Then after a long time it settles down. And I've checked the metadata from the tested build says commit bcab3cc |
Can you try customizing the throttler? Options are:
Try reducing the MaxRequests to 2 (default now is numcores/3, in tour case 8)
Calls canceled by the client should not use much resources before being discarded. But I just pushed a change to be sure it is discarded immediately without further processing. |
Quickly tried to change the value this does not seem to change anything. (Not with the last one as the binaries where not up to date).
While it lags most images are still working and there's some rare error 429. But CPU seems really maxed trying to do stuff with lastFM. Still gives hundreds of
The second they stop, the CPU goes down to 0. Rescrolling again directly restart the full CPU and the errors. Seems the 24h cache is not working for the failures maybe? That's a good temp check :) |
When I just scroll a little on new part CPU goes to 100% and block all also but before it start to use all CPU I got some
So now my educated guess would be that the issue is not about the fetching of the data and returning the image but what Navidrome does when it receive the data back. All that when the client just request an image. Edit: When just scrolling a little to trigger enough last.fm scrape to go 100% but not scrolling a lot I do not get any |
Yeah, that's it. Reason why you should not work too late at night :P. But anyway, it is currently hard to decouple the similar By the way, I just checked and if EDIT: Top Songs is not done at the same time. EDIT 2: If DevArtworkMaxRequests = 2, you should not max out your cores, it should only use 2.... I'll put more logs in place to be sure the params are being read correctly. |
Can't you just delegate the actual job of handling the update from the external data to another job with queuing? You still get the data at the same time, but the handling of the data is handled in two parts. I think go have the same concept as coroutines so should be relatively easy no. Edit for your edit2. The art may be limited to 2 cores, but the data processing of the similar artist might support many threads. |
It does not, for sure.
Yes, there's a lot of goroutines and concurrency happening in Navidrome, ex: when pre-caching new added albums/artists, the scanner is parallelized, searchs for artist/album/mediafiles run in parallel, and so on.
Sure I can, but that requires a bunch of refactoring. I was thinking of having each piece of external metadata to have its own cache expiration (currently it's only one for all), so they could be queried independently. But that also requires a lot of refactoring.... As I said, I'll figure it out EDIT:
But I wouldn't be able to return the Similar Songs if I don't do the matching first. The main problem seams the matching that is really intensive, with lots of LIKE queries. |
Is there a parameter to increase the 24h cache for last FM for artist? |
It is hardcoded for now, but I can make it a config option. |
I added two new config options:
I also added logs around the similar artists mapping function, so it should show now how long it takes for each call. Look for debug messages |
Ok so :
Then it follows will all the others:
So nearly 1 minutes at full CPU before the first mapping then the others do follow, this really looks like some deadlock with all of them racing to do something. This test instance have 27K songs / 4516 artists |
This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
Description
Just updated to 0.49 and the new artist image handling is not "ideal"
First issue is that artists without image returns a default image, this is not really wanted as prevent clients to display a better default image that more match the theme of the application. On dark theme for example a large white icon does not look good at all.
Second issue is related so if first is fixed this one will be too, but now even the default image urls are different per artist, forcing the client to download and cache a possibly large amount of the same image for no purpose.
Expected Behaviour
Identical image should report the same path for performance reason.
Artist without image should return no image url or eventually return a well known ID so that client can display something else that better match the interface.
Steps to reproduce
Use the subsonic API search3 endpoint for artists.
Platform information
The text was updated successfully, but these errors were encountered: