New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Duplicate title entries after library update #547
Comments
A full wipe cache & scan would do without moving files, wouldn't it? |
I just tried this - now all my favorites are gone. :( And yes, it fixes duplicate entries, but it usually takes longer (2,5h) than 2x rescan (7 minutes per re-scan for 215.000 titles): Just the database deletion takes as long as 1 rescan. And it would seem weird to me if multiple library entries exist (within the same library, of course) that refer to one and the same file. Maybe a library consistency check that takes care of duplicate entries for same file would be helpful. |
I would love to see this one fixed, it's been a long standing problem I noticed too. I use the same workaround (renaming the directory) - and a full rescan is not really a practical option for those of us with large libraries. Feels like there ought to be an easy solution in the scanner - as @schnillerman suggests, a consistency check or some such |
Could one of you please outline how that easy consistency check would work? |
Hi Michael,
totally understand your question :)
Me as a very inexperienced programmer (if any), I would probably check for duplicate entries in the table where the full path/filename are stored.
If one and the same file is listed more than once, there's a good indicator that it's registered as a duplicate.
Cheers, Till
|
What seems to be happening is - when you change a file within an album somehow, or add a new file to the album - then do a new/changed scan: The scan finds the new file and creates it within a new album, so you end up with two duplicated albums 1 - the original album with the unchanged files but not the changed/new So the logic needs to be something like
|
You're right, I remember now that a new album is mostly created in this case.
*From: *bobbydriver ***@***.***>
*To: *Logitech/slimserver ***@***.***>
*CC: *schnillerman ***@***.***>; Mention ***@***.***>
*Date: *27.01.2022 18:36:05
*Subject: *Re: [Logitech/slimserver] Duplicate title entries after library update (#547)
What seems to be happening is - when you change a file within an album somehow, or add a new file to the album - then do a new/changed scan:
The scan finds the new file and creates it within a new album, so you end up with two duplicated albums
1 - the original album with the unchanged files but not the changed/new
2 - the new album with just the changed/new file and none of the unchanged files
So the logic needs to be something like
1. > new file is found
2. > read album tag
3. > read the folder path
4. > does an album with the same name exist with the same folder path?
5. > if yes then add the file to the existing album
6. > if no - carry on as before and create a new album
…
—
Reply to this email directly, view it on GitHub[#547 (comment)], or unsubscribe[https://github.com/notifications/unsubscribe-auth/AEDUA6JIT37IURSRI7QC5W3UYF7AJANCNFSM4YZFHB3Q].
Triage notifications on the go with GitHub Mobile for iOS[https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675] or Android[https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub].
You are receiving this because you were mentioned. [data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAD8AAAA/CAYAAABXXxDfAAAAAXNSR0IArs4c6QAAAARzQklUCAgICHwIZIgAAAAmSURBVGiB7cEBDQAAAMKg909tDwcUAAAAAAAAAAAAAAAAAAAAJwY+QwABivJx1AAAAABJRU5ErkJggg==###24x24:true###][Verfolgungsbild][https://github.com/notifications/beacon/AEDUA6NOWWWS2ULCWTIQ37DUYF7AJA5CNFSM4YZFHB32YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOHUAQD6Q.gif]
|
Now here's the problem: the reason why a regular scan is so much faster than a full wipe & rescan is because the former only deals with changed items, doesn't do these kinds of optimisations and checks. Any additional check will slow it down. In order to keep things as fast as possible, we have to be sure what we're talking about. The issue subject line says "Duplicate title entries". The description says "duplicates (in the same album)". And the latest suggestion is about duplicated albums. Maybe both are valid. And I'm pretty sure complaints about genres have been heard, too... I fear in order to fix this all I'd need the amount of time I currently don't have. |
Oh, and artists: #704 |
Thanks Michael - appreciate that it's probably a lot of effort. If I get chance I might set up a test rig and do some proper documentation of the issues/scenarios. I don't know perl so I couldn't do anything with the scanner, but i could at least work out the sql queries that ID the culprits As for the scan time - I had the exact same thought. It would really need to be a separate scan option for occasional use. A "tidy/remove duplicates scan" or something In actual fact I'm more than happy with all the LMS functionality these days and the only thing left which bugs me is the way the new/changed scan can make a mess of db integrity. I'd actually really love a UI that allowed me to query and tidy up my music db without the inconvenience of a full drop and rescan, but I know that's dreamland :) |
Could both of you please describe what tag you'd change (artist, album, title...), and what the outcome would be? I think I've identified one issue if you changed some tracks' artist names without getting rid of the original artist name (eg. different artists of the same name, you rename only one of them). This could likely cause empty albums in the original artist's collection (see #704). |
Would #705 be a duplicate of this issue? |
Those affected by the file renaming issue: what OS are you using? |
Happens when I change attributes like
If the file name upper/lower case is changed, it happens as well. I'm running LMS on a Linux Debian. |
I think I've identified the cause of the duplication in case of a file name case change. See #705 (comment). There's some background information, and how you might be able to work around / fix this until I have a fix in LMS. |
Could you please give the 8.3 nightly a try (https://downloads.slimdevices.com/nightly/?ver=8.3)? I applied a few changes to the scanner. I'm no longer able to get invalid records after
|
I just installed 8.3 over 8.2 and will have a look! Do I need to perform a complete re-scan? |
I loaded the nightly and did the same tests - works ok for me too (on Raspbian 10 Buster/Max2Play) The duplicate albums still get created though if you fundamentally change a filename (other than a case change) - or add new files to the album folder - then run a new/changed rescan. Does that need to be raised as a separate issue to keep things clear? |
Thank you so much for mentioning this behavior - I forgot that this happens to me a lot, too, because I've been working around this by temp_renaming the updated folder, scanning, re-naming again, re-scanning! |
Just did a test added some new files to an existing album folder I realise that adding this integrity step to a new/changed files rescan will slow things down, but maybe not too much? After all - it only needs to be run against the new files discovered If you put the cover art in each album folder, then the SQL to id the existing duplicates is quite simple - because although it allocates a new album id to the new files - the value for cover (which is essentially the path to the cover.jpg) is the same for both the new and existing albums
Not sure how this works for people who use embedded cover art though |
OK - just been digging some more and that SQL is not ideal, as it also finds occurrences where you have files in the same album folder but with different album tags. That's just bad tagging/mistakes, so handy for IDing where your library is messed up, but not a definitive ID of where the new/changed scan problem has happened I also worked out this SQL query on the albums table
this IDs where a duplicate album name has the same artist and year BUT a different coverart hash - which also pulls out records where the new/changed scan problem has happened Neither of these queries take bad tagging into account, so only useful for manually interrogating libraries for bad integrity - not the new/changed scan problem |
OT - is your nick from Bob's Burgers? 😂
*From: *bobbydriver ***@***.***>
*To: *Logitech/slimserver ***@***.***>
*CC: *schnillerman ***@***.***>; Mention ***@***.***>
*Date: *01.02.2022 16:39:41
*Subject: *Re: [Logitech/slimserver] Duplicate title entries after library update (#547)
… OK - just been digging some more and that SQL is not ideal, as it also finds occurrences where you have files in the same album folder but with different album tags. That's just bad tagging/mistakes, so handy for IDing where your library is messed up, but not a definitive ID of where the new/changed scan problem has happened
I also worked out this SQL query on the albums table
*Select A.title,A.id, C.name, b.artwork
from albums A, contributors C
join albums B
on A.title = B.title
and A.contributor=C.id
and A.contributor= B.contributor
and A.year = B.year
and A.artwork <> B.artwork
group by A.title,A.artwork
*
this IDs where a duplicate album name has the same artist and year BUT a different coverart hash - which also pulls out records where the new/changed scan problem has happened
BUT
also IDs other issues, like where you have moved a file to a different album but not changed the album tag - so again bad tagging
Neither of these queries take bad tagging into account, so only useful for manually interrogating libraries for bad integrity - not the new/changed scan problem
—
Reply to this email directly, view it on GitHub[#547 (comment)], or unsubscribe[https://github.com/notifications/unsubscribe-auth/AEDUA6J4AOZ6AX72P4A44RDUY75D3ANCNFSM4YZFHB3Q].
Triage notifications on the go with GitHub Mobile for iOS[https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675] or Android[https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub].
You are receiving this because you were mentioned. [data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAD8AAAA/CAYAAABXXxDfAAAAAXNSR0IArs4c6QAAAARzQklUCAgICHwIZIgAAAAmSURBVGiB7cEBDQAAAMKg909tDwcUAAAAAAAAAAAAAAAAAAAAJwY+QwABivJx1AAAAABJRU5ErkJggg==###24x24:true###][Verfolgungsbild][https://github.com/notifications/beacon/AEDUA6MPJLFVDVETTKC65Z3UY75D3A5CNFSM4YZFHB32YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOHU3IAGA.gif]
|
haha - no, but I did laugh when I saw that episode |
The duplicate albums still get created though if you fundamentally
change a filename (other than a case change) - or add new files to the
album folder - then run a new/changed rescan. Does that need to be
raised as a separate issue to keep things clear?
Could you please provide step-by-step instructions what I need to do to
reproduce the problem?
|
What I usually did in order to produce the duplicate DB entries (but with 8.3, the behavior seems to be different):
As I mentioned above, this behavior seems to be different with LMS 8.3:
It seems in LMS 8.3 now it works as expected. But what about same albums with different years? (They sometimes exist, e.g. re-releases, and the release info is only present in comment tag)? |
Just done the same test as above with v8.3 and confirm the same result. So the problem is now just with the album tag If I add new tracks into an existing album folder - even if the album tag is identical to the existing album tags in the same folder, it still creates a new duplicate album in the db for the new tracks to test
The behaviour is sort of understandable, as the existing tracks aren't new or changed, but the folder contents have changed I don't know how to fix it - maybe the scan needs to look for new/changed subfolders (date modified on the folder) and rescan the whole folder? Not sure if either of these are viable |
This is working as expected here. Are you 100% certain album and artist information are absolutely identical? No upper/lower case issues? No whitespace? Would you mind sharing the library.db with such an issue with me? |
Why would you want to install the previous version? It's fixed in 8.3, not 8.2. But to answer your question: yes, you can go back and forth as you like. |
Sorry, Michael - the problem of duplicate albums by adding files to a folder or capitalization changes does not seem to be an issue in 8.3 anymore - at least from what I tried. |
Would you mind sharing your library.db (with the above duplicate albums in it!)? https://www.dropbox.com/request/T3RctyzGgNg0oFDubq6a Without the database it's hard to tell what's going on there. |
Will do - have tidied up the duplicates from yesterday so I will create a new test example and document for you, then upload my library.db and screenshots etc |
Hmm - i now seem to have corrupted my library and it's triggered a full rescan - not ideal! On the positive side, I think I've narrowed down the exact circumstances in which the issue now occurs Most of the error modes from older versions seem to have been fixed - which is great While I wait to get my library back - can you try this
Run a new/changed scan What happens? For me I get 2 new albums created (one with the existing tracks and one with just the new track) |
Thanks @bobbydriver! I received your files and will investigate. Can you confirm you're using the latest LMS 8.3? |
Oh, I think I know what's going on: new tracks are scanned before the updated tracks. The new tracks therefore create a new album, because their album doesn't exist yet. Only once that's done the modified tracks would be updated. And as they already exist, the album referenced in the track would be updated, rather than the track linked to a new album. This causes the previous album to become a duplicate of the new one. That might become tricky to fix. |
Ah ok - that makes sense, not sure how you fix that. I guess if it scanned updated before new files that would bring it's own problems? And yes - I am on the latest 8.3 nightly (if it matters) |
Yes, changing the processing order is the most obvious approach I'll investigate first. |
Please let me know should you encounter any new side-effects. Thanks for your help identifying this long-standing issue, @bobbydriver! |
@mherger , GREAT You fixed this, too! This has been an evergreen either (to me it always happened if there was a new bonus edition of an album and i added the new bonus tracks to it - and this is often nowadays! :) ) |
Thanks Michael! Testing it tonight. will let you know |
Looking good to me - the problem is gone. I didn't think this would ever get fixed so THANK YOU so much! |
Good to know! Sometimes it needs a fresh mind to look into these old issues 😉. |
Did you change artist name in tag, folder and file name all at the same time? I haven't tried all three at once yet. Did you completely delete library.db (not just wipe its content) in the past week? Some of the new behaviour require some table schema to be updated/re-created from scratch. Yes, I'd be interested in your library.db in its broken state: https://www.dropbox.com/request/T3RctyzGgNg0oFDubq6a |
I did not delete library.db, however did a full re-scan before I changed the files as described above. Just dropped the library.db. Now re-scanning with all files named library.* renamed and LMS restarted (triggered a re-scan). I did the following:
|
Thanks for the uploaded file. As you confirmed it's not using the latest schema. It would still do case sensitive comparisons under certain circumstances. |
I'll keep you updated as duplicates occur. For now, as the others already said: Huge thank you for dealing with this issue. |
This has been working fine with everything I've thrown at it in terms of folder/file changes so far, but today found a test case that still gives duplication
Can you recreate this? I'm on today's latest nightly |
Please whenever you encounter such issues put aside a copy of your server.log, scanner.log and library.db and send me a copy. This would greatly help me to better understand what's going on. |
Ok, I've been able to reproduce this... Argh... |
…ust rename its album, but have to merge it with the other tracks of the album, before (potentially) deleting the old album entry.
You wouldn't need to touch any of the other tracks. "fixing" one track's album name would be enough to trigger this issue: on the initial scan two albums would be created for those tracks. When we fix the album title, LMS would simply rename the second album's title, but not try to merge all tracks of that album. The latest commit attempts to fix this situation: it would re-assign the fixed track to the existing correct album, then remove the old, incorrect album (if no track was left on it). Please give the next build another try. And thanks for your reports! |
Tested at my end and that seems to have fixed it - thank you! Will pipe up again if I find any more test fail scenarios, but fingers crossed that this is done now |
Whenever I update files that are registered in the library, a few of them are registered as duplicates (in the same album) in the library after a re-scan.
The nature of the file update can be just (mp3-) tag updates, but also file renaming (directory name remains the same).
The only solution I have found to this is to rename or move the album's directory, re-scan, rename/move it back to its original state and do a second rescan.
Version: 8.2.0 - 1614990095 @ Sat Mar 6 01:43:25 CET 2021
This happens in earlier 8.x and 7.x versions as well.
The text was updated successfully, but these errors were encountered: