Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A few BookFinder improvements (including a fix for #2238) #2400

Merged
merged 7 commits into from
Dec 10, 2023

Conversation

mikiher
Copy link
Contributor

@mikiher mikiher commented Dec 8, 2023

The following improvements were implemented:

  • Handle "abridged/unabridged" in titleCandidate generation
  • Clean up "et al[.]" in author normalization.
  • Sort Audible match results by ascending duration difference
    • this required an API change, since up until now /search/books had no context (it only accepted title, author, and provider). We now pass the libraryItem id from the client, retrieve it from the database, and pass it to BookFinder.search().
    • This change is fully backwards-compatible.
    • This change is also generally a good preparation for future context-based matching improvements (e.g. prefer results with same narrator)
    • I believe this resolves Sort Book Match Results by closest to length #2238

@mikiher mikiher marked this pull request as ready for review December 8, 2023 22:53
@advplyr
Copy link
Owner

advplyr commented Dec 10, 2023

Works great, thanks!

@advplyr advplyr merged commit 6abc081 into advplyr:master Dec 10, 2023
1 check passed
@mikiher mikiher deleted the bookfinder-improvements branch December 10, 2023 21:57
@advplyr
Copy link
Owner

advplyr commented Jan 16, 2024

#2517 there is a case I didn't think of where audible returns a bunch of results that are bad matches but if one of those matches has a closer duration it will show up at the top.
I'm not sure the best resolution for this. Maybe we have a max levenshtein distance or something.

@mikiher
Copy link
Contributor Author

mikiher commented Jan 17, 2024

So, there's a filterSearchResults function that tries to do this (but only for OpenLib results).
I tried to enable it for other providers at some point, but it had some issues and I ditched the idea at the time (I don't remember the exact details).

I'll look into it once more, and perhaps change the current implementation so it works well for all providers.

@mikiher
Copy link
Contributor Author

mikiher commented Jan 17, 2024

But, just to be clear, even if it is better implemented, I think filterSearchResults is just a patch.

I think that for the purpose of running Quick Match on a large scale, what we really need is to have a confidence score for the top result. If it is above some threshold, we automatically match, if not, we mark it and let the user review - I know I want this feature badly... it's the only way I'm going to be able to get some order into my library :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Sort Book Match Results by closest to length
2 participants