Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to import specific edition from OpenLibrary #446

Closed
renatolond opened this issue Dec 31, 2020 · 6 comments
Closed

Unable to import specific edition from OpenLibrary #446

renatolond opened this issue Dec 31, 2020 · 6 comments
Assignees

Comments

@renatolond
Copy link
Contributor

Is your feature request related to a problem? Please describe.
As a portuguese speaker, often times the "default" edition coming from OpenLibrary does not correspond to the edition I read. I would like to be able to import a exact edition from OpenLibrary

Describe the solution you'd like
If I type a specific OpenLibrary Id or ISBN and it exists on OpenLibrary, I would like to use that edition instead of the default one.

Describe alternatives you've considered
It seems the current code to import the work from OpenLibrary works if I give it an edition link instead of a work, so I think it's an adaptation of the search code to be able to return an edition instead of work.

Additional context
A way to reproduce it: when looking for "OL31834925M", that corresponds to "A Oxford de Lyra", I get back the work and import an english edition of the book instead.

@mouse-reeve
Copy link
Member

this is interesting because in theory is this ought to work -- it should be grabbing all the editions (up to 50) from openlibrary, and while the default edition is set to an english language edition after import, all the editions should be available in the edition page (https://bookwyrm.social/book/22082/editions). I tried importing again and got an import that successfully loaded the Portuguese edition (https://bookwyrm.social/book/38595), but it didn't deduplicate the work correctly and now there are two works :( (related to #413 I bet)

It's frustrating that OpenLibrary's search shows the english language edition as the search result, but I'm not sure how to fix that for the time being. I definitely can play with how bookwyrm search works to try to get it to show the correct edition in this search: https://bookwyrm.social/search/?q=a+oxford+de+lyra -- I think I know why it's choosing the default edition rather than the specific work title here based on search result deduplication.

so the problems I can address seem to be:

  1. for an unknown reason, the Portuguese edition failed to import initially
  2. re-importing created a duplicate work and editions rather than deduplicating
  3. bookwyrm search ranking elided the closer match on the title "A Oxford de Lyra" with the less close but still within the certainty margin "Lyra's Oxford" and merged them into one search result, which hides the correct edition

does that sound right? given that the correct edition does exist now, are the ways (in addition to tweaking search) that accessing it could be easier?

@mouse-reeve
Copy link
Member

okay, #452 and #450 should fix the problems with the data being imported incorrectly. tweaking search is still a to-do

@mouse-reeve mouse-reeve self-assigned this Jan 1, 2021
@renatolond
Copy link
Contributor Author

Hi!

Thanks for looking into this so quickly!
Sorry, you were right, I had misdiagnosed the issue. I did some tests by recreating my local instance and in the latest main here seems to be the correct steps to reproduce it:

  1. Import the attached goodreads export csv (it only contains Oxford de Lyra)
  2. the import works, but only one edition is imported (and it's not the portuguese one, it's https://openlibrary.org/books/OL9661645M/Lyra's_Oxford)

I think the import task does not call "load_more_data" for the editions it imports

Regarding the search, yeah, the results come in an odd order :S
Maybe there could be a "look by id" option in the search? There's similar options in trakt.tv, where instead of a regular search, you can search by imdb id or tvdb id. Maybe in this case there could be an option to look by isbn or by an edition id, then it could use the books api instead: https://openlibrary.org/dev/docs/api/books

I think I mixed a lot of issues here, I can break this down into multiple issues to make this easier to track :)

@mouse-reeve
Copy link
Member

oh you're right about load more data not being called, I keep refactoring things around and that got lost in the shuffle. it'll be so nice to finish this bit: #461

I didn't realize that it was possible to search isbn directly in openlibrary 😂 that will make search and import lookup a lot tidier. In bookwyrm, you should be able to search by openlibrary key or isbn, but per #454, the isbn data is getting populated as well as it should for some reason

btw I really appreciate working through this with you! it's so helpful in finding bugs and weirdnesses in how managing the book data works

@mouse-reeve
Copy link
Member

#466 should (among some other improvements), prefer the best matching edition of a work, rather than the default edition of a work. it also makes searches on unique identifiers like isbn behave as direct lookups instead of trigram vectors.

@mouse-reeve
Copy link
Member

I created #467 for the openlibrary api, and with that I think everything here is either addressed in a pr or in another issue. and thank you again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants