Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Web based Add Book creates malformed entries if there's no author #2689

Open
tfmorris opened this issue Dec 3, 2019 · 2 comments
Open

Web based Add Book creates malformed entries if there's no author #2689

tfmorris opened this issue Dec 3, 2019 · 2 comments
Labels
Affects: Data Issues that affect book/author metadata or user/account data. [managed] Affects: UI Issues with the web site's user interface. [managed] Lead: @scottbarnes Issues overseen by Scott (Community Imports) Needs: Investigation This issue/PR needs a root-cause analysis to determine a solution. [managed] Priority: 3 Issues that we can consider at our leisure. [managed] Type: Bug Something isn't working. [managed]

Comments

@tfmorris
Copy link
Contributor

tfmorris commented Dec 3, 2019

Related #2116 internetarchive/openlibrary-client#126

When a book is entered through the web UI and the user doesn't provide an author, the author entry is incorrectly formatted as described in the two related issues.

Evidence

See the increasing counts and list of newly created works at internetarchive/openlibrary-client#126

Steps to Reproduce

I didn't attempt to reproduce, but the empirical evidence is that an additional 4,000 of these entries were created this year.

  • Actual: record have an author object with a type, but no value
  • Expected: authors should just be an empty list instead.
@tfmorris tfmorris added Type: Bug Something isn't working. [managed] Affects: UI Issues with the web site's user interface. [managed] Affects: Data Issues that affect book/author metadata or user/account data. [managed] labels Dec 3, 2019
@tfmorris
Copy link
Contributor Author

tfmorris commented Dec 4, 2019

Since the original dataset ended in Aug 2019, I realized that there was a small chance the bug had been fixed in the intervening months, so I reran the analysis with the Nov 2019 dump. Here are some recent bad entries:

2019-11-30T18:01:33.177279	/works/OL20503723W
2019-11-29T05:00:48.115709	/works/OL20501278W
2019-11-29T02:37:49.419152	/works/OL20501231W
2019-11-27T21:49:25.668036	/works/OL20499616W
2019-11-27T19:31:31.716786	/works/OL20498934W
2019-11-27T19:31:24.291327	/works/OL20498931W
2019-11-27T19:31:22.097877	/works/OL20498930W
2019-11-27T19:31:21.283416	/works/OL20498929W
2019-11-27T19:27:06.373220	/works/OL20498902W
2019-11-27T19:27:04.043480	/works/OL20498901W

The most recent work has no associated editions and was created by @JeffKaplan, so perhaps he can provide more insight into what screen/mechanism was used.

@LeadSongDog
Copy link

LeadSongDog commented Dec 4, 2019

@tfmorris
The first of those was left childless after https://openlibrary.org/books/OL27737159M/Reasoning_through_Romans_Part_1?b=2&a=1&_compare=Compare&m=diff
merged its sole edition to another work record. Patrons have no ready way to redirect these childless works to the merged work. It should just happen automagically.

The second was a new user with the same username as the publisher stated. The title seems to be a mangled version of Ưu điểm của két sắt khách sạn ("Advantages of hotel safes" in Vietnamese). Gives no author, identifier (except OL numbers) or other details. Pretty clearly spam/vandalism

The third was another new user adding a Create Space edition. Gives no author, identifier (except OL numbers) or other details. Pretty clearly spam again.

The real thing to address here is why the edit interface allows creation of works without authors or for that matter other identifiers too.

@xayhewalo xayhewalo added the Needs: Triage This issue needs triage. The team needs to decide who should own it, what to do, by when. [managed] label Dec 5, 2019
@cdrini cdrini added Needs: Lead Lead: @cdrini Issues overseen by Drini (Staff: Team Lead & Solr, Library Explorer, i18n) [managed] Needs: Investigation This issue/PR needs a root-cause analysis to determine a solution. [managed] and removed Needs: Lead Needs: Triage This issue needs triage. The team needs to decide who should own it, what to do, by when. [managed] labels Apr 20, 2020
@mekarpeles mekarpeles added the Needs: Triage This issue needs triage. The team needs to decide who should own it, what to do, by when. [managed] label Sep 15, 2023
@mekarpeles mekarpeles added Lead: @scottbarnes Issues overseen by Scott (Community Imports) Priority: 3 Issues that we can consider at our leisure. [managed] and removed Lead: @cdrini Issues overseen by Drini (Staff: Team Lead & Solr, Library Explorer, i18n) [managed] Needs: Triage This issue needs triage. The team needs to decide who should own it, what to do, by when. [managed] labels May 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Affects: Data Issues that affect book/author metadata or user/account data. [managed] Affects: UI Issues with the web site's user interface. [managed] Lead: @scottbarnes Issues overseen by Scott (Community Imports) Needs: Investigation This issue/PR needs a root-cause analysis to determine a solution. [managed] Priority: 3 Issues that we can consider at our leisure. [managed] Type: Bug Something isn't working. [managed]
Projects
None yet
Development

No branches or pull requests

5 participants