Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When scraping, comics from Imprints having primary Publisher recorded in PUBLISHER table. #205

Closed
bareheiny opened this issue Mar 28, 2020 · 4 comments
Assignees
Labels
bug Something isn't working critical Critical issue metadata Issues related to comic metadata
Milestone

Comments

@bareheiny
Copy link

bareheiny commented Mar 28, 2020

Publishers and imprints are causing primary key violations when scraping.

Depending on the order of scraping, Wildstorm is being recorded as DC Comics and Icon Comics is being recorded as Marvel.

This causes primary key violations when attempting to scrape either a true DC or Marvel comic.

@bareheiny bareheiny changed the title Publisher - Wildstorm being recorded as DC, causing primary key violation. When scraping, comics from Imprints having primary Publisher recorded in PUBLISHER table. Mar 28, 2020
@mcpierce mcpierce added bug Something isn't working metadata Issues related to comic metadata labels Mar 28, 2020
@mcpierce mcpierce added this to the 0.6 milestone Mar 28, 2020
@mcpierce mcpierce added the critical Critical issue label Mar 28, 2020
@mcpierce mcpierce self-assigned this Mar 28, 2020
@mcpierce
Copy link
Contributor

@bareheiny Yep, identified the issue. When the CV scraper sees the publisher is one of the know the imprints it's replacing the name with the parent publisher's name, which is causing the error.

Fixing it now.

mcpierce added a commit to mcpierce/comixed that referenced this issue Mar 28, 2020
@mcpierce
Copy link
Contributor

@bareheiny I've pushed a PR that should fix this issue. When the prerelease is created, please test it and if it works you can close this ticket.

@bareheiny
Copy link
Author

Looks good to me.

It does raise a question about correcting ComicVine data once a comic has been scrapped - but for now, I'm just going to modify the record via SQL.

@mcpierce
Copy link
Contributor

@bareheiny If there's a change in the CV data, rescraping the comic replaces/overwrites the old data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working critical Critical issue metadata Issues related to comic metadata
Projects
None yet
Development

No branches or pull requests

2 participants