Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: remove suspect publish_date values from AMZ/BWB #9009

Conversation

scottbarnes
Copy link
Collaborator

@scottbarnes scottbarnes commented Apr 2, 2024

Closes #8969

Certain import sources, such as Amazon and BWB, have publish_date values that are known to be suspect, such as 1900 or January 1, 1900.

This commit removes the publish_date value from those import records prior to import.

Specifically, the publish_date value is removed in the following situations:

  1. The import record has a source_record value starting with amazon, bwb, or promise; and
  2. the publish_date is either 1900 or January 1, 1900.

Questions

I want to verify the above above conditions for removal are correct.

Testing

See the unit tests.

Stakeholders

@mekarpeles
@judec
@hornc

Certain import sources, such as Amazon and BWB, have `publish_date`
values that are known to be suspect, such as `1900` or `January 1, 1900`.

This commit removes the `publish_date` value from those import records
prior to import.
@mekarpeles mekarpeles self-assigned this Apr 5, 2024
@mekarpeles mekarpeles merged commit 1d5cb58 into internetarchive:master Apr 5, 2024
3 checks passed
@scottbarnes scottbarnes deleted the feature/8969/stop-importing-from-aws-and-bwb-if-publish-date-is-1900 branch April 5, 2024 16:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Don't import from Amazon or BWB promise items if publish_date is 1900
2 participants