Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Periodically update WoS records #1594

Open
peetucket opened this issue Apr 7, 2023 · 0 comments
Open

Periodically update WoS records #1594

peetucket opened this issue Apr 7, 2023 · 0 comments

Comments

@peetucket
Copy link
Member

peetucket commented Apr 7, 2023

See also #87 and #179 which are related

Web of Science periodically fixes data problems, for example, typos and problems in identifiers, such as DOIs. Since we never update our source data once harvested, those typos and problems with DOIs will remain forever. This results, for example, in broken DOI links (of which we have ~1400 as of April 2023).

Since we used previously harvested publication data if possible when adding the same publication to a new author Profile, it will still have the old cached data even if the a new author harvests that previously harvested publication.

This task would involve periodically pulling updated WoS data for all our publications and re-updating our cached data (source records and pub-hash) or at least some portion of it (such as just the identifiers).

Note, this could have side-effects, as we would be changing data for publications already harvested, which could

  1. change how they appear on user's profiles, even after approved
  2. cause larger than expected nightly change updates when the Profiles API updates publications via our API

Note that some data problems are likely never fixed in Web of Science source records, and this work would thus have no impact on records with persistent bad metadata.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant