Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wikidata importer from ID #13

Open
LPorcaro opened this issue Mar 29, 2021 · 1 comment
Open

Wikidata importer from ID #13

LPorcaro opened this issue Mar 29, 2021 · 1 comment

Comments

@LPorcaro
Copy link
Contributor

It seems that there is an inconstitencies among methods for importing data from Wikidata.
Indeed, in
https://github.com/trompamusic/ce-data-import/blob/master/ceimport/loader.py#L294
it uses the wikidata URL to get the person info

while in
https://github.com/trompamusic/ce-data-import/blob/master/ceimport/loader.py#L460
it uses the wikidata ID

I assume that every method the wikidata URL should be passed as input

@alastair
Copy link
Member

It looks like I may have made some mistake when merging the two repositories. I'll have a look at it again.

As an overview, there are two ways of loading data, but that's because we get data in different ways from different sites.

  • the main source of data should be wikidata. We get a description from the first paragraph of a related wikipedia article
  • imslp and cpdl have links to wikipedia
  • musicbrainz has a link to wikidata
  • when we have a wikipedia link, we convert it to a wikidata link
  • then we load the data from wikidata
  • then we load data from wikipedia, using the wikidata id and language code to find the article, not the original wikipedia url

what these two loaders should do is:

  1. given a wikipedia url, look up its wikidata id
  2. import the person from wikidata
  3. import the person from wikipedia (using the wikidata id + language code to find the article)

but it seems that they're not doing the right thing, because they're just calling the same method. I'll update it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants