Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adapt config for a wikibase different to wikidata? #154

Open
dlindem opened this issue Mar 8, 2021 · 2 comments
Open

Adapt config for a wikibase different to wikidata? #154

dlindem opened this issue Mar 8, 2021 · 2 comments

Comments

@dlindem
Copy link

dlindem commented Mar 8, 2021

Hello, I would like to use this tool on a wikibase instance different to wikidata. Is there an easy way to adapt config to that other wikibase (base url, author literal prop, author item prop, etc. etc.)? Many thanks in advance,
David (http://data.lexbib.org)

@arthurpsmith
Copy link
Owner

Hi @dlindem - great question. Unfortunately as I've added features to the code I think it's become more and more dependent on the wikimedia ecosystem and details of wikidata... For your use case it definitely would be good to abstract those things better. Here are some of the places it would need to be done:

  • Many places (in "lib/*.php") where the http://www.wikidata.org/entity or /entity/statement is filtered out to get an item ID or claim ID.
  • Various places (in "lib", also "magnustools") where the wikidata api is specified - https://www.wikidata.org/w/api.php
  • The following properties are used directly as strings, would need to be replaced by configuration settings:
    P31 (instance of), P50 (author), P279 (subclass of), P1932 (stated as), P2093 (author name string), P1476 (title), P1545 (series ordinal), P1433 (published in), P356 (DOI), P698 (PMID), P921 (main subject), P577 (publication date), P2860 (cites work), P496 (ORCID), P213 (ISNI), P1053 (ResearcherID), P214 (VIAF ID), P2038 (ResearchGate Profile ID), P108 (employer), P1416 (affiliation), P69 (educated at)
  • Also it expects human to be Q5 in several places
  • The OAuth configuration in lib/wikidata_oauth.php (and associated oauth.ini file which is not part of the git repo) would have to be altered for however you are handling authentication...
  • There is also a database connection file on toolfoge - replica.my.cnf - which is also not part of the repo; this is used for batch handling, author lists, and query rate throttling on the non oauth pieces. Setting up and configuring this database is a step I don't currently have listed in setup instructions, which are woefully lacking right now!
  • I've probably missed some things - having a test case would help!

If you are interested in working on this yourself, let me know. Pull requests to address some of the above are welcome! I can also see this being something I might want to look into as well, having contemplated setting up a separate wikibase myself. Is your wikibase available to the public, so I could test with it?

@dlindem
Copy link
Author

dlindem commented Mar 9, 2021

Hi @arthurpsmith, many thanks for your quick answer. Yes, my wikibase is available, look at the SPARQL query examples for getting an idea of the content, or e.g. this for a bibitem. I am doing bibdata migration from Zotero to my wikibase (and ultimately also to wikidata, but for curation I will keep it in mine).
I am about to replace a batch of author/editor literals (P39/P42) with item statements (I used OpenRefine to cluster name variants to around 4.000 unique authors, which are just now created as new Q5 items (yes, humans are Q5 also in mine)). After that, your tool would be best option when including more bibitems, 95% of which will have known authors/editors.
If you want to use my usecase for tests, perfect. I understand your hints, and will now create all necessary properties you list. As soon as I find more time, I would try to make the necessary edits to the authordisambiguator code, but I'd also have to find out how to install the tool (locally? on lexbib.org server? Elsewhere? Very little experience with such tasks).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants