Join GitHub today
GitHub is home to over 36 million developers working together to host and review code, manage projects, and build software together.Sign up
Integrate paper metadata handling with Wikidata #126
This post is inspired by the BibTeX from Wikidata functionality described in https://larsgw.blogspot.de/2016/09/citationjs-on-command-line.html .
Some thoughts on how to integrate ContentMine's paper metadata handling with Wikidata:
These are definitely interesting ideas.
The first idea appears to me to be the most readily actionable at the moment. What sort of work flow do you envisage for this? Hope this doesn't seem like a barrage of questions but I just want to check I've understood this all right.
Should we build a bot and seek approval from the community to add these items? In the call we discussed principally interacting via the primary-sources tool although we mostly talked about 'facts' not paper metadata. Perhaps given that this data is already 'curated' by either the NCBI or CrossRef this isn't such a problem.
Which metadata should we consider adding? If we are looking at all the new publications on a given day should these all be added to Wikidata? My impression from wikicite was that this kind of blanket adding where there is neither a structural need or a lack of real notability for a given publication should be avoided (and perhaps go into librarybase instead?).
One of the places I started with librarybase before was only importing works that were referenced on enwiki (but we could choose any/all wikis)?
I think having Wikidata entries for all works cited from enwiki is reasonable, and expansion to all works cited by any Wikimedia project (at least from their content namespaces) should come as soon as possible thereafter.
Blanket addition beyond that may cause problems but still makes sense in the long run, as that would contribute to the goal of turning Wikidata into an open citation graph. So anything cited anywhere from "within scope" would eventually get a Wikidata item, and after some time, I could well imagine encouraging people to upload their BibTeX files to some tool on Wikimedia labs that would then check these files against the Wikidata corpus and add info / flag inconsistencies as needed.
Perhaps we can start by sharing in a standard fashion the publications that CM has read on a given day, perhaps along with things mined from them? We could then go over that feed and hopefully become more specific about the respective workflows for Wikidata/ Librarybase, and demo things with Zika.
I think these are all in scope for the WikiFactMine project - it will
On Mon, Oct 10, 2016 at 4:46 AM, Daniel Mietchen firstname.lastname@example.org
Yes - Lars has done a super job.