Export to the Wikidata Mismatch Finder #5607
Labels
Type: Feature Request
Identifies requests for new features or enhancements. These involve proposing new improvements.
wikibase
Related to wikidata/wikibase integration
There is a relatively new tool to import data into Wikidata: the Wikidata Mismatch Finder, a tool developed by WMDE which acts as a sort of staging space for data uploads to Wikidata. Once new or conflicting data is uploaded to the Mismatch Finder, editors can review each proposed statement individually and decide whether to add it to Wikidata or not.
@lydiapintscher asked me whether OpenRefine could be used to populate the suggestions offered by this tool. This works by producing a CSV file of a specific format, which then gets uploaded via the tool's API.
Proposed solution
A new exporter could be added to the Wikibase extension, making it possible to upload candidate edits to the Mismatch Finder (instead of uploading the data directly or using QuickStatements).
As a user, this would mean that one would prepare the data just like for direct edits (with reconciliation, schema building, issue fixing and preview), but one could at the end choose the Mismatch Finder as an upload method. This would generate a fitting CSV, which would be either downloaded from OpenRefine by the user and then uploaded to the Mismatch finder, or the upload via the Mismatch finder's API could also be handled by OpenRefine (which requires asking the user to generate a token on the tool's side and add it to OpenRefine, to be used for API authentication).
Alternatives considered
Someone could write a tutorial explaining how to use the existing OpenRefine functionalities to produce a CSV of the format expected by the Mismatch Finder.
Additional context
Whether the tool can be run on other Wikibase instances is unclear at this stage.
To offer the functionality only for Wikibases which do support the tool, it might be necessary to add a relevant field in the Wikibase manifest, especially for an integration which would upload the data directly to the tool via its API.
The data model supported by the CSV format looks relatively poor, so it is unclear how rich statements (with qualifiers, novalue/somevalue claims, ranks, complex references…) could be represented in this export format.
The text was updated successfully, but these errors were encountered: