-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ability to update/replace statements #3383
Comments
Yes this is a very legitimate request. The tricky part is to make this feature work for a wide range of use cases. Therefore we need to make multiple things configurable:
I am keen to work on this but it'll have to wait a few months still since I am busy with a migration to a new architecture (https://github.com/OpenRefine/OpenRefine/projects/7) |
Asynchronous distributed updates are a difficult problem in general, but I think may be impossible to do reliably without at least some minimal machinery on the backend. Freebase had an entire framework that was put in place for bulk updates (which also included things like sampled quality reviews, etc). I think that, at a minimum, you'll need some type of "commit if conflict free" primitive that could capture the state of the world that you reconciled against and tell if it had changed since then. It feels dangerous to try and innovate in this space as long as the Wikidata team is ignoring needs of bulk updates. They really need to put the architecture in place to support clients like OpenRefine. |
It is true that Wikibase-side support for this would be great. Pinging @addshore who has been thinking about this recently. That being said, in the current context of Wikibase, I would not be too worried about atomicity: the edit rate on Wikidata is still really manageable and the chances that simultaneous editing ends up in data races is pretty low (compared to, for instance, the problems stale reconciliation data can cause). |
In the context of the Wikimedia Commons integration project, we have updated Wikidata-Toolkit to a new version, which breaks the limited deduplication of statements we had, so I am going to build it back better, more configurable. |
This should be in the forthcoming release (3.6). In the meantime, feel free to use our snapshot releases to try it out and tell us if you see ways to improve it. |
I am looking for a tool to do mass edits to a Wikibase. It would be useful if the same tool could do both adding missing data and updating existing data. Based on my understand, currently OpenRefine is only able to add new items and new statements.
Proposed solution
Brute force solution
In Wikibase schema editor, have a checkbox to replace existing statements for the given property. Internally this would be implemented as deletion of all statements for the subject using that property. This would be sufficient for our case.
Ideal solution
"Reconcile" statements. Ideally the Query Service can be used to output statement identifiers that uniquely identify the statement. Otherwise a manual reconciliation is needed (but limited to the statements of the already reconciled item). Then the value (in another column) could be updated in OpenRefine and it would automatically update the existing statement instead of creating a new one.
Alternatives considered
I think I saw a suggestion to export to quickstatements and duplicate lines so that they are preceded by deletions. But this is rather difficult to do manually.
I think I could write my own tool that processes either the quickstatements v1 format, but it wouldn't know which columns should be overwritten. I could also write my own tool that just takes CSV (from Query Service, then modified) that does this, but for the end user there are already many tools to learn for the data update process (Wikibase, Query Service, OpenRefine, QuickStatements, this new tool?).
Additional context
https://nimiarkisto.nikerabb.it/w/index.php?title=Item:Q5106574&diff=6759469&oldid=6759468 example of an attempted edit creating a new statement instead of updating it.
#2116 is similar to this request, but per my understanding it is about updating qualifiers rather than the value of the statement.
#2999 (comment) mentions that deleting statements is currently not possible.
The text was updated successfully, but these errors were encountered: