-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Store partial reconciliation results #1231
Comments
I agree this is an important problem. But it's not entirely clear to me what the solution should look like.
As always with caches, we have to be careful about the size, invalidation strategy, time to live, and so on… |
Related issue: #580 |
@wetneb If I recall, I think I asked David H. to consider Fetch URLs and Reconcile as long running operations and hence no automatic project save occurs. Honestly, I cannot recall...so...We need to and could change the logic so that project saves are configurable and could occur more rapidly than default (ex. every 4 mins). |
I don't think that running an autosave while the operation is running would do anything - the reconciliation results are stored in memory and they are only added to the project at the very end of the operation. Even if the column was populated gradually, you would still need to write the change to the history, and recover from an interrupted run (that could be done by faceting, but it would be quite manual). |
@wetneb your probably right, like I said...I cannot recall, and I am not very intimate anymore with our code in a lot of areas. But I can give lots of pointers to folks that have more time so they can get intimate with our code :) |
Maybe some tweak around line: https://github.com/OpenRefine/OpenRefine/blob/master/main/src/com/google/refine/process/ProcessManager.java#L152 |
this will help to address #1235 |
This is implemented in the 4.0 branch. See the videos and discussion there: https://forum.openrefine.org/t/partial-results-of-long-running-operations/482 |
Reconciliation (eg with Wikidata) is sometimes a long process. At the slightest connection problem, or if Open Refine crashes, you have to start all over again. Would it not be possible to preserve the reconciliations already made, or to store them in a kind of cache? Unless it is already the case and I ignore it ? (I've never noticed that a failed reconciliation was faster the second time).
The text was updated successfully, but these errors were encountered: