Load evaluted values in parallel #610
Replies: 5 comments
-
We understand your pain point, but this would be difficult to change, as we assume that after setting some values we (re)compute all the affected cells. Actually, in our benchmarks we saw that the most time-consuming part is the construction of the graph of dependencies between cells, in particular when you have big ranges / whole columns, not the evaluation. Do you remember that you can use batch to add all your data at once? This will ensure that you evaluate all cells exactly once. In general, HF does not parallelize well, but if you need to speed up the first loading phase and later you can wait for the result of evaluation, you can first add the (raw) data and later add only the cells with formulas. This will simply run all the computations later. To sum up, I wouldn't recommend changing the behavior of HF. I hope that some of the above hints will already help you speed HF up. |
Beta Was this translation helpful? Give feedback.
-
Excel stores values as cache and IMO the cache management is one layer higher than the calculation engine. It depends on the use case so it's up to the developers to implement it according to their needs / tech stack. I wouldn't change that in HyperFormula, as this is already possible to do with an abstraction layer over the calculation engine. Note: Pseudo-code, just to show the idea. Incomplete. Server payload should have both, cached value and the formula assigned to the cell: {
"status": 200,
"error": null,
"data": [
[ { "value": 3, "formula": "=SUM(B1:B2)" }, { "value": 1, "formula": 1 } ],
[ { "value": null, "formula": null }, { "value": 2, "formula": 2 } ]
],
"namedExpressions": [ ]
} App initialization will load the data and create an instance of our cached formula engine: const response = await fetchDataFromServer();
if (reposne.error) {
throw new Error(response.error);
}
const data = response.data;
const namedExpressions = response.namedExpressions;
const formulas = new MyFormulaEngine({ data, namedExpressions });
const value = formulas.getValue('A1');
// returns value "3" instantly, from server cache Then we need an engine to manage HyperFormula, and if it's not ready, get the value from server cache: class MyFormulaEngine {
private hyperformulaReady = false;
private dataCache;
constructor({ data }) {
this.dataCache = data.map(row => row.map(cell => cell.value));
const hfData = data.map(row => row.map(cell => cell.formula));
// JS is single threaded so we will need a Worker or Node.js process,
// this highly depends on the tech stack used by your company
this.hyperformula = new MyHyperFormulaWorker();
this.hyperformula.addEventListener('hyperformulaReady', () => {
this.hyperformulaReady = true;
this.dataCache = null; // release the memory
});
this.hyperformula.loadData(hfData);
}
/**
* Method returns the value for cell from HyperFormula OR fallbacks to the cache
* until the HyperFormula is ready.
*/
getValue(cellAddress) {
const simpleCellAddress = this.hyperformula.toSimpleCellAddress(cellAddress);
if (this.hyperformulaReady) {
return this.hyperformula.getCellValue(simpleCellAddress);
}
return this.dataCache[simpleCellAddress.row][simpleCellAddress.column];
}
/**
* Disable editing until the calculation engine is ready. Show loader or whatever you like.
*/
get readOnly() {
return !this.hyperformulaReady;
}
} |
Beta Was this translation helpful? Give feedback.
-
Thanks for your answer. But I am not really sure if the generation of the cell dependency graph could be faster. When I handover our quite complex Excel file to another machine it is also loaded in one second without any caching. As far as I understand to get all dependencies of a Cell one have to evaluate the precedents of all cells. Is this what takes so much time? I think there should be a way to build the dependecy graph faster? |
Beta Was this translation helpful? Give feedback.
-
Excel file IS the cache. We're parsing the XLSX files for https://handsontable.com/spreadsheet-viewer/ so we got to know their structure a little bit. Inside the file they keep a lot of meta data for each cell:
Notice the Most viewers work like that. Excel by default opens the file for viewing purposes only and has "Edit this file" button. We even do this! The Spreadsheet Viewer (https://handsontable.com/spreadsheet-viewer/) don't have a graph parser. My idea from #609 (comment) is based on the Excel file. You store the calculated value on the database (file) and send it to the app together with the formula. Perceived performance is an app loaded instantly while the graph is build in the background. All the animations, fade ins, finding the cell you want to edit, opening the editor, validation and saving the data takes time. At this point the engine should be initialized and the user won't even notice that the calculation engine was not ready from the beginning. Edit: Moreover, you can use the SSR technique to send the view with cached values as RAW HTML with the first page load and "hydrate" the data with interactive calculation engine afterwards. Should work with any front-end framework that has SSR support. This way you could deliver the perfect perceived performance to the user. |
Beta Was this translation helpful? Give feedback.
-
Thanks for your explanation @wojciechczerniak ! I know the spreadsheet viewer and also looked into xlsx format. I didn't know that excel does the graph building in background after I opened the file. Than I was "tricked" by that because of the already shown values. I thought at least the graph is build when I open the file (not in background). I already knew that only the values are shown but to edit of course the graph has to be built before. |
Beta Was this translation helpful? Give feedback.
-
We have the problem that the loading of our data takes a long time because all cell formulas are evaluated when the data is set.
Excel saves the evaluated cell value in parallel and cells are only evaluated on changes. Because of that the first data loading is much faster (<1 second instead of 1 minute).
I don't know how complicated is that to implement. If you don't have the time maybe you could give a hint what I could do?
Beta Was this translation helpful? Give feedback.
All reactions