Large memory consumption even after running Garbage Collection #173

JoaoPere · 2022-05-11T10:08:08Z

The use case in hand consists of a Spring Boot Java Service that yields an ODS file streamed in base64 via JSON alongside other values. In the end, generated files will have fairly big payloads, so performance is a big concern, with both time and space in question. Performance tests noted that memory consumption gets big rather quickly and even after running the Garbage Collector, the heap is still fairly big.

Unfortunately the source code responsible to create the file is too expansive to share but essentially the problem seems to derive from the way table rows and table cells are being populated. For time performance reasons, there is a base row and cell we use as reference

OdfTableRow currentRow = sheet.getRowByIndex(0);
OdfTableCell currentCell = currentRow.getCellByIndex(colIndex);
TableTableRowElement baseTableTableRowElement = currentRow.getOdfElement();
Node baseNode = currentCell.getOdfElement().getFirstChild();

from which following iterations clone from.

TableTableRowElement  copyTableTableRowElement = (TableTableRowElement) baseTableTableRowElement.cloneNode(true);
NodeList childNodeList = copyTableTableRowElement.getChildNodes();
Element element = (Element) childNodeList.item(index);
element.appendChild(baseNode.cloneNode(false));
baseTableTableRowElement.getParentNode().appendChild(copyTableTableRowElement);

All streams are correctly closed and the endpoint has succesfuly returned.

Since OdfToolkit (0.9.0) uses a DOM based approach it is expected that memory consumption is big throughout the creation of the file. However, the most concerning aspect is that even when the Garbage Collector is manually triggered, part of the heap is dropped but a fair chunk still remains in memory. For instance, in a file generated with roughly 5k rows in 6 worksheets (30k rows), memory consumption goes as high as 1.8GB, dropped to 400MB after running the GC. Following runs with the same payload spike this to 800MB, suggesting a O(n) space complexity.

Also worth noting that the following implementation was subject to an uptake since it was based on the now deprecated, soon to be dropped, Simple API.

Happy to provide more details if need be.

The text was updated successfully, but these errors were encountered:

JoaoPere · 2022-07-13T10:26:29Z

bump

svanteschubert · 2022-12-19T16:02:33Z

@JoaoPere We have uploaded a patch, could you please pull, build and test your scenario again, please?
Thanks in advance!
Svante

svanteschubert mentioned this issue Aug 1, 2022

OutOfMemory occurs on many OdfTable creations. #176

Closed

svanteschubert added this to the 0.11.0 milestone Dec 12, 2022

svanteschubert assigned mistmist Dec 12, 2022

svanteschubert linked a pull request Dec 19, 2022 that will close this issue

fix issue #176 #197

Merged

svanteschubert closed this as completed in #197 Dec 19, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Large memory consumption even after running Garbage Collection #173

Large memory consumption even after running Garbage Collection #173

JoaoPere commented May 11, 2022 •

edited

JoaoPere commented Jul 13, 2022

svanteschubert commented Dec 19, 2022

Large memory consumption even after running Garbage Collection #173

Large memory consumption even after running Garbage Collection #173

Comments

JoaoPere commented May 11, 2022 • edited

JoaoPere commented Jul 13, 2022

svanteschubert commented Dec 19, 2022

JoaoPere commented May 11, 2022 •

edited