Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large memory consumption even after running Garbage Collection #173

Closed
JoaoPere opened this issue May 11, 2022 · 2 comments · Fixed by #197
Closed

Large memory consumption even after running Garbage Collection #173

JoaoPere opened this issue May 11, 2022 · 2 comments · Fixed by #197
Assignees
Milestone

Comments

@JoaoPere
Copy link

JoaoPere commented May 11, 2022

The use case in hand consists of a Spring Boot Java Service that yields an ODS file streamed in base64 via JSON alongside other values. In the end, generated files will have fairly big payloads, so performance is a big concern, with both time and space in question. Performance tests noted that memory consumption gets big rather quickly and even after running the Garbage Collector, the heap is still fairly big.

Unfortunately the source code responsible to create the file is too expansive to share but essentially the problem seems to derive from the way table rows and table cells are being populated. For time performance reasons, there is a base row and cell we use as reference

OdfTableRow currentRow = sheet.getRowByIndex(0);
OdfTableCell currentCell = currentRow.getCellByIndex(colIndex);
TableTableRowElement baseTableTableRowElement = currentRow.getOdfElement();
Node baseNode = currentCell.getOdfElement().getFirstChild();  

from which following iterations clone from.

TableTableRowElement  copyTableTableRowElement = (TableTableRowElement) baseTableTableRowElement.cloneNode(true);
NodeList childNodeList = copyTableTableRowElement.getChildNodes();
Element element = (Element) childNodeList.item(index);
element.appendChild(baseNode.cloneNode(false));
baseTableTableRowElement.getParentNode().appendChild(copyTableTableRowElement);

All streams are correctly closed and the endpoint has succesfuly returned.

Since OdfToolkit (0.9.0) uses a DOM based approach it is expected that memory consumption is big throughout the creation of the file. However, the most concerning aspect is that even when the Garbage Collector is manually triggered, part of the heap is dropped but a fair chunk still remains in memory. For instance, in a file generated with roughly 5k rows in 6 worksheets (30k rows), memory consumption goes as high as 1.8GB, dropped to 400MB after running the GC. Following runs with the same payload spike this to 800MB, suggesting a O(n) space complexity.

image

Also worth noting that the following implementation was subject to an uptake since it was based on the now deprecated, soon to be dropped, Simple API.

Happy to provide more details if need be.

@JoaoPere
Copy link
Author

bump

@svanteschubert
Copy link
Contributor

@JoaoPere We have uploaded a patch, could you please pull, build and test your scenario again, please?
Thanks in advance!
Svante

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants