New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Excessive memory usage when saving big XLWorkbook #264
Comments
@igitur the problem is the same as #86 because it's the cell repository that became huge - the suggestion about the cell styles as reference may apply to other properties as well. I used ClrProfiler and confirmed. |
I copy my comment at #273 (comment) about the challenges of using a style repository: For this issue, yes, a central style repository would be one solution, as people have pointed out many times. However, this would require a significant change to the ClosedXML API, which I must thoroughly investigate. Take this sample code: using (var wb = new XLWorkbook())
using (var ws = wb.Worksheets.Add("Sheet1"))
{
var a1 = ws.Cell("A1");
var a2 = ws.Cell("A2");
c1.Style.Fill.BackgroundColor = XLColor.Red;
c2.Style.Fill.BackgroundColor = XLColor.Blue;
} With the current codebase, cell Now let's imagine we implement a style repository. That means If we use a style repository, we would need some hack to do an implicit/invisible |
It's not that hard.
https://msdn.microsoft.com/en-us/library/1yef90x0(v=vs.110).aspx
…On Thu, Apr 27, 2017, 9:14 AM Francois Botha ***@***.***> wrote:
I copy my comment at #273 (comment)
<#273 (comment)>
about the challenges of using a style repository:
For this issue, yes, a central style repository would be one solution, as
people have pointed out many times. However, this would require a
significant change to the ClosedXML API, which I must thoroughly
investigate.
Take this sample code:
using (var wb = new XLWorkbook())using (var ws = wb.Worksheets.Add("Sheet1"))
{
var a1 = ws.Cell("A1");
var a2 = ws.Cell("A2");
c1.Style.Fill.BackgroundColor = XLColor.Red;
c2.Style.Fill.BackgroundColor = XLColor.Blue;
}
With the current codebase, cell C1 would indeed be red and C2 would be
blue, as is expected.
Now let's imagine we implement a style repository. That means c1.Style
points to some object in a shared repository. c2.Style also points to an
object in the repository. When you open a new spreadsheet, c1 and c2
would both have the default styles (i.e. no background colour). Both cells
would have the same Style instance, or in other words Object.ReferenceEquals(c1.Style,
c2.Style) would evaluate to true. That means, in the above code sample,
when you set c2's background colour to XLColor.Blue, you would be
changing this single instance's colour. Because c1 also uses that style,
both C1 and C2 would have a blue background. This is unintuitive and
would break a lot of existing codebases. I doubt I would allow that.
If we use a style repository, we would need some hack to do an
implicit/invisible c2.Style = new Style() first when one of the Style
instance's child properties is changed. I've been looking for solutions,
but haven't found a pattern that solves this yet. If someone has a
practical application (i.e. real code) that can solve this, please help.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#264 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ADdhscIg9uCx3XR-8RtHVd8ujmoUJvadks5r0JStgaJpZM4M3ELZ>
.
|
I don't see how that link addresses the issue I mentioned above. |
Did you read it? Did you not see the part about cell style inheritance and
how they work the logic of exactly what we want to do here?
…On Thu, Apr 27, 2017, 10:05 AM Francois Botha ***@***.***> wrote:
I don't see how that link addresses the issue I mentioned above.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#264 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ADdhsc2bGI291jO0oLwsfPNpdLWqBTeCks5r0KCagaJpZM4M3ELZ>
.
|
Yes, I did read it. It still doesn't address my issue: a style repository doesn't work with the existing API. If you change the subproperty of a shared style, you'll change all cells that use that shared style. If you disagree, post a proof of concept piece of code. |
That is where you would specify that is the expected behavior. They do
have an event when that is fired when that happens in the grid article.
Notice the grid has a default style and an actual style property and uses
the inherited cell style. Let the user determine which style they want. If
they say
Defaultstyle.backcolor then do that for all cells using that style object.
The style inheritance is the key.
…On Thu, Apr 27, 2017, 10:18 AM Francois Botha ***@***.***> wrote:
Yes, I did read it. It still doesn't address my issue: a style repository
doesn't work with the existing API. If you change the subproperty of a
shared style, you'll change *all cells* that use that shared style. If
you disagree, post a proof of concept piece of code.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#264 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ADdhscuinJ5lj9aRYk_OxpbDy3R7Arehks5r0KOggaJpZM4M3ELZ>
.
|
We would have workbook, worksheet, row, column, and cell layers.
…On Thu, Apr 27, 2017, 11:35 AM Jay Asbury ***@***.***> wrote:
That is where you would specify that is the expected behavior. They do
have an event when that is fired when that happens in the grid article.
Notice the grid has a default style and an actual style property and uses
the inherited cell style. Let the user determine which style they want. If
they say
Defaultstyle.backcolor then do that for all cells using that style object.
The style inheritance is the key.
On Thu, Apr 27, 2017, 10:18 AM Francois Botha ***@***.***>
wrote:
> Yes, I did read it. It still doesn't address my issue: a style repository
> doesn't work with the existing API. If you change the subproperty of a
> shared style, you'll change *all cells* that use that shared style. If
> you disagree, post a proof of concept piece of code.
>
> —
> You are receiving this because you commented.
>
>
> Reply to this email directly, view it on GitHub
> <#264 (comment)>,
> or mute the thread
> <https://github.com/notifications/unsubscribe-auth/ADdhscuinJ5lj9aRYk_OxpbDy3R7Arehks5r0KOggaJpZM4M3ELZ>
> .
>
|
@vbjay @igitur that was an example that we discussed but I understand if implementation could be a much simpler with just cell default and cell style. These are picked from a style repository (cell default pre-populated). The cell object just have a reference to the repository. When a style is set, the repository is updated or add a new style (using a hash as key of the repository). |
I know this would be a big change (maybe V2), but you could use a factory method approach with immutable styles. So in your example above, something like
Also I note that your implementations of GetHashCode() on XLStyle etc are suspect because they reference non-readonly properties. |
Anything I can test or examples I can provide for this issue? The workbooks are approximately 8mb in size when saved as xlsx files and contain 50,000 rows with 53 columns. To add the data to the sheet I am using:
This causes the memory to spike up to 600 MB before finishing, and then:
causes an additional spike that keeps increasing to 2GB + until I receive an OutOfMemory exception. |
@ghronkrepps Have you tried disabling workbook events? |
@igitur Yes I have that option set when I create the workbook. I have not seen a difference in memory consumption with or without it. |
I noticed that I don't receive the out of memory exception until the line executed after saving. So I was able to force a garbage collection
And was able to continue with no problems. |
Have you looked at the Flyweight Pattern? |
I think I found the way how to improve the situation with styles handling. I manage to implement POC where XLFont instances are interned. I tested it against the file attached to #607, and had the following results:
The underlying idea is the following:
So, currently, we have 1000 XLCell-s, they reference 1000 XLStyle-s, 1000 XLBorder-s, 1000 XLFont-s, 1000 XLAlignment-s, 1000 XLFill-s and so on. You can look to the code here Pankraty@374d105. It is not for merging, of course, but if you guys approve the approach I can proceed with the implementation. |
https://www.nuget.org/packages/ClosedXML/0.93.0-beta2 has been released and should address this issue. Closing this issue. Please address other concerns in new issues. |
Continued from #86
For large workbooks,
ClosedXML
consumes are large amount of memory and sometimes this leads toOutOfMemoryException
.This is a known issue. For every cell that is used or addressed a full object graph containing the cell styles, formatting, etc is created. This is quite expensive.
The text was updated successfully, but these errors were encountered: