-
Notifications
You must be signed in to change notification settings - Fork 786
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TcSymbolUseData[] accounting for 19.1 MB on the LOH #6084
Comments
The way things are set up, we have to save some information after checking a file - and the information we save is a dump of all the symbol resolutions. For a medium-sized file I'd imagine this can be 400K no problem. The information is not particularly long lived - the next check of the file will make it irrelevant and the information will be collected. I think this architecture is OK - if we were trying to reduce the amount of re-computed information we would do that by making all of checking more incremental from parsing through to type-checking. SO it's not a priori wrong to allocate this information, even in one linear array (note that pre-indexing the information or attempting to compress it is a waste of time since it is very often discarded). It does however feel odd if the CLR is somehow assuming it is long-lived. Is the CLR applying the (in this case false) heuristic that all LOH objects are long-lived? If so I suppose we could artificially chunk the array though it feels, well, artificial. What's the minimum size on the LOH? [ Aside: TBH much of this memory work would feel easier if the CLR just had a intrinsic allocation method that said "this object is big but transient", or magically worked that out. From a GC perspective there should be no real problem with allocating short-lived big objects apart from the data-copying costs involved, as the data should just evaporate immediately on next collection. ] |
BTW this is the definition of
Looking at this
For the others:
|
To answer my question:
So we should chunk this thing I guess. Ugh how artificial. Fighting the memory manager is a PITA. |
If someone wants to look at this, it should be fairly easy to chunk the array allocated here based on
|
(I must add: it is great to see this LOH-analysis work beginning to hone in on the actual data we need to save, rather than data that should never have been allocated/copied in the first place :) ) |
I might be off-base here, but the |
@baronfel Yes, that's correct (except that CapturedNameResolution is not a large struct so there will be less on the LOH). |
Ah, I see the caveat there now. Because CapturedNameResolution contains less data we can have more array elements before we hit the 85,000 byte limit, so we may not hit the LOH for that structure at all. Thanks for clarifying. |
FYI there is excellent information in our docs about this: https://docs.microsoft.com/en-us/dotnet/standard/garbage-collection/large-object-heap |
This feels like a bit of a design smell to me. If we need to allocate ~390k of data very often (in this case, every 3-4 seconds), then it implies something needs to be cached. I was only working with two files and typing in one, and only adding a single function. I don't see why that requires |
But relatively speaking this is far less pressing than other issues filed |
Yes. The smell is really just a whiff of the much bigger stench of re-checking entire files very frequently even on small changes. We could in theory pool the arrays, to avoid the reallocation. However that must be done with real care as the arrays contain references. And, putting aside the LOH, I'm certain that the costs of actually doing the re-checking are much higher than the cost of allocating and filling this specific data structure. |
Thanks, yes, that's great up-to-date info |
This reverts commit 7584974
This reverts commit 7584974
This reverts commit 7584974.
This reverts commit 7584974.
This reverts commit 7584974.
This reverts commit 7584974.
This reverts commit 7584974.
This reverts commit 7584974.
This reverts commit 7584974.
This was after opening VS (dev16.0, with built-in VSIX) for about three minutes, slowly implementing a function in FSharp.Editor that uses FCS and Roslyn types for about 3 minutes. I say slowly because I code slowly 🙂
This is the
TcSymbolUses
ctor getting called 49 times, which is ~390k on the LOH each time. This is called byTypeCheckTask
in the incremental builder:https://github.com/Microsoft/visualfsharp/blob/631401a1cc7c487a6b27486242ac7df3e77bd495/src/fsharp/service/IncrementalBuild.fs#L1378
Since in my scenario the large majority of these symbols are the same (in fact they would probably be the mostly the same in nearly every scenario), I wonder if there's an opportunity to cache them.
The text was updated successfully, but these errors were encountered: