Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WPF TreeView, virtualization, scroll, freeze #1962

Closed
vsfeedback opened this issue Sep 26, 2019 · 38 comments
Closed

WPF TreeView, virtualization, scroll, freeze #1962

vsfeedback opened this issue Sep 26, 2019 · 38 comments
Assignees
Labels
area-VirtualizingStackPanel Bug Product bug (most likely) .NET Framework netfx-servicing-approved Netfx Approved for Servicing

Comments

@vsfeedback
Copy link

This issue has been moved from a ticket on Developer Community.


Using VirtualizingStackPanel can freeze the execution.
Steps to reproduce:

  1. Build a WPF application with TreeView, VirtualizingStackPanel , expanded nodes, release version,
  2. start the exe,
  3. scroll to the end of the tree,
  4. click on the scroll bar at the top and keep the left mouse button pressed.
    The application freeze.

To reproduce try the following demo.:

' XAML:

<Window x:Class="MainWindow"
xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
xmlns:d="http://schemas.microsoft.com/expression/blend/2008"
xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006"
xmlns:local="clr-namespace:WpfApp1"
mc:Ignorable="d"
Title="MainWindow" Height="450" Width="800">
<Window.DataContext>
<local:ViewModel/>
</Window.DataContext>
<Grid>
<TreeView ItemsSource="{Binding View}"
VirtualizingPanel.IsVirtualizing="True"
VirtualizingPanel.VirtualizationMode="Recycling"
VirtualizingPanel.ScrollUnit="Item">
<TreeView.Resources>
<HierarchicalDataTemplate DataType="{x:Type local:Data}" ItemsSource="{Binding Childs}">
<Label Content="{Binding Info}"/>
</HierarchicalDataTemplate>
</TreeView.Resources>
<TreeView.ItemContainerStyle>
<Style TargetType="{x:Type TreeViewItem}">
<Setter Property="IsExpanded" Value="True"/>
</Style>
</TreeView.ItemContainerStyle>
</TreeView>
</Grid>
</Window>

' ViewModel:

Imports System.Collections.ObjectModel
Imports System.ComponentModel

Public Class ViewModel

Public Sub New()
For i = 1 To 100
Dim d As New Data With {. Info = $"Node {i:000}"}
For k = 1 To 10
Dim c As New Data With {. Info = $"Child {i:000} {k:000}"}
For l = 1 To 10
c.Childs.Add(New Data With {. Info = $"Grandchild {i:000} {k:000} {l:000}"})
Next
d.Childs.Add(c)
Next
col. Add(d)
Next
cvs. Source = col
End Sub

Private col As New ObservableCollection(Of Data)
Private cvs As New CollectionViewSource

Public ReadOnly Property View As ICollectionView
Get
Return cvs. View
End Get
End Property

End Class

' Data class

Imports System.Collections.ObjectModel

Public Class Data
Public Property Info As String
Public Property Childs As New ObservableCollection(Of Data)
End Class


Original Comments

(no comments)


Original Solutions

(no solutions)

@grubioe grubioe self-assigned this Oct 1, 2019
@grubioe grubioe added Bug Product bug (most likely) 📭 waiting-author-feedback To request more information from author. labels Oct 4, 2019
@grubioe
Copy link
Contributor

grubioe commented Oct 4, 2019

Can you provide what version of Windows and .NET you are running? Thanks

@AndriyGlu
Copy link

Reproduced on Windows 10 and .Net 4.7.2

@AndriyGlu
Copy link

Sample application on C# with this bug:

https://drive.google.com/drive/folders/1-LFKy9wnAEpVB1X4Gsz2vVA5DqyDeAq4

@grubioe grubioe added area-netfx and removed 📭 waiting-author-feedback To request more information from author. labels Oct 8, 2019
@iihnat
Copy link

iihnat commented Oct 11, 2019

Andriy,

We have similar problem.
Try VirtualizingPanel.VirtualizationMode="Standard". This way scrolling becomes a bit slower but at least application is not freezing in our case.

@AndriyGlu
Copy link

AndriyGlu commented Oct 11, 2019

iihnat,

Yes, I tried this but it didn’t work. Also I tried ScrollUnit=“Pixel” but also didn’t work.

@grubioe grubioe assigned grubioe and unassigned grubioe Oct 16, 2019
@grubioe
Copy link
Contributor

grubioe commented Oct 18, 2019

@AndriyGlu to help with prioritization can you provide more details on what is your business scenario where this issue surfaces? What is the application, number of users and details related to the business impact of this issues? Thank you

@AndriyGlu
Copy link

@grubioe Our program is a special encyclopedia. The number of users is about 300 thousand. Users report about application freeze during virtualized ListView scrolling.

@grubioe
Copy link
Contributor

grubioe commented Oct 21, 2019

Thanks @AndriyGlu - what is the name of the encyclopedia product? Can you email me more specific information, the email is available from my GitHub profile.

@grubioe grubioe added this to the 5.0 milestone Oct 24, 2019
@grubioe grubioe assigned SamBent and unassigned grubioe Oct 24, 2019
@edtheprogrammerguy
Copy link

Also becoming an issue with our customers. Please provide an update when it might be fixed.
Thanks!

@Asser82
Copy link

Asser82 commented Oct 31, 2019

I have this issue also. Just wanted to use a virtualized tree view to display big hierarchies in our application, just to stumble upon freezes. I initially could not enumerate the reproduction steps exactly. With the steps above it is always reproducible. We are a company that provides hardware and software for industry automation.

In our prototype the freezes occure, if we use UseLayoutRounding="True" on the tree, otherwise the application does not freeze, but the node icons are blurry. Setting UseLayoutRounding to true in the TreeViewItem style instead of applying it to tree, immediately kills the possibility to scroll down.

@SamBent
Copy link
Contributor

SamBent commented Oct 31, 2019

About the process: We are investigating - assuming we find a fix, it would appear in .NETCore 5.0. For those of you using .NET Framework, backporting a fix to .NetFx (as a servicing update) doesn't happen by default. It would be more likely to happen if (a) someone opens a servicing request through MS Customer Support, and/or (b) there's evidence this is a recent regression, e.g. it works in .NET 4.7.2 but fails in .NET 4.8.

About the bug itself: We fixed many hangs since .NET 4.5. Unfortunately they all look the same to the usual scrutiny - callstacks, memory dumps, ETW traces. And they tend to be highly dependent on the exact history of scrolling and virtualization, the exact sizes of the UI elements, and factors that influence all that (theme, templates, styles, UseLayoutRounding, DPI, etc.) This makes them hard to diagnose, but more importantly it makes it impossible to assert that a fix for one hang will fix other hangs; they may arise for completely different reasons.

The best way to help my investigation is to send me self-contained projects (as @AndriyGlu has done). Make them as small as possible - it's usually possible to distill it down to the offending TreeView (or ListBox/DataGrid/...) with the supporting templates and styles, and some fake data that has the same size and shape as your real data.

The second-best way to help is to enable the telemetry that's built into .NET 4.6+, as follows:

  1. Find the name of the offending control: <TreeView x:Name="myTreeView">
  2. Edit the app-config file: MyApp.exe.config (located next to MyApp.exe)
    a. add an appSettings entry: <add key="ScrollingTraceTarget" value="myTreeView"/>
    b. add surrounding tags, if necessary: <configuration> <appSettings> <add.../> </appSettings> </configuration>
  3. Run your app, get it to hang, break into the debugger, create a full-memory dump.
  4. The telemetry creates one or more files in your app's directory, with names like ScrollTrace1.stf.
  5. Send me the STF files (STF = Scroll Trace File), and the matching process dump. Also send me a description of what you did to produce the hang, as detailed as possible. What scrolling you did, how you did it (e.g. arrow keys vs. drag scroll thumb vs. click LineDown button vs. click in scroll gutter...), what app did - anything that might be relevant.

[Privacy disclaimer: The STF files contain the "exact history" in a binary format. Lots of numbers - sizes and counts of UI elements, viewport offsets, virtualization decisions - but nothing from your actual data items, so no PII. We use the dump file under the standard privacy policy: used only for diagnosing the bug then deleted, no extracting data for any other purpose.]

I encourage @edtheprogrammerguy , @Asser82 , and anyone else whose app is hanging, to follow up as above. Unless you are 100% sure your hang is arising for the same reason as the one from @AndriyGlu .

@gpeter
Copy link

gpeter commented Nov 1, 2019

AndriyGlu: thanks so much for creating this demo. I have had this problem with TreeView and virtualization for years. It's very frustrating that it can't ever seem to get fixed. Fortunately it doesn't happen often, but when it does I hate to tell clients that I can't fix it (I do blame it on Microsoft).

SamBent: thanks for your post and glad you are working on it. However, why put the onus on "someone" to ensure this gets fixed in the .Net framework? I tried AndriyGlu's demo built for 4.5.1, 4.7.1, 4.7.2, and 4.8 and the problem clearly exists in all. It needs to be fixed in there too.

@Asser82
Copy link

Asser82 commented Nov 1, 2019

I have switched my prototyping efforts to a third party tree view from Telerik, because I cannot wait for months for something to happen. The virtualization seems to work there. They also need their time to clean up virtualized containers on collapsing two million tree nodes at once, but this will be the case with every implementation, I think.

@SamBent
Copy link
Contributor

SamBent commented Nov 1, 2019

@gpeter: Sorry you've been frustrated for years. Did you ever report the problem to MS? It never got to me - I first heard about it a month ago, when AndriyGlu's issue arrived.

The onus is not about fixing the issue, but about convincing MS that it's worth shipping the fix in a servicing update. That's a big deal, only a few select fixes get that treatment, and only if the benefit outweighs the cost. So it helps if customers go through CSS, provide business justification, document the benefit, etc.. The servicing people won't take a fix for a bug that has been around since 4.5.1 without anyone reporting it. (BTW, it sounds like you merely rebuilt AndriyGlu's demo for 4.5.1. It matters more which version of .NET is installed on the machine where you ran it. Was 4.5.1 installed?)

Requesting servicing through the official channel also helps in getting my attention. The main reason I haven't yet diagnosed AndriyGlu's demo is that I've been working on "official" bugs, which take priority over almost everything else. There are 1000s of bugs, and only 1 of me :).

@gpeter
Copy link

gpeter commented Nov 2, 2019 via email

@Asser82
Copy link

Asser82 commented Nov 2, 2019

@gpeter I hope, you did not see this problem in lists and combo boxes.

I think the problem with the TreeView is not reported that much, because the control is delivered with a non virtualizing panel by default. In our company, as we started with WPF, we evaluated the TreeView and it was unusable performance wise. We did not know that the virtualizing stack panel could be inserted. So we created a tree control based on ListBox, where the tree logic was implemented in the view model. But this approach does not scale well with a large number of nodes, because of many linear searches for insert/remove positions.

What we basically want is a toolkit, that allows us building a 'Solution Explorer' like in Visual Studio. We burn many resources to create a fast tree that can deal with many thousands of nodes, can be filtered on a worker thread, supports collapse all and where selected items can easily be scrolled into view, once the filter is removed. I think this is, what 80% of applications need, that have something like a project view. Everybody has to reinvent the wheel there.

@gpeter
Copy link

gpeter commented Nov 4, 2019

@Asser82 Indeed I have only ever seen the problem with TreeView. We do use virtualization with the ListView control (based on ListBox) where we have displayed up to maybe 100,000 items, but have never observed this issue there.

@SamBent
Copy link
Contributor

SamBent commented Nov 4, 2019

@gpeter No offense taken - I'm only frustrated that the problem has existed so long without my knowledge. It's definitely the kind of thing we would have fixed had we known, and I'll certainly lobby for it - the "official channel" support just increases the odds.

Yes, you do need to uninstall later versions and re-install earlier ones to test fairly. When you run an app that targets version N on a machine where N+1 is installed, you're running the verion N+1 .NET code. There are a small number of places where we preserve the earlier code for compat reasons (to protect you from known breaking changes), but virtualization/scrolling isn't one of them. You are getting the benefit of bugfixes in version N+1. My concern is whether a bugfix actually made things worse in your scenario; I doubt that happened, but I have to ask.

What do you mean by "screen scaling"? Is it what we call "high DPI", where your monitor renders at 120% (or 150%, 200%, ...) of normal size? Or what we call "Per-Monitor Aware DPI (PMA)", where the app tells the OS how to deal with multiple monitors each with a different DPI? Either one could contribute to the problem, especially if you have UseLayoutRounding or SnapsToDevicePixels set.

Historically the problems that have arisen in TreeView also affect ListBox/ListView/DataGrid, but only when they have set IsVirtualizingWhenGrouping and use grouping in the underlying CollectionView. A TreeView displaying 4 levels of hierarchical data and a ListBox displaying data with 4 levels of grouping use the same codepaths in VirtualizingStackPanel.

@edtheprogrammerguy
Copy link

edtheprogrammerguy commented Nov 4, 2019

Hi,
I used the "Expand Test" app posted by @AndriyGlu above and created the .stf and .dmp files that @SamBent asked for. Very easy to duplicate. Steps:

-start app
-click on "Fill"
-click on "Expand All"
-grab scroll "pill"
-pull it to the bottom
-click and hold with mouse in "page up" area of scroll bar
-app locks up solid within a few seconds.

Thanks for your help!!

@SamBent
Copy link
Contributor

SamBent commented Nov 7, 2019

I've been experimenting with ExpandTest. I found a completely deterministic repro - no randomization, no dragging the scroll thumb, no holding down the mouse for "a few seconds". It builds a tree with 101343 nodes in four levels (57 top-level nodes). Right-click the scrollbar and pick "Bottom" to scroll to the end. Then click (not hold) in the "page up" gutter 7 times - the last one freezes. It correctly shows node 56.9 (9th level-2 child of 56th top-level node) at the top of the viewport, but gets stuck in measure trying to fill in the cache.

  • Works with the 56th subtree (a 3-level tree), despite the scrolling being exactly the same
  • works when no cache (VirtualizingPanel.CacheLength="0")
  • works on Win7
  • works when run under VS 2019. More precisely, when "Tools/Options/Debugging/General/Enable UI Debugging Tools for XAML/Show runtime tools in application" is enabled
  • no use of UseLayoutRounding, SnapsToDevicePixels
  • DPI is standard

All this tells me that it's relevant that you're scrolling backwards to a level-2 node, there are 4 levels, caching is active, Aero2 theme is active, and no one is injecting extra UI. And that UseLayoutRounding, and DPI are not relevant. The exact size and shape of the tree may also be relevant, perhaps especially the shape of the 56.8 subtree, which is where the cache would be getting filled from.

No fix yet, but all the above helps direct the investigation. And it possibly helps explain why this has escaped our notice - those are narrow circumstances.

@rockerinthelocker
Copy link

Thanks @SamBent for trying to finally fix something that should have been fixed 10+ years ago. At least, that is how long developers beg Microsoft to fix it. Apparently, handling more than just a few hundred expanded nodes is something the TreeView control was never designed for. Even a couple of thousand TreeViewItems already cause slow page up/down scrolling; and beyond 15k+, chances are that the application stops responding when moving the scroll thumb up/down. Anyway, thanks for sharing your findings and good luck!

@SamBent
Copy link
Contributor

SamBent commented Nov 15, 2019

Using ExpandTest, I've found two independent problems that both lead to hangs:

  1. After scrolling back to a mid-level node (like 56.9 in my previous post), we calculate its offset and compare it to the expected offset. The first calculation can be off by one due to floating-point catastrophic cancellation; if so the mismatch triggers a re-measure.

  2. We do some bookkeeping after measuring each node to revise/refine the estimates for how large its children are. One step is happening when it shouldn't - when measuring to fill the "before" cache. Usually that's not a problem as we measure the normal way and replace the bad data before it causes harm. But rarely it survives long enough to cause another mismatch as in (1).

(1) seems to be the more likely - it's the first one I found, and is also the villain in the trace posted by @edtheprogrammerguy. It needs a big tree, which is why it didn't happen with the 3-level tree (which is two orders of magnitude smaller than the 4-level tree). And it needs floating-point calculations to produce just the wrong kind of error, so it looks "random" to us humans.

(2) only showed up for my de-randomized version of ExpandTest after holding down PageUp for a couple of minutes, scrolling nearly halfway up the list. It had a different symptom as well: when it hung the treeview displayed only two or three nodes - the rest was blank.

I have fixes for both problems, and I don't see any others after hammering on ExpandTest pretty hard. We have test cases that do the same kind of hammering, but I guess they simply didn't use a large enough data set, or didn't run long enough to catch these problems.

I'll be putting the fixes into .NETCore 5.0. For .NET Framework, it would help if folks could send me links to the complaints about TreeView (@gpeter "I have googled this issue extensively over the years and have seen many others have the same problem", @rockerinthelocker "should have been fixed 10+ years ago. At least, that is how long developers beg Microsoft to fix it"). As I said earlier, there are no such reports in our internal bug databases (other than the ones we fixed in 4.5.1 - 4.7.2), and I can't find any on the internet - maybe I'm searching for the wrong terms.

I did find complaints about TreeView perf, but they all boiled down to virtualization being off, as @Asser82 mentioned earlier. (@rockerinthelocker - I wonder if that explains your experience?) That's for compat; .NET 4.0 only supported TreeView virtualization when ScrollUnit=Item, so when support for virtualized pixel-scrolling came along in .NET 4.5 we had to preserve the older behavior as the default. It's easy to turn it on: VirtualizingPanel.IsVirtualizing="true".

@SamBent
Copy link
Contributor

SamBent commented Nov 15, 2019

None of those mention hangs/freezes. And none of them seem to be really about WPF TreeView bugs:

  1. resolved as user's bad design - rebuilding UI on every keystroke
  2. enable virtualization
  3. enable virtualization
  4. talks about TreeListView (not TreeView). That's a 3rd-party control.
  5. Silverlight (not WPF)
    Sorry, but these won't help the cause.

@gpeter
Copy link

gpeter commented Nov 18, 2019 via email

@SamBent
Copy link
Contributor

SamBent commented Nov 18, 2019

@gpeter Thanks for the links. Here's what I know about them:

https://weblog.west-wind.com/posts/2019/Feb/14/WPF-Hanging-in-Infinite-Rendering-Loop
This is a hang in layout of Grid (while allocating space to *-columns), completely independent of the current TreeView/virtualization hang. It's just a coincidence that it also involves floating-point precision, and that my name shows up. It's fixed in 4.8 (and in updates to 4.7.x).

KirillOsenkov/MSBuildStructuredLog#21
The original problem cited was fixed in 2013. The problem that "kikootwo" added earlier this year might be the current TreeView hang, but he hasn't sent any data back that would confirm that.

https://social.msdn.microsoft.com/Forums/vstudio/en-US/31ea27c0-32bf-4fae-a806-204f06c198b8/treeview-and-isvirtualizing-gt-scrolling-bug-?forum=wpf
This refers to 3.5sp1. The scrolling/virtualization code was completely reworked in 4.5, so this report is probably not relevant. It seems to have involved selection, which the current hang doesn't. None of the links work any more, so I can't follow up.

You have a valid point about MS sites that don't work any more. They don't work for me either.
That's partly why I'm asking the public for help.

Have you been able to get a ScrollTrace.stf for your problem? I don't think we've confirmed that it's the same as ExpandTest, so even if my fix gets into 4.8 it might not solve your problem.

@gpeter
Copy link

gpeter commented Nov 18, 2019

@SamBent I've been trying to reproduce this error again and find I can no longer reproduce it in our program. I got the latest bug report from a client on 1 Nov and I was able to reproduce it then, but now I cannot. Our client is continually adding nodes and I think this bug occurs with specific tree "shapes", and it seems the shape that triggered the bug no longer exists. When I found AndriyGlu's post I immediately tried his program and I noted that I could see very similar stack traces to our program (after the freeze, randomly pausing execution under the debugger and observing the stack), which convinced me this was the same problem. If I see this problem again I will run the trace you have asked for. Sorry I didn't get it when I had the chance.

@SamBent
Copy link
Contributor

SamBent commented Nov 18, 2019

It's worth repeating: The stack traces for TreeView virtualization/scrolling hang bugs are all the same, but the root causes might be different. For example, the stack traces for the "two independent problems" (see my post of 14 Nov) were identical, although the root causes had little to do with each other.
For everyone who's following this, if your stack traces look like ExpandTest I can't be sure that my fixes will help you, without looking much deeper. Send me your .stf files.

@SamBent
Copy link
Contributor

SamBent commented Nov 20, 2019

Good news: .NET servicing agreed to take the fix. It will take a while to assemble, test, approve, and sign the packages for Windows Update, but an update for .NET 4.6 - 4.8 should appear early next year. I'll link the announcement here when it happens. I'll also get the fix into .NETCore.

@gpeter
Copy link

gpeter commented Nov 20, 2019

Great! Thanks so much for your solving this vexing problem.

@edtheprogrammerguy
Copy link

Thank you. Thank you. Thank you!

@SamBent SamBent added the netfx-servicing-approved Netfx Approved for Servicing label Nov 21, 2019
@ghost ghost removed this from the 5.0 milestone Nov 25, 2019
@ghost
Copy link

ghost commented Nov 25, 2019

@dotnet/wpf-developers, It's time to give an update to the community.

1 similar comment
@ghost
Copy link

ghost commented Nov 26, 2019

@dotnet/wpf-developers, It's time to give an update to the community.

@ju2pom
Copy link

ju2pom commented Jan 7, 2020

Hi @SamBent,
First thank you for investing time and energy in this community issue.
We have the same symptom in one of our internal software (>1000 users), this time using a ListView with virtualization and grouping (groups can be expanded/collapsed). I have created an stf file and a memory dump (>2Go). How can I give them to you?
Would it be possible to test a "beta" or unofficial build of .NetFramework to check if your recent fix is effective in our scenario?

@SamBent
Copy link
Contributor

SamBent commented Jan 7, 2020

@ju2pom The best thing to do is open a case with Microsoft Customer Support. That will create a workspace in which we can trade large files securely and privately. If it's our bug (as this surely is), you don't get charged. Include the following information, to get it routed in the right direction
Product: .NET Framework 4.x
Category: Class Library Namespaces\System.Windows (WPF)

For the preview fix, you'll want to prepare a test machine that you can afford to rebuild; the private installation might leave your machine in a non-serviceable state (where Windows Update can't install new patches). And I'll need to know which version of .NET is installed, and the bitness (x86 v. x64).

As you probably already knew, scrolling in ListView + grouping (with expand/collapse) is handled by the same code as TreeView. So my fix will resolve your problem, if it's one of the "two independent issues" described earlier.

@ghost
Copy link

ghost commented Jan 27, 2020

@dotnet/wpf-developers, It's time to give an update to the community.

@SamBent
Copy link
Contributor

SamBent commented Jan 27, 2020

For .NET 4.6+, the fix has been released for many OS (others to follow this week). For details see the announcement.
For .NET Core 3.1.2, the fix is PR #2271.

If you are still seeing hangs after installing this fix, please open a new issue (and link to this issue, to aid discovery).

@yingDev
Copy link

yingDev commented Jan 30, 2021

In my case, it is related to UseLayoutRounding="True". changed it to "False" at some level in the tree then the bug disappeared.

 <Style TargetType="VirtualizingStackPanel">
       <Setter Property="UseLayoutRounding" Value="False"/>
   </Style>

dotnet --version: 5.0.200-preview.20614.14

@dotnet dotnet locked as resolved and limited conversation to collaborators Apr 14, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-VirtualizingStackPanel Bug Product bug (most likely) .NET Framework netfx-servicing-approved Netfx Approved for Servicing
Projects
None yet
Development

No branches or pull requests