Skip to content
This repository has been archived by the owner on Dec 14, 2018. It is now read-only.

Very large views take forever to compile #3196

Closed
poke opened this issue Sep 24, 2015 · 13 comments
Closed

Very large views take forever to compile #3196

poke opened this issue Sep 24, 2015 · 13 comments
Assignees
Milestone

Comments

@poke
Copy link
Contributor

poke commented Sep 24, 2015

Compiling large views, that is views that contain a large number of HTML tags, is really slow. In my example, the view has 10,000 paragraph tags (the kind of tags is actually irrelevant, same with the content of the tags) and that takes almost two minutes to compile, during which there is no feedback at all what is going on. Subsequent requests are instant, so it’s not the size of the file that is causing the issue.

The issue only happens for static content, so if you have a for loop generating millions of tags, then thee compiler has no issues with it (the browser probably will though ;P).

I know this is somewhat an exotic edge case, and usually you wouldn’t have static views this large. But in my case, I am prototyping with a lot of static data (which at some point later will be delivered dynamically of course), so I rather want to put it all in there. I could possibly just use a static file for this, but I would like to run this inside of the application’s existing layout.

Anyway, I don’t know why the compiler takes that long to process the file—and in the end do nothing with it (since there are no razor directives etc.). I understand that it needs to look at all attributes to check for tag helpers etc. (although in my test project, I have no tag helpers registered) to make sure that it doesn’t miss anything. But even so, parsing such a file should just take a tiny moment, and processing 10k tags also shouldn’t take this long.

@m0sa
Copy link
Contributor

m0sa commented Sep 24, 2015

We've had the same issues with roslyn in our Precompilation project. To us it looked like roslyn had a hard time compiling SourceText created from (big) strings. For us, replacing CSharpSyntaxTree.ParseText(string, CompilationOptions) with SourceText.From(Stream, Encoding) (here) did the trick of speeding up things tremendously. But we were not digging deep enough to find this minimal (big strings) repro case. If doing the mentioned code change in the offending places speeds things up, this should definitely be forwarded to the roslyn team.

@m0sa
Copy link
Contributor

m0sa commented Sep 24, 2015

So, roslyn uses the StringText for it's tests (including perf tests). But the perf test source file only has 3k lines, which is not even the third of your 10k paragraphs, which would also add a bunch of additional #line directives - that might be the perf-killer combination, as our benchmarks showed the slow part was actually the emit (which generates pdbs, that have to take those into account).

@m0sa
Copy link
Contributor

m0sa commented Sep 24, 2015

Ok, nevermind the above, my hunch was incorrect. I just proved myself wrong

@pranavkm
Copy link
Contributor

Thanks for the investigations @m0sa!

@poke, do the HTML tags have associated tag helpers? This would help to narrow down the issue.

@rynowak
Copy link
Member

rynowak commented Sep 24, 2015

/cc @pharring

@poke
Copy link
Contributor Author

poke commented Sep 24, 2015

@pranavkm In my test case, I started with an empty ASP.NET 5 project with only registering the necessary stuff. So no, there are no tag helpers registered at all (not even the ones in Microsoft.AspNet.Mvc.TagHelpers).

My original problem started with lots of data in tables (so many tr/td tags) but my test project was using paragraph tags only. If you like, I can make the test project public, so you can use it.

@rynowak
Copy link
Member

rynowak commented Sep 24, 2015

That would be really helpful, or even if you want to provide a gist with just the cshtml, that would likely demonstrate the issue.

@poke
Copy link
Contributor Author

poke commented Sep 24, 2015

Well, the cshtml consists just of 10k <p>Paragraph</p> and literally nothing else. I’ve put the project here.

@pranavkm
Copy link
Contributor

Thanks @poke

@Eilon
Copy link
Member

Eilon commented Sep 24, 2015

@pranavkm can you profile this to see what's up?

@rynowak
Copy link
Member

rynowak commented Sep 28, 2015

I did a little playing around with this - @NTaylorMullen will have to help me understand what this code is doing. It's definitely a perf issue inside Razor code generation.

image

This code is building up massive stringbuilders and lists

@danroth27 danroth27 assigned rynowak and unassigned pranavkm Sep 29, 2015
@danroth27 danroth27 added this to the 6.0.0-rc1 milestone Sep 29, 2015
@Eilon Eilon modified the milestones: 6.0.0-rc1, 1.0.0-rc2 Oct 22, 2015
rynowak added a commit to aspnet/Razor that referenced this issue Nov 18, 2015
This change significantly reduces the amount of string and List<ISymbol>
allocations that occur during compilation by changing the way
LiteralChunks are combined.

This is a low impact fix that addresses the performance issue, the design
issues that caused it still exist.

The problem here lies in what Razor fundamentally does - it parses HTML/C#
into tokens, and then combines them back into 'chunks' a representation
friendly to code generation. When presenting with a large block of static
HTML, Razor parses it into individual HTML tokens, and then tries to join
them in back into a single chunk for rendering. Due to details of Razor's
representation of chunks/tokens, the process of combining literals is too
expensive.

Mainly, what's done here is to not try to combine instances of
LiteralChunk. The process of merging them is too expensive and requires
lots of interm List<ISymbol> and string allocations.

Instead we produce a new 'chunk' ParentLiteralChunk, which doesn't do so
much up-front computing. Various pieces of the code that deal with
LiteralChunk need to be updated to deal with ParentLiteralChunk also,
which is the bulk of the changes here.

Note that we still have the potential for LOH allocations to occur during
codegen, but it's likely to occur O(1) for each large block of HTML
instead of O(N) as it did in the old code.
rynowak added a commit to aspnet/Razor that referenced this issue Nov 18, 2015
This change significantly reduces the amount of string and List<ISymbol>
allocations that occur during compilation by changing the way
LiteralChunks are combined.

This is a low impact fix that addresses the performance issue, the design
issues that caused it still exist.

The problem here lies in what Razor fundamentally does - it parses HTML/C#
into tokens, and then combines them back into 'chunks' a representation
friendly to code generation. When presenting with a large block of static
HTML, Razor parses it into individual HTML tokens, and then tries to join
them in back into a single chunk for rendering. Due to details of Razor's
representation of chunks/tokens, the process of combining literals is too
expensive.

Mainly, what's done here is to not try to combine instances of
LiteralChunk. The process of merging them is too expensive and requires
lots of interm List<ISymbol> and string allocations.

Instead we produce a new 'chunk' ParentLiteralChunk, which doesn't do so
much up-front computing. Various pieces of the code that deal with
LiteralChunk need to be updated to deal with ParentLiteralChunk also,
which is the bulk of the changes here.

Note that we still have the potential for LOH allocations to occur during
codegen, but it's likely to occur O(1) for each large block of HTML
instead of O(N) as it did in the old code.
rynowak added a commit to aspnet/Razor that referenced this issue Nov 18, 2015
This change significantly reduces the amount of string and List<ISymbol>
allocations that occur during compilation by changing the way
LiteralChunks are combined.

This is a low impact fix that addresses the performance issue, the design
issues that caused it still exist.

The problem here lies in what Razor fundamentally does - it parses HTML/C#
into tokens, and then combines them back into 'chunks' a representation
friendly to code generation. When presenting with a large block of static
HTML, Razor parses it into individual HTML tokens, and then tries to join
them in back into a single chunk for rendering. Due to details of Razor's
representation of chunks/tokens, the process of combining literals is too
expensive.

Mainly, what's done here is to not try to combine instances of
LiteralChunk. The process of merging them is too expensive and requires
lots of interm List<ISymbol> and string allocations.

Instead we produce a new 'chunk' ParentLiteralChunk, which doesn't do so
much up-front computing. Various pieces of the code that deal with
LiteralChunk need to be updated to deal with ParentLiteralChunk also,
which is the bulk of the changes here.

Note that we still have the potential for LOH allocations to occur during
codegen, but it's likely to occur O(1) for each large block of HTML
instead of O(N) as it did in the old code.
rynowak added a commit to aspnet/Razor that referenced this issue Nov 18, 2015
This change significantly reduces the amount of string and List<ISymbol>
allocations that occur during compilation by changing the way
LiteralChunks are combined.

This is a low impact fix that addresses the performance issue, the design
issues that caused it still exist.

The problem here lies in what Razor fundamentally does - it parses HTML/C#
into tokens, and then combines them back into 'chunks' a representation
friendly to code generation. When presenting with a large block of static
HTML, Razor parses it into individual HTML tokens, and then tries to join
them in back into a single chunk for rendering. Due to details of Razor's
representation of chunks/tokens, the process of combining literals is too
expensive.

Mainly, what's done here is to not try to combine instances of
LiteralChunk. The process of merging them is too expensive and requires
lots of interm List<ISymbol> and string allocations.

Instead we produce a new 'chunk' ParentLiteralChunk, which doesn't do so
much up-front computing. Various pieces of the code that deal with
LiteralChunk need to be updated to deal with ParentLiteralChunk also,
which is the bulk of the changes here.

Note that we still have the potential for LOH allocations to occur during
codegen, but it's likely to occur O(1) for each large block of HTML
instead of O(N) as it did in the old code.
rynowak added a commit to aspnet/Razor that referenced this issue Nov 18, 2015
This change significantly reduces the amount of string and List<ISymbol>
allocations that occur during compilation by changing the way
LiteralChunks are combined.

This is a low impact fix that addresses the performance issue, the design
issues that caused it still exist.

The problem here lies in what Razor fundamentally does - it parses HTML/C#
into tokens, and then combines them back into 'chunks' a representation
friendly to code generation. When presenting with a large block of static
HTML, Razor parses it into individual HTML tokens, and then tries to join
them in back into a single chunk for rendering. Due to details of Razor's
representation of chunks/tokens, the process of combining literals is too
expensive.

Mainly, what's done here is to not try to combine instances of
LiteralChunk. The process of merging them is too expensive and requires
lots of interm List<ISymbol> and string allocations.

Instead we produce a new 'chunk' ParentLiteralChunk, which doesn't do so
much up-front computing. Various pieces of the code that deal with
LiteralChunk need to be updated to deal with ParentLiteralChunk also,
which is the bulk of the changes here.

Note that we still have the potential for LOH allocations to occur during
codegen, but it's likely to occur O(1) for each large block of HTML
instead of O(N) as it did in the old code.
@rynowak
Copy link
Member

rynowak commented Nov 18, 2015

This has been fixed by aspnet/Razor#613

I can compile this view now in about 4 seconds. This still isn't great and we have more work in this area that's going to be tracked by more specific work items. This should at least be usable now.

@poke
Copy link
Contributor Author

poke commented Nov 18, 2015

Thanks a lot for the update! And four seconds is definitely a lot more manageable as the minutes before. Looking forward to further improvements :)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

6 participants