Skip to content
This repository has been archived by the owner on Dec 14, 2018. It is now read-only.

Rendering large Razor views is much slower in MVC 6 than MVC 5 #3516

Closed
DamianEdwards opened this issue Nov 10, 2015 · 24 comments
Closed

Rendering large Razor views is much slower in MVC 6 than MVC 5 #3516

DamianEdwards opened this issue Nov 10, 2015 · 24 comments
Assignees

Comments

@DamianEdwards
Copy link
Member

Did some experiments comparing throughput of largish views (i.e. ~325 KB, browse http://www.msn.com and save "view source" as HTML/CSHTML file) in the SmurfLab.

The results show that serving large views via Razor in MVC 5 is not a huge drop from raw IIS static file serving, but MVC 6 is much slower than MVC 5 (6x). A quick look at the .NET memory counters shows a very high number of Gen 1 GCs taking place when serving from MVC 6 and a time in GC at ~15%.

Stack RPS % of IIS
IIS static file 3,549 100%
MVC 5 view in IIS 2,993 84%
MVC 6 view in Kestrel 495 14%
@Eilon Eilon added this to the 6.0.0-rc2 milestone Nov 10, 2015
@rynowak
Copy link
Member

rynowak commented Nov 11, 2015

Some allocation data for ~430 requests

image

The bulk of this is just coming from TagHelpers and related infrastructure. These issues are known, and are being worked on 👍

There's a significant contribution from SocketOutput in Kestrel which is also being worked on.

Removing TagHelper support leads you to a repro of #3196

@rynowak
Copy link
Member

rynowak commented Jan 11, 2016

With 2eb6cd6 - I can easily max out my 100mbit network with this scenario (no taghelpers) using about 4% CPU. We should rerun this on the benchmarking hardware w/without taghelpers.

@rynowak
Copy link
Member

rynowak commented Jan 26, 2016

Fixes here:
434da68
aspnet/Razor@1890950

I'm able to also max out my 100mbit network easily with the taghelpers version of this benchmark. We should remeasure.

@rynowak
Copy link
Member

rynowak commented Feb 4, 2016

We measured this again today with these fixes:

Stack RPS % of IIS
IIS static file 3,549 100%
MVC 5 view in IIS 2,993 84%
MVC 6 view without UrlResolutionTagHelper in Kestrel 2,701 76%
MVC 6 view with UrlResolutionTagHelper in Kestrel 649 (was 495) 18%

Without UrlResolutionTagHelper we max out the network card at 2701 rps. I suspect that we're doing something a little less efficient with chunking which could explain the delta between MVC5 and MVC6, but that change won't come from MVC.

We're going to take further steps to mitigate the effects of UrlResolutionTagHelper since that's part of the default experience. See aspnet/Razor#684

@rynowak
Copy link
Member

rynowak commented Mar 30, 2016

Remeasuring this

Stack RPS % of IIS
IIS static file 3,549 100%
MVC 5 view in IIS 2,993 84%
MVC 6 view without TagHelpers in Kestrel 2,739 (1.09 gb/s) 76%
MVC 6 view with default settings in Kestrel 2,340 (0.93 gb/s) 66%

Best throughput was at 32 connections in wrk. We were able to push CPU and network near max. Suspect pushing further will require taking a look at chunking in Kestrel


Allocation data 3000 requests (TagHelpers disabled):
image

Allocation data 3000 requests (TagHelpers):
image

@rynowak
Copy link
Member

rynowak commented Mar 31, 2016

The only issue we're seeing now is large memory usage in the TagHelpers variant

image

I think there's an unexpectedly high number of view buffers in use per-page-load.

Stepping through in the debugger confirmed that only 1728 view buffer 'slots' were checked out to render this page with 218 (12%) unused. That's about 13kb of buffer space per request across 11 pages for about 400kb of output. This feature is working as intended.

Rather what seems to be happening here is that a flood of requests is coming in and getting 'stuck'. Due to the fact that Razor buffers all of its output before moving on doing real I/O - every request gets into the system, checks out the needed buffers and then waits for its chance to do I/O. This causes us to max out the pool.

When a lot of requests hit a page that hasn't been compiled yet they will queue waiting for the compilation task to complete. Once it's completed, they all begin executing the page and will check out buffers to hold their output.

If the number of requests is high, or the page takes a while to compile (like on startup where we have load roslyn) then the number of concurrent requests can result in the ArrayPool handing out much much much bigger buffer than necessary. You might ask for a 256 element array and get back a 1-million element array. This results in us 'filling' the buffer pool, which results in about 800mb of memory allocated that won't ever be collected.

@DamianEdwards
Copy link
Member Author

@rynowak could you verify that by adding a call to @await FlushAsync() in the page to "unstick" it, just to further validate the theory?

@rynowak
Copy link
Member

rynowak commented Apr 4, 2016

I'll hack this up and give it a try.

@rynowak
Copy link
Member

rynowak commented Apr 4, 2016

Yeah, I don't think it's the I/O that's the issue here, it's compilation/startup.

What I noticed trying this again is that if I visit the page in a browser before hitting it with the load then the memory usage is normal.

If I hit the site with the load generator right away then memory spikes as soon as the page is compiled because there are lots of threads executing it.

You would hope that FlushAsync would mitigate this, and it does when you have TagHelpers disabled.

However, it doesn't really help you in the default case because ...... because TagHelpers. Each TagHelper needs to buffer all output that happens while rendering it's body, and because of the "after form" taghelper and the "option/select" taghelper there's a lot of these.

@rynowak rynowak modified the milestones: 1.0.0, 1.0.0-rc2 Apr 8, 2016
@ikourfaln
Copy link

Hi,
So rynowak, did you find a solution for this issue, because I see that you closed it.
Thanx for your hard work 😉

@rynowak
Copy link
Member

rynowak commented Apr 8, 2016

We didn't close the issue, we just haven't identified any additional work for RC2. Take a look at the data here: #3516 (comment)

The main change that we had to make was to make TagHelpers smarter in some cases about when they run and not. We've also put a lot of work into the backing buffer that Razor writes to which has helped as well.

@ikourfaln
Copy link

Hmmmm 👍
Thanx for sharing infos, and for your hard work too.
Cheers from Morocco.

@rynowak
Copy link
Member

rynowak commented Apr 14, 2016

We've done some more analysis here - the memory growth issue is an intentional behavior of the buffer pool. When a bucket is exhausted, the pool will try the next bucket, and the next, and the next, until they've all been tried.

For a scenario like this where we're blasting the server with load while the page is being compiled, the number of concurrent/queued requests results in us 'trying' each bucket and thus populating it up to the theoretical max of ~800mb. So Razor asks for a 32-element array and might get a 2048-element array.

We're looking at making a tweak to this policy to have the memory pool fall back to allocation mode faster and try only a few buckets.

/cc @stephentoub

@rynowak
Copy link
Member

rynowak commented Apr 14, 2016

Stepping through in the debugger confirmed that only 1728 view buffer 'slots' were checked out to render this page with 218 (12%) unused. That's about 13kb of buffer space per request across 11 pages for about 400kb of output. This feature is working as intended.

I'll amend my previous statement, this was working as intended when there's no contention 😆

@Eilon Eilon modified the milestones: 1.1.0, 1.2.0 Nov 8, 2016
@Eilon Eilon assigned dougbu and unassigned rynowak Nov 23, 2016
@Eilon
Copy link
Member

Eilon commented Nov 23, 2016

@dougbu let's re-analyze this scenario by comparing:

  1. The current 1K chars buffer
  2. Modifying it to a 4K chars write buffer
  3. Same test app, but running in MVC5

Current test app is here: https://github.com/aspnet/Performance/tree/dev/testapp/LargeStaticView

Buffer is here:

public static readonly int DefaultBufferSize = 1024; // 1KB - results in a 4KB byte array for UTF8.

@dougbu
Copy link
Member

dougbu commented Nov 25, 2016

@Eilon this issue has covered both RPS and allocations. Is your request for reanalysis of one, the other, or both?

@rynowak were the RPS comparisons of ASP.NET Core MVC against MVC 5.x done using full .NET everywhere? And are the dotMemory tables all for .NET Core runs?

@dougbu
Copy link
Member

dougbu commented Nov 25, 2016

Took a quick look at HttpRequestStreamReader and see it is rather inefficient when the ArrayPool<byte> rents a larger buffer than requested. That is, the class does not recalculate _byteBufferSize based on the returned array's Length. Worth testing a change here too?

@dougbu
Copy link
Member

dougbu commented Nov 25, 2016

Should instead have mentioned HttpResponseStreamWriter and its _charBufferSize. Regardless, the reader and writer both may use half (or less) of their buffers.

@Eilon
Copy link
Member

Eilon commented Nov 26, 2016

@rynowak would have to answer regarding allocations vs. RPS.

As far as where to run ASP.NET Core, run it on .NET Core only. Not super interested in .NET Framework numbers for that.

@rynowak
Copy link
Member

rynowak commented Nov 28, 2016

@dougbu - we want to understand if we are better or worse than MVC5 in this regard, by how much, and why (if we're worse).

Better includes both throughput and latency. Allocations helps us understand why but isn't on it's own an outcome when we're looking at this as an end to end.

With regard to your specifics, the world is your oyster. We would consider changing anything that makes the future brighter.

dougbu added a commit to aspnet/HttpAbstractions that referenced this issue Feb 22, 2017
…e all of allocated arrays

- aspnet/Mvc#3516
- when first array (`byte[]` or `char[]`) is larger than requested, use it all
- also allocate second array (`char[]` or `byte[]`) to match larger size

nit: add bounds check of `bufferSize` in `HttpResponseStreamWriter` constructor
@dougbu
Copy link
Member

dougbu commented Apr 18, 2017

Have tested on a couple of different machine sizes and across the above scenarios and more. With the Azure hardware I've been using, the sweet spot appears to be a buffer size of 16K. Will send out a PR using that size.
rps

@dougbu
Copy link
Member

dougbu commented Apr 18, 2017

For the scenarios mentioned above, throughput on Azure is normally network-limited. But the CPU numbers tell a positive story for the latest ASP.NET Core stack.

Scenario CPU % of IIS
MVC 5.2 LargeStaticView 55.50% 100.00%
LargeStaticView 52.50% 94.59%
LargeStaticView w/ tag helpers 68.00% 122.20%

On larger Azure hardware, throughput improves but hits another limit. CPU utilization looks reasonable though worse for .NET Core MVC than ASP.NET 5.2:

Scenario CPU % of IIS
MVC 5.2 LargeStaticView 17.9% 100.00%
LargeStaticFile 15.7% 87.71%
LargeStaticView 21.5% 120.11%
LargeStaticView w/ tag helpers 23.8% 132.96%

(LargeStaticFile uses StaticFilesMiddleware and the same Index.cshtml renamed to Large.html. MVC 5.2 scenario ran in IIS; all other scenarios ran in Kestrel.)

We may have more room for improvement with the latest tag helper infrastructure. But, they do involve more work.

dougbu added a commit that referenced this issue Apr 20, 2017
- #3516
- fix tests that relied on otherwise-unused `HttpResponseStreamWriter.DefaultBufferSize`
dougbu added a commit that referenced this issue Apr 23, 2017
- #3516
- fix tests that relied on otherwise-unused `HttpResponseStreamWriter.DefaultBufferSize`
@dougbu
Copy link
Member

dougbu commented Apr 23, 2017

141b637

@dougbu
Copy link
Member

dougbu commented Apr 24, 2017

Have a few potential follow-ups on the BigViews and LargeJsonApi scenarios. Will talk a bit internally then file the appropriate issues.

dougbu added a commit to aspnet/HttpAbstractions that referenced this issue May 18, 2017
…e all of allocated arrays

- aspnet/Mvc#3516
- when first array (`byte[]` or `char[]`) is larger than requested, use it all
- also allocate second array (`char[]` or `byte[]`) to match larger size

nit: add bounds check of `bufferSize` in `HttpResponseStreamWriter` constructor
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants