Reduce the memory footprint of HttpHeaders #62981

MihaZupan · 2021-12-18T09:51:16Z

Changed the backing header store from a dictionary to an array.
- Insertion, lookup, and remove operations perform O(n) linear scans to find the matching entry
  - Avoided multiple lookups per operation in places like TryAddWithoutValidation and AddHeaders that would previously perform 2 dictionary lookups (1 for TryGetValue and 1 for Add)
- After adding 64 headers, the underlying store is swapped to a Dictionary to avoid certain algorithmic complexity attacks (or reduce their impact - see Worst case benchmarks for MaxNumberOfHeaders = 64 in Reduce the memory footprint of HttpHeaders #62981 (comment)).

The change is a win for both CPU and memory.

Measuring memory for 10 simple get requests (code):
Note: These allocation numbers include the HeaderDescriptor change (#62981 (comment)).

We go from 15 objects (1968 bytes) to 5 objects (568 bytes) per reqest+response. 🎉
The exact difference will vary depending on the number of headers since internal resizes occur at different times for a dictionary vs array. In the above example, we allocate 3x HeaderEntry[4] and 2x resizes to HeaderEntry[8].
In this case, that's a reduction of the total number of allocated bytes per request+response of 42%.

In an E2E Yarp scenario, this results in a 2-3% RPS increase.

@geoffkizer @scalablecory @stephentoub PTAL

Behavioral changes

The insertion order of headers is preserved during enumeration / serialization to the wire (up to 64 entries).

ghost · 2021-12-18T09:51:26Z

Tagging subscribers to this area: @dotnet/ncl
See info in area-owners.md if you want to be subscribed.

Issue Details

Fixes #62846
Fixes #62847

Opening as a draft to get thoughts on the overall approach.

Changed the backing header store from a dictionary to an array.
- Insertion, lookup, and remove operations perform O(n) linear scans to find the matching entry
  - Avoided multiple lookups per operation in places like TryAddWithoutValidation and AddHeaders that would previously perform 2 dictionary lookups (1 for TryGetValue and 1 for Add)
  - I will add microbenchmarks for various ways of interacting with headers if we decide to go forward with this. From initial testing, things are generally faster unless you use an unreasonable amount of headers
- Added a hard cap of 128 headers to prevent n^2 from growing too much
  - Kestrel has a default cap of 100
Changed the HeaderDescriptor to use a single backing object.

The change is a win for both CPU and memory.

Measuring memory for 10 simple get requests (code):

We go from 15 objects (1968 bytes) to 5 objects (568 bytes) per reqest+response. 🎉
The exact difference will vary depending on the number of headers since internal resizes occur at different times for a dictionary vs array.
In this case, that's a reduction of the total number of allocated bytes per request+response of 42%.

In an E2E Yarp scenario, this results in a 2-3% RPS increase.

Side effects:

Header ordering is preserved. If multiple values are added for the same name, they are merged into the first entry like before.
- E.g.
```
Foo: a
Bar: b
Foo: c
```
  will now always end up as
```
Foo: a, c
Bar: b
```
whereas before the order of Foo vs Bar was random based on the hash seed the process started with

@geoffkizer @scalablecory @stephentoub PTAL

TODO:

Tests for header ordering
Tests for the header number limit
More benchmarks

Author:	MihaZupan
Assignees:	-
Labels:	`area-System.Net.Http`
Milestone:	7.0.0

stephentoub · 2021-12-18T11:43:13Z

Added a hard cap of 128 headers to prevent n^2 from growing too much

So if a server sends 129 headers, the request/response fails? Why is that an acceptable breaking change? What workaround does the client have if it needs to communicate with such a server it's already able to communicate with today?

Insertion, lookup, and remove operations perform O(n) linear scans to find the matching entry

What's the crossover point where cost of these operations is more expensive with the list?

From initial testing, things are generally faster unless you use an unreasonable amount of headers

What is "unreasonable"? By who's standard?

Kestrel has a default cap of 100

As part of this then are you proposing adding a knob?

Technically we could swap the store to a dictionary when we reach X number of headers, but the complexity doesn't seem worth it to support an unreasonable scenario.

I'm not sure why we get to set the threshold for what's a "reasonable" number of headers, considering there's effectively no limit today. The hybrid approach seems more sound to me.

MihaZupan · 2021-12-18T17:25:39Z

What's the crossover point where cost of these operations is more expensive with the list?

Depends on the header names. Assuming a reasonable distribution of names (string.Equals can exit early), comparing main vs this pr, main is about equal at ~128 headers and faster after ~150. (Time to add all the headers and do one NonValidated enumeration)

The fact that we are searching for the key on every insertion by comparing it to existing keys, the worst-case scenario changes:
With a dictionary, the worst-case is O(MaxResponseHeadersLength) to hash all the inputs.
With the new approach, it's O(MaxResponseHeadersLength * MaxNumberOfHeaders) if you specifically craft long names of equal length that only change at the end. This makes a limit on the number of headers useful to put a ceiling on how much CPU malicious input could burn.
With the default MaxResponseHeadersLength = 64k and MaxNumberOfHeaders = 128, the worst-case on my CPU is about 1.6 ms.

Worst-case benchmarks

ResponseHeadersLengthKb	NumberOfHeaders	Mean
16	64	236.4 us
16	128	422.2 us
16	256	849.3 us
16	512	1,832.2 us

32	64	449.9 us
32	128	886.7 us
32	256	1,603.5 us
32	512	3,299.8 us

64	64	859.6 us
64 = current default	128	1,660.0 us
64	256	3,402.1 us
64	512	6,260.3 us

128	64	1,689.1 us
128	128	3,223.9 us
128	256	6,464.7 us
128	512	13,477.5 us

If we feel that using more headers is realistic, there are ways to eliminate this worst-case as well.
For example, we could store the Hash(name) on the descriptor to make comparisons effectively O(1). At that point, we could set MaxHeaders to 1000 and be happy.

As part of this then are you proposing adding a knob?

If we believe using more than a few hundred is reasonable, I would prefer to remove the limitation via the above approach instead.

I'm not sure why we get to set the threshold for what's a "reasonable" number of headers, considering there's effectively no limit today.

There are practical limits you will hit, e.g.

Kestrel's limit of 100 headers
Kestrel and Cloudflare's limit of 32 kB for all request headers
HttpClient's limit of 64 kB on response headers
IIS seems to have a ~16 kB limit
Internet says Tomcat has a default of 8 kB

While most are configurable, I'd argue HttpClient shouldn't care about a scenario where someone wants to send 1001 headers :)

What workaround does the client have if it needs to communicate with such a server it's already able to communicate with today?

LLHTTP. Or preferably changing their service to not rely on hundreds of headers.

antonfirsov · 2021-12-19T14:37:43Z

src/libraries/System.Net.Http/src/System/Net/Http/Headers/HttpHeaders.cs

+            private const int InitialCapacity = 4;
+            private const int MaxHeaderCount = 128;
+
+            public HeaderEntry[]? Entries;


Might be crazy idea but:
How common is the Count == 0 case in practice? If less common than Count > 0, isn't it worth to define an embedded storage to avoid the array allocation?

private HeaderEntry _e000, _e001, _e002, _e003, _e004, _e005 /*...*/ ; public Span<HeaderEntry> Entries => MemoryMarshal.CreateSpan<HeaderEntry>(ref _e0, MaxHeaderCount);

We definitely wouldn't want to store the entire max as fields since having few headers is very common. But something like storing the first 6 headers as fields and deferring to an array when more are added is something we could potentially explore. I imagine it would come down to how much it would complicate the logic vs. saving the small allocation.

stephentoub · 2021-12-19T15:07:14Z

If we believe using more than a few hundred is reasonable

It is not our place now to say what is or is not reasonable. HttpClient has been usable with such a number of headers for its entire existence. And the RFC places no such limitation. From my perspective, breaking code that currently does this, without workaround, is what's not reasonable. And telling people to not use HttpClient (which is effectively what "use LLHTTP" says, if nothing else given the sheer complexity it adds to otherwise one-line-ish use) or to stop communicating with particular servers, are not workarounds. Without a knob, i think the proposal isn't viable, and with a knob, you need to be able to accommodate an arbitrarily large number of headers, anyway, so we should just do the hybrid approach.

MihaZupan · 2021-12-20T04:03:12Z

I changed the implementation to fall back to a dictionary store when adding more than N (currently 64) headers.
This means that ordering is preserved when adding up to N headers, otherwise, the current (effectively random) ordering is used.

Updated the description to reflect these changes. Marking the PR as ready-to-review since the header limit was the only behavioral change I was expecting pushback on.

MihaZupan · 2021-12-20T09:15:27Z

Opened #63005 regarding the 8-byte size regression of HttpHeaders-derived types (suboptimal field layout with extra padding).
It could be worked around by bringing all the methods from the HeaderStore struct onto HttpHeaders (c9eb437).

src/libraries/System.Net.Http/src/System/Net/Http/Headers/HeaderDescriptor.cs

src/libraries/System.Net.Http/src/System/Net/Http/Headers/HttpHeaders.cs

src/libraries/System.Net.Http/src/System/Net/Http/Headers/HeaderDescriptor.cs

src/libraries/System.Net.Http/src/System/Net/Http/Headers/HeaderStringValues.cs

src/libraries/System.Net.Http/src/System/Net/Http/Headers/KnownHeader.cs

src/libraries/System.Net.Http/src/System/Net/Http/SocketsHttpHandler/HttpConnection.cs

src/libraries/System.Net.Http/tests/UnitTests/Headers/HeaderEncodingTest.cs

geoffkizer · 2021-12-20T16:33:00Z

TODO:

More benchmarks

Can we use the new HttpClient benchmark from @CarnaViire to measure this?

MihaZupan · 2021-12-20T16:47:21Z

Can we use the new HttpClient benchmark from @CarnaViire to measure this?

Of course. I already have before, but a lot more changes went in since then. I'll post some nice graphs in the next few hours.

geoffkizer · 2021-12-21T16:17:50Z

This has the side effect of preserving the original insertion order (up to 64 headers). If multiple values are added for the same name, they are merged into the first entry like before.

Did you consider an approach where we don't merge multiple values into the same entry?

This seems like it has a couple advantages:
(1) Preserve header ordering even in the presence of multiple values
(2) Insertion is always cheap because we don't have to compare against existing keys and merge if found

It does make lookup even more expensive, but I suspect we could deal with that by on-demand creating an "index" on the header table when the entry count exceeds some threshhold (e.g 64 or even less). Any update would cause the index to get blown away and recreated if/when necessary.

MihaZupan · 2021-12-21T17:14:18Z

Did you consider an approach where we don't merge multiple values into the same entry?

Yes, but the value of doing that depends on the following:

Would we want NonValidated enumeration to group the values with the same key?
- As far as I understand the team's feeling here is yes.
- Does the act of non-validated enumeration group the underlying values (affecting the below question)
Would we want the enumeration that happens as part of serializing the headers to the wire to group the values with the same key?
- We currently do, but we don't necessarily have to
- Preserve header ordering even in the presence of multiple values
  
  I take it this means your preference is "no"?
- If the answer is yes (meaning preserving the current behavior), I would expect the approach to have similar or slightly worse performance characteristics than the current implementation.

geoffkizer · 2021-12-21T17:35:45Z

Would we want NonValidated enumeration to group the values with the same key?

As far as I understand the team's feeling here is yes.

For better or worse we have defined NonValidated to group values by header name, and we cannot change that now. (At least I don't think we can... certainly we need to group for the IDictionary implementation on NonValidated. Maybe we could get away with not grouping for the IEnumerable implementation? Seems weird...)

That said, when I think about the scenarios for using NonValidated (e.g. YARP), I think (a) these don't really care about grouping by header name, and (b) probably would actually prefer to not group by header name, since that should provide both better performance and preserve the original header ordering.

So while we can't change the existing NonValidated semantics, we could perhaps add some new API that does non-validated enumeration without grouping by header name. That's a bit ugly, especially since we just added NonValidated... but if it's a better solution, we should consider it.

Would we want the enumeration that happens as part of serializing the headers to the wire to group the values with the same key?

We currently do, but we don't necessarily have to

As with the NonValidated case above, I think we don't really care about grouping by header name here. The fact that we do is more a legacy of the current HttpHeaders design than a conscious choice. If we can improve perf by not grouping (and I suspect we can) then that alone is good reason to not group by header name.

Preserving header ordering is nice too; I think if we were starting from scratch here we'd try to design the header store to preserve ordering of raw headers, but since we don't today, and we haven't heard huge complaints about it, I don't know that it really matters.

The nice thing about this case as opposed to NonValidated above is that no new public API is required. So if this were to improve the performance of sending request headers in a non-trivial way, I think that could be justification enough for doing this even without adding new public API for the NonValidated case.

geoffkizer · 2021-12-21T17:54:34Z

Here's an alternative idea re the HeaderDescriptor changes in this PR: #63047

That's a non-trivial amount of work, so we should probably just proceed with the HeaderDescriptor changes here; but it seems worth considering as we are making improvements in this area.

geoffkizer · 2022-01-05T20:59:06Z

That data looks pretty good... based on this do you believe this is the best approach?

geoffkizer · 2022-01-15T12:53:49Z

The use of GetValueRefOrAddDefault is a nice optimization -- however it does seem to make the comparison above unfair since this optimization doesn't exist in the existing code.

src/libraries/System.Net.Http/src/System/Net/Http/Headers/HttpHeaders.cs

geoffkizer · 2022-01-15T13:00:20Z

src/libraries/System.Net.Http/src/System/Net/Http/Headers/HttpHeaders.cs

@@ -235,35 +250,32 @@ public override string ToString()

            var vsb = new ValueStringBuilder(stackalloc char[512]);

-            if (_headerStore is Dictionary<HeaderDescriptor, object> headerStore)
+            foreach (HeaderEntry entry in GetEntries())


This (and other calls to GetEntries) will force an array allocation if we are in dictionary mode, right? Seems like we could avoid that without too much trouble.

You mean by caching the array or by implementing a custom enumerator?
I tried to optimize for the common case, even if it made the dictionary edge-case slightly more expensive.

I was thinking more like a custom enumerator. That said, I can see how this could potentially add cost to the common case. That's unfortunate. Seems like we should be able to handle both cases efficiently and without allocating, but it's not clear to me how to do this without significantly complicating the code. Hmmm.

src/libraries/System.Net.Http/src/System/Net/Http/Headers/HttpHeaders.cs

geoffkizer · 2022-01-15T13:17:07Z

src/libraries/System.Net.Http/src/System/Net/Http/Headers/HttpHeaders.cs

+            ref object? DictionaryGetValueRefOrAddDefault(HeaderDescriptor key)
+            {
+                var dictionary = (Dictionary<HeaderDescriptor, object>)_headerStore!;
+                ref object? value = ref CollectionsMarshal.GetValueRefOrAddDefault(dictionary, key, out s_dictionaryGetValueRefOrAddDefaultExistsDummy);


Why can't we just use _ here?

Some sort of escape analysis is preventing that:

CS8347: Cannot use a result of 'CollectionsMarshal.GetValueRefOrAddDefault<HeaderDescriptor, object>(Dictionary<HeaderDescriptor, object>, HeaderDescriptor, out bool)' in this context because it may expose variables referenced by parameter 'exists' outside of their declaration scope

Using a dummy field for it was the only way I found to workaround it.

I have no idea what that error means.

@stephentoub Any insight here?

You get that error with out _ but not with out s_dictionaryGetValueRefOrAddDefaultExistsDummy?

That's right

@jaredpar, is this a compiler bug?
https://sharplab.io/#v2:EYLgxg9gTgpgtADwGwBYA+ABATARgLABQGAzAATakDCpA3oaQ6QA5QCWAbgIYAuM5OSKqQDOAfTABueoxYcefDANLAIEADYjRwKQUalpDEuRSkAsgAoAlAdo29sAGZCAMldIBeAHylHpAHLmAHYwAO5UVgA0pBAArtyawJakOnoAvjY2ioK+1AHUYFGx8Srqyta6jHQVegzAHqTcUDEwKTWGAOw+ME5ikjap+gSpQA==
This compiles, but replace the out bool s_b with a discard out _ and it fails. I'd expect a discard to accommodate any required scope.

error CS8156: An expression cannot be used in this context because it may not be passed or returned by reference error CS8347: Cannot use a result of 'C.N(C, out bool)' in this context because it may expose variables referenced by parameter 'b' outside of their declaration scope

It's a known issue but up until this point seemed more theoretical than a practical problem. Can look into fixing this with the rest of the ref work we're doing this time around.

dotnet/roslyn#56587 (comment)

Note: the discussion on that last issue details the trade offs in fixing this. Effectively to fix this we need to ensure that discards have unique locals associated with them when discards mix with span safety rules (today we re-use temps). Otherwise it can lead to discards creating safety issues.

MihaZupan · 2022-01-15T14:34:55Z

Just to make sure I understand the current approach: ...

All the observations are correct.

Meaning, if you do something like this:
TryAddWithoutValidation("Header1", "X");
TryAddWithoutValidation("Header2", "Y");
TryAddWithoutValidation("Header1", "Z");
You will still see the values for Header1 grouped.
Am I understanding this all correctly?

That's correct, entries for the same name are always grouped into a single entry.

However, presumably there's some point less than 64 headers where the lookup cost is now more expensive, but not so much more expensive that it's worth incurring the cost of the added memory usage of the dictionary. Is that the right way to think about this? I see the chart above only goes up to 16 headers, and it seems like even at 16 the cost of lookup is starting to make a noticeable difference vs the previous PR.

I'll run & post numbers for more headers here. From previous tests, I can say that #62981 (comment):

Always grouping is slightly faster up to ~10 headers.
Even with more headers, the difference between the approaches never reaches 10%.

The exact cutoff (currently ~10) depends on how expensive HeaderDescriptor.Equals(HeaderDescriptor) is. I expect it may be slightly cheaper with #62981 (comment).

geoffkizer · 2022-01-15T14:49:52Z

The exact cutoff (currently ~10) depends on how expensive HeaderDescriptor.Equals(HeaderDescriptor) is. I expect it may be slightly cheaper with #62981 (comment).

Good point. Perhaps we should make that change first?

MihaZupan · 2022-01-17T16:47:21Z

Note that with the first graph, the pr-before is never ordering/searching the entries. They are simply added to the end of the list and then serialized.

SendAsync raw data

Toolchain	RequestHeaders	Mean	Error	StdDev	Median	Ratio	Allocated
main	1	2.353 us	0.0036 us	0.0174 us	2.350 us	1.00	1,728 B
pr-grouped	1	1.970 us	0.0033 us	0.0157 us	1.974 us	0.84	1,064 B
pr	1	2.055 us	0.0047 us	0.0224 us	2.047 us	0.87	1,064 B

main	2	2.420 us	0.0018 us	0.0087 us	2.421 us	1.00	1,728 B
pr-grouped	2	2.067 us	0.0020 us	0.0098 us	2.066 us	0.85	1,064 B
pr	2	2.148 us	0.0045 us	0.0219 us	2.157 us	0.89	1,064 B

main	3	2.608 us	0.0103 us	0.0498 us	2.584 us	1.00	1,728 B
pr-grouped	3	2.155 us	0.0024 us	0.0116 us	2.153 us	0.83	1,064 B
pr	3	2.194 us	0.0027 us	0.0134 us	2.192 us	0.84	1,064 B

main	4	2.710 us	0.0161 us	0.0812 us	2.679 us	1.00	2,032 B
pr-grouped	4	2.312 us	0.0137 us	0.0714 us	2.301 us	0.85	1,064 B
pr	4	2.322 us	0.0121 us	0.0622 us	2.295 us	0.86	1,064 B

main	5	2.822 us	0.0171 us	0.0844 us	2.807 us	1.00	2,032 B
pr-grouped	5	2.440 us	0.0083 us	0.0412 us	2.417 us	0.87	1,280 B
pr	5	2.491 us	0.0055 us	0.0271 us	2.493 us	0.88	1,280 B

main	6	2.941 us	0.0046 us	0.0221 us	2.933 us	1.00	2,032 B
pr-grouped	6	2.518 us	0.0086 us	0.0419 us	2.503 us	0.86	1,280 B
pr	6	2.522 us	0.0062 us	0.0298 us	2.511 us	0.86	1,280 B

main	7	3.041 us	0.0062 us	0.0302 us	3.022 us	1.00	2,032 B
pr-grouped	7	2.608 us	0.0036 us	0.0175 us	2.601 us	0.86	1,280 B
pr	7	2.670 us	0.0130 us	0.0680 us	2.667 us	0.88	1,280 B

main	8	3.275 us	0.0120 us	0.0626 us	3.293 us	1.00	2,696 B
pr-grouped	8	2.770 us	0.0158 us	0.0801 us	2.767 us	0.85	1,280 B
pr	8	2.737 us	0.0062 us	0.0305 us	2.754 us	0.84	1,280 B

main	9	3.371 us	0.0116 us	0.0576 us	3.347 us	1.00	2,696 B
pr-grouped	9	2.864 us	0.0036 us	0.0174 us	2.867 us	0.85	1,688 B
pr	9	2.895 us	0.0144 us	0.0716 us	2.869 us	0.86	1,688 B

main	10	3.602 us	0.0164 us	0.0854 us	3.594 us	1.00	2,696 B
pr-grouped	10	3.053 us	0.0031 us	0.0153 us	3.051 us	0.85	1,688 B
pr	10	3.005 us	0.0068 us	0.0330 us	3.015 us	0.84	1,688 B

main	11	3.767 us	0.0025 us	0.0122 us	3.768 us	1.00	2,696 B
pr-grouped	11	3.168 us	0.0018 us	0.0083 us	3.164 us	0.84	1,688 B
pr	11	3.139 us	0.0082 us	0.0415 us	3.122 us	0.83	1,688 B

main	12	3.953 us	0.0125 us	0.0631 us	3.927 us	1.00	2,696 B
pr-grouped	12	3.395 us	0.0118 us	0.0617 us	3.390 us	0.86	1,688 B
pr	12	3.219 us	0.0105 us	0.0545 us	3.184 us	0.81	1,688 B

main	13	4.099 us	0.0189 us	0.0964 us	4.052 us	1.00	2,696 B
pr-grouped	13	3.505 us	0.0105 us	0.0519 us	3.489 us	0.85	1,688 B
pr	13	3.353 us	0.0139 us	0.0686 us	3.342 us	0.82	1,688 B

main	14	4.177 us	0.0020 us	0.0096 us	4.176 us	1.00	2,696 B
pr-grouped	14	3.661 us	0.0077 us	0.0373 us	3.664 us	0.88	1,688 B
pr	14	3.410 us	0.0081 us	0.0388 us	3.411 us	0.82	1,688 B

main	15	4.361 us	0.0021 us	0.0101 us	4.360 us	1.00	2,696 B
pr-grouped	15	3.795 us	0.0043 us	0.0208 us	3.801 us	0.87	1,688 B
pr	15	3.579 us	0.0099 us	0.0507 us	3.569 us	0.82	1,688 B

main	16	4.671 us	0.0187 us	0.0954 us	4.642 us	1.00	2,696 B
pr-grouped	16	4.125 us	0.0213 us	0.1111 us	4.117 us	0.88	1,688 B
pr	16	3.670 us	0.0093 us	0.0458 us	3.682 us	0.79	1,688 B

main	20	5.456 us	0.0037 us	0.0178 us	5.461 us	1.00	4,080 B
pr-grouped	20	4.903 us	0.0209 us	0.1088 us	4.863 us	0.90	2,480 B
pr	20	4.247 us	0.0216 us	0.1126 us	4.222 us	0.78	2,480 B

main	24	6.298 us	0.0159 us	0.0790 us	6.307 us	1.00	4,080 B
pr-grouped	24	5.674 us	0.0043 us	0.0212 us	5.673 us	0.90	2,480 B
pr	24	4.769 us	0.0123 us	0.0607 us	4.742 us	0.76	2,480 B

main	28	7.009 us	0.0060 us	0.0290 us	7.006 us	1.00	4,080 B
pr-grouped	28	6.507 us	0.0124 us	0.0611 us	6.493 us	0.93	2,480 B
pr	28	5.310 us	0.0087 us	0.0425 us	5.320 us	0.76	2,480 B

main	32	7.692 us	0.0062 us	0.0301 us	7.676 us	1.00	4,080 B
pr-grouped	32	7.424 us	0.0165 us	0.0783 us	7.374 us	0.97	2,480 B
pr	32	5.729 us	0.0083 us	0.0396 us	5.748 us	0.74	2,480 B

main	36	8.406 us	0.0131 us	0.0644 us	8.400 us	1.00	4,080 B
pr-grouped	36	8.472 us	0.0034 us	0.0163 us	8.468 us	1.01	4,040 B
pr	36	6.459 us	0.0259 us	0.1273 us	6.423 us	0.77	4,040 B

main	40	9.582 us	0.0352 us	0.1694 us	9.541 us	1.00	7,336 B
pr-grouped	40	9.528 us	0.0413 us	0.2078 us	9.427 us	0.99	4,040 B
pr	40	6.933 us	0.0276 us	0.1372 us	6.959 us	0.72	4,040 B

main	44	10.306 us	0.0286 us	0.1384 us	10.283 us	1.00	7,336 B
pr-grouped	44	10.590 us	0.0452 us	0.2270 us	10.464 us	1.03	4,040 B
pr	44	7.438 us	0.0105 us	0.0491 us	7.419 us	0.72	4,040 B

main	48	11.058 us	0.0092 us	0.0445 us	11.052 us	1.00	7,336 B
pr-grouped	48	11.655 us	0.0305 us	0.1479 us	11.595 us	1.05	4,040 B
pr	48	8.072 us	0.0491 us	0.2560 us	8.038 us	0.73	4,040 B

main	52	11.746 us	0.0368 us	0.1831 us	11.677 us	1.00	7,336 B
pr-grouped	52	12.655 us	0.0057 us	0.0269 us	12.654 us	1.08	4,040 B
pr	52	8.448 us	0.0407 us	0.2119 us	8.427 us	0.72	4,040 B

main	56	12.453 us	0.0311 us	0.1522 us	12.412 us	1.00	7,336 B
pr-grouped	56	13.865 us	0.0235 us	0.1127 us	13.822 us	1.11	4,040 B
pr	56	8.966 us	0.0292 us	0.1423 us	8.917 us	0.72	4,040 B

main	60	13.081 us	0.0207 us	0.0975 us	13.086 us	1.00	7,336 B
pr-grouped	60	15.032 us	0.0084 us	0.0388 us	15.022 us	1.15	4,040 B
pr	60	9.412 us	0.0338 us	0.1629 us	9.345 us	0.72	4,040 B

main	64	13.787 us	0.0094 us	0.0447 us	13.774 us	1.00	7,336 B
pr-grouped	64	16.743 us	0.0787 us	0.4103 us	16.960 us	1.21	4,040 B
pr	64	9.924 us	0.0336 us	0.1653 us	9.853 us	0.72	4,040 B

main	100	21.579 us	0.0852 us	0.4425 us	21.810 us	1.00	14,480 B
pr-grouped	100	26.696 us	0.0829 us	0.4322 us	26.399 us	1.24	15,072 B
pr	100	20.317 us	0.0918 us	0.4782 us	20.450 us	0.94	15,072 B

Add without validation + enumerate without validation raw data

Toolchain	NumberOfHeaders	Mean	Error	StdDev	Median	Ratio	Allocated
main	1	122.59 ns	2.510 ns	3.757 ns	121.18 ns	1.00	376 B
pr-grouped	1	76.79 ns	1.535 ns	1.996 ns	77.10 ns	0.63	256 B
pr	1	84.04 ns	1.681 ns	1.868 ns	84.70 ns	0.69	256 B

main	2	183.06 ns	3.577 ns	3.346 ns	180.87 ns	1.00	376 B
pr-grouped	2	118.26 ns	1.673 ns	1.565 ns	119.13 ns	0.65	256 B
pr	2	122.93 ns	1.904 ns	1.590 ns	123.77 ns	0.67	256 B

main	3	270.32 ns	2.927 ns	2.444 ns	268.94 ns	1.00	376 B
pr-grouped	3	162.32 ns	3.152 ns	3.987 ns	160.26 ns	0.60	256 B
pr	3	165.98 ns	3.036 ns	2.840 ns	164.30 ns	0.61	256 B

main	4	389.57 ns	7.828 ns	9.319 ns	389.19 ns	1.00	680 B
pr-grouped	4	205.07 ns	3.002 ns	2.809 ns	206.41 ns	0.52	256 B
pr	4	205.72 ns	4.023 ns	3.951 ns	202.63 ns	0.53	256 B

main	5	461.69 ns	8.266 ns	7.732 ns	464.62 ns	1.00	680 B
pr-grouped	5	293.61 ns	5.823 ns	7.773 ns	294.10 ns	0.63	472 B
pr	5	316.54 ns	6.246 ns	8.338 ns	321.83 ns	0.68	472 B

main	6	527.32 ns	10.455 ns	16.582 ns	525.12 ns	1.00	680 B
pr-grouped	6	353.87 ns	7.081 ns	7.272 ns	356.35 ns	0.68	472 B
pr	6	370.81 ns	7.154 ns	9.792 ns	373.87 ns	0.71	472 B

main	7	616.52 ns	12.214 ns	17.122 ns	606.43 ns	1.00	680 B
pr-grouped	7	419.36 ns	8.110 ns	8.328 ns	414.64 ns	0.68	472 B
pr	7	437.81 ns	8.669 ns	8.109 ns	440.15 ns	0.71	472 B

main	8	745.69 ns	14.659 ns	13.712 ns	755.95 ns	1.00	1,344 B
pr-grouped	8	486.95 ns	7.636 ns	7.142 ns	482.39 ns	0.65	472 B
pr	8	492.31 ns	7.838 ns	6.545 ns	492.26 ns	0.66	472 B

main	9	825.77 ns	16.452 ns	16.158 ns	832.19 ns	1.00	1,344 B
pr-grouped	9	625.59 ns	12.451 ns	18.251 ns	631.28 ns	0.76	880 B
pr	9	610.42 ns	12.158 ns	16.642 ns	602.64 ns	0.74	880 B

main	10	997.87 ns	15.747 ns	14.730 ns	996.91 ns	1.00	1,344 B
pr-grouped	10	720.57 ns	14.355 ns	15.359 ns	726.92 ns	0.72	880 B
pr	10	690.36 ns	13.212 ns	13.568 ns	682.78 ns	0.69	880 B

main	11	1,061.06 ns	21.144 ns	35.904 ns	1,048.99 ns	1.00	1,344 B
pr-grouped	11	809.18 ns	15.951 ns	16.381 ns	814.43 ns	0.77	880 B
pr	11	807.85 ns	12.956 ns	12.119 ns	809.09 ns	0.77	880 B

main	12	1,209.00 ns	24.048 ns	22.495 ns	1,217.40 ns	1.00	1,344 B
pr-grouped	12	900.33 ns	13.765 ns	12.876 ns	907.99 ns	0.74	880 B
pr	12	850.53 ns	16.424 ns	16.131 ns	848.89 ns	0.70	880 B

main	13	1,328.95 ns	26.219 ns	30.194 ns	1,309.19 ns	1.00	1,344 B
pr-grouped	13	1,007.52 ns	12.576 ns	11.763 ns	1,013.41 ns	0.76	880 B
pr	13	956.47 ns	13.528 ns	12.654 ns	963.93 ns	0.72	880 B

main	14	1,455.24 ns	29.118 ns	29.902 ns	1,467.84 ns	1.00	1,344 B
pr-grouped	14	1,126.84 ns	22.493 ns	25.903 ns	1,142.96 ns	0.77	880 B
pr	14	1,042.98 ns	20.597 ns	32.067 ns	1,043.92 ns	0.72	880 B

main	15	1,530.39 ns	26.842 ns	23.795 ns	1,516.45 ns	1.00	1,344 B
pr-grouped	15	1,219.10 ns	24.151 ns	18.855 ns	1,226.85 ns	0.80	880 B
pr	15	1,153.90 ns	22.947 ns	34.346 ns	1,148.41 ns	0.74	880 B

main	16	1,656.75 ns	32.754 ns	33.636 ns	1,670.99 ns	1.00	1,344 B
pr-grouped	16	1,389.44 ns	24.917 ns	23.307 ns	1,374.75 ns	0.84	880 B
pr	16	1,263.69 ns	8.729 ns	7.289 ns	1,261.70 ns	0.76	880 B

main	17	1,773.05 ns	35.178 ns	39.100 ns	1,753.49 ns	1.00	1,344 B
pr-grouped	17	1,549.71 ns	27.231 ns	25.472 ns	1,535.39 ns	0.87	1,672 B
pr	17	1,473.61 ns	27.595 ns	24.463 ns	1,465.86 ns	0.83	1,672 B

main	18	2,022.24 ns	36.846 ns	28.767 ns	2,024.19 ns	1.00	2,728 B
pr-grouped	18	1,727.06 ns	32.736 ns	35.028 ns	1,737.08 ns	0.86	1,672 B
pr	18	1,539.29 ns	1.943 ns	1.623 ns	1,538.48 ns	0.76	1,672 B

main	19	2,195.57 ns	43.635 ns	63.959 ns	2,192.98 ns	1.00	2,728 B
pr-grouped	19	1,814.63 ns	30.953 ns	28.954 ns	1,795.46 ns	0.84	1,672 B
pr	19	1,670.20 ns	26.156 ns	24.466 ns	1,655.58 ns	0.77	1,672 B

main	20	2,255.31 ns	44.376 ns	63.643 ns	2,245.40 ns	1.00	2,728 B
pr-grouped	20	1,949.18 ns	24.671 ns	23.077 ns	1,935.58 ns	0.87	1,672 B
pr	20	1,819.34 ns	32.696 ns	30.584 ns	1,829.61 ns	0.81	1,672 B

main	22	2,594.99 ns	51.857 ns	48.507 ns	2,564.49 ns	1.00	2,728 B
pr-grouped	22	2,259.40 ns	44.367 ns	60.730 ns	2,228.79 ns	0.87	1,672 B
pr	22	2,094.97 ns	41.693 ns	73.022 ns	2,069.17 ns	0.79	1,672 B

main	24	2,868.40 ns	56.532 ns	58.055 ns	2,821.24 ns	1.00	2,728 B
pr-grouped	24	2,619.68 ns	50.704 ns	60.359 ns	2,574.80 ns	0.92	1,672 B
pr	24	2,408.42 ns	48.074 ns	51.439 ns	2,448.43 ns	0.84	1,672 B

main	26	3,131.28 ns	58.519 ns	54.738 ns	3,167.28 ns	1.00	2,728 B
pr-grouped	26	2,863.38 ns	3.760 ns	2.936 ns	2,864.17 ns	0.91	1,672 B
pr	26	2,747.12 ns	53.499 ns	50.043 ns	2,710.30 ns	0.88	1,672 B

main	28	3,336.57 ns	2.453 ns	2.295 ns	3,336.55 ns	1.00	2,728 B
pr-grouped	28	3,220.18 ns	4.427 ns	3.457 ns	3,219.11 ns	0.97	1,672 B
pr	28	3,008.62 ns	59.656 ns	71.016 ns	3,046.78 ns	0.90	1,672 B

main	30	3,770.03 ns	49.027 ns	45.860 ns	3,737.39 ns	1.00	2,728 B
pr-grouped	30	3,551.77 ns	20.115 ns	15.704 ns	3,549.86 ns	0.94	1,672 B
pr	30	3,392.65 ns	54.903 ns	51.356 ns	3,363.21 ns	0.90	1,672 B

main	32	3,973.16 ns	77.671 ns	125.424 ns	3,942.48 ns	1.00	2,728 B
pr-grouped	32	3,922.48 ns	46.984 ns	43.949 ns	3,902.50 ns	0.99	1,672 B
pr	32	3,730.02 ns	63.504 ns	59.401 ns	3,708.42 ns	0.95	1,672 B

main	34	4,268.38 ns	85.357 ns	135.385 ns	4,201.05 ns	1.00	2,728 B
pr-grouped	34	4,450.25 ns	88.186 ns	137.295 ns	4,408.10 ns	1.04	3,232 B
pr	34	4,259.82 ns	83.759 ns	117.418 ns	4,225.08 ns	1.00	3,232 B

main	36	4,571.16 ns	87.931 ns	101.262 ns	4,493.04 ns	1.00	2,728 B
pr-grouped	36	4,703.05 ns	92.352 ns	90.702 ns	4,711.15 ns	1.03	3,232 B
pr	36	4,623.75 ns	91.465 ns	136.901 ns	4,560.38 ns	1.01	3,232 B

main	38	4,997.31 ns	83.586 ns	65.258 ns	5,003.47 ns	1.00	5,984 B
pr-grouped	38	5,144.99 ns	101.197 ns	103.922 ns	5,106.03 ns	1.03	3,232 B
pr	38	4,862.88 ns	95.718 ns	143.267 ns	4,905.93 ns	0.96	3,232 B

main	40	5,269.29 ns	77.839 ns	60.771 ns	5,285.07 ns	1.00	5,984 B
pr-grouped	40	5,504.70 ns	14.606 ns	12.948 ns	5,501.95 ns	1.05	3,232 B
pr	40	5,276.89 ns	102.523 ns	113.954 ns	5,343.00 ns	1.00	3,232 B

main	42	5,476.23 ns	17.783 ns	14.850 ns	5,477.43 ns	1.00	5,984 B
pr-grouped	42	5,967.31 ns	117.755 ns	125.997 ns	5,901.33 ns	1.09	3,232 B
pr	42	5,676.34 ns	106.889 ns	104.980 ns	5,736.88 ns	1.04	3,232 B

main	44	5,779.55 ns	107.425 ns	100.486 ns	5,726.15 ns	1.00	5,984 B
pr-grouped	44	6,462.91 ns	122.669 ns	120.478 ns	6,525.91 ns	1.12	3,232 B
pr	44	6,019.01 ns	120.007 ns	128.406 ns	6,111.23 ns	1.04	3,232 B

main	46	6,095.82 ns	99.287 ns	92.873 ns	6,052.07 ns	1.00	5,984 B
pr-grouped	46	6,924.19 ns	137.678 ns	135.218 ns	6,968.37 ns	1.14	3,232 B
pr	46	6,520.93 ns	124.904 ns	116.835 ns	6,470.77 ns	1.07	3,232 B

main	48	6,504.33 ns	128.527 ns	126.231 ns	6,511.55 ns	1.00	5,984 B
pr-grouped	48	7,329.39 ns	104.296 ns	97.559 ns	7,284.50 ns	1.13	3,232 B
pr	48	6,948.54 ns	114.840 ns	107.421 ns	6,879.09 ns	1.07	3,232 B

main	52	6,816.08 ns	11.007 ns	8.593 ns	6,816.38 ns	1.00	5,984 B
pr-grouped	52	8,207.21 ns	157.495 ns	204.788 ns	8,099.89 ns	1.20	3,232 B
pr	52	7,857.39 ns	154.434 ns	144.458 ns	7,799.97 ns	1.15	3,232 B

main	56	7,309.92 ns	25.379 ns	19.814 ns	7,306.98 ns	1.00	5,984 B
pr-grouped	56	9,251.26 ns	181.800 ns	288.354 ns	9,122.51 ns	1.26	3,232 B
pr	56	8,572.55 ns	161.131 ns	150.722 ns	8,627.65 ns	1.17	3,232 B

main	60	8,075.87 ns	130.259 ns	121.845 ns	8,116.33 ns	1.00	5,984 B
pr-grouped	60	10,320.31 ns	204.922 ns	325.028 ns	10,485.12 ns	1.27	3,232 B
pr	60	9,537.24 ns	185.582 ns	213.717 ns	9,393.53 ns	1.18	3,232 B

main	64	8,318.01 ns	161.570 ns	165.921 ns	8,345.00 ns	1.00	5,984 B
pr-grouped	64	11,384.55 ns	221.299 ns	245.973 ns	11,445.30 ns	1.37	3,232 B
pr	64	10,525.79 ns	208.092 ns	213.695 ns	10,353.05 ns	1.27	3,232 B

main	80	10,495.06 ns	178.951 ns	149.432 ns	10,558.71 ns	1.00	5,984 B
pr-grouped	80	17,428.86 ns	340.630 ns	509.840 ns	17,763.80 ns	1.64	13,784 B
pr	80	10,174.41 ns	201.227 ns	223.663 ns	10,056.71 ns	0.97	13,784 B

main	96	13,642.04 ns	233.097 ns	218.039 ns	13,487.46 ns	1.00	13,128 B
pr-grouped	96	20,239.71 ns	397.674 ns	473.403 ns	20,537.87 ns	1.47	14,168 B
pr	96	12,179.03 ns	201.938 ns	188.893 ns	12,071.38 ns	0.89	14,168 B

main	112	15,849.29 ns	197.159 ns	184.423 ns	15,941.89 ns	1.00	13,128 B
pr-grouped	112	22,182.71 ns	424.558 ns	505.406 ns	22,084.05 ns	1.40	14,552 B
pr	112	14,118.42 ns	17.055 ns	14.241 ns	14,119.91 ns	0.89	14,552 B

main	128	18,360.61 ns	357.130 ns	438.588 ns	18,569.98 ns	1.00	13,128 B
pr-grouped	128	23,802.49 ns	38.615 ns	30.148 ns	23,803.64 ns	1.29	14,936 B
pr	128	16,259.46 ns	318.608 ns	549.582 ns	15,948.75 ns	0.88	14,936 B

Good point. Perhaps we should make that change first?

I can put up a PR for it in parallel to this, but I imagine the exact numbers shouldn't matter for this PR.

stephentoub · 2022-01-19T16:15:04Z

src/libraries/System.Net.Http/src/System/Net/Http/Headers/HttpHeaders.cs

@@ -159,28 +169,33 @@ internal bool TryAddWithoutValidation(HeaderDescriptor descriptor, IEnumerable<s
                throw new ArgumentNullException(nameof(values));
            }

-            using (IEnumerator<string?> enumerator = values.GetEnumerator())
+            using IEnumerator<string?> enumerator = values.GetEnumerator();


Based on usage in, say, YARP, do we have a sense for what the most common concrete type of values is? If it's typically an array or List<string>, or even an IList<string>, it might be worth special-casing those types. PGO will hopefully be able to help even without that, but I suspect for the foreseeable future even if it can avoid some of the interface dispatch it probably won't be able to avoid the enumerator allocation.

The common case is certainly TryAddWithoutValidation(string, string), but if we do end up using the IEnumerable overload, it would always be string[] (since the source is StringValues).

Opened #64049 to split off such optimizations from this PR.

src/libraries/System.Net.Http/src/System/Net/Http/Headers/HttpHeaders.cs

stephentoub · 2022-01-19T16:27:32Z

src/libraries/System.Net.Http/src/System/Net/Http/Headers/HttpHeaders.cs

-            _headerStore ??= new Dictionary<HeaderDescriptor, object>();
-
-            foreach (KeyValuePair<HeaderDescriptor, object> header in sourceHeadersStore)
+            foreach (HeaderEntry entry in sourceHeaders.GetEntries())


If sourceHeaders is backed by a dictionary, this is first going to copy the data into an array, only to then enumerate that array and throw it away. Can we do better?

Separately but related, how common is it for AddHeaders to be called on an empty collection? I'm wondering if we can optimize the transfer from sourceHeaders to this collection in some common cases by cloning the whole data structure rather than by adding each header individually.

If sourceHeaders is backed by a dictionary, this is first going to copy the data into an array, only to then enumerate that array and throw it away. Can we do better?

We could, but it would mean either duplicating some logic in callers / incurring the overhead for a custom enumerator in the common case. Related: #62981 (comment)

how common is it for AddHeaders to be called on an empty collection?

AddHeaders in mainly used in two places:

Copying defaultRequestHeaders to the request in HttpClient. I'm not sure how common it is to not specify any headers on the request, but this sounds like a nice optimization we should just do here if the target is empty.

Creating a new HttpContent in DecompressionHandler and copying all the headers. In that case the destination always starts out as empty. I opened DecompressionHandler could move the headers more efficiently #63632 for this case since we could avoid copying altogether.

If it turns out it's really common to have a lot of defaultRequestHeaders and few headers on the request, we could optimize for that too in the future.

src/libraries/System.Net.Http/src/System/Net/Http/Headers/HttpHeaders.cs

stephentoub · 2022-01-19T17:03:10Z

src/libraries/System.Net.Http/src/System/Net/Http/Headers/HttpHeaders.cs

+                    {
+                        return ref entries[i].Value!;
+                    }
+                }


Do we need to open-code these comparison loops, or could we use IndexOf?

Since we're searching against entries[i].Key and not the whole type, we would need a custom IndexOf method anyway.
Extracting that logic and replacing the 3 loops here with it, the performance regresses by ~5% while increasing LOC.

we would need a custom IndexOf method anyway

Just a custom comparer, no?

If it's not a good tradeoff, fine. I just want us to strongly prefer using built-in functionality whenever possible.

stephentoub · 2022-01-19T17:05:38Z

src/libraries/System.Net.Http/src/System/Net/Http/Headers/HttpHeaders.cs

+                if ((uint)count < (uint)entries.Length)
+                {
+                    entries[count] = entry;
+                    _count++;


Would this be better as _count = count + 1;?

It's just slightly worse as it has to store an extra value on the stack

src/libraries/System.Net.Http/src/System/Net/Http/Headers/HttpHeaders.cs

* [RateLimiting] Dequeue items when queuing with NewestFirst (#63377) * Don't reuse registers in Debug mode (#63698) Co-authored-by: Bruce Forstall <brucefo@microsoft.com> * Add IsKnownConstant jit helper and optimize 'str == ""' with str.StartsWith('c') (#63734) Co-authored-by: Miha Zupan <mihazupan.zupan1@gmail.com> Co-authored-by: SingleAccretion <62474226+SingleAccretion@users.noreply.github.com> * - mono_wasm_new_external_root for roots on stack (#63997) - temp_malloc helper via linear buffer in js - small refactorings Co-authored-by: Katelyn Gadd <kg@luminance.org> * [Arm64] Don't use D-copies in CopyBlock (#63588) * Increase the maximum number of internal registers allowd per node in src/coreclr/jit/lsra.h * Based on discussion in https://github.com/dotnet/runtime/issues/63453 don't allocate a SIMD register pair if the JIT won't be able to use Q-copies in src/coreclr/jit/lsraarmarch.cpp * Update CodeGen to reflect that Q-copies should be used only when size >= 2 * FP_REGSIZE_BYTES and using of them makes the instruction sequence shorter in src/coreclr/jit/codegenarmarch.cpp * Update comment - we don't use D-copies after that change in src/coreclr/jit/codegenarmarch.cpp * Disable hot reload tests for AOT configurations (#64006) * Bump Explicit-layout value types with no fields to at minimum 1 byte size. (#63975) * Add runtime-extra-platforms pipeline to have matching runtime PR and Rolling builds (#62564) * Add runtime-extended-platforms pipeline to have matching runtime PR and Rolling builds * Fix evaluate changed paths condition for the extra pipeline * PR Feedback and fix condition * Move MacCatalyst back to staging, disable tvOS tests * Disable browser wasm windows legs * Make ILStubGenerated event log ModuleID corresponding to that on other events (#63974) * Retries for flaky WMI test (#64008) * [arm64] JIT: Redundant zero/sign extensions after ldrX/ldrsX (#62630) * JIT: fix up switch map for out-of-loop predecessor (#64014) If we have a loop where some of the non-loop predecessors are switchs, and we add pre-header to the loop, we need to update the switch map for those predecessors. Fixes #63982. * Update StructMarshalling design now that DisableRuntimeMarshallingAttribute is approved (#63765) Co-authored-by: Elinor Fung <elfung@microsoft.com> * Fix Crossgen2 bug #61104 and add regression test (#63956) The issue tracks the runtime regression failure where Crossgen2-compiled app is unable to locate a type with non-ASCII characters in its name. The failure was caused by the fact that Crossgen2 was incorrectly zero-extending the individual UTF8 characters when calculating the hash whereas runtime is sign-extending them. Thanks Tomas * Fix invalid threading of nodes in rationalization (#64012) The code in question assumes that the ASG will be reversed and thus threads "simdTree" before "location" in the linear order. That dependency, while valid, because "gtSetEvalOrder" will always reverse ASGs with locals on the LHS, is unnecessary and incorrect from the IR validity point of view. Fix this by using "InsertAfter" instead of manual node threading. * Check if the child object is in the heap range before get_region_plan_gen_num (#63828) * Check if the child object is in the heap range before object_gennum (#63970) * 'cmeq' and 'fcmeq' Vector64<T>.Zero/Vector128<T>.Zero ARM64 containment optimizations (#62933) * Initial work * Added a comma to display * Cleanup * Fixing build * More cleanup * Update comment * Update comment * Added CompareEqual Vector64/128 with Zero tests * Do not contain op1 for now * Wrong intrinsic id used * Removing generated tests * Removing generated tests * Added CompareEqual tests * Supporting containment for first operand * Fix test build * Passing correct register * Check IsVectorZero before not allocing a register * Update comment * Fixing test * Minor format change * Fixed formatting * Renamed test * Adding AdvSimd_Arm64 tests: * Adding support for rest of 'cmeq' and 'fcmeq' instructions * Removing github csproj * Minor test fix * Fixed tests * Fix print * Minor format change * Fixing test * Added some emitter tests * Feedback * Update emitarm64.cpp * Feedback * [Arm64] Keep unrolling InitBlock and CopyBlock up to 128 bytes (#63422) * Add INITBLK_LCL_UNROLL_LIMIT and CPBLK_LCL_UNROLL_LIMIT of 128 bytes in src/coreclr/jit/targetarm64.h * Keep unrolling InitBlock up to INITBLK_LCL_UNROLL_LIMIT bytes when dstAddr points to the stack in src/coreclr/jit/lowerarmarch.cpp * Keep unrolling CopyBlock up to CPBLK_LCL_UNROLL_LIMIT bytes when both srcAddr and dstAddr point to the stack in src/coreclr/jit/lowerarmarch.cpp * Add ProcessLinkerXmlBase to NativeAOT (#63666) Add Xml Parsing linker files as a reference source to NativeAOT Rename NativeAOT ProcessLinkerXmlBase version to ProcessXmlBase (uses XmlReader) Add ProcessLinkerXmlBase from linker and fix it so it can be used in NativeAOT (uses XPath) * Fix gc_heap::remove_ro_segment (#63473) * Fix OpenSSL version check in GetAlpnSupport The previous check failed 3.0.0 because the Minor was 0 and Build was 0. It could probably be rewritten to be `>= new Version(1, 0, 2)`, but that'd require more thinking. * Fix issues with verify_regions, clear_batch_mark_array_bits. (#63798) Details: - we cannot verify the tail of the region list from background GC, as it may be updated by threads allocating. - fix case in clear_batch_mark_array_bits where end is equal to the very end of a segment and we write into uncommitted memory in the mark_array. - bgc_clear_batch_mark_array_bits did some checks and then called clear_batch_mark_array_bits which repeated the exact same checks. Renamed clear_batch_mark_array_bits to bgc_batch_mark_array_bits and removed the old copy, removed the declaration for clear_batch_mark_array_bits. * [debugger][wasm] Added support for non user code attribute (#63876) * Hidden methods and step through methods behave the same way. * Perpared flow for setting JustMyCode in the future. * Tests for JustMyCode setting before debug launch. * Transformed into dynamic JustMyCode change flow. * JustMyCode disabled, first 3 cases solved. * Finished behavior for JMC disabled (with 1 difference). * JMC enabled: stepIn np bp + stepIn bp + resume bp. * Functional version (with minor deviations from expected behavior). * Refactoring. * All tests for NonUserCode work. * Fix line number after adding code above. * Fix error in merge. * Removing duplicated tests. * [wasm][debugger] Added support for stepper boundary attribute (#63991) * Hidden methods and step through methods behave the same way. * Perpared flow for setting JustMyCode in the future. * Tests for JustMyCode setting before debug launch. * Transformed into dynamic JustMyCode change flow. * JustMyCode disabled, first 3 cases solved. * Finished behavior for JMC disabled (with 1 difference). * JMC enabled: stepIn np bp + stepIn bp + resume bp. * Functional version (with minor deviations from expected behavior). * Refactoring. * All tests for NonUserCode work. * Fix line number after adding code above. * Stepper boundary with tests. * Save information about multiple decorators. * Fix error in merge. * Polish the PR build doc (#64036) * [wasm] WebSocket tests on NodeJS (#63441) - NPM package with WS. - Restore npm during build. - Load npm modules in test-main.js. Co-authored-by: Pavel Savara <pavel.savara@gmail.com> * Fix dependency in runtime-official.yml (#64040) After https://github.com/dotnet/runtime/pull/62564 the `hostedOs` value is included in the job name. * [API Implementation]: System.Diagnostics.CodeAnalysis.StringSyntaxAttribute (#62995) * Add StringSyntaxAttribute * Fix attribute declaration and add usage * Address PR feedback Co-authored-by: Stephen Toub <stoub@microsoft.com> * Reduce the memory footprint of HttpHeaders (#62981) * Change HttpHeaders backing store to an array * Reduce the size of HeaderDescriptor to 1 object * Update UnitTests, fix GetOrCreateHeaderInfo * Switch to a dictionary after ArrayThreshold headers * Add unit tests * Use storeValueRef naming consistently * Workaround field layout regression (#63005) * Mark _descriptor on HeaderDescriptor as nullable * Remove HeaderDescriptor.Descriptor and add HasValue, IsKnownHeader, Equals * Simplify HttpHeaderParser.Separator logic * Add comments on HasValue checks * Lazily group headers by name * Add a header ordering+grouping test * Make use of the _count field * Revert all HeaderDescriptor changes from PR * Switch back to always grouping by name * Assert that the collection is not empty in GetEnumeratorCore * Optimize AddHeaders for empty collections * Reference the Roslyn bug issue * Assert that multiValues are never empty * Don't preserve a Dictionary across Clear * Add comment about why a custom HeaderEntry type is used * Disable DirectoryLongerThanMaxLongPathWithExtendedSyntax_ThrowsException (#64044) * Add test coverage for frozen objects and GC interaction (#64030) * Test coverage for frozen objects and GC interaction * Update Preinitialization.cs * Remove Type.MakeGenericType dependency from source generation (#64004) * Remove Type.MakeGenericType dependency from srcgen * address feedback * add trimmer warning suppression * address feedback * Add ns2.0 support to System.Formats.Cbor (#62872) * Add ns2.0 support to System.Formats.Cbor * Add NetFrameworkMinimum to tfms * Add ReadHalf and WriteHalf to compatibility suppressions * Remove unwanted comment * Exception sets: debug checker & fixes (#63539) * Add a simple exception sets checker * Add asserts to catch missing nodes * Fix normal VN printing * Fix JTRUE VNs * Fix PHI VNs * Update VNs for "this" ARGPLACE node * Tolerate missing VNs on PHI_ARGs We do not update them after numbering the loops. (Though perhaps we should) * Tolerate unreachable blocks * Fix exception sets for VNF_PtrTo VNFuncs * Add VNUniqueWithExc * Add VNPUniqueWithExc * Fix arrays * Consistently give location nodes VNForVoid And always add exception sets for them. This will simplify the exception set propagation code for assignments. * Fix CSE * Fix GT_RETURN * Fix LCLHEAP * Fix GT_ARR_ELEM * Fix unique HWI * Fix unique SIMD * Fix GT_SWITCH * Fix CKFINITE * Fix HWI loads * Fix fgValueNumberAddExceptionSetForIndirection The method does not need to add the exception set for the base address. Additionally, the way it did add the sets, by unioning with normal value numbers, lost all exceptions not coming from the base address. This was fine for the unary loads, but broke the HWI loads that could have exceptions coming from not just the address. * Fix GT_RETFILT * Fix INIT_VAL * Fix DYN_BLK * Fix FIELD_LIST * De-pessimize CkFinite * Add a test for HWIs * Add a test for LCLHEAP * Change test to check for store block operators (#60878) * Update XUnit to 2.4.2-pre.22 (#63948) * Update to Xunit build 2.4.2-pre.13 Also pick up latest pre-release of analyzers * Disambiguate calls to Assert.Equals(double,double,int) Xunit added a new Assert overload that caused a lot of ambiguous calls. https://github.com/xunit/xunit/issues/2393 Workaround by casting to double. * Fix new instances of xUnit2000 diagnostic * Workaround xUnit2002 issue with implicit cast Works around https://github.com/xunit/xunit/issues/2395 * Disable xUnit2014 diagnostic This diagnostic forces the use of Assert.ThrowsAsync for any async method, however in our case we may want to test that a method will throw synchronously to avoid regressing that behavior by moving to the async portion of the method. * Use AssertExtensions to test for null ArgumentException.ParamName Workaround https://github.com/xunit/xunit/issues/2396 * Update to Xunit 2.4.2-pre.22 * Fix another ArugmentException.ParamName == null assert * Preserve OBJ/BLK on the RHS of ASG (#63268) One of my upcoming changes will need this information to accurately detect type mismatch in "fgValueNumberBlockAssignment". * Revert "Temporarily disable coredumps during library testing on macOS (#63742)" (#64057) This reverts commit 2c28e63f9360280011a3b03c1ca6dc0edce1fae4. Fixes #63761 * Performance: Fix Browser Wasm job not being found for dependent jobs (#64058) * Figure out the name that browser wasm now uses. * linux to the Browser wasm depends on name. Update the browser wasm dependson name to match the new one found in the pipeline. * Fix exception propagation over HW exception frame on macOS arm64 (#63596) * Fix exception propagation over HW exception frame on macOS arm64 There is a problem unwinding over the PAL_DispatchExceptionWrapper to the actual hardware exception location. The unwinder is unable to get distinct LR and PC in that frame and sets both of them to the same value. This is caused by the fact that the PAL_DispatchExceptionWrapper is just an injected fake frame and there was no real call. Calls always return with LR and PC set to the same value. The fix unifies the hardware exception frame unwinding with Linux where we had problems unwinding over signal handler trampoline, so PAL_VirtualUnwind skips the trampoline and now also the PAL_DispatchExceptionWrapper frame by copying the context of the exception as the unwound context. * Reenable DllImportGenerator.Unit.Tests * Add StringSyntax attribute to Regex.pattern field (#64063) I missed adding this one in my initial audit. It'll be exceedingly rare for a developer to manually write code that assigns a string to this protected field, but every source-generated regex does so, and thus any colorization VS provides will benefit looking at the source-generated code. * Sync shared code from aspnetcore (#64059) Co-authored-by: JamesNK <JamesNK@users.noreply.github.com> * Read the System.GC.CpuGroup settings in runtimeconfig.json (#64067) * Log message of unexpected exception in ThrowsAny (#64064) * Log message of unexpected exception in ThrowsAny * Update AssertExtensions.cs * Enable some browser legs on the extra-platforms pipeline (#64065) * Enable some browser legs on the extra-platforms pipeline * Flow platform parameter from helix queues templates * Fix another condition * Allow CreateScalarUnsafe to be directly contained by hwintrinsics that support scalar loads (#62407) * Ensure that floating-point constants can be contained by hardware intrinsics * Allow CreateScalarUnsafe to be directly contained by hwintrinsics that support scalar loads * Rename IsContainableHWIntrinsicOp to TryGetContainableHWIntrinsicOp and improve handling * Ensure that NI_AVX2_BroadcastScalarToVector128/256 are properly tracked as MaybeMemoryLoad * Applying formatting patch * Ensure a few other "maybe memory" and special memory operand size cases are handled * Applying formatting patch * Remove commented code (#63869) * Add pmi_path argument to superpmi.py script and use it in the superpmi-collect pipeline. (#63983) * Add -pmi_path argument to superpmi.py collect command and use it to set PMIPATH environment variable in src/coreclr/scripts/superpmi.py * Set pmi_path to $(SuperPMIDirectory)\crossgen2 * Print a warning if -pmi_path or -pmi_location is specified while --pmi is not in src/coreclr/scripts/superpmi.py * Move setting of PMIPATH environment variable under `if self.coreclr_args.pmi is True:` in src/coreclr/scripts/superpmi.py * Move pmi argument validation to setup_args() in src/coreclr/scripts/superpmi.py * Clone root_env if we are going to set PMIPATH environment variable in src/coreclr/scripts/superpmi.py * Update the macOS CoreCLR building documentation. (#63932) This updates the documentation to refer to the up-to-date location of requirements and prerequisites. * Introduce RandomAccess.SetLength (#63992) * don't Flush readonly MemoryMappedViewAccessor on disposal (#63794) * don't Flush if it's impossible to write * address code review feedback: apply same optimization to MemoryMappedViewStream * Implement System.Runtime.CompilerServices.DisabledRuntimeMarshallingAttribute on CoreCLR-family of runtimes/type systems (#63320) * Add the DisableRuntimeMarshallingAttribute to the build. * Add initial test suite * Implement support in IL stubs for the "disabled runtime marshalling" feature. * Add testing for inlining IL stubs. * Block SetLastError and LCID support when DisableRuntimeMarshallingAttribute is applied. * Bump NativeAOT-only R2R version header (missed previously) * Implement support in crossgen2 and NativeAOT * Clean up the test tree and update the tests to fail more reliably when bugs are present. Fix a bug that was uncovered when the tests were refactored. * Fix NativeAOT and clean up crossgen2 * Add a test for NoPreserveSig with DisableRuntimeMarshalling * Assign hr in SUCCEEDED macro. * PR feedback. * Block varargs in disabled marshalling mode. * Fix typo * Block types that have a field that is auto-layout somewhere in their layout. * Fix typo * Revert the AutoLayoutOrHasAutoLayoutFIeld check in the "marshalling enabled" case * Only set scope when it isn't null (it's null for some cases). * Fix narrowing conversion failure. * First pass simple implementation in Mono * Fix assert to still work for the built-in marshalling system * S_FALSE is a thing * Fix type load failures caused by eager type handle loading. * Get MethodILScope from the calling method when available (this covers all cases where we need it) * Add const modifier. * Try 2 to fix const modifiers * Fix compilation of NativeAOT jitinterface * Fix type lookup in Mono * Use try_get model for getting the attribute type in the case of failure. Fix mono implementation for looking up the attribute. * Handle void and generic instantiations * Update auto-layout check to check recursively in layout. * Enhance test suite with more tests for UnmanagedCallersOnly, generics, and the like. Fix AutoLayout test. * Fix IL and a few typos * Set a value in the padding for easier debugging. * Create sig->marshalling_disabled to track when marshalling is disabled, which is separate from the concept of "is this signature a P/Invoke" * Fix running test suite on Mono + Mini JIT * Fix recursive type load failure by only checking the "has auto-layout or field with auto-layout" for value types. * Fix mono windows build. * Feedback from Michal. * Fix bug in EcmaAssembly.HasAssemblyCustomAttribute * Make the runtime flavor check in the wrapper generator case-invariant * Use helper method since various different platforms/configurations throw different exceptions for these scenarios. * Fix AutoLayout test refactor and use a dummy value for the padding field in both enabled and disabled scenarios. * Add an explicit test for using enums as they're a little weird and needed some special-casing. * Fix build-time test filtering in xunit wrapper generator. * Fix some x86-specific issues * Add a nice big comment block. * Fix x86 * Refactor tests so we can skip one on Mono since Char->char lossy conversion is not supported. * Disable test in issues.targets until an alternative solution is reached. * Add another SkipOnMono attribute in the "Enabled" test suite. * Apply UnmangedFunctionPointerAttribute to help hint to the Mono LLVM AOT compiler to compile the managed->native thunks at aot-time * Unify on "runtime marshalling" terminology * Clean up unused usings. * Address Jan's feedback except for applying the attribute to CoreLib. * PR feedback. * Mono throws an InvalidProgramException for varargs * Fix copy-paste issue. * Make sure we use the P/Invoke's Module and not the caller's module when deciding if runtime marshalling is enabled for a varargs P/Invoke call. * Handle how LLVM AOT reports the failure to handle varargs (EEE) * Make ILLink validation steps in libs incrementally buildable (#64041) * Make ILLink validation steps in libs incrementally buildable Both the illink-oob and the illink-sharedframework targets don't define Inputs and Outputs which makes them run during no-op incremental builds. This change defines Inputs and Outputs based on what's used during the target's execution so that if the input assemblies or the illink assembly itself haven't changed, the step will be skipped. Also renaming properties and items to make them more readable and consistent. As these target files are "extensions" of the src.proj file and aren't shared anywhere, they can be treated like logic inside a project file and hence prefixing properties and items with an underscore "_" isn't necessary. * Fix broken callstacks in interpreter on MonoVM. (#60338) * Fix some broken callstacks in interpreter. * Fix build error. * Initial WASI support prototype (#63890) * Add StringSyntaxAttribute.Json (#64081) * [main] Update dependencies from 5 repositories (#64002) Co-authored-by: dotnet-maestro[bot] <dotnet-maestro[bot]@users.noreply.github.com> Co-authored-by: Přemek Vysoký <premek.vysoky@microsoft.com> * Fix crash when VS4Mac is debugging VS4Mac arm64 (#64085) Fix crash when VS4Mac is debugging VS4Mac arm64 Issue: https://github.com/dotnet/runtime/issues/64011 * ILVerify: Handle readonly references in ldfld (#64077) * ILVerify: Handle readonly references in ldfld Fixes #63953 * Fix test name Co-authored-by: Michal Strehovský <MichalStrehovsky@users.noreply.github.com> * Avoid additional local created for delegate invocations (#63796) Very often 'this' is already a local and we can avoid creating another local. * [wasm][debugger] Apply changes on wasm using sdb protocol. (#63705) * Apply changes on wasm using sdb protocol. * conflict * Merge conflict. * Fix merge * Fix compilation error. * Fixed IsFloatPositiveZero from returning 'true' on non-constant double operands (#64083) * Fixed IsFloatPositiveZero from returning 'true' on non-constant double operands * Update src/coreclr/jit/gentree.h Co-authored-by: Egor Bogatov <egorbo@gmail.com> Co-authored-by: Egor Bogatov <egorbo@gmail.com> * Ensure several helper intrinsics are correctly imported and handled (#63972) * Ensure several helper intrinsics are correctly imported and handled * Ensure that Sum for TYP_INT/UINT on Arm64 is correctly handled * Respond to PR feedback and ensure ExtractMostSignificantBits for Vector64<int/uint> on Arm64 also uses AddPairwise * Applying formatting patch * Ensure the clsHnd is correct * Fix the remaining musl failures * Ensure that we aren't sign-extending TYP_BYTE (System.SByte) for ExtractMostSignificantBits * Ensure an assert is correct on x64 * Ensure Vector64<int/uint>.Dot on Arm64 uses AddPairwise, not AddAcross * Apply formatting patch * RegexNode cleanup (#64074) No functional changes, just code cleanup: - Move node types into a RegexNodeKind enum - Rename some of the kinds to make them more descriptive - Rename node.Next to node.Parent to better describe its purpose - Add a bunch of comments about node kinds * Refactor optimizing morph for commutative operations (#63251) * Create "fgOptimizeCommutativeArithmetic" And just move code from "fgMorphSmpOp" to it. Just one diff: better comma throw propagation in an ILGEN method. * Refactor the function Split it into specialized variants for each operator, delete redundant code, fix up one case of wrong typing for a constant in the MUL -> SHIFT optimization. One CSE diff due to different VNs because of the typing change for the constant (int -> long). Many text diffs: "mov x3, 5" => "mov w3, 5". * Do not set GTF_NO_CSE for sources of block copies (#63462) It is not necessary, the compiler fully supports locals on the RHS of a struct assignment. Not marking these results in a CQ improvement, from struct (including SIMD) CSEs and global constant propagation into promoted fields. * Handle embedded assignments in copy propagation (#63447) * Clean things up a little Delete redundant conditions, use "LclVarDsc*", rename locals for clarity. * Delete a redundant condition For actual def nodes, GTF_VAR_CAST will never be set, it is only set in "optNarrowTree" for uses. For "def nodes" that are actually uses (parameters), the VNs will never match anyway. * Handle embedded assignments in copy propagation Previously, as the comments in copy propagation tell us, it did not handle "intervening", or not-top-level definitions of locals, instead opting to maintain a dedicated kill set of them. This is obviously a CQ problem, but also a TP one, as it meant there had to be a second pass over the statement's IR, where the definitions would be pushed on the stack. This change does away with that, instead pushing new definitions as they are encountered in execution order, and simultaneously propagating on uses. Notably, this means the code now needs to look at the real definition nodes, i. e. ASGs, not the LHS locals, as those are encountered first in canonical execution order, i. e. for a tree like: ``` ASG LCL_VAR V00 "def" ADD LCL_VAR V00 LCL_VAR V00 ``` Were we to use the "def" as the definition point, we would wrongly push it as the definition on the stack, even as the assignments itself hasn't happened yet at that point. There are nice diffs with this change, all resulting from unblocked propagations, and mostly coming from setup arguments under calls. * Simplify optIsSsaLocal * Update format script permissions so it can be called on Unix systems directly. (#64107) * Revert "Enable System.Text.Json tests on netfx (#63803)" (#64108) This reverts commit 34794bc5f2bcdbaa9057bb07b8764e2bb6a411a2. * Make ApiCompat.proj incrementally buildable (#64037) * Make ApiCompat.proj incrementally buildable In https://github.com/dotnet/runtime/pull/64000, I noticed that ApiCompat.proj never builds incrementally. Even though the RunApiCompat target has Inputs and Outputs, those aren't defined too late inside the target to have any effect. Moving them out and declare the generated response file as an output. Also simplifying some msbuild logic and renaming some properties as underscore prefixes in project files don't make sense if the property isn't reserved in any way. * Update ApiCompat.proj * Remove enable drawing on unix switch (#64084) * Remove enable drawing on unix switch * Update some tests and not run tests that need Drawing on non Windows * PR Feedback, just turn off the switch * Address-expose locals under complex local addresses in block morphing (#63100) * Handle complex local addresses in block morphing In block morphing, "addrSpill" is used when the destination or source represent indirections of "complex" addresses. Unfortunately, some trees in the form of "IND(ADDR(LCL))" fall into this category. If such an "ADDR(LCL)" is used as an "addrSpill", the underlying local *must* be marked as address-exposed. Block morphing was using a very simplistic test for when that needs to happen, essentially only recognizing "ADDR(LCL_VAR/FLD)". But it is possible to have a more complicated pattern as "PrepareDst/Src" uses "IsLocalAddrExpr" to recognize indirect stores to locals. Currently it appears impossible to get a mismatch here as morph transforms "IND(ADD(ADDR(LCL_VAR), OFFSET))" into "LCL_FLD" (including for TYP_STRUCT indirections), but this is a very fragile invariant. Transforming TYP_STRUCT GT_FIELDs into GT_OBJs instead of GT_INDs breaks it, for example. Fix this by address-exposing the local obtained via "IsLocalAddrExpr". * Add a TODO-CQ for LCL_FLD usage * [Group 2] Enable nullable annotations for `Microsoft.Extensions.DependencyInjection` (#63836) * Annotate src * Update ResolverBuilder.Build * Update RunOnEmptyStackCore * ILEmitResolverBuilderContext constructor * Remove setter * Add assert * Enable nullable annotations for Microsoft.Extensions.Configuration.UserSecrets (#63700) * [mono] Cleanup trailing whitespace. (#64112) * Delete `GT_DYN_BLK` (#63026) * Import GT_STORE_DYN_BLK directly * Delete GT_DYN_BLK * DynBlk -> StoreDynBlk * Add some tests * Mark tests Pri-1 * Rebase and fix build * Bring back the odd early return * Ignore conversion exceptions during dictionary construction (#63792) * Extract SuperPMI into a separate component (#64035) Allows building the runtime without SPMI. `build.cmd clr` will still build SPMI. `build.cmd clr.native` will still build SPMI. `build.cmd clr.runtime` will no longer build SPMI. This is mostly motivated by NativeAOT subset builds where SPMI contributes to 10% of the native build time (nativeaot CorecLR subset builds pretty quickly compared to full CoreCLR). * Add COMWrappers to crossgen (#63969) * pipelines: Add wasm jobs (#64109) * Fixing update issue with multivalued properties #34267 (#56696) * Add custom attribute test * Adding test demonstrating issue #34267 * Solution for issue #34267 Replacing all values in property with the new collection, instead of just appending new values, leaving old values in place. * Incorporate review feedback Changing the variable name * Relax assert in ApplyEditAndContinue (#64132) Fixes #64070 * Disable NJulianRuleTest test crashing in CI (#64142) * Updating unit tests for DirectoryServices.AccountManagement (#56670) Removing old, redundant unit tests that were actually never executed Migrating old tests to new test infrastructure with configurable LDAP/AD connections * Fix MultiByteToWideChar call in pal (#64146) * Extra tests for assembly name parser. (#64022) * Dead code in native assembly name parsing * disallow `\u` escaping in assembly names * misc cleanup * forward slash is illegal escaped or not * ignore "language" attribute in assembly name ("culture" must be used) * duplicate attributes are ok if unrecognized (just add tests) * drop support for "custom" blob attribute * drop support for publickey[token]=neutral ("null" must be used) * ignore unknown assembly name attributes in mono (compat) * disallow \0 anywhere in the assembly name * disallow \0 in assembly names on mono (compat) * only check for embedded nulls when parsing * fix mono build * make GCC happy * couple test scenarios for publickey vs. publickeytoken (CoreRT parser might trip on these) * produce errors on duplicate known attributes in mono * Dispose LdapConnections used by ValidateCredentials (#62036) Ensure that cached LdapConnection instances created by PrincipalContext.ValidateCredentials are disposed when the corresponding PrincipalContext is disposed. Fix #62035 * Add runtime support for `ref` fields (#63985) * Add mono and coreclr runtime support for ref fields * Update Reflection.Emit tests to validate ref fields. Add test for TypedReference as a ref field. * Spmi replay asmdiffs mac os arm64 (#64119) * Split unix-arm64 into linux-arm64 and osx-arm64 in src/coreclr/scripts/superpmi-replay.proj * Split unix-arm64 into linux-arm64 and osx-arm64 in src/coreclr/scripts/superpmi-asmdiffs.proj * Add all subdirectories of $(SuperPMIDirectory) as PMIPATH in src/coreclr/scripts/superpmi-collect.proj * Update NativeAOT codegen and Crossgen2 for CreateSpan (#63977) - Make sure FieldRVA pointers remain aligned as required by the code generator - Use the same Packing Size approach as the IL Linker will use (See jbevain/cecil#817 for details) - Compilers that generate CreateSpan will need to follow that trick to be compatible with rewriters. - Provide ECMA spec augment describing packing size detail * Add alignment to mapped field stream (#63305) * Align MappeFieldDataStream at 8 byte boundary * Add test to verify that the mapped field rva data blob is aligned to ManagedPEBuilder.MappedFieldDataAlignment * Only align when the mapped field data is of size not equal to 0 * Implement hash and HMAC stream one shots This implements hashing and HMAC statics for streams. Additionally, "LiteHmac" and "LiteHash" were introduced. The existing HMAC and hash provider functionality do some bookkeeping we don't need for resetting. Since we do not need to use these hash handles after the digest has been finalized, resetting is unnecessary work. For HMAC, that also means keeping a copy of the key around for some implementations which we don't need to do. The LiteHash and LiteHmac types are implemented as structs with a common interface. To avoid boxing, generics are used and constrained to the interface where possible. The Browser implementation just defers to the existing HashDispenser rather than do anything novel. The HashProviderCng is somewhat specialized in its ability to reset. It did up-front check to determine if the platform supported reusable hash providers, and further had a single implementation for HMAC and Digests. The current Lite hash design requires that they remain separate types. * Title and message resources should be enforced to exist to prevent printing empty messages (#64151) Sync ILLink.Shared folder with the latest version in dotnet/linker main branch List of changes include: - Enforce title and message resources to exist to prevent printing empty messages - All diagnostics produced by linker now have a DiagnosticId, a title and a message - Schema for xml link attributes file - Added a readme file to the ILLink.Shared project to keep track of the commit is being used from dotnet/linker * Allow generating Dwarf version 5 (#63988) Contributes to https://github.com/dotnet/runtimelab/issues/1738. * Re-enable failing long path test (#64113) * Port MD4 managed implementation from mono/mono (#62074) Porting MD4 managed implementation from mono/mono (MD4.cs and MD4Managed.cs). It adds: - an internal class in the System.Net.Security with a single HashData method for now; - a set of related MD 4 unit tests to System.Net.Security.Unit.Tests project. * Fix one source of perf regression in GCHeap::Alloc. This impacts the System.Collections.CtorFromCollectionNonGeneric<Int32> family of benchmarks. (#64091) These benchmarks manage to make GCHeap::Alloc into a hotspot, so the call to IsHeapPointer() at the end matters for performance. * Add blsr (#63545) * Fix FileSystemAclExtensions.Create when passing a null FileSecurity (#61297) * Make FileSecurity parameter nullable. * Add missing ArgumentException message for FileMode.Append. * Refactor tests to ensure FileSecurity is tested with all FileMode and FileSystemRights combinations. Separate special cases. * Remove exception that throws when FileSecurity is null. Ensure we have logic that can create a FileHandle when FileSecurity is null. Fix bug where FileShare.Inheritable causes IOException because it is being unexpectedly passed to the P/Invoke (it should just be saved in the SECURITY_ATTRIBUTES struct). Add documentation to mention this parameter as optional. Ensure all exceptions match exactly what we have in .NET Framework, with simpler logic. * Address suggestions Co-authored-by: carlossanlop <carlossanlop@users.noreply.github.com> * Tune FP CSEs live across a call better (#63903) The problem was that the comparison of a weighted refcount, which usually has the order of hundreds or tens, with a small digit like "4" was too weak and missed some cases where we were still trying to CSE cheaps floats across calls and ending up with lots of stack shuffling. Fix this by using different tuning parameters, namely the costs estimated for the uses and defs (increase them to account for the spills and reloads). * [main] Update dependencies from dotnet/arcade dotnet/icu dotnet/xharness dotnet/emsdk (#64098) Co-authored-by: dotnet-maestro[bot] <dotnet-maestro[bot]@users.noreply.github.com> * Update zip extraction to never throw any exceptions when the LastWriteTime update fails (#63912) * Use kebab-case in FB automation labels (#64048) * Onboard new Triage & PR Boards (#64198) * Exclusively use GitHub teams for Libraries area mentions (#64199) * Reduce buffer size used in XmlReader when using Async mode (#63459) The current choice of AsyncBufferSize resulted in the character buffer in the XmlTextReader being allocated on the Large Object Heap (LOH) Fixes https://github.com/dotnet/runtime/issues/61459 * Ignoring leading dot when comparing cookie domains (#64038) * ignoring leading dot when comparin cookie domain * Simplify cookie comparing logic to equality and moving it to CookieComparer to fix the build * Domain comparing optimizarion and more unit tests * small check optimization * Renaming method * Add missing handle function enter/return macros (#64061) The mono_field_static_get_value method uses a handle, but did not set up enter/exit macros properly, so this handle was leaked. Some code in Unity calls this embedding API method pretty often, which can lead to the mark stack overflowing in the GC code. * Drop support for .NET 5 SDK (#64186) We had to duplicate a lot of Microsoft.NET.ILLink.targets logic. * Implement IEquatable<T> on value types overriding Equals (and enable CA1066/1077) (#63690) * [mono] Temporarily disable two tests that fail on arm64 LLVM FullAOT. (#64180) * Delete stale reference in System.Drawing.Primitives (#64202) * Respond to feedback in GenerateMultiTargetRoslynComponentTargetsFile (#63943) * Respond to feedback in GenerateMultiTargetRoslynComponentTargetsFile Two small follow up changes from #58446 - Fix a type-o that breaks incremental build. Forgot to use MSBuild property syntax - Instead of having the infrastructure hard-code removing 'Abstractions', packages can set their own Disable source gen property name. * PR feedback * Use the static HashData(Stream) method in more places * Add executable bit to tizen sh files (#64216) * Bump Intellisensense package version to latest from `dotnet7-transport` (#63352) * Ensure that we aren't accidentally generating instructions for unsupported ISAs (#64140) * Assert that the ISA of the set intrinsic ID is supported * Ensure gtNewSimdCmpOpAllNode and gtNewSimdCmpOpAnyNode don't generate AVX2 instructions when not supported * Ensure codegen for Vector128.Dot when SSSE3 is disabled is correct * Update src/coreclr/jit/hwintrinsiccodegenarm64.cpp Co-authored-by: Jan Kotas <jkotas@microsoft.com> * Ensure Vector256.Sum has a check for AVX2 Co-authored-by: Jan Kotas <jkotas@microsoft.com> * Don't reference .NETFramework shims in libraries product or test composition (#64193) * Don't reference .NETFramework shims Stop referencing .NETFramework shims in libraries ref or source projects as those are supplementary and shouldn't impact the product composition. * [Android][libs] Enable Internal.Console.Write in System.Private.CoreLib (#63949) * [Android][libs] Enable Internal.Console.Write in System.Private.CoreLib * [docs] Add debugging System.Private.CoreLib Internal.Console.Write * Elaborate on debugging corelib log * Address feedback * Install v8 and Prebuild wasm (#64100) * Port Mono to Raspberry Pi, ship as new linux-armv6 RID (#62594) * Initial ARMv6 arch addition. Builds mono runtime, not CoreCLR (Mono already supports the CPU arch subset used by Raspberry Pi, whilst porting CoreCLR to e.g. VFPv2 would be major work) * Build small clr subset on ARMv6, it's needed for SDK and we want to check it works * Fix remote unwind (#64220) The _OOP_find_proc_info was setting only a couple of members of the unw_dyn_info_t instance on stack. So the remaining ones had random values. The load_offset was a recently added member to the struct. When we have updated libunwind, this change came in. The load_offset was random and that has broken unwindign as this offset is subtracted from the IP when looking up unwind info. The fix clears the whole struct. I have verified that the issue we had no longer happens with the fix. * Put back FindCaseSensitivePrefix regex alternation support (#64204) * Put back FindCaseSensitivePrefix alternation support * Fix the bug from the initial version, and add more comments * Update tests to expect RemoteExecutor to check exit code (#64133) * update generation_allocation_size correctly for SIP regions (#64176) SIP regions need to update the corresponding generation's generation_allocation_size and since this can be more than 1 gen older than the region's gen, we need to make all generation's alloc size get updated. * Android remove backward timezones (#64028) Fixes #63693 It was discovered that Android produces duplicate TimeZone DisplayNames among all timezone IDs in GetSystemTimeZones. These duplicate DisplayNames occur across TimeZone IDs that are aliases, where all except one are backward timezone IDs. If a name is changed, put its old spelling in the 'backward' file From the Android TimeZone data file tzdata, it isn't obvious which TimeZone IDs are backward (I find it strange that they're included in the first place), however we discovered that on some versions of Android, there is an adjacent file tzlookup.xml that can aid us in determining which TimeZone IDs are "current" (not backward). This PR aims to utilize tzlookup.xml when it exists and post-filter's the Populated TimeZone IDs in the AndroidTzData instance by removing IDs and their associated information (byteoffset and length) from the AndroidTzData instance if it is not found in tzlookup.xml. This is using the assumption that all non-backward TimeZone IDs make it to the tzlookup.xml file. This PR also adds a new TimeZoneInfo Test to check whether or not there are duplicate DisplayNames in GetSystemTimeZones * Update main branding to preview2 (#64219) * Catch UnicodeEncodeErrors (#64251) * Make XmlSerializer.Generator targets incremental (#64191) * Make XmlSerializer.Generator targets incremental Adding inputs and outputs to make XmlSerializer.Generator incremental * Make sure that shared memory object name meets the length requirements (#64099) Co-authored-by: Stephen Toub <stoub@microsoft.com> * Fix PAL_wprintf for wide characters (#64181) * [main] Update dependencies from dotnet/runtime dotnet/llvm-project (#64205) * Update dependencies from https://github.com/dotnet/runtime build 20220123.5 Microsoft.NETCore.ILAsm , Microsoft.NETCore.DotNetHostPolicy , Microsoft.NETCore.DotNetHost , Microsoft.NETCore.App.Runtime.win-x64 , System.Runtime.CompilerServices.Unsafe , runtime.native.System.IO.Ports , Microsoft.NET.Sdk.IL , System.Text.Json From Version 7.0.0-alpha.1.22066.4 -> To Version 7.0.0-alpha.1.22073.5 * Update dependencies from https://github.com/dotnet/llvm-project build 20220123.1 runtime.win-x64.Microsoft.NETCore.Runtime.ObjWriter , runtime.win-arm64.Microsoft.NETCore.Runtime.ObjWriter , runtime.osx.10.12-x64.Microsoft.NETCore.Runtime.ObjWriter , runtime.osx.11.0-arm64.Microsoft.NETCore.Runtime.ObjWriter , runtime.linux-x64.Microsoft.NETCore.Runtime.ObjWriter , runtime.linux-musl-x64.Microsoft.NETCore.Runtime.ObjWriter , runtime.linux-musl-arm64.Microsoft.NETCore.Runtime.ObjWriter , runtime.linux-arm64.Microsoft.NETCore.Runtime.ObjWriter From Version 1.0.0-alpha.1.22070.1 -> To Version 1.0.0-alpha.1.22073.1 Co-authored-by: dotnet-maestro[bot] <dotnet-maestro[bot]@users.noreply.github.com> * Delete unused ApiCompat baseline files (#64190) * Delete unused ApiCompat baseline files * Delete ApiCompatBaseline.netfx.netstandardOnly.txt * Remove manual .NETFramework baseline validation * Delete ApiCompatBaseline.netcoreapp.netfx461.ignore.txt * Delete ApiCompatBaseline.netcoreapp.netfx461.txt * Improve Regex handling of anchors (#64177) * Improve Regex handling of anchors - Extend search for leading anchor to support alternations. This means that an expression like `^abc|^def` will now observe the leading `^` whereas previously it didn't. - Add a FindFirstChar optimization that jumps to the right position for a pattern that matches a computeable max length and ends with an end anchor. * Address PR feedback * Add the exception set for `ObjGetType` (#64106) * Model NRE for ObjGetType * Add tests * [ILVerify] Fix casting check for arrays of generic parameters with class constraints (#64259) Fixes #63999 * Use lower call count threshold for tiering in debug builds (#60945) * Use lower call count threshold for tiering in debug builds To exercise more paths during tests, see https://github.com/dotnet/runtime/pull/60886 * Skip tests using AsyncIO in FileSystemAclExtensionsTests where it's not supported (#64212) The mono runtime does not yet support AsyncIO on Windows and there were some tests failing on CI because of it. Fixes #64221 * Correct JsonNode.Root doc (#64238) * Take ARMv6 out of PlatformGroup All (#64267) * Take ARMv6 out of PlatformGroup All, CoreCLR assumes this means full support Co-authored-by: Alexander Köplinger <alex.koeplinger@outlook.com> * Only send to Helix for rolling build, due to small Helix queue (#64274) * Add ref field runtime feature indication (#64167) * Add ref field runtime feature indication Co-authored-by: Stephen Toub <stoub@microsoft.com> * Faster IndexOf for substrings (#63285) * Improve "lastChar == firstChar" case, also, use IndexOf directly if value.Length == 1 * Try plain IndexOf first, to optimize cases where even first char of value is never met * add 1-byte implementation * copyrights * fix copy-paste mistake * Initial LastIndexOf impl * More efficient LastIndexOf * fix bug in Char version (we need two clear two lowest bits in the mask) & temporarily remove AdvSimd impl * use ResetLowestSetBit * Fix bug * Add two-byte LastIndexOf * Fix build * Minor optimizations * optimize cases with two-byte/two-char values * Remove gotos, fix build * fix bug in LastIndexOf * Make sure String.LastIndexOf is optimized * Use xplat simd helpers - implicit ARM support * fix arm * Delete \ * Use Vector128.IsHardwareAccelerated * Fix build * Use IsAllZero * Address feedback * Address feedback * micro-optimization, do-while is better here since mask is guaranteed to be non-zero * Address feedabc * Use clever trick I borrowed from IndexOfAny for trailing elements * give up on +1 bump for SequenceCompare * Clean up * Clean up * fix build * Add debug asserts * Clean up: give up on the unrolled trick - too little value from code bloat * Add a test * Fix build * Add byte-specific test * Fix build * Update IndexOfSequence.byte.cs * [main] Update dependencies from dotnet/arcade dotnet/xharness dotnet/icu dotnet/hotreload-utils dotnet/llvm-project (#64265) * Update dependencies from https://github.com/dotnet/arcade build 20220124.13 Microsoft.DotNet.XUnitConsoleRunner , Microsoft.DotNet.CodeAnalysis , Microsoft.DotNet.Build.Tasks.Workloads , Microsoft.DotNet.Build.Tasks.Templating , Microsoft.DotNet.Build.Tasks.TargetFramework.Sdk , Microsoft.DotNet.Build.Tasks.Packaging , Microsoft.DotNet.Build.Tasks.Installers , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Build.Tasks.Archives , Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.ApiCompat , Microsoft.DotNet.XUnitExtensions , Microsoft.DotNet.GenAPI , Microsoft.DotNet.VersionTools.Tasks , Microsoft.DotNet.GenFacades , Microsoft.DotNet.SharedFramework.Sdk , Microsoft.DotNet.RemoteExecutor , Microsoft.DotNet.PackageTesting , Microsoft.DotNet.Helix.Sdk From Version 2.5.1-beta.22071.6 -> To Version 2.5.1-beta.22074.13 * Update dependencies from https://github.com/dotnet/xharness build 20220124.1 Microsoft.DotNet.XHarness.CLI , Microsoft.DotNet.XHarness.TestRunners.Common , Microsoft.DotNet.XHarness.TestRunners.Xunit From Version 1.0.0-prerelease.22071.1 -> To Version 1.0.0-prerelease.22074.1 * Update dependencies from https://github.com/dotnet/icu build 20220124.5 Microsoft.NETCore.Runtime.ICU.Transport From Version 7.0.0-preview.2.22071.2 -> To Version 7.0.0-preview.2.22074.5 * Update dependencies from https://github.com/dotnet/hotreload-utils build 20220124.1 Microsoft.DotNet.HotReload.Utils.Generator.BuildTool From Version 1.0.2-alpha.0.22069.1 -> To Version 1.0.2-alpha.0.22074.1 * Update dependencies from https://github.com/dotnet/llvm-project build 20220124.2 runtime.win-x64.Microsoft.NETCore.Runtime.Mono.LLVM.Tools , runtime.win-x64.Microsoft.NETCore.Runtime.Mono.LLVM.Sdk , runtime.linux-arm64.Microsoft.NETCore.Runtime.Mono.LLVM.Sdk , runtime.linux-arm64.Microsoft.NETCore.Runtime.Mono.LLVM.Tools , runtime.linux-x64.Microsoft.NETCore.Runtime.Mono.LLVM.Sdk , runtime.linux-x64.Microsoft.NETCore.Runtime.Mono.LLVM.Tools , runtime.osx.10.12-x64.Microsoft.NETCore.Runtime.Mono.LLVM.Sdk , runtime.osx.10.12-x64.Microsoft.NETCore.Runtime.Mono.LLVM.Tools From Version 11.1.0-alpha.1.22067.2 -> To Version 11.1.0-alpha.1.22074.2 Co-authored-by: dotnet-maestro[bot] <dotnet-maestro[bot]@users.noreply.github.com> * Add CancellationToken to TextReader.ReadXAsync (#61898) Co-authored-by: Adam Sitnik <adam.sitnik@gmail.com> Co-authored-by: Stephen Toub <stoub@microsoft.com> * Restrict parallelism in LLVM FullAOT compile, to prevent OOM (#63800) * Restrict parallelism in FullAOT compile, to prevent OOM * Reduce parallelism further, due to more OOM * Moved AssemblyName helpers to managed (#62866) * Moved ComputePublicKeyToken to managed * Managed assembly name parsing (adapted from nativeaot) * Fix for HostActivation failures. * PR feedback (RuntimeAssemblyName is back to CoreRT + other comments) * remove AssemblyNameNative::Init form the .hpp * remove AppX compat ifdef * renamed instance fields to convention used in C# * `Argument_InvalidAssemblyName` should be `InvalidAssemblyName`. Majority of use is `FileLoadException`. * remove `this.` * PR feedback (assign to fileds, bypass properties) * missed this change in the rebase * "low-hanging fruit" perf tweaks. * move one-user helpers to where they are used. * removed ActiveIssue for #45032 * remove AssemblyNameHelpers.cs form corelib * Remove the List when detecting duplicates. Support PublicKey. * whitespace * Fix managed implementation to match the new tests. * Some minor cleanup. * Do not validate culture too early * PR feedback * use SR.InvalidAssemblyName * Report the input string when throwing FileLoadException * tweaked couple comments * Disable RegexReductionTests tests on browser * Fix formatting of resource string where excess arguments are passed (#63824) * Fix formatting of resource string where excess arguments are passed. #63607 * Fix BuildCharExceptionArgs and ECCurve.Validate * Fix CA2208 * Fix CA2208. Remove paramName becaus it is in error message * Code review fixes * Code review fixes * Add Regex.Count string overloads (#64289) * Clarify purpose of PDB Document hashing (#64306) Fixes #63505 * Fix arm64/PInvoke so that NESTED_ENTRY/NESTED_END labels match. (#64296) This was exposed by building on arm64 with gcc-12, wherein the assembler complained about not being able to evaluate the constant expression for .size for the symbol on NESTED_END. Since the symbol on NESTED_END is not referenced anywhere else in the code base, I concluded that it was wrong, and NESTED_ENTRY was right. I have not tested this on anything but arm64 + gcc-12 * When decommitting, leaving one instead of two pages in regions case. (#64243) * Ensure that canceled Task.Delays invoke continuations asynchronously from Cancel (#64217) * Add gen folder moving gen projects from src folder (#64231) * Fix minor typos in GC documentation. (#64298) * Explicitly specify four subdirectories to use as part of the paths for -pmi_path arguments and expand the paths on a remote machine in src/coreclr/scripts/superpmi-collect.proj (#64308) * Disable RegexReductionTests on browser (#64312) * Add UnreachableException (#63922) * [mono] Recognize new names for Xamarin.iOS etc assemblies (#64278) They are being renamed in https://github.com/xamarin/xamarin-macios/pull/13847 * Remove usage of codecvt from corerun (#64157) * Remove usage of codecvt from corerun * Update src/coreclr/hosts/corerun/corerun.cpp Co-authored-by: Aaron Robinson <arobins@microsoft.com> Co-authored-by: Aaron Robinson <arobins@microsoft.com> * Refactor FileStatus.Unix. (#62721) * Refactor FileStatus.Unix. - Moves InitiallyDirectory out of FileStatus into FileSystemInfo. In FileSystemInfo it can be a readonly field making its usage clearer. And FileStatus can then directly be used to implement some FileSystem methods without allocating an intermediate FileInfo/DirectoryInfo. - Treat not exists/exist as initialized states to avoid wrongly assuming initialized means the file cache is valid, which isn't so when the file does not exist. - Use 0 for tracking uninitialized to make default(FileStatus) uninitialized. * Fix unique VNs for `ADDR`s (#64230) * Add the test * Fix unique VNs for ADDRs They need to keep the exception sets. * Implemented hierarchy of attributes. (#64201) * Implemented hierarchy of attributes. * Shortened. * Fixed overlooked test naming and simplified. * Partial refactor. * Update the managed type system to more gracefully fail when calling a varargs method. (#64286) * Update the managed type system to more gracefully fail when calling a varargs method. * Use ThrowHelper instead of manually throwing the exception. * Update src/coreclr/tools/Common/TypeSystem/Ecma/EcmaSignatureParser.cs Co-authored-by: Jan Kotas <jkotas@microsoft.com> * [mono] Add some missing Internal.Runtime.CompilerServices.Unsafe intrinsics. (#64314) * Remove usage of FEATURE_CORESYSTEM (#63850) * Remove usage of FEATURE_CORESYSTEM from coreclr. * [main] Update dependencies from dotnet/arcade dotnet/runtime-assets (#64331) * Update dependencies from https://github.com/dotnet/arcade build 20220125.6 Microsoft.DotNet.XUnitConsoleRunner , Microsoft.DotNet.CodeAnalysis , Microsoft.DotNet.Build.Tasks.Workloads , Microsoft.DotNet.Build.Tasks.Templating , Microsoft.DotNet.Build.Tasks.TargetFramework.Sdk , Microsoft.DotNet.Build.Tasks.Packaging , Microsoft.DotNet.Build.Tasks.Installers , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Build.Tasks.Archives , Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.ApiCompat , Microsoft.DotNet.XUnitExtensions , Microsoft.DotNet.GenAPI , Microsoft.DotNet.VersionTools.Tasks , Microsoft.DotNet.GenFacades , Microsoft.DotNet.SharedFramework.Sdk , Microsoft.DotNet.RemoteExecutor , Microsoft.DotNet.PackageTesting , Microsoft.DotNet.Helix.Sdk From Version 2.5.1-beta.22074.13 -> To Version 2.5.1-beta.22075.6 * Update dependencies from https://github.com/dotnet/runtime-assets build 20220125.1 Microsoft.DotNet.CilStrip.Sources , System.ComponentModel.TypeConverter.TestData , System.Drawing.Common.TestData , System.IO.Compression.TestData , System.IO.Packaging.TestData , System.Net.TestData , System.Private.Runtime.UnicodeData , System.Runtime.Numerics.TestData , System.Runtime.TimeZoneData , System.Security.Cryptography.X509Certificates.TestData , System.Text.RegularExpressions.TestData , System.Windows.Extensions.TestData From Version 7.0.0-beta.22060.1 -> To Version 7.0.0-beta.22075.1 Co-authored-by: dotnet-maestro[bot] <dotnet-maestro[bot]@users.noreply.github.com> * Fixes bad log method generation in certain cases. (#64311) In certain cases when developer by mistake places ILogger, Exception, or LogLevel in the message template, the code generator will produce the expected warning and makes sure the code will indeed compile and run correctly. Prior to this fix, the code generator would fail to compile with when either of ILogger, Exception or LogLevel were placed in message template incorrectly. Fixes #64310 * Fix IsMutuallyAuthenticated on Linux and OSX (#63945) * WIP - prepared a failing test * Fix IsMutuallyAuthenticated on Linux * Fix failing unit tests * Minor cleanup * Port changes to OSX * Fix comment * Invoke cert selection inline, don't allocate new credentials on Linux/OSX * Fix tests on OSX * Code review feedback * Move tests to separate file * Fix build * Fix Failing tests * Support {Last}IndexOfAny with sets after {lazy} loops (#64254) When emitting backtracking loops, the loop consumes as much and then backtracks through the consumed input. Rather than doing this one character by one character, we previously added use of LastIndexOf to search for the next place the literal after the loop matches. We can also augment that to use IndexOfAny to search for a small set that comes after a loop instead of a literal. Similarly when emitting backtracking lazy loops, rather than consuming one character and trying the rest of the expression and then consuming another character and trying the rest of the expression, we previously added an optimization to use IndexOf{Any} to find the next possible location of a match based on the literal that comes after the lazy loop. And we can similarly augment that to support a small set after the lazy loop. This is particularly helpful for IgnoreCase, as we're on a path to replacing literals with sets that contain all equivalent casings of that character. * Fix race conditions in SystemEvents shutdown logic (#62773) * Fix race conditions in SystemEvents shutdown logic When the application is terminated through Restart Manager the event broadcasting window will get the `WM_CLOSE` message. The message gets handled by passing it to `DefWndProc` which calls `DestroyWindow` on the window itself thus making the window handle invalid. The `Shutdown` method expects the window handle to be valid to post `WM_QUIT` message to terminate the thread running the message loop but that's no longer possible under these conditions. Additionally there's second race condition with the `s_eventThreadTerminated` event that is created during shutdown and set conditionally. A race condition between the threads could cause it to be created when the window message thread is already shutting down and thus it would never be set. Waiting for it in the `Shutdown` method would be cause a deadlock. This thread is also completely unnecessary since a `Join` is performed on the thread itself. The fix has several changes that act together: - `s_eventThreadTerminated` event is removed completely in favor of only relying on `Thread.Join` - `WM_DESTROY` message is detected (which happens as a result of WM_CLOSE calling `DefWndProc` which in turn calls `DestroyWindow`) and handled by shutting down the message loop thread - The message loop itself is rewritten to use standard `GetMessageW` loop. The reasoning on why it was not used seems not to be valid anymore since AppDomain shutdowns are performed differently * Add unit test. * Add braces * Add marshaller for TypeLoad failure cases (#64317) This is marshaller used when there incorrect configuration of marshaller applied to fields mostly * Add additional loop table asserts (#64126) 1. Assert that top-level loops are basic block disjoint 2. Assert LPFLG_ITER related flags are legal In addition: 1. Create a `optClearLoopIterInfo` phase to clear various bits in the loop table that are known to no longer be valid, to prevent bad asserts or JitDump output on their values. 2. Move the EndPhase call in Phase::PostPhase happens early, not late. This causes any subsequent asserts due to post-phase checking to be marked with the correct phase, in cases where there was a nested phase executed (such as liveness re-computation). 3. Convert PHASE_INSERT_GC_POLLS to use EndPhase checking 4. Convert fgDetermineFirstCodeBlock to return a PhaseStatus 5. Some minor cleanup in optUpdateLoopsBeforeRemoveBlock() (this was extracted from some bigger changes) * Moved AssemblyName helpers to managed (part 2) (#63915) * implement GetAssemblyName via dynamic call to MetadataReader * A few more file-locking tests. * fix #28153 * no need for version when getting MetadataReader * rename the argument to match AssemblyName * perf tweaks * use memory-mapped file to read metadata * adjust tests for the new implementation * use "bufferSize: 1" when stream is going to be mapped. * null-conditional operator. * do Dispose before re-throwing * get rid of the platform-specific/native stuff * remove assemblyname.hpp * remove `VerifyIsAssembly()` * PR feedback * put back gStdMngIEnumerableFuncs and the others * Fix several bugs in NullabilityInfoContext. (#64143) * Fix several bugs in NullabilityInfoContext. * Reverse ASG(CLS_VAR, ...) (#63957) This helps with register allocation. Consider: ``` ***** BB01 STMT00001 ( 0x000[E-] ... ??? ) N003 ( 18, 10) [000003] -ACXG------- * ASG ref $c0 N001 ( 3, 4) [000002] ----G--N---- +--* CLS_VAR ref Hnd=0x8fec230 Fseq[hackishFieldName] N002 ( 14, 5) [000000] --CXG------- \--* CALL ref CscBench.GetMscorlibPathCore $c0 ``` The rationalizer will rewrite it to what is effectively: ``` ***** BB01 STMT00001 ( 0x000[E-] ... ??? ) N004 ( 18, 12) [000003] -ACXG---R--- * ASG ref N003 ( 3, 6) [000002] n---G--N---- +--* IND ref N002 ( 1, 4) [000006] H----------- | \--* CLS_VAR_ADDR byref Hnd=0x8fec230 N001 ( 14, 5) [000000] --CXG------- \--* CALL ref CscBench.GetMscorlibPathCore ``` And the final LIR will look like: ``` [000006] ------------ IL_OFFSET void INLRT @ 0x000[E-] N001 ( 3, 4) [000002] ----G--N---- t2 = CLS_VAR_ADDR byref Hnd=0x8fec230 N002 ( 14, 5) [000000] --CXG------- t0 = CALL ref CscBench.GetMscorlibPathCore $c0 /--* t2 byref +--* t0 ref N003 ( 18, 10) [000003] -A-XG------- * STOREIND ref [000007] ------------ IL_OFFSET void INLRT @ 0x00A[E-] N001 ( 0, 0) [000004] ------------ RETURN void $180 ``` Since this store must use a barrier, `CLS_VAR_ADDR` won't be contained and will have to be evaludated separately. Because its value is live across a call, it'll get spilled and reloaded. Reversing the ASG fixes the problem: ``` ------------ BB01 [000..00B) (return), preds={} succs={} [000006] ------------ IL_OFFSET void INLRT @ 0x000[E-] N001 ( 14, 5) [000000] --CXG------- t0 = CALL ref CscBench.GetMscorlibPathCore $c0 N002 ( 3, 4) [000002] ----G--N---- t2 = CLS_VAR_ADDR byref Hnd=0x8fec230 /--* t2 byref +--* t0 ref N003 ( 18, 10) [000003] -A-XG------- * STOREIND ref [000007] ------------ IL_OFFSET void INLRT @ 0x00A[E-] N001 ( 0, 0) [000004] ------------ RETURN void $180 ``` * Fixes a few issues for dprintf on OSX (#64076) * [Codespaces] Make it possible to run wasm samples in the browser (#64277) With these changes, running the following in the Codespace will open the local browser to a page served from the codespace hosting the WASM sample: ```console cd src/mono/sample/wasm/browser make make run-browser ``` * Set EMSDK_PATH in .devcontainer.json We provision Emscripten as part of the devcontainer prebuild. Set EMSDK_PATH to allow rebuilding the wasm runtime to work without any additional ceremony * Install dotnet-serve into .dotnet-tools-global * [wasm] Don't try to open browser if running in Codespaces * .devcontainer: add global tools dir to PATH * .devcontainer: forward port 8000 This enables running the mono wasm samples in the local browser: * [wasm] samples: also check for dotnet-serve on the path On Codespaces we install dotnet-se…

MihaZupan added the area-System.Net.Http label Dec 18, 2021

MihaZupan added this to the 7.0.0 milestone Dec 18, 2021

MihaZupan requested a review from a team December 18, 2021 09:51

ghost assigned MihaZupan Dec 18, 2021

antonfirsov reviewed Dec 19, 2021

View reviewed changes

MihaZupan marked this pull request as ready for review December 20, 2021 04:03

MihaZupan mentioned this pull request Dec 20, 2021

Suboptimal field layout when an object is wrapped in a struct #63005

Open

marek-safar reviewed Dec 20, 2021

View reviewed changes

geoffkizer reviewed Dec 20, 2021

View reviewed changes

src/libraries/System.Net.Http/src/System/Net/Http/Headers/HeaderDescriptor.cs Outdated Show resolved Hide resolved

geoffkizer reviewed Dec 20, 2021

View reviewed changes

src/libraries/System.Net.Http/src/System/Net/Http/Headers/HeaderDescriptor.cs Outdated Show resolved Hide resolved

geoffkizer reviewed Dec 20, 2021

View reviewed changes

src/libraries/System.Net.Http/src/System/Net/Http/Headers/HeaderDescriptor.cs Outdated Show resolved Hide resolved

geoffkizer reviewed Dec 20, 2021

View reviewed changes

src/libraries/System.Net.Http/src/System/Net/Http/Headers/HeaderStringValues.cs Outdated Show resolved Hide resolved

geoffkizer reviewed Dec 20, 2021

View reviewed changes

src/libraries/System.Net.Http/src/System/Net/Http/Headers/KnownHeader.cs Outdated Show resolved Hide resolved

geoffkizer reviewed Dec 20, 2021

View reviewed changes

src/libraries/System.Net.Http/src/System/Net/Http/SocketsHttpHandler/HttpConnection.cs Outdated Show resolved Hide resolved

geoffkizer reviewed Dec 20, 2021

View reviewed changes

src/libraries/System.Net.Http/tests/UnitTests/Headers/HeaderEncodingTest.cs Outdated Show resolved Hide resolved

This comment has been minimized.

Sign in to view

geoffkizer reviewed Jan 15, 2022

View reviewed changes

src/libraries/System.Net.Http/src/System/Net/Http/Headers/HttpHeaders.cs Show resolved Hide resolved

geoffkizer reviewed Jan 15, 2022

View reviewed changes

src/libraries/System.Net.Http/src/System/Net/Http/Headers/HttpHeaders.cs Show resolved Hide resolved

geoffkizer reviewed Jan 15, 2022

View reviewed changes

src/libraries/System.Net.Http/src/System/Net/Http/Headers/HttpHeaders.cs Show resolved Hide resolved

geoffkizer reviewed Jan 15, 2022

View reviewed changes

geoffkizer mentioned this pull request Jan 15, 2022

SocketsHttpHandler does not preserve request header format #63808

Open

stephentoub reviewed Jan 19, 2022

View reviewed changes

src/libraries/System.Net.Http/src/System/Net/Http/Headers/HttpHeaders.cs Outdated Show resolved Hide resolved

stephentoub reviewed Jan 19, 2022

View reviewed changes

src/libraries/System.Net.Http/src/System/Net/Http/Headers/HttpHeaders.cs Outdated Show resolved Hide resolved

stephentoub reviewed Jan 19, 2022

View reviewed changes

src/libraries/System.Net.Http/src/System/Net/Http/Headers/HttpHeaders.cs Outdated Show resolved Hide resolved

stephentoub reviewed Jan 19, 2022

View reviewed changes

src/libraries/System.Net.Http/src/System/Net/Http/Headers/HttpHeaders.cs Outdated Show resolved Hide resolved

stephentoub approved these changes Jan 19, 2022

View reviewed changes

MihaZupan added 6 commits January 19, 2022 19:50

Assert that the collection is not empty in GetEnumeratorCore

1864797

Optimize AddHeaders for empty collections

076b219

Reference the Roslyn bug issue

5d681d6

Assert that multiValues are never empty

5babbda

Don't preserve a Dictionary across Clear

468cf3d

Add comment about why a custom HeaderEntry type is used

b3aee48

MihaZupan force-pushed the httpheaders-dict branch from 75ee4a6 to b3aee48 Compare January 20, 2022 14:21

MihaZupan merged commit bc359a8 into dotnet:main Jan 20, 2022

MihaZupan mentioned this pull request Feb 14, 2022

HttpClient: Header allocations #31235

Closed

ghost locked as resolved and limited conversation to collaborators Feb 19, 2022

Reduce the memory footprint of HttpHeaders #62981

Reduce the memory footprint of HttpHeaders #62981

Conversation

MihaZupan commented Dec 18, 2021 • edited Loading

Behavioral changes

ghost commented Dec 18, 2021

stephentoub commented Dec 18, 2021

MihaZupan commented Dec 18, 2021 • edited Loading

antonfirsov Dec 19, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stephentoub commented Dec 19, 2021 • edited Loading

MihaZupan commented Dec 20, 2021

MihaZupan commented Dec 20, 2021 • edited Loading

geoffkizer commented Dec 20, 2021

MihaZupan commented Dec 20, 2021

This comment has been minimized.

geoffkizer commented Dec 21, 2021

MihaZupan commented Dec 21, 2021

geoffkizer commented Dec 21, 2021

geoffkizer commented Dec 21, 2021

This comment has been minimized.

geoffkizer commented Jan 5, 2022

geoffkizer commented Jan 15, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stephentoub Jan 15, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MihaZupan commented Jan 15, 2022

geoffkizer commented Jan 15, 2022

MihaZupan commented Jan 17, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MihaZupan commented Dec 18, 2021 •

edited

Loading

MihaZupan commented Dec 18, 2021 •

edited

Loading

antonfirsov Dec 19, 2021 •

edited

Loading

stephentoub commented Dec 19, 2021 •

edited

Loading

MihaZupan commented Dec 20, 2021 •

edited

Loading

stephentoub Jan 15, 2022 •

edited

Loading