Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce the memory footprint of HttpHeaders #62981

Merged
merged 22 commits into from
Jan 20, 2022

Conversation

MihaZupan
Copy link
Member

@MihaZupan MihaZupan commented Dec 18, 2021

Fixes #62846

  • Changed the backing header store from a dictionary to an array.
    • Insertion, lookup, and remove operations perform O(n) linear scans to find the matching entry
      • Avoided multiple lookups per operation in places like TryAddWithoutValidation and AddHeaders that would previously perform 2 dictionary lookups (1 for TryGetValue and 1 for Add)
    • After adding 64 headers, the underlying store is swapped to a Dictionary to avoid certain algorithmic complexity attacks (or reduce their impact - see Worst case benchmarks for MaxNumberOfHeaders = 64 in Reduce the memory footprint of HttpHeaders #62981 (comment)).

The change is a win for both CPU and memory.

Measuring memory for 10 simple get requests (code):
Note: These allocation numbers include the HeaderDescriptor change (#62981 (comment)).
main-marked
pr-marked

We go from 15 objects (1968 bytes) to 5 objects (568 bytes) per reqest+response. 🎉
The exact difference will vary depending on the number of headers since internal resizes occur at different times for a dictionary vs array. In the above example, we allocate 3x HeaderEntry[4] and 2x resizes to HeaderEntry[8].
In this case, that's a reduction of the total number of allocated bytes per request+response of 42%.

In an E2E Yarp scenario, this results in a 2-3% RPS increase.

@geoffkizer @scalablecory @stephentoub PTAL

Behavioral changes

The insertion order of headers is preserved during enumeration / serialization to the wire (up to 64 entries).

@MihaZupan MihaZupan added this to the 7.0.0 milestone Dec 18, 2021
@MihaZupan MihaZupan requested a review from a team December 18, 2021 09:51
@ghost ghost assigned MihaZupan Dec 18, 2021
@ghost
Copy link

ghost commented Dec 18, 2021

Tagging subscribers to this area: @dotnet/ncl
See info in area-owners.md if you want to be subscribed.

Issue Details

Fixes #62846
Fixes #62847

Opening as a draft to get thoughts on the overall approach.

  • Changed the backing header store from a dictionary to an array.
    • Insertion, lookup, and remove operations perform O(n) linear scans to find the matching entry
      • Avoided multiple lookups per operation in places like TryAddWithoutValidation and AddHeaders that would previously perform 2 dictionary lookups (1 for TryGetValue and 1 for Add)
      • I will add microbenchmarks for various ways of interacting with headers if we decide to go forward with this. From initial testing, things are generally faster unless you use an unreasonable amount of headers
    • Added a hard cap of 128 headers to prevent n^2 from growing too much
      • Kestrel has a default cap of 100
  • Changed the HeaderDescriptor to use a single backing object.

The change is a win for both CPU and memory.

Measuring memory for 10 simple get requests (code):
main-marked
pr-marked

We go from 15 objects (1968 bytes) to 5 objects (568 bytes) per reqest+response. 🎉
The exact difference will vary depending on the number of headers since internal resizes occur at different times for a dictionary vs array.
In this case, that's a reduction of the total number of allocated bytes per request+response of 42%.

In an E2E Yarp scenario, this results in a 2-3% RPS increase.

Side effects:

  • Header ordering is preserved. If multiple values are added for the same name, they are merged into the first entry like before.
    • E.g.
      Foo: a
      Bar: b
      Foo: c
      
      will now always end up as
      Foo: a, c
      Bar: b
      
    whereas before the order of Foo vs Bar was random based on the hash seed the process started with

@geoffkizer @scalablecory @stephentoub PTAL

TODO:

  • Tests for header ordering
  • Tests for the header number limit
  • More benchmarks
Author: MihaZupan
Assignees: -
Labels:

area-System.Net.Http

Milestone: 7.0.0

@stephentoub
Copy link
Member

Added a hard cap of 128 headers to prevent n^2 from growing too much

So if a server sends 129 headers, the request/response fails? Why is that an acceptable breaking change? What workaround does the client have if it needs to communicate with such a server it's already able to communicate with today?

Insertion, lookup, and remove operations perform O(n) linear scans to find the matching entry

What's the crossover point where cost of these operations is more expensive with the list?

From initial testing, things are generally faster unless you use an unreasonable amount of headers

What is "unreasonable"? By who's standard?

Kestrel has a default cap of 100

As part of this then are you proposing adding a knob?

Technically we could swap the store to a dictionary when we reach X number of headers, but the complexity doesn't seem worth it to support an unreasonable scenario.

I'm not sure why we get to set the threshold for what's a "reasonable" number of headers, considering there's effectively no limit today. The hybrid approach seems more sound to me.

@MihaZupan
Copy link
Member Author

MihaZupan commented Dec 18, 2021

What's the crossover point where cost of these operations is more expensive with the list?

Depends on the header names. Assuming a reasonable distribution of names (string.Equals can exit early), comparing main vs this pr, main is about equal at ~128 headers and faster after ~150. (Time to add all the headers and do one NonValidated enumeration)

The fact that we are searching for the key on every insertion by comparing it to existing keys, the worst-case scenario changes:
With a dictionary, the worst-case is O(MaxResponseHeadersLength) to hash all the inputs.
With the new approach, it's O(MaxResponseHeadersLength * MaxNumberOfHeaders) if you specifically craft long names of equal length that only change at the end. This makes a limit on the number of headers useful to put a ceiling on how much CPU malicious input could burn.
With the default MaxResponseHeadersLength = 64k and MaxNumberOfHeaders = 128, the worst-case on my CPU is about 1.6 ms.

Worst-case benchmarks
ResponseHeadersLengthKb NumberOfHeaders Mean
16 64 236.4 us
16 128 422.2 us
16 256 849.3 us
16 512 1,832.2 us
32 64 449.9 us
32 128 886.7 us
32 256 1,603.5 us
32 512 3,299.8 us
64 64 859.6 us
64 = current default 128 1,660.0 us
64 256 3,402.1 us
64 512 6,260.3 us
128 64 1,689.1 us
128 128 3,223.9 us
128 256 6,464.7 us
128 512 13,477.5 us

If we feel that using more headers is realistic, there are ways to eliminate this worst-case as well.
For example, we could store the Hash(name) on the descriptor to make comparisons effectively O(1). At that point, we could set MaxHeaders to 1000 and be happy.

As part of this then are you proposing adding a knob?

If we believe using more than a few hundred is reasonable, I would prefer to remove the limitation via the above approach instead.

I'm not sure why we get to set the threshold for what's a "reasonable" number of headers, considering there's effectively no limit today.

There are practical limits you will hit, e.g.

  • Kestrel's limit of 100 headers
  • Kestrel and Cloudflare's limit of 32 kB for all request headers
  • HttpClient's limit of 64 kB on response headers
  • IIS seems to have a ~16 kB limit
  • Internet says Tomcat has a default of 8 kB

While most are configurable, I'd argue HttpClient shouldn't care about a scenario where someone wants to send 1001 headers :)

What workaround does the client have if it needs to communicate with such a server it's already able to communicate with today?

LLHTTP. Or preferably changing their service to not rely on hundreds of headers.

private const int InitialCapacity = 4;
private const int MaxHeaderCount = 128;

public HeaderEntry[]? Entries;
Copy link
Member

@antonfirsov antonfirsov Dec 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be crazy idea but:
How common is the Count == 0 case in practice? If less common than Count > 0, isn't it worth to define an embedded storage to avoid the array allocation?

private HeaderEntry _e000, _e001, _e002, _e003, _e004, _e005 /*...*/ ;
public Span<HeaderEntry> Entries => MemoryMarshal.CreateSpan<HeaderEntry>(ref _e0, MaxHeaderCount);

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We definitely wouldn't want to store the entire max as fields since having few headers is very common. But something like storing the first 6 headers as fields and deferring to an array when more are added is something we could potentially explore. I imagine it would come down to how much it would complicate the logic vs. saving the small allocation.

@stephentoub
Copy link
Member

stephentoub commented Dec 19, 2021

If we believe using more than a few hundred is reasonable

It is not our place now to say what is or is not reasonable. HttpClient has been usable with such a number of headers for its entire existence. And the RFC places no such limitation. From my perspective, breaking code that currently does this, without workaround, is what's not reasonable. And telling people to not use HttpClient (which is effectively what "use LLHTTP" says, if nothing else given the sheer complexity it adds to otherwise one-line-ish use) or to stop communicating with particular servers, are not workarounds. Without a knob, i think the proposal isn't viable, and with a knob, you need to be able to accommodate an arbitrarily large number of headers, anyway, so we should just do the hybrid approach.

@MihaZupan
Copy link
Member Author

I changed the implementation to fall back to a dictionary store when adding more than N (currently 64) headers.
This means that ordering is preserved when adding up to N headers, otherwise, the current (effectively random) ordering is used.

Updated the description to reflect these changes. Marking the PR as ready-to-review since the header limit was the only behavioral change I was expecting pushback on.

@MihaZupan
Copy link
Member Author

MihaZupan commented Dec 20, 2021

Opened #63005 regarding the 8-byte size regression of HttpHeaders-derived types (suboptimal field layout with extra padding).
It could be worked around by bringing all the methods from the HeaderStore struct onto HttpHeaders (c9eb437).

@geoffkizer
Copy link
Contributor

TODO:

  • More benchmarks

Can we use the new HttpClient benchmark from @CarnaViire to measure this?

@MihaZupan
Copy link
Member Author

Can we use the new HttpClient benchmark from @CarnaViire to measure this?

Of course. I already have before, but a lot more changes went in since then. I'll post some nice graphs in the next few hours.

@MihaZupan

This comment has been minimized.

@geoffkizer
Copy link
Contributor

This has the side effect of preserving the original insertion order (up to 64 headers). If multiple values are added for the same name, they are merged into the first entry like before.

Did you consider an approach where we don't merge multiple values into the same entry?

This seems like it has a couple advantages:
(1) Preserve header ordering even in the presence of multiple values
(2) Insertion is always cheap because we don't have to compare against existing keys and merge if found

It does make lookup even more expensive, but I suspect we could deal with that by on-demand creating an "index" on the header table when the entry count exceeds some threshhold (e.g 64 or even less). Any update would cause the index to get blown away and recreated if/when necessary.

@MihaZupan
Copy link
Member Author

Did you consider an approach where we don't merge multiple values into the same entry?

Yes, but the value of doing that depends on the following:

  • Would we want NonValidated enumeration to group the values with the same key?
    • As far as I understand the team's feeling here is yes.
    • Does the act of non-validated enumeration group the underlying values (affecting the below question)
  • Would we want the enumeration that happens as part of serializing the headers to the wire to group the values with the same key?
    • We currently do, but we don't necessarily have to

    • Preserve header ordering even in the presence of multiple values

      I take it this means your preference is "no"?

    • If the answer is yes (meaning preserving the current behavior), I would expect the approach to have similar or slightly worse performance characteristics than the current implementation.

@geoffkizer
Copy link
Contributor

  • Would we want NonValidated enumeration to group the values with the same key?

    • As far as I understand the team's feeling here is yes.

For better or worse we have defined NonValidated to group values by header name, and we cannot change that now. (At least I don't think we can... certainly we need to group for the IDictionary implementation on NonValidated. Maybe we could get away with not grouping for the IEnumerable implementation? Seems weird...)

That said, when I think about the scenarios for using NonValidated (e.g. YARP), I think (a) these don't really care about grouping by header name, and (b) probably would actually prefer to not group by header name, since that should provide both better performance and preserve the original header ordering.

So while we can't change the existing NonValidated semantics, we could perhaps add some new API that does non-validated enumeration without grouping by header name. That's a bit ugly, especially since we just added NonValidated... but if it's a better solution, we should consider it.

  • Would we want the enumeration that happens as part of serializing the headers to the wire to group the values with the same key?

    • We currently do, but we don't necessarily have to

As with the NonValidated case above, I think we don't really care about grouping by header name here. The fact that we do is more a legacy of the current HttpHeaders design than a conscious choice. If we can improve perf by not grouping (and I suspect we can) then that alone is good reason to not group by header name.

Preserving header ordering is nice too; I think if we were starting from scratch here we'd try to design the header store to preserve ordering of raw headers, but since we don't today, and we haven't heard huge complaints about it, I don't know that it really matters.

The nice thing about this case as opposed to NonValidated above is that no new public API is required. So if this were to improve the performance of sending request headers in a non-trivial way, I think that could be justification enough for doing this even without adding new public API for the NonValidated case.

@geoffkizer
Copy link
Contributor

Here's an alternative idea re the HeaderDescriptor changes in this PR: #63047

That's a non-trivial amount of work, so we should probably just proceed with the HeaderDescriptor changes here; but it seems worth considering as we are making improvements in this area.

@MihaZupan

This comment has been minimized.

@geoffkizer
Copy link
Contributor

That data looks pretty good... based on this do you believe this is the best approach?

@geoffkizer
Copy link
Contributor

The use of GetValueRefOrAddDefault is a nice optimization -- however it does seem to make the comparison above unfair since this optimization doesn't exist in the existing code.

@@ -235,35 +250,32 @@ public override string ToString()

var vsb = new ValueStringBuilder(stackalloc char[512]);

if (_headerStore is Dictionary<HeaderDescriptor, object> headerStore)
foreach (HeaderEntry entry in GetEntries())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This (and other calls to GetEntries) will force an array allocation if we are in dictionary mode, right? Seems like we could avoid that without too much trouble.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean by caching the array or by implementing a custom enumerator?
I tried to optimize for the common case, even if it made the dictionary edge-case slightly more expensive.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking more like a custom enumerator. That said, I can see how this could potentially add cost to the common case. That's unfortunate. Seems like we should be able to handle both cases efficiently and without allocating, but it's not clear to me how to do this without significantly complicating the code. Hmmm.

ref object? DictionaryGetValueRefOrAddDefault(HeaderDescriptor key)
{
var dictionary = (Dictionary<HeaderDescriptor, object>)_headerStore!;
ref object? value = ref CollectionsMarshal.GetValueRefOrAddDefault(dictionary, key, out s_dictionaryGetValueRefOrAddDefaultExistsDummy);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't we just use _ here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some sort of escape analysis is preventing that:

CS8347: Cannot use a result of 'CollectionsMarshal.GetValueRefOrAddDefault<HeaderDescriptor, object>(Dictionary<HeaderDescriptor, object>, HeaderDescriptor, out bool)' in this context because it may expose variables referenced by parameter 'exists' outside of their declaration scope

Using a dummy field for it was the only way I found to workaround it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea what that error means.

@stephentoub Any insight here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You get that error with out _ but not with out s_dictionaryGetValueRefOrAddDefaultExistsDummy?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's right

Copy link
Member

@stephentoub stephentoub Jan 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jaredpar, is this a compiler bug?
https://sharplab.io/#v2:EYLgxg9gTgpgtADwGwBYA+ABATARgLABQGAzAATakDCpA3oaQ6QA5QCWAbgIYAuM5OSKqQDOAfTABueoxYcefDANLAIEADYjRwKQUalpDEuRSkAsgAoAlAdo29sAGZCAMldIBeAHylHpAHLmAHYwAO5UVgA0pBAArtyawJakOnoAvjY2ioK+1AHUYFGx8Srqyta6jHQVegzAHqTcUDEwKTWGAOw+ME5ikjap+gSpQA==
This compiles, but replace the out bool s_b with a discard out _ and it fails. I'd expect a discard to accommodate any required scope.

error CS8156: An expression cannot be used in this context because it may not be passed or returned by reference
error CS8347: Cannot use a result of 'C.N(C, out bool)' in this context because it may expose variables referenced by parameter 'b' outside of their declaration scope

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a known issue but up until this point seemed more theoretical than a practical problem. Can look into fixing this with the rest of the ref work we're doing this time around.

dotnet/roslyn#56587 (comment)

Note: the discussion on that last issue details the trade offs in fixing this. Effectively to fix this we need to ensure that discards have unique locals associated with them when discards mix with span safety rules (today we re-use temps). Otherwise it can lead to discards creating safety issues.

@MihaZupan
Copy link
Member Author

Just to make sure I understand the current approach: ...

All the observations are correct.

Meaning, if you do something like this:
TryAddWithoutValidation("Header1", "X");
TryAddWithoutValidation("Header2", "Y");
TryAddWithoutValidation("Header1", "Z");
You will still see the values for Header1 grouped.
Am I understanding this all correctly?

That's correct, entries for the same name are always grouped into a single entry.

However, presumably there's some point less than 64 headers where the lookup cost is now more expensive, but not so much more expensive that it's worth incurring the cost of the added memory usage of the dictionary. Is that the right way to think about this? I see the chart above only goes up to 16 headers, and it seems like even at 16 the cost of lookup is starting to make a noticeable difference vs the previous PR.

I'll run & post numbers for more headers here. From previous tests, I can say that #62981 (comment):

Always grouping is slightly faster up to ~10 headers.
Even with more headers, the difference between the approaches never reaches 10%.

The exact cutoff (currently ~10) depends on how expensive HeaderDescriptor.Equals(HeaderDescriptor) is. I expect it may be slightly cheaper with #62981 (comment).

@geoffkizer
Copy link
Contributor

The exact cutoff (currently ~10) depends on how expensive HeaderDescriptor.Equals(HeaderDescriptor) is. I expect it may be slightly cheaper with #62981 (comment).

Good point. Perhaps we should make that change first?

@MihaZupan
Copy link
Member Author

Note that with the first graph, the pr-before is never ordering/searching the entries. They are simply added to the end of the list and then serialized.

image
image

SendAsync raw data
Toolchain RequestHeaders Mean Error StdDev Median Ratio Allocated
main 1 2.353 us 0.0036 us 0.0174 us 2.350 us 1.00 1,728 B
pr-grouped 1 1.970 us 0.0033 us 0.0157 us 1.974 us 0.84 1,064 B
pr 1 2.055 us 0.0047 us 0.0224 us 2.047 us 0.87 1,064 B
main 2 2.420 us 0.0018 us 0.0087 us 2.421 us 1.00 1,728 B
pr-grouped 2 2.067 us 0.0020 us 0.0098 us 2.066 us 0.85 1,064 B
pr 2 2.148 us 0.0045 us 0.0219 us 2.157 us 0.89 1,064 B
main 3 2.608 us 0.0103 us 0.0498 us 2.584 us 1.00 1,728 B
pr-grouped 3 2.155 us 0.0024 us 0.0116 us 2.153 us 0.83 1,064 B
pr 3 2.194 us 0.0027 us 0.0134 us 2.192 us 0.84 1,064 B
main 4 2.710 us 0.0161 us 0.0812 us 2.679 us 1.00 2,032 B
pr-grouped 4 2.312 us 0.0137 us 0.0714 us 2.301 us 0.85 1,064 B
pr 4 2.322 us 0.0121 us 0.0622 us 2.295 us 0.86 1,064 B
main 5 2.822 us 0.0171 us 0.0844 us 2.807 us 1.00 2,032 B
pr-grouped 5 2.440 us 0.0083 us 0.0412 us 2.417 us 0.87 1,280 B
pr 5 2.491 us 0.0055 us 0.0271 us 2.493 us 0.88 1,280 B
main 6 2.941 us 0.0046 us 0.0221 us 2.933 us 1.00 2,032 B
pr-grouped 6 2.518 us 0.0086 us 0.0419 us 2.503 us 0.86 1,280 B
pr 6 2.522 us 0.0062 us 0.0298 us 2.511 us 0.86 1,280 B
main 7 3.041 us 0.0062 us 0.0302 us 3.022 us 1.00 2,032 B
pr-grouped 7 2.608 us 0.0036 us 0.0175 us 2.601 us 0.86 1,280 B
pr 7 2.670 us 0.0130 us 0.0680 us 2.667 us 0.88 1,280 B
main 8 3.275 us 0.0120 us 0.0626 us 3.293 us 1.00 2,696 B
pr-grouped 8 2.770 us 0.0158 us 0.0801 us 2.767 us 0.85 1,280 B
pr 8 2.737 us 0.0062 us 0.0305 us 2.754 us 0.84 1,280 B
main 9 3.371 us 0.0116 us 0.0576 us 3.347 us 1.00 2,696 B
pr-grouped 9 2.864 us 0.0036 us 0.0174 us 2.867 us 0.85 1,688 B
pr 9 2.895 us 0.0144 us 0.0716 us 2.869 us 0.86 1,688 B
main 10 3.602 us 0.0164 us 0.0854 us 3.594 us 1.00 2,696 B
pr-grouped 10 3.053 us 0.0031 us 0.0153 us 3.051 us 0.85 1,688 B
pr 10 3.005 us 0.0068 us 0.0330 us 3.015 us 0.84 1,688 B
main 11 3.767 us 0.0025 us 0.0122 us 3.768 us 1.00 2,696 B
pr-grouped 11 3.168 us 0.0018 us 0.0083 us 3.164 us 0.84 1,688 B
pr 11 3.139 us 0.0082 us 0.0415 us 3.122 us 0.83 1,688 B
main 12 3.953 us 0.0125 us 0.0631 us 3.927 us 1.00 2,696 B
pr-grouped 12 3.395 us 0.0118 us 0.0617 us 3.390 us 0.86 1,688 B
pr 12 3.219 us 0.0105 us 0.0545 us 3.184 us 0.81 1,688 B
main 13 4.099 us 0.0189 us 0.0964 us 4.052 us 1.00 2,696 B
pr-grouped 13 3.505 us 0.0105 us 0.0519 us 3.489 us 0.85 1,688 B
pr 13 3.353 us 0.0139 us 0.0686 us 3.342 us 0.82 1,688 B
main 14 4.177 us 0.0020 us 0.0096 us 4.176 us 1.00 2,696 B
pr-grouped 14 3.661 us 0.0077 us 0.0373 us 3.664 us 0.88 1,688 B
pr 14 3.410 us 0.0081 us 0.0388 us 3.411 us 0.82 1,688 B
main 15 4.361 us 0.0021 us 0.0101 us 4.360 us 1.00 2,696 B
pr-grouped 15 3.795 us 0.0043 us 0.0208 us 3.801 us 0.87 1,688 B
pr 15 3.579 us 0.0099 us 0.0507 us 3.569 us 0.82 1,688 B
main 16 4.671 us 0.0187 us 0.0954 us 4.642 us 1.00 2,696 B
pr-grouped 16 4.125 us 0.0213 us 0.1111 us 4.117 us 0.88 1,688 B
pr 16 3.670 us 0.0093 us 0.0458 us 3.682 us 0.79 1,688 B
main 20 5.456 us 0.0037 us 0.0178 us 5.461 us 1.00 4,080 B
pr-grouped 20 4.903 us 0.0209 us 0.1088 us 4.863 us 0.90 2,480 B
pr 20 4.247 us 0.0216 us 0.1126 us 4.222 us 0.78 2,480 B
main 24 6.298 us 0.0159 us 0.0790 us 6.307 us 1.00 4,080 B
pr-grouped 24 5.674 us 0.0043 us 0.0212 us 5.673 us 0.90 2,480 B
pr 24 4.769 us 0.0123 us 0.0607 us 4.742 us 0.76 2,480 B
main 28 7.009 us 0.0060 us 0.0290 us 7.006 us 1.00 4,080 B
pr-grouped 28 6.507 us 0.0124 us 0.0611 us 6.493 us 0.93 2,480 B
pr 28 5.310 us 0.0087 us 0.0425 us 5.320 us 0.76 2,480 B
main 32 7.692 us 0.0062 us 0.0301 us 7.676 us 1.00 4,080 B
pr-grouped 32 7.424 us 0.0165 us 0.0783 us 7.374 us 0.97 2,480 B
pr 32 5.729 us 0.0083 us 0.0396 us 5.748 us 0.74 2,480 B
main 36 8.406 us 0.0131 us 0.0644 us 8.400 us 1.00 4,080 B
pr-grouped 36 8.472 us 0.0034 us 0.0163 us 8.468 us 1.01 4,040 B
pr 36 6.459 us 0.0259 us 0.1273 us 6.423 us 0.77 4,040 B
main 40 9.582 us 0.0352 us 0.1694 us 9.541 us 1.00 7,336 B
pr-grouped 40 9.528 us 0.0413 us 0.2078 us 9.427 us 0.99 4,040 B
pr 40 6.933 us 0.0276 us 0.1372 us 6.959 us 0.72 4,040 B
main 44 10.306 us 0.0286 us 0.1384 us 10.283 us 1.00 7,336 B
pr-grouped 44 10.590 us 0.0452 us 0.2270 us 10.464 us 1.03 4,040 B
pr 44 7.438 us 0.0105 us 0.0491 us 7.419 us 0.72 4,040 B
main 48 11.058 us 0.0092 us 0.0445 us 11.052 us 1.00 7,336 B
pr-grouped 48 11.655 us 0.0305 us 0.1479 us 11.595 us 1.05 4,040 B
pr 48 8.072 us 0.0491 us 0.2560 us 8.038 us 0.73 4,040 B
main 52 11.746 us 0.0368 us 0.1831 us 11.677 us 1.00 7,336 B
pr-grouped 52 12.655 us 0.0057 us 0.0269 us 12.654 us 1.08 4,040 B
pr 52 8.448 us 0.0407 us 0.2119 us 8.427 us 0.72 4,040 B
main 56 12.453 us 0.0311 us 0.1522 us 12.412 us 1.00 7,336 B
pr-grouped 56 13.865 us 0.0235 us 0.1127 us 13.822 us 1.11 4,040 B
pr 56 8.966 us 0.0292 us 0.1423 us 8.917 us 0.72 4,040 B
main 60 13.081 us 0.0207 us 0.0975 us 13.086 us 1.00 7,336 B
pr-grouped 60 15.032 us 0.0084 us 0.0388 us 15.022 us 1.15 4,040 B
pr 60 9.412 us 0.0338 us 0.1629 us 9.345 us 0.72 4,040 B
main 64 13.787 us 0.0094 us 0.0447 us 13.774 us 1.00 7,336 B
pr-grouped 64 16.743 us 0.0787 us 0.4103 us 16.960 us 1.21 4,040 B
pr 64 9.924 us 0.0336 us 0.1653 us 9.853 us 0.72 4,040 B
main 100 21.579 us 0.0852 us 0.4425 us 21.810 us 1.00 14,480 B
pr-grouped 100 26.696 us 0.0829 us 0.4322 us 26.399 us 1.24 15,072 B
pr 100 20.317 us 0.0918 us 0.4782 us 20.450 us 0.94 15,072 B
Add without validation + enumerate without validation raw data
Toolchain NumberOfHeaders Mean Error StdDev Median Ratio Allocated
main 1 122.59 ns 2.510 ns 3.757 ns 121.18 ns 1.00 376 B
pr-grouped 1 76.79 ns 1.535 ns 1.996 ns 77.10 ns 0.63 256 B
pr 1 84.04 ns 1.681 ns 1.868 ns 84.70 ns 0.69 256 B
main 2 183.06 ns 3.577 ns 3.346 ns 180.87 ns 1.00 376 B
pr-grouped 2 118.26 ns 1.673 ns 1.565 ns 119.13 ns 0.65 256 B
pr 2 122.93 ns 1.904 ns 1.590 ns 123.77 ns 0.67 256 B
main 3 270.32 ns 2.927 ns 2.444 ns 268.94 ns 1.00 376 B
pr-grouped 3 162.32 ns 3.152 ns 3.987 ns 160.26 ns 0.60 256 B
pr 3 165.98 ns 3.036 ns 2.840 ns 164.30 ns 0.61 256 B
main 4 389.57 ns 7.828 ns 9.319 ns 389.19 ns 1.00 680 B
pr-grouped 4 205.07 ns 3.002 ns 2.809 ns 206.41 ns 0.52 256 B
pr 4 205.72 ns 4.023 ns 3.951 ns 202.63 ns 0.53 256 B
main 5 461.69 ns 8.266 ns 7.732 ns 464.62 ns 1.00 680 B
pr-grouped 5 293.61 ns 5.823 ns 7.773 ns 294.10 ns 0.63 472 B
pr 5 316.54 ns 6.246 ns 8.338 ns 321.83 ns 0.68 472 B
main 6 527.32 ns 10.455 ns 16.582 ns 525.12 ns 1.00 680 B
pr-grouped 6 353.87 ns 7.081 ns 7.272 ns 356.35 ns 0.68 472 B
pr 6 370.81 ns 7.154 ns 9.792 ns 373.87 ns 0.71 472 B
main 7 616.52 ns 12.214 ns 17.122 ns 606.43 ns 1.00 680 B
pr-grouped 7 419.36 ns 8.110 ns 8.328 ns 414.64 ns 0.68 472 B
pr 7 437.81 ns 8.669 ns 8.109 ns 440.15 ns 0.71 472 B
main 8 745.69 ns 14.659 ns 13.712 ns 755.95 ns 1.00 1,344 B
pr-grouped 8 486.95 ns 7.636 ns 7.142 ns 482.39 ns 0.65 472 B
pr 8 492.31 ns 7.838 ns 6.545 ns 492.26 ns 0.66 472 B
main 9 825.77 ns 16.452 ns 16.158 ns 832.19 ns 1.00 1,344 B
pr-grouped 9 625.59 ns 12.451 ns 18.251 ns 631.28 ns 0.76 880 B
pr 9 610.42 ns 12.158 ns 16.642 ns 602.64 ns 0.74 880 B
main 10 997.87 ns 15.747 ns 14.730 ns 996.91 ns 1.00 1,344 B
pr-grouped 10 720.57 ns 14.355 ns 15.359 ns 726.92 ns 0.72 880 B
pr 10 690.36 ns 13.212 ns 13.568 ns 682.78 ns 0.69 880 B
main 11 1,061.06 ns 21.144 ns 35.904 ns 1,048.99 ns 1.00 1,344 B
pr-grouped 11 809.18 ns 15.951 ns 16.381 ns 814.43 ns 0.77 880 B
pr 11 807.85 ns 12.956 ns 12.119 ns 809.09 ns 0.77 880 B
main 12 1,209.00 ns 24.048 ns 22.495 ns 1,217.40 ns 1.00 1,344 B
pr-grouped 12 900.33 ns 13.765 ns 12.876 ns 907.99 ns 0.74 880 B
pr 12 850.53 ns 16.424 ns 16.131 ns 848.89 ns 0.70 880 B
main 13 1,328.95 ns 26.219 ns 30.194 ns 1,309.19 ns 1.00 1,344 B
pr-grouped 13 1,007.52 ns 12.576 ns 11.763 ns 1,013.41 ns 0.76 880 B
pr 13 956.47 ns 13.528 ns 12.654 ns 963.93 ns 0.72 880 B
main 14 1,455.24 ns 29.118 ns 29.902 ns 1,467.84 ns 1.00 1,344 B
pr-grouped 14 1,126.84 ns 22.493 ns 25.903 ns 1,142.96 ns 0.77 880 B
pr 14 1,042.98 ns 20.597 ns 32.067 ns 1,043.92 ns 0.72 880 B
main 15 1,530.39 ns 26.842 ns 23.795 ns 1,516.45 ns 1.00 1,344 B
pr-grouped 15 1,219.10 ns 24.151 ns 18.855 ns 1,226.85 ns 0.80 880 B
pr 15 1,153.90 ns 22.947 ns 34.346 ns 1,148.41 ns 0.74 880 B
main 16 1,656.75 ns 32.754 ns 33.636 ns 1,670.99 ns 1.00 1,344 B
pr-grouped 16 1,389.44 ns 24.917 ns 23.307 ns 1,374.75 ns 0.84 880 B
pr 16 1,263.69 ns 8.729 ns 7.289 ns 1,261.70 ns 0.76 880 B
main 17 1,773.05 ns 35.178 ns 39.100 ns 1,753.49 ns 1.00 1,344 B
pr-grouped 17 1,549.71 ns 27.231 ns 25.472 ns 1,535.39 ns 0.87 1,672 B
pr 17 1,473.61 ns 27.595 ns 24.463 ns 1,465.86 ns 0.83 1,672 B
main 18 2,022.24 ns 36.846 ns 28.767 ns 2,024.19 ns 1.00 2,728 B
pr-grouped 18 1,727.06 ns 32.736 ns 35.028 ns 1,737.08 ns 0.86 1,672 B
pr 18 1,539.29 ns 1.943 ns 1.623 ns 1,538.48 ns 0.76 1,672 B
main 19 2,195.57 ns 43.635 ns 63.959 ns 2,192.98 ns 1.00 2,728 B
pr-grouped 19 1,814.63 ns 30.953 ns 28.954 ns 1,795.46 ns 0.84 1,672 B
pr 19 1,670.20 ns 26.156 ns 24.466 ns 1,655.58 ns 0.77 1,672 B
main 20 2,255.31 ns 44.376 ns 63.643 ns 2,245.40 ns 1.00 2,728 B
pr-grouped 20 1,949.18 ns 24.671 ns 23.077 ns 1,935.58 ns 0.87 1,672 B
pr 20 1,819.34 ns 32.696 ns 30.584 ns 1,829.61 ns 0.81 1,672 B
main 22 2,594.99 ns 51.857 ns 48.507 ns 2,564.49 ns 1.00 2,728 B
pr-grouped 22 2,259.40 ns 44.367 ns 60.730 ns 2,228.79 ns 0.87 1,672 B
pr 22 2,094.97 ns 41.693 ns 73.022 ns 2,069.17 ns 0.79 1,672 B
main 24 2,868.40 ns 56.532 ns 58.055 ns 2,821.24 ns 1.00 2,728 B
pr-grouped 24 2,619.68 ns 50.704 ns 60.359 ns 2,574.80 ns 0.92 1,672 B
pr 24 2,408.42 ns 48.074 ns 51.439 ns 2,448.43 ns 0.84 1,672 B
main 26 3,131.28 ns 58.519 ns 54.738 ns 3,167.28 ns 1.00 2,728 B
pr-grouped 26 2,863.38 ns 3.760 ns 2.936 ns 2,864.17 ns 0.91 1,672 B
pr 26 2,747.12 ns 53.499 ns 50.043 ns 2,710.30 ns 0.88 1,672 B
main 28 3,336.57 ns 2.453 ns 2.295 ns 3,336.55 ns 1.00 2,728 B
pr-grouped 28 3,220.18 ns 4.427 ns 3.457 ns 3,219.11 ns 0.97 1,672 B
pr 28 3,008.62 ns 59.656 ns 71.016 ns 3,046.78 ns 0.90 1,672 B
main 30 3,770.03 ns 49.027 ns 45.860 ns 3,737.39 ns 1.00 2,728 B
pr-grouped 30 3,551.77 ns 20.115 ns 15.704 ns 3,549.86 ns 0.94 1,672 B
pr 30 3,392.65 ns 54.903 ns 51.356 ns 3,363.21 ns 0.90 1,672 B
main 32 3,973.16 ns 77.671 ns 125.424 ns 3,942.48 ns 1.00 2,728 B
pr-grouped 32 3,922.48 ns 46.984 ns 43.949 ns 3,902.50 ns 0.99 1,672 B
pr 32 3,730.02 ns 63.504 ns 59.401 ns 3,708.42 ns 0.95 1,672 B
main 34 4,268.38 ns 85.357 ns 135.385 ns 4,201.05 ns 1.00 2,728 B
pr-grouped 34 4,450.25 ns 88.186 ns 137.295 ns 4,408.10 ns 1.04 3,232 B
pr 34 4,259.82 ns 83.759 ns 117.418 ns 4,225.08 ns 1.00 3,232 B
main 36 4,571.16 ns 87.931 ns 101.262 ns 4,493.04 ns 1.00 2,728 B
pr-grouped 36 4,703.05 ns 92.352 ns 90.702 ns 4,711.15 ns 1.03 3,232 B
pr 36 4,623.75 ns 91.465 ns 136.901 ns 4,560.38 ns 1.01 3,232 B
main 38 4,997.31 ns 83.586 ns 65.258 ns 5,003.47 ns 1.00 5,984 B
pr-grouped 38 5,144.99 ns 101.197 ns 103.922 ns 5,106.03 ns 1.03 3,232 B
pr 38 4,862.88 ns 95.718 ns 143.267 ns 4,905.93 ns 0.96 3,232 B
main 40 5,269.29 ns 77.839 ns 60.771 ns 5,285.07 ns 1.00 5,984 B
pr-grouped 40 5,504.70 ns 14.606 ns 12.948 ns 5,501.95 ns 1.05 3,232 B
pr 40 5,276.89 ns 102.523 ns 113.954 ns 5,343.00 ns 1.00 3,232 B
main 42 5,476.23 ns 17.783 ns 14.850 ns 5,477.43 ns 1.00 5,984 B
pr-grouped 42 5,967.31 ns 117.755 ns 125.997 ns 5,901.33 ns 1.09 3,232 B
pr 42 5,676.34 ns 106.889 ns 104.980 ns 5,736.88 ns 1.04 3,232 B
main 44 5,779.55 ns 107.425 ns 100.486 ns 5,726.15 ns 1.00 5,984 B
pr-grouped 44 6,462.91 ns 122.669 ns 120.478 ns 6,525.91 ns 1.12 3,232 B
pr 44 6,019.01 ns 120.007 ns 128.406 ns 6,111.23 ns 1.04 3,232 B
main 46 6,095.82 ns 99.287 ns 92.873 ns 6,052.07 ns 1.00 5,984 B
pr-grouped 46 6,924.19 ns 137.678 ns 135.218 ns 6,968.37 ns 1.14 3,232 B
pr 46 6,520.93 ns 124.904 ns 116.835 ns 6,470.77 ns 1.07 3,232 B
main 48 6,504.33 ns 128.527 ns 126.231 ns 6,511.55 ns 1.00 5,984 B
pr-grouped 48 7,329.39 ns 104.296 ns 97.559 ns 7,284.50 ns 1.13 3,232 B
pr 48 6,948.54 ns 114.840 ns 107.421 ns 6,879.09 ns 1.07 3,232 B
main 52 6,816.08 ns 11.007 ns 8.593 ns 6,816.38 ns 1.00 5,984 B
pr-grouped 52 8,207.21 ns 157.495 ns 204.788 ns 8,099.89 ns 1.20 3,232 B
pr 52 7,857.39 ns 154.434 ns 144.458 ns 7,799.97 ns 1.15 3,232 B
main 56 7,309.92 ns 25.379 ns 19.814 ns 7,306.98 ns 1.00 5,984 B
pr-grouped 56 9,251.26 ns 181.800 ns 288.354 ns 9,122.51 ns 1.26 3,232 B
pr 56 8,572.55 ns 161.131 ns 150.722 ns 8,627.65 ns 1.17 3,232 B
main 60 8,075.87 ns 130.259 ns 121.845 ns 8,116.33 ns 1.00 5,984 B
pr-grouped 60 10,320.31 ns 204.922 ns 325.028 ns 10,485.12 ns 1.27 3,232 B
pr 60 9,537.24 ns 185.582 ns 213.717 ns 9,393.53 ns 1.18 3,232 B
main 64 8,318.01 ns 161.570 ns 165.921 ns 8,345.00 ns 1.00 5,984 B
pr-grouped 64 11,384.55 ns 221.299 ns 245.973 ns 11,445.30 ns 1.37 3,232 B
pr 64 10,525.79 ns 208.092 ns 213.695 ns 10,353.05 ns 1.27 3,232 B
main 80 10,495.06 ns 178.951 ns 149.432 ns 10,558.71 ns 1.00 5,984 B
pr-grouped 80 17,428.86 ns 340.630 ns 509.840 ns 17,763.80 ns 1.64 13,784 B
pr 80 10,174.41 ns 201.227 ns 223.663 ns 10,056.71 ns 0.97 13,784 B
main 96 13,642.04 ns 233.097 ns 218.039 ns 13,487.46 ns 1.00 13,128 B
pr-grouped 96 20,239.71 ns 397.674 ns 473.403 ns 20,537.87 ns 1.47 14,168 B
pr 96 12,179.03 ns 201.938 ns 188.893 ns 12,071.38 ns 0.89 14,168 B
main 112 15,849.29 ns 197.159 ns 184.423 ns 15,941.89 ns 1.00 13,128 B
pr-grouped 112 22,182.71 ns 424.558 ns 505.406 ns 22,084.05 ns 1.40 14,552 B
pr 112 14,118.42 ns 17.055 ns 14.241 ns 14,119.91 ns 0.89 14,552 B
main 128 18,360.61 ns 357.130 ns 438.588 ns 18,569.98 ns 1.00 13,128 B
pr-grouped 128 23,802.49 ns 38.615 ns 30.148 ns 23,803.64 ns 1.29 14,936 B
pr 128 16,259.46 ns 318.608 ns 549.582 ns 15,948.75 ns 0.88 14,936 B

Good point. Perhaps we should make that change first?

I can put up a PR for it in parallel to this, but I imagine the exact numbers shouldn't matter for this PR.

@@ -159,28 +169,33 @@ internal bool TryAddWithoutValidation(HeaderDescriptor descriptor, IEnumerable<s
throw new ArgumentNullException(nameof(values));
}

using (IEnumerator<string?> enumerator = values.GetEnumerator())
using IEnumerator<string?> enumerator = values.GetEnumerator();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on usage in, say, YARP, do we have a sense for what the most common concrete type of values is? If it's typically an array or List<string>, or even an IList<string>, it might be worth special-casing those types. PGO will hopefully be able to help even without that, but I suspect for the foreseeable future even if it can avoid some of the interface dispatch it probably won't be able to avoid the enumerator allocation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The common case is certainly TryAddWithoutValidation(string, string), but if we do end up using the IEnumerable overload, it would always be string[] (since the source is StringValues).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened #64049 to split off such optimizations from this PR.

_headerStore ??= new Dictionary<HeaderDescriptor, object>();

foreach (KeyValuePair<HeaderDescriptor, object> header in sourceHeadersStore)
foreach (HeaderEntry entry in sourceHeaders.GetEntries())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If sourceHeaders is backed by a dictionary, this is first going to copy the data into an array, only to then enumerate that array and throw it away. Can we do better?

Separately but related, how common is it for AddHeaders to be called on an empty collection? I'm wondering if we can optimize the transfer from sourceHeaders to this collection in some common cases by cloning the whole data structure rather than by adding each header individually.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If sourceHeaders is backed by a dictionary, this is first going to copy the data into an array, only to then enumerate that array and throw it away. Can we do better?

We could, but it would mean either duplicating some logic in callers / incurring the overhead for a custom enumerator in the common case. Related: #62981 (comment)

how common is it for AddHeaders to be called on an empty collection?

AddHeaders in mainly used in two places:

  1. Copying defaultRequestHeaders to the request in HttpClient. I'm not sure how common it is to not specify any headers on the request, but this sounds like a nice optimization we should just do here if the target is empty.
  2. Creating a new HttpContent in DecompressionHandler and copying all the headers. In that case the destination always starts out as empty. I opened DecompressionHandler could move the headers more efficiently #63632 for this case since we could avoid copying altogether.

If it turns out it's really common to have a lot of defaultRequestHeaders and few headers on the request, we could optimize for that too in the future.

{
return ref entries[i].Value!;
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to open-code these comparison loops, or could we use IndexOf?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we're searching against entries[i].Key and not the whole type, we would need a custom IndexOf method anyway.
Extracting that logic and replacing the 3 loops here with it, the performance regresses by ~5% while increasing LOC.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we would need a custom IndexOf method anyway

Just a custom comparer, no?

If it's not a good tradeoff, fine. I just want us to strongly prefer using built-in functionality whenever possible.

if ((uint)count < (uint)entries.Length)
{
entries[count] = entry;
_count++;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this be better as _count = count + 1;?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just slightly worse as it has to store an extra value on the stack

@MihaZupan MihaZupan merged commit bc359a8 into dotnet:main Jan 20, 2022
deeprobin added a commit to deeprobin/runtime that referenced this pull request Feb 1, 2022
* [RateLimiting] Dequeue items when queuing with NewestFirst (#63377)

* Don't reuse registers in Debug mode (#63698)

Co-authored-by: Bruce Forstall <brucefo@microsoft.com>

* Add IsKnownConstant jit helper and optimize 'str == ""' with str.StartsWith('c') (#63734)

Co-authored-by: Miha Zupan <mihazupan.zupan1@gmail.com>
Co-authored-by: SingleAccretion <62474226+SingleAccretion@users.noreply.github.com>

* - mono_wasm_new_external_root for roots on stack (#63997)

- temp_malloc helper via linear buffer in js
- small refactorings
Co-authored-by: Katelyn Gadd <kg@luminance.org>

* [Arm64] Don't use D-copies in CopyBlock (#63588)

* Increase the maximum number of internal registers allowd per node in src/coreclr/jit/lsra.h

* Based on discussion in https://github.com/dotnet/runtime/issues/63453 don't allocate a SIMD register pair if the JIT won't be able to use Q-copies in src/coreclr/jit/lsraarmarch.cpp

* Update CodeGen to reflect that Q-copies should be used only when size >= 2 * FP_REGSIZE_BYTES and using of them makes the instruction sequence shorter in src/coreclr/jit/codegenarmarch.cpp

* Update comment - we don't use D-copies after that change in src/coreclr/jit/codegenarmarch.cpp

* Disable hot reload tests for AOT configurations (#64006)

* Bump Explicit-layout value types with no fields to at minimum 1 byte size. (#63975)

* Add runtime-extra-platforms pipeline to have matching runtime PR and Rolling builds (#62564)

* Add runtime-extended-platforms pipeline to have matching runtime PR and Rolling builds

* Fix evaluate changed paths condition for the extra pipeline

* PR Feedback and fix condition

* Move MacCatalyst back to staging, disable tvOS tests

* Disable browser wasm windows legs

* Make ILStubGenerated event log ModuleID corresponding to that on other events (#63974)

* Retries for flaky WMI test (#64008)

* [arm64] JIT: Redundant zero/sign extensions after ldrX/ldrsX (#62630)

* JIT: fix up switch map for out-of-loop predecessor (#64014)

If we have a loop where some of the non-loop predecessors are switchs, and
we add pre-header to the loop, we need to update the switch map for those
predecessors.

Fixes #63982.

* Update StructMarshalling design now that DisableRuntimeMarshallingAttribute is approved (#63765)

Co-authored-by: Elinor Fung <elfung@microsoft.com>

* Fix Crossgen2 bug #61104 and add regression test (#63956)

The issue tracks the runtime regression failure where
Crossgen2-compiled app is unable to locate a type with non-ASCII
characters in its name. The failure was caused by the fact that
Crossgen2 was incorrectly zero-extending the individual UTF8 characters
when calculating the hash whereas runtime is sign-extending them.

Thanks

Tomas

* Fix invalid threading of nodes in rationalization (#64012)

The code in question assumes that the ASG will be
reversed and thus threads "simdTree" before "location"
in the linear order. That dependency, while valid,
because "gtSetEvalOrder" will always reverse ASGs
with locals on the LHS, is unnecessary and incorrect
from the IR validity point of view.

Fix this by using "InsertAfter" instead of manual
node threading.

* Check if the child object is in the heap range before get_region_plan_gen_num (#63828)

* Check if the child object is in the heap range before object_gennum (#63970)

* 'cmeq' and 'fcmeq' Vector64<T>.Zero/Vector128<T>.Zero ARM64 containment optimizations (#62933)

* Initial work

* Added a comma to display

* Cleanup

* Fixing build

* More cleanup

* Update comment

* Update comment

* Added CompareEqual Vector64/128 with Zero tests

* Do not contain op1 for now

* Wrong intrinsic id used

* Removing generated tests

* Removing generated tests

* Added CompareEqual tests

* Supporting containment for first operand

* Fix test build

* Passing correct register

* Check IsVectorZero before not allocing a register

* Update comment

* Fixing test

* Minor format change

* Fixed formatting

* Renamed test

* Adding AdvSimd_Arm64 tests:

* Adding support for rest of 'cmeq' and 'fcmeq' instructions

* Removing github csproj

* Minor test fix

* Fixed tests

* Fix print

* Minor format change

* Fixing test

* Added some emitter tests

* Feedback

* Update emitarm64.cpp

* Feedback

* [Arm64] Keep unrolling InitBlock and CopyBlock up to 128 bytes (#63422)

* Add INITBLK_LCL_UNROLL_LIMIT and CPBLK_LCL_UNROLL_LIMIT of 128 bytes in src/coreclr/jit/targetarm64.h

* Keep unrolling InitBlock up to INITBLK_LCL_UNROLL_LIMIT bytes when dstAddr points to the stack in src/coreclr/jit/lowerarmarch.cpp

* Keep unrolling CopyBlock up to CPBLK_LCL_UNROLL_LIMIT bytes when both srcAddr and dstAddr point to the stack in src/coreclr/jit/lowerarmarch.cpp

* Add ProcessLinkerXmlBase to NativeAOT (#63666)

Add Xml Parsing linker files as a reference source to NativeAOT
Rename NativeAOT ProcessLinkerXmlBase version to ProcessXmlBase (uses XmlReader) 
Add ProcessLinkerXmlBase from linker and fix it so it can be used in NativeAOT (uses XPath)

* Fix gc_heap::remove_ro_segment (#63473)

* Fix OpenSSL version check in GetAlpnSupport

The previous check failed 3.0.0 because the Minor was 0 and Build was 0.

It could probably be rewritten to be `>= new Version(1, 0, 2)`, but that'd require more thinking.

* Fix issues with verify_regions, clear_batch_mark_array_bits. (#63798)

Details:
- we cannot verify the tail of the region list from background GC, as it may be updated by threads allocating.
- fix case in clear_batch_mark_array_bits where end is equal to the very end of a segment and we write into uncommitted memory in the mark_array.
-  bgc_clear_batch_mark_array_bits did some checks and then called clear_batch_mark_array_bits which repeated the exact same checks. Renamed clear_batch_mark_array_bits to bgc_batch_mark_array_bits and removed the old copy, removed the declaration for clear_batch_mark_array_bits.

* [debugger][wasm] Added support for non user code attribute (#63876)

* Hidden methods and step through methods behave the same way.

* Perpared flow for setting JustMyCode in the future.

* Tests for JustMyCode setting before debug launch.

* Transformed into dynamic JustMyCode change flow.

* JustMyCode disabled, first 3 cases solved.

* Finished behavior for JMC disabled (with 1 difference).

* JMC enabled: stepIn np bp + stepIn bp + resume bp.

* Functional version (with minor deviations from expected behavior).

* Refactoring.

* All tests for NonUserCode work.

* Fix line number after adding code above.

* Fix error in merge.

* Removing duplicated tests.

* [wasm][debugger] Added support for stepper boundary attribute (#63991)

* Hidden methods and step through methods behave the same way.

* Perpared flow for setting JustMyCode in the future.

* Tests for JustMyCode setting before debug launch.

* Transformed into dynamic JustMyCode change flow.

* JustMyCode disabled, first 3 cases solved.

* Finished behavior for JMC disabled (with 1 difference).

* JMC enabled: stepIn np bp + stepIn bp + resume bp.

* Functional version (with minor deviations from expected behavior).

* Refactoring.

* All tests for NonUserCode work.

* Fix line number after adding code above.

* Stepper boundary with tests.

* Save information about multiple decorators.

* Fix error in merge.

* Polish the PR build doc (#64036)

* [wasm] WebSocket tests on NodeJS (#63441)

- NPM package with WS.
- Restore npm during build.
- Load npm modules in test-main.js.

Co-authored-by: Pavel Savara <pavel.savara@gmail.com>

* Fix dependency in runtime-official.yml (#64040)

After https://github.com/dotnet/runtime/pull/62564 the `hostedOs` value is included in the job name.

* [API Implementation]: System.Diagnostics.CodeAnalysis.StringSyntaxAttribute (#62995)

* Add StringSyntaxAttribute

* Fix attribute declaration and add usage

* Address PR feedback

Co-authored-by: Stephen Toub <stoub@microsoft.com>

* Reduce the memory footprint of HttpHeaders (#62981)

* Change HttpHeaders backing store to an array

* Reduce the size of HeaderDescriptor to 1 object

* Update UnitTests, fix GetOrCreateHeaderInfo

* Switch to a dictionary after ArrayThreshold headers

* Add unit tests

* Use storeValueRef naming consistently

* Workaround field layout regression (#63005)

* Mark _descriptor on HeaderDescriptor as nullable

* Remove HeaderDescriptor.Descriptor and add HasValue, IsKnownHeader, Equals

* Simplify HttpHeaderParser.Separator logic

* Add comments on HasValue checks

* Lazily group headers by name

* Add a header ordering+grouping test

* Make use of the _count field

* Revert all HeaderDescriptor changes from PR

* Switch back to always grouping by name

* Assert that the collection is not empty in GetEnumeratorCore

* Optimize AddHeaders for empty collections

* Reference the Roslyn bug issue

* Assert that multiValues are never empty

* Don't preserve a Dictionary across Clear

* Add comment about why a custom HeaderEntry type is used

* Disable DirectoryLongerThanMaxLongPathWithExtendedSyntax_ThrowsException (#64044)

* Add test coverage for frozen objects and GC interaction (#64030)

* Test coverage for frozen objects and GC interaction

* Update Preinitialization.cs

* Remove Type.MakeGenericType dependency from source generation (#64004)

* Remove Type.MakeGenericType dependency from srcgen

* address feedback

* add trimmer warning suppression

* address feedback

* Add ns2.0 support to System.Formats.Cbor (#62872)

* Add ns2.0 support to System.Formats.Cbor

* Add NetFrameworkMinimum to tfms

* Add ReadHalf and WriteHalf to compatibility suppressions

* Remove unwanted comment

* Exception sets: debug checker & fixes (#63539)

* Add a simple exception sets checker

* Add asserts to catch missing nodes

* Fix normal VN printing

* Fix JTRUE VNs

* Fix PHI VNs

* Update VNs for "this" ARGPLACE node

* Tolerate missing VNs on PHI_ARGs

We do not update them after numbering the loops.

(Though perhaps we should)

* Tolerate unreachable blocks

* Fix exception sets for VNF_PtrTo VNFuncs

* Add VNUniqueWithExc

* Add VNPUniqueWithExc

* Fix arrays

* Consistently give location nodes VNForVoid

And always add exception sets for them.
This will simplify the exception set
propagation code for assignments.

* Fix CSE

* Fix GT_RETURN

* Fix LCLHEAP

* Fix GT_ARR_ELEM

* Fix unique HWI

* Fix unique SIMD

* Fix GT_SWITCH

* Fix CKFINITE

* Fix HWI loads

* Fix fgValueNumberAddExceptionSetForIndirection

The method does not need to add the exception set for
the base address. Additionally, the way it did add the
sets, by unioning with normal value numbers, lost all
exceptions not coming from the base address.

This was fine for the unary loads, but broke the HWI loads
that could have exceptions coming from not just the address.

* Fix GT_RETFILT

* Fix INIT_VAL

* Fix DYN_BLK

* Fix FIELD_LIST

* De-pessimize CkFinite

* Add a test for HWIs

* Add a test for LCLHEAP

* Change test to check for store block operators (#60878)

* Update XUnit to 2.4.2-pre.22 (#63948)

* Update to Xunit build 2.4.2-pre.13

Also pick up latest pre-release of analyzers

* Disambiguate calls to Assert.Equals(double,double,int)

Xunit added a new Assert overload that caused a lot of ambiguous calls.
https://github.com/xunit/xunit/issues/2393

Workaround by casting to double.

* Fix new instances of xUnit2000 diagnostic

* Workaround xUnit2002 issue with implicit cast

Works around https://github.com/xunit/xunit/issues/2395

* Disable xUnit2014 diagnostic

This diagnostic forces the use of Assert.ThrowsAsync for any async method,
however in our case we may want to test that a method will throw
synchronously to avoid regressing that behavior by moving to the async
portion of the method.

* Use AssertExtensions to test for null ArgumentException.ParamName

Workaround https://github.com/xunit/xunit/issues/2396

* Update to Xunit 2.4.2-pre.22

* Fix another ArugmentException.ParamName == null assert

* Preserve OBJ/BLK on the RHS of ASG (#63268)

One of my upcoming changes will need this information to
accurately detect type mismatch in "fgValueNumberBlockAssignment".

* Revert "Temporarily disable coredumps during library testing on macOS (#63742)" (#64057)

This reverts commit 2c28e63f9360280011a3b03c1ca6dc0edce1fae4.

Fixes #63761

* Performance: Fix Browser Wasm job not being found for dependent jobs (#64058)

* Figure out the name that browser wasm now uses.

*  linux to the Browser wasm depends on name.

Update the browser wasm dependson name to match the new one found in the pipeline.

* Fix exception propagation over HW exception frame on macOS arm64 (#63596)

* Fix exception propagation over HW exception frame on macOS arm64

There is a problem unwinding over the PAL_DispatchExceptionWrapper
to the actual hardware exception location. The unwinder is unable
to get distinct LR and PC in that frame and sets both of them to
the same value. This is caused by the fact that the
PAL_DispatchExceptionWrapper is just an injected fake frame and
there was no real call. Calls always return with LR and PC set
to the same value.

The fix unifies the hardware exception frame unwinding with Linux
where we had problems unwinding over signal handler trampoline, so
PAL_VirtualUnwind skips the trampoline and now also the
PAL_DispatchExceptionWrapper frame by copying the context of
the exception as the unwound context.

* Reenable DllImportGenerator.Unit.Tests

* Add StringSyntax attribute to Regex.pattern field (#64063)

I missed adding this one in my initial audit.  It'll be exceedingly rare for a developer to manually write code that assigns a string to this protected field, but every source-generated regex does so, and thus any colorization VS provides will benefit looking at the source-generated code.

* Sync shared code from aspnetcore (#64059)

Co-authored-by: JamesNK <JamesNK@users.noreply.github.com>

* Read the System.GC.CpuGroup settings in runtimeconfig.json (#64067)

* Log message of unexpected exception in ThrowsAny (#64064)

* Log message of unexpected exception in ThrowsAny

* Update AssertExtensions.cs

* Enable some browser legs on the extra-platforms pipeline (#64065)

* Enable some browser legs on the extra-platforms pipeline

* Flow platform parameter from helix queues templates

* Fix another condition

* Allow CreateScalarUnsafe to be directly contained by hwintrinsics that support scalar loads (#62407)

* Ensure that floating-point constants can be contained by hardware intrinsics

* Allow CreateScalarUnsafe to be directly contained by hwintrinsics that support scalar loads

* Rename IsContainableHWIntrinsicOp to TryGetContainableHWIntrinsicOp and improve handling

* Ensure that NI_AVX2_BroadcastScalarToVector128/256 are properly tracked as MaybeMemoryLoad

* Applying formatting patch

* Ensure a few other "maybe memory" and special memory operand size cases are handled

* Applying formatting patch

* Remove commented code (#63869)

* Add pmi_path argument to superpmi.py script and use it in the superpmi-collect pipeline. (#63983)

* Add -pmi_path argument to superpmi.py collect command and use it to set PMIPATH environment variable in src/coreclr/scripts/superpmi.py

* Set pmi_path to $(SuperPMIDirectory)\crossgen2

* Print a warning if -pmi_path or -pmi_location is specified while --pmi is not in src/coreclr/scripts/superpmi.py

* Move setting of PMIPATH environment variable under `if self.coreclr_args.pmi is True:` in src/coreclr/scripts/superpmi.py

* Move pmi argument validation to setup_args() in src/coreclr/scripts/superpmi.py

* Clone root_env if we are going to set PMIPATH environment variable in src/coreclr/scripts/superpmi.py

* Update the macOS CoreCLR building documentation. (#63932)

This updates the documentation to refer to the up-to-date location of
requirements and prerequisites.

* Introduce RandomAccess.SetLength (#63992)

* don't Flush readonly MemoryMappedViewAccessor on disposal (#63794)

* don't Flush if it's impossible to write

* address code review feedback: apply same optimization to MemoryMappedViewStream

* Implement System.Runtime.CompilerServices.DisabledRuntimeMarshallingAttribute on CoreCLR-family of runtimes/type systems (#63320)

* Add the DisableRuntimeMarshallingAttribute to the build.

* Add initial test suite

* Implement support in IL stubs for the "disabled runtime marshalling" feature.

* Add testing for inlining IL stubs.

* Block SetLastError and LCID support when DisableRuntimeMarshallingAttribute is applied.

* Bump NativeAOT-only R2R version header (missed previously)

* Implement support in crossgen2 and NativeAOT

* Clean up the test tree and update the tests to fail more reliably when bugs are present.

Fix a bug that was uncovered when the tests were refactored.

* Fix NativeAOT and clean up crossgen2

* Add a test for NoPreserveSig with DisableRuntimeMarshalling

* Assign hr in SUCCEEDED macro.

* PR feedback.

* Block varargs in disabled marshalling mode.

* Fix typo

* Block types that have a field that is auto-layout somewhere in their layout.

* Fix typo

* Revert the AutoLayoutOrHasAutoLayoutFIeld check in the "marshalling enabled" case

* Only set scope when it isn't null (it's null for some cases).

* Fix narrowing conversion failure.

* First pass simple implementation in Mono

* Fix assert to still work for the built-in marshalling system

* S_FALSE is a thing

* Fix type load failures caused by eager type handle loading.

* Get MethodILScope from the calling method when available (this covers all cases where we need it)

* Add const modifier.

* Try 2 to fix const modifiers

* Fix compilation of NativeAOT jitinterface

* Fix type lookup in Mono

* Use try_get model for getting the attribute type in the case of failure. Fix mono implementation for looking up the attribute.

* Handle void and generic instantiations

* Update auto-layout check to check recursively in layout.

* Enhance test suite with more tests for UnmanagedCallersOnly, generics, and the like. Fix AutoLayout test.

* Fix IL and a few typos

* Set a value in the padding for easier debugging.

* Create sig->marshalling_disabled to track when marshalling is disabled, which is separate from the concept of "is this signature a P/Invoke"

* Fix running test suite on Mono + Mini JIT

* Fix recursive type load failure by only checking the "has auto-layout or field with auto-layout" for value types.

* Fix mono windows build.

* Feedback from Michal.

* Fix bug in EcmaAssembly.HasAssemblyCustomAttribute

* Make the runtime flavor check in the wrapper generator case-invariant

* Use helper method since various different platforms/configurations throw different exceptions for these scenarios.

* Fix AutoLayout test refactor and use a dummy value for the padding field in both enabled and disabled scenarios.

* Add an explicit test for using enums as they're a little weird and needed some special-casing.

* Fix build-time test filtering in xunit wrapper generator.

* Fix some x86-specific issues

* Add a nice big comment block.

* Fix x86

* Refactor tests so we can skip one on Mono since Char->char lossy conversion is not supported.

* Disable test in issues.targets until an alternative solution is reached.

* Add another SkipOnMono attribute in the "Enabled" test suite.

* Apply UnmangedFunctionPointerAttribute to help hint to the Mono LLVM AOT compiler to compile the managed->native thunks at aot-time

* Unify on "runtime marshalling" terminology

* Clean up unused usings.

* Address Jan's feedback except for applying the attribute to CoreLib.

* PR feedback.

* Mono throws an InvalidProgramException for varargs

* Fix copy-paste issue.

* Make sure we use the P/Invoke's Module and not the caller's module when deciding if runtime marshalling is enabled for a varargs P/Invoke call.

* Handle how LLVM AOT reports the failure to handle varargs (EEE)

* Make ILLink validation steps in libs incrementally buildable (#64041)

* Make ILLink validation steps in libs incrementally buildable

Both the illink-oob and the illink-sharedframework targets don't define Inputs and Outputs which makes them run during no-op incremental builds. This change defines Inputs and Outputs based on what's used during the target's execution so that if the input assemblies or the illink assembly itself haven't changed, the step will be skipped.

Also renaming properties and items to make them more readable and consistent. As these target files are "extensions" of the src.proj file and aren't shared anywhere, they can be treated like logic inside a project file and hence prefixing properties and items with an underscore "_" isn't necessary.

* Fix broken callstacks in interpreter on MonoVM. (#60338)

* Fix some broken callstacks in interpreter.

* Fix build error.

* Initial WASI support prototype (#63890)

* Add StringSyntaxAttribute.Json (#64081)

* [main] Update dependencies from 5 repositories (#64002)

Co-authored-by: dotnet-maestro[bot] <dotnet-maestro[bot]@users.noreply.github.com>
Co-authored-by: Přemek Vysoký <premek.vysoky@microsoft.com>

* Fix crash when VS4Mac is debugging VS4Mac arm64 (#64085)

Fix crash when VS4Mac is debugging VS4Mac arm64

Issue: https://github.com/dotnet/runtime/issues/64011

* ILVerify: Handle readonly references in ldfld (#64077)

* ILVerify: Handle readonly references in ldfld

Fixes #63953

* Fix test name

Co-authored-by: Michal Strehovský <MichalStrehovsky@users.noreply.github.com>

* Avoid additional local created for delegate invocations (#63796)

Very often 'this' is already a local and we can avoid creating another
local.

* [wasm][debugger] Apply changes on wasm using sdb protocol. (#63705)

* Apply changes on wasm using sdb protocol.

* conflict

* Merge conflict.

* Fix merge

* Fix compilation error.

* Fixed IsFloatPositiveZero from returning 'true' on non-constant double operands (#64083)

* Fixed IsFloatPositiveZero from returning 'true' on non-constant double operands

* Update src/coreclr/jit/gentree.h

Co-authored-by: Egor Bogatov <egorbo@gmail.com>

Co-authored-by: Egor Bogatov <egorbo@gmail.com>

* Ensure several helper intrinsics are correctly imported and handled (#63972)

* Ensure several helper intrinsics are correctly imported and handled

* Ensure that Sum for TYP_INT/UINT on Arm64 is correctly handled

* Respond to PR feedback and ensure ExtractMostSignificantBits for Vector64<int/uint> on Arm64 also uses AddPairwise

* Applying formatting patch

* Ensure the clsHnd is correct

* Fix the remaining musl failures

* Ensure that we aren't sign-extending TYP_BYTE (System.SByte) for ExtractMostSignificantBits

* Ensure an assert is correct on x64

* Ensure Vector64<int/uint>.Dot on Arm64 uses AddPairwise, not AddAcross

* Apply formatting patch

* RegexNode cleanup (#64074)

No functional changes, just code cleanup:
- Move node types into a RegexNodeKind enum
- Rename some of the kinds to make them more descriptive
- Rename node.Next to node.Parent to better describe its purpose
- Add a bunch of comments about node kinds

* Refactor optimizing morph for commutative operations (#63251)

* Create "fgOptimizeCommutativeArithmetic"

And just move code from "fgMorphSmpOp" to it.

Just one diff: better comma throw propagation in an ILGEN method.

* Refactor the function

Split it into specialized variants for each operator,
delete redundant code, fix up one case of wrong typing
for a constant in the MUL -> SHIFT optimization.

One CSE diff due to different VNs because of the typing
change for the constant (int -> long).

Many text diffs: "mov x3, 5" => "mov w3, 5".

* Do not set GTF_NO_CSE for sources of block copies (#63462)

It is not necessary, the compiler fully supports locals
on the RHS of a struct assignment. Not marking these results
in a CQ improvement, from struct (including SIMD) CSEs and
global constant propagation into promoted fields.

* Handle embedded assignments in copy propagation (#63447)

* Clean things up a little

Delete redundant conditions, use "LclVarDsc*", rename locals for clarity.

* Delete a redundant condition

For actual def nodes, GTF_VAR_CAST will never be set, it is
only set in "optNarrowTree" for uses.

For "def nodes" that are actually uses (parameters), the VNs
will never match anyway.

* Handle embedded assignments in copy propagation

Previously, as the comments in copy propagation tell us, it
did not handle "intervening", or not-top-level definitions of
locals, instead opting to maintain a dedicated kill set of them.

This is obviously a CQ problem, but also a TP one, as it meant
there had to be a second pass over the statement's IR, where
the definitions would be pushed on the stack.

This change does away with that, instead pushing new definitions
as they are encountered in execution order, and simultaneously
propagating on uses. Notably, this means the code now needs to
look at the real definition nodes, i. e. ASGs, not the LHS locals,
as those are encountered first in canonical execution order, i. e.
for a tree like:

```
  ASG
    LCL_VAR V00 "def"
    ADD
      LCL_VAR V00
      LCL_VAR V00
```

Were we to use the "def" as the definition point, we would wrongly
push it as the definition on the stack, even as the assignments
itself hasn't happened yet at that point.

There are nice diffs with this change, all resulting from unblocked
propagations, and mostly coming from setup arguments under calls.

* Simplify optIsSsaLocal

* Update format script permissions so it can be called on Unix systems directly. (#64107)

* Revert "Enable System.Text.Json tests on netfx (#63803)" (#64108)

This reverts commit 34794bc5f2bcdbaa9057bb07b8764e2bb6a411a2.

* Make ApiCompat.proj incrementally buildable (#64037)

* Make ApiCompat.proj incrementally buildable

In https://github.com/dotnet/runtime/pull/64000, I noticed that ApiCompat.proj never builds incrementally. Even though the RunApiCompat target has Inputs and Outputs, those aren't defined too late inside the target to have any effect. Moving them out and declare the generated response file as an output.

Also simplifying some msbuild logic and renaming some properties as underscore prefixes in project files don't make sense if the property isn't reserved in any way.

* Update ApiCompat.proj

* Remove enable drawing on unix switch (#64084)

* Remove enable drawing on unix switch

* Update some tests and not run tests that need Drawing on non Windows

* PR Feedback, just turn off the switch

* Address-expose locals under complex local addresses in block morphing (#63100)

* Handle complex local addresses in block morphing

In block morphing, "addrSpill" is used when the destination or source
represent indirections of "complex" addresses. Unfortunately, some trees
in the form of "IND(ADDR(LCL))" fall into this category.

If such an "ADDR(LCL)" is used as an "addrSpill", the underlying local
*must* be marked as address-exposed. Block morphing was using a very
simplistic test for when that needs to happen, essentially only recognizing
"ADDR(LCL_VAR/FLD)". But it is possible to have a more complicated pattern
as "PrepareDst/Src" uses "IsLocalAddrExpr" to recognize indirect stores
to locals.

Currently it appears impossible to get a mismatch here as morph transforms
"IND(ADD(ADDR(LCL_VAR), OFFSET))" into "LCL_FLD" (including for TYP_STRUCT
indirections), but this is a very fragile invariant. Transforming TYP_STRUCT
GT_FIELDs into GT_OBJs instead of GT_INDs breaks it, for example.

Fix this by address-exposing the local obtained via "IsLocalAddrExpr".

* Add a TODO-CQ for LCL_FLD usage

* [Group 2] Enable nullable annotations for `Microsoft.Extensions.DependencyInjection` (#63836)

* Annotate src

* Update ResolverBuilder.Build

* Update RunOnEmptyStackCore

* ILEmitResolverBuilderContext constructor

* Remove setter

* Add assert

* Enable nullable annotations for Microsoft.Extensions.Configuration.UserSecrets (#63700)

* [mono] Cleanup trailing whitespace. (#64112)

* Delete `GT_DYN_BLK` (#63026)

* Import GT_STORE_DYN_BLK directly

* Delete GT_DYN_BLK

* DynBlk -> StoreDynBlk

* Add some tests

* Mark tests Pri-1

* Rebase and fix build

* Bring back the odd early return

* Ignore conversion exceptions during dictionary construction (#63792)

* Extract SuperPMI into a separate component (#64035)

Allows building the runtime without SPMI.

`build.cmd clr` will still build SPMI.
`build.cmd clr.native` will still build SPMI.
`build.cmd clr.runtime` will no longer build SPMI.

This is mostly motivated by NativeAOT subset builds where SPMI contributes to 10% of the native build time (nativeaot CorecLR subset builds pretty quickly compared to full CoreCLR).

* Add COMWrappers to crossgen (#63969)

* pipelines: Add wasm jobs (#64109)

* Fixing update issue with multivalued properties #34267 (#56696)

* Add custom attribute test

* Adding test demonstrating issue #34267

* Solution for issue #34267

Replacing all values in property with the new collection, instead of just
appending new values, leaving old values in place.

* Incorporate review feedback

Changing the variable name

* Relax assert in ApplyEditAndContinue (#64132)

Fixes #64070

* Disable NJulianRuleTest test crashing in CI (#64142)

* Updating unit tests for DirectoryServices.AccountManagement (#56670)

Removing old, redundant unit tests that were actually never executed

Migrating old tests to new test infrastructure with configurable LDAP/AD
connections

* Fix MultiByteToWideChar call in pal (#64146)

* Extra tests for assembly name parser. (#64022)

* Dead code in native assembly name parsing

* disallow `\u` escaping in assembly names

* misc cleanup

* forward slash is illegal escaped or not

* ignore "language" attribute in assembly name ("culture" must be used)

* duplicate attributes are ok if unrecognized (just add tests)

* drop support for "custom" blob attribute

* drop support for publickey[token]=neutral ("null" must be used)

* ignore unknown assembly name attributes in mono (compat)

* disallow \0 anywhere in the assembly name

* disallow \0 in assembly names on mono (compat)

* only check for embedded nulls when parsing

* fix mono build

* make GCC happy

* couple test scenarios for publickey vs. publickeytoken (CoreRT parser might trip on these)

* produce errors on duplicate known attributes in mono

* Dispose LdapConnections used by ValidateCredentials (#62036)

Ensure that cached LdapConnection instances created by
PrincipalContext.ValidateCredentials are disposed when
the corresponding PrincipalContext is disposed.

Fix #62035

* Add runtime support for `ref` fields (#63985)

* Add mono and coreclr runtime support for ref fields

* Update Reflection.Emit tests to validate ref fields.

Add test for TypedReference as a ref field.

* Spmi replay asmdiffs mac os arm64 (#64119)

* Split unix-arm64 into linux-arm64 and osx-arm64 in src/coreclr/scripts/superpmi-replay.proj

* Split unix-arm64 into linux-arm64 and osx-arm64 in src/coreclr/scripts/superpmi-asmdiffs.proj

* Add all subdirectories of $(SuperPMIDirectory) as PMIPATH in src/coreclr/scripts/superpmi-collect.proj

* Update NativeAOT codegen and Crossgen2 for CreateSpan (#63977)

- Make sure FieldRVA pointers remain aligned as required by the code generator
  - Use the same Packing Size approach as the IL Linker will use (See jbevain/cecil#817 for details)
  - Compilers that generate CreateSpan will need to follow that trick to be compatible with rewriters.
- Provide ECMA spec augment describing packing size detail

* Add alignment to mapped field stream (#63305)

* Align MappeFieldDataStream at 8 byte boundary

* Add test to verify that the mapped field rva data blob is aligned to ManagedPEBuilder.MappedFieldDataAlignment

* Only align when the mapped field data is of size not equal to 0

* Implement hash and HMAC stream one shots 

This implements hashing and HMAC statics for streams. Additionally,
"LiteHmac" and "LiteHash" were introduced. The existing HMAC and hash
provider functionality do some bookkeeping we don't need for resetting.
Since we do not need to use these hash handles after the digest has
been finalized, resetting is unnecessary work. For HMAC, that also means
keeping a copy of the key around for some implementations which we don't
need to do.

The LiteHash and LiteHmac types are implemented as structs with a common
interface. To avoid boxing, generics are used and constrained to the interface
where possible.

The Browser implementation just defers to the existing HashDispenser rather
than do anything novel.

The HashProviderCng is somewhat specialized in its ability to reset. It did
up-front check to determine if the platform supported reusable hash providers,
and further had a single implementation for HMAC and Digests. The current
Lite hash design requires that they remain separate types.

* Title and message resources should be enforced to exist to prevent printing empty messages (#64151)

Sync ILLink.Shared folder with the latest version in dotnet/linker main branch

List of changes include:
- Enforce title and message resources to exist to prevent printing empty messages
- All diagnostics produced by linker now have a DiagnosticId, a title and a message
- Schema for xml link attributes file

- Added a readme file to the ILLink.Shared project to keep track of the commit is being used from dotnet/linker

* Allow generating Dwarf version 5 (#63988)

Contributes to https://github.com/dotnet/runtimelab/issues/1738.

* Re-enable failing long path test (#64113)

* Port MD4 managed implementation from mono/mono (#62074)

Porting MD4 managed implementation from mono/mono (MD4.cs and MD4Managed.cs). 
  
It adds:  
- an internal class in the System.Net.Security with a single HashData method for now;  
- a set of related MD 4 unit tests to System.Net.Security.Unit.Tests project.

* Fix one source of perf regression in GCHeap::Alloc. This impacts the System.Collections.CtorFromCollectionNonGeneric<Int32> family of benchmarks. (#64091)

These benchmarks manage to make GCHeap::Alloc into a hotspot, so the call to IsHeapPointer() at the end matters for performance.

* Add blsr (#63545)

* Fix FileSystemAclExtensions.Create when passing a null FileSecurity (#61297)

* Make FileSecurity parameter nullable.

* Add missing ArgumentException message for FileMode.Append.

* Refactor tests to ensure FileSecurity is tested with all FileMode and FileSystemRights combinations. Separate special cases.

* Remove exception that throws when FileSecurity is null.
Ensure we have logic that can create a FileHandle when FileSecurity is null.
Fix bug where FileShare.Inheritable causes IOException because it is being unexpectedly passed to the P/Invoke (it should just be saved in the SECURITY_ATTRIBUTES struct).
Add documentation to mention this parameter as optional.
Ensure all exceptions match exactly what we have in .NET Framework, with simpler logic.

* Address suggestions

Co-authored-by: carlossanlop <carlossanlop@users.noreply.github.com>

* Tune FP CSEs live across a call better (#63903)

The problem was that the comparison of a weighted refcount,
which usually has the order of hundreds or tens, with a small
digit like "4" was too weak and missed some cases where we
were still trying to CSE cheaps floats across calls and ending
up with lots of stack shuffling.

Fix this by using different tuning parameters, namely the costs
estimated for the uses and defs (increase them to account for
the spills and reloads).

* [main] Update dependencies from dotnet/arcade dotnet/icu dotnet/xharness dotnet/emsdk (#64098)

Co-authored-by: dotnet-maestro[bot] <dotnet-maestro[bot]@users.noreply.github.com>

* Update zip extraction to never throw any exceptions when the LastWriteTime update fails (#63912)

* Use kebab-case in FB automation labels (#64048)

* Onboard new Triage & PR Boards (#64198)

* Exclusively use GitHub teams for Libraries area mentions (#64199)

* Reduce buffer size used in XmlReader when using Async mode (#63459)

The current choice of AsyncBufferSize resulted in the character buffer in the XmlTextReader being allocated on the Large Object Heap (LOH)

Fixes https://github.com/dotnet/runtime/issues/61459

* Ignoring leading dot when comparing cookie domains (#64038)

* ignoring leading dot when comparin cookie domain

* Simplify cookie comparing logic to equality and moving it to CookieComparer to fix the build

* Domain comparing optimizarion and more unit tests

* small check optimization

* Renaming method

* Add missing handle function enter/return macros (#64061)

The mono_field_static_get_value method uses a handle, but did not set up
enter/exit macros properly, so this handle was leaked.

Some code in Unity calls this embedding API method pretty often, which
can lead to the mark stack overflowing in the GC code.

* Drop support for .NET 5 SDK (#64186)

We had to duplicate a lot of Microsoft.NET.ILLink.targets logic.

* Implement IEquatable<T> on value types overriding Equals (and enable CA1066/1077) (#63690)

* [mono] Temporarily disable two tests that fail on arm64 LLVM FullAOT. (#64180)

* Delete stale reference in System.Drawing.Primitives (#64202)

* Respond to feedback in GenerateMultiTargetRoslynComponentTargetsFile (#63943)

* Respond to feedback in GenerateMultiTargetRoslynComponentTargetsFile

Two small follow up changes from #58446

- Fix a type-o that breaks incremental build. Forgot to use MSBuild property syntax
- Instead of having the infrastructure hard-code removing 'Abstractions', packages can set their own Disable source gen property name.

* PR feedback

* Use the static HashData(Stream) method in more places

* Add executable bit to tizen sh files (#64216)

* Bump Intellisensense package version to latest from `dotnet7-transport` (#63352)

* Ensure that we aren't accidentally generating instructions for unsupported ISAs (#64140)

* Assert that the ISA of the set intrinsic ID is supported

* Ensure gtNewSimdCmpOpAllNode and gtNewSimdCmpOpAnyNode don't generate AVX2 instructions when not supported

* Ensure codegen for Vector128.Dot when SSSE3 is disabled is correct

* Update src/coreclr/jit/hwintrinsiccodegenarm64.cpp

Co-authored-by: Jan Kotas <jkotas@microsoft.com>

* Ensure Vector256.Sum has a check for AVX2

Co-authored-by: Jan Kotas <jkotas@microsoft.com>

* Don't reference .NETFramework shims in libraries product or test composition (#64193)

* Don't reference .NETFramework shims

Stop referencing .NETFramework shims in libraries ref or source projects as those are supplementary and shouldn't impact the product composition.

* [Android][libs] Enable Internal.Console.Write in System.Private.CoreLib (#63949)

* [Android][libs] Enable Internal.Console.Write in System.Private.CoreLib

* [docs] Add debugging System.Private.CoreLib Internal.Console.Write

* Elaborate on debugging corelib log

* Address feedback

* Install v8 and Prebuild wasm (#64100)

* Port Mono to Raspberry Pi, ship as new linux-armv6 RID (#62594)

* Initial ARMv6 arch addition. Builds mono runtime, not CoreCLR (Mono already supports the CPU arch subset used by Raspberry Pi, whilst porting CoreCLR to e.g. VFPv2 would be major work)
* Build small clr subset on ARMv6, it's needed for SDK and we want to check it works

* Fix remote unwind (#64220)

The _OOP_find_proc_info was setting only a couple of members of the
unw_dyn_info_t instance on stack. So the remaining ones had random
values. The load_offset was a recently added member to the struct.
When we have updated libunwind, this change came in. The load_offset was
random and that has broken unwindign as this offset is subtracted from
the IP when looking up unwind info.

The fix clears the whole struct. I have verified that the issue we had no
longer happens with the fix.

* Put back FindCaseSensitivePrefix regex alternation support (#64204)

* Put back FindCaseSensitivePrefix alternation support

* Fix the bug from the initial version, and add more comments

* Update tests to expect RemoteExecutor to check exit code (#64133)

* update generation_allocation_size correctly for SIP regions (#64176)

SIP regions need to update the corresponding generation's generation_allocation_size and since this can be more than 1 gen older than the region's gen, we need to make all generation's alloc size get updated.

* Android remove backward timezones (#64028)

Fixes #63693

It was discovered that Android produces duplicate TimeZone DisplayNames among all timezone IDs in GetSystemTimeZones. These duplicate DisplayNames occur across TimeZone IDs that are aliases, where all except one are backward timezone IDs.

If a name is changed, put its old spelling in the 'backward' file

From the Android TimeZone data file tzdata, it isn't obvious which TimeZone IDs are backward (I find it strange that they're included in the first place), however we discovered that on some versions of Android, there is an adjacent file tzlookup.xml that can aid us in determining which TimeZone IDs are "current" (not backward).

This PR aims to utilize tzlookup.xml when it exists and post-filter's the Populated TimeZone IDs in the AndroidTzData instance by removing IDs and their associated information (byteoffset and length) from the AndroidTzData instance if it is not found in tzlookup.xml. This is using the assumption that all non-backward TimeZone IDs make it to the tzlookup.xml file.

This PR also adds a new TimeZoneInfo Test to check whether or not there are duplicate DisplayNames in GetSystemTimeZones

* Update main branding to preview2 (#64219)

* Catch UnicodeEncodeErrors (#64251)

* Make XmlSerializer.Generator targets incremental (#64191)

* Make XmlSerializer.Generator targets incremental

Adding inputs and outputs to make XmlSerializer.Generator incremental

* Make sure that shared memory object name meets the length requirements (#64099)

Co-authored-by: Stephen Toub <stoub@microsoft.com>

* Fix PAL_wprintf for wide characters (#64181)

* [main] Update dependencies from dotnet/runtime dotnet/llvm-project (#64205)

* Update dependencies from https://github.com/dotnet/runtime build 20220123.5

Microsoft.NETCore.ILAsm , Microsoft.NETCore.DotNetHostPolicy , Microsoft.NETCore.DotNetHost , Microsoft.NETCore.App.Runtime.win-x64 , System.Runtime.CompilerServices.Unsafe , runtime.native.System.IO.Ports , Microsoft.NET.Sdk.IL , System.Text.Json
 From Version 7.0.0-alpha.1.22066.4 -> To Version 7.0.0-alpha.1.22073.5

* Update dependencies from https://github.com/dotnet/llvm-project build 20220123.1

runtime.win-x64.Microsoft.NETCore.Runtime.ObjWriter , runtime.win-arm64.Microsoft.NETCore.Runtime.ObjWriter , runtime.osx.10.12-x64.Microsoft.NETCore.Runtime.ObjWriter , runtime.osx.11.0-arm64.Microsoft.NETCore.Runtime.ObjWriter , runtime.linux-x64.Microsoft.NETCore.Runtime.ObjWriter , runtime.linux-musl-x64.Microsoft.NETCore.Runtime.ObjWriter , runtime.linux-musl-arm64.Microsoft.NETCore.Runtime.ObjWriter , runtime.linux-arm64.Microsoft.NETCore.Runtime.ObjWriter
 From Version 1.0.0-alpha.1.22070.1 -> To Version 1.0.0-alpha.1.22073.1

Co-authored-by: dotnet-maestro[bot] <dotnet-maestro[bot]@users.noreply.github.com>

* Delete unused ApiCompat baseline files (#64190)

* Delete unused ApiCompat baseline files

* Delete ApiCompatBaseline.netfx.netstandardOnly.txt

* Remove manual .NETFramework baseline validation

* Delete ApiCompatBaseline.netcoreapp.netfx461.ignore.txt

* Delete ApiCompatBaseline.netcoreapp.netfx461.txt

* Improve Regex handling of anchors (#64177)

* Improve Regex handling of anchors

- Extend search for leading anchor to support alternations.  This means that an expression like `^abc|^def` will now observe the leading `^` whereas previously it didn't.
- Add a FindFirstChar optimization that jumps to the right position for a pattern that matches a computeable max length and ends with an end anchor.

* Address PR feedback

* Add the exception set for `ObjGetType` (#64106)

* Model NRE for ObjGetType

* Add tests

* [ILVerify] Fix casting check for arrays of generic parameters with class constraints (#64259)

Fixes #63999

* Use lower call count threshold for tiering in debug builds (#60945)

* Use lower call count threshold for tiering in debug builds

To exercise more paths during tests, see https://github.com/dotnet/runtime/pull/60886

* Skip tests using AsyncIO in FileSystemAclExtensionsTests where it's not supported (#64212)

The mono runtime does not yet support AsyncIO on Windows and there were some tests failing on CI because of it.
Fixes #64221

* Correct JsonNode.Root doc (#64238)

* Take ARMv6 out of PlatformGroup All (#64267)

* Take ARMv6 out of PlatformGroup All, CoreCLR assumes this means full support

Co-authored-by: Alexander Köplinger <alex.koeplinger@outlook.com>

* Only send to Helix for rolling build, due to small Helix queue (#64274)

* Add ref field runtime feature indication (#64167)

* Add ref field runtime feature indication

Co-authored-by: Stephen Toub <stoub@microsoft.com>

* Faster IndexOf for substrings (#63285)

* Improve "lastChar == firstChar" case, also, use IndexOf directly if value.Length == 1

* Try plain IndexOf first, to optimize cases where even first char of value is never met

* add 1-byte implementation

* copyrights

* fix copy-paste mistake

* Initial LastIndexOf impl

* More efficient LastIndexOf

* fix bug in Char version (we need two clear two lowest bits in the mask) & temporarily remove AdvSimd impl

* use ResetLowestSetBit

* Fix bug

* Add two-byte LastIndexOf

* Fix build

* Minor optimizations

* optimize cases with two-byte/two-char values

* Remove gotos, fix build

* fix bug in LastIndexOf

* Make sure String.LastIndexOf is optimized

* Use xplat simd helpers - implicit ARM support

* fix arm

* Delete \

* Use Vector128.IsHardwareAccelerated

* Fix build

* Use IsAllZero

* Address feedback

* Address feedback

* micro-optimization, do-while is better here since mask is guaranteed to be non-zero

* Address feedabc

* Use clever trick I borrowed from IndexOfAny for trailing elements

* give up on +1 bump for SequenceCompare

* Clean up

* Clean up

* fix build

* Add debug asserts

* Clean up: give up on the unrolled trick - too little value from code bloat

* Add a test

* Fix build

* Add byte-specific test

* Fix build

* Update IndexOfSequence.byte.cs

* [main] Update dependencies from dotnet/arcade dotnet/xharness dotnet/icu dotnet/hotreload-utils dotnet/llvm-project (#64265)

* Update dependencies from https://github.com/dotnet/arcade build 20220124.13

Microsoft.DotNet.XUnitConsoleRunner , Microsoft.DotNet.CodeAnalysis , Microsoft.DotNet.Build.Tasks.Workloads , Microsoft.DotNet.Build.Tasks.Templating , Microsoft.DotNet.Build.Tasks.TargetFramework.Sdk , Microsoft.DotNet.Build.Tasks.Packaging , Microsoft.DotNet.Build.Tasks.Installers , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Build.Tasks.Archives , Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.ApiCompat , Microsoft.DotNet.XUnitExtensions , Microsoft.DotNet.GenAPI , Microsoft.DotNet.VersionTools.Tasks , Microsoft.DotNet.GenFacades , Microsoft.DotNet.SharedFramework.Sdk , Microsoft.DotNet.RemoteExecutor , Microsoft.DotNet.PackageTesting , Microsoft.DotNet.Helix.Sdk
 From Version 2.5.1-beta.22071.6 -> To Version 2.5.1-beta.22074.13

* Update dependencies from https://github.com/dotnet/xharness build 20220124.1

Microsoft.DotNet.XHarness.CLI , Microsoft.DotNet.XHarness.TestRunners.Common , Microsoft.DotNet.XHarness.TestRunners.Xunit
 From Version 1.0.0-prerelease.22071.1 -> To Version 1.0.0-prerelease.22074.1

* Update dependencies from https://github.com/dotnet/icu build 20220124.5

Microsoft.NETCore.Runtime.ICU.Transport
 From Version 7.0.0-preview.2.22071.2 -> To Version 7.0.0-preview.2.22074.5

* Update dependencies from https://github.com/dotnet/hotreload-utils build 20220124.1

Microsoft.DotNet.HotReload.Utils.Generator.BuildTool
 From Version 1.0.2-alpha.0.22069.1 -> To Version 1.0.2-alpha.0.22074.1

* Update dependencies from https://github.com/dotnet/llvm-project build 20220124.2

runtime.win-x64.Microsoft.NETCore.Runtime.Mono.LLVM.Tools , runtime.win-x64.Microsoft.NETCore.Runtime.Mono.LLVM.Sdk , runtime.linux-arm64.Microsoft.NETCore.Runtime.Mono.LLVM.Sdk , runtime.linux-arm64.Microsoft.NETCore.Runtime.Mono.LLVM.Tools , runtime.linux-x64.Microsoft.NETCore.Runtime.Mono.LLVM.Sdk , runtime.linux-x64.Microsoft.NETCore.Runtime.Mono.LLVM.Tools , runtime.osx.10.12-x64.Microsoft.NETCore.Runtime.Mono.LLVM.Sdk , runtime.osx.10.12-x64.Microsoft.NETCore.Runtime.Mono.LLVM.Tools
 From Version 11.1.0-alpha.1.22067.2 -> To Version 11.1.0-alpha.1.22074.2

Co-authored-by: dotnet-maestro[bot] <dotnet-maestro[bot]@users.noreply.github.com>

* Add CancellationToken to TextReader.ReadXAsync (#61898)

Co-authored-by: Adam Sitnik <adam.sitnik@gmail.com>
Co-authored-by: Stephen Toub <stoub@microsoft.com>

* Restrict parallelism in LLVM FullAOT compile, to prevent OOM (#63800)

* Restrict parallelism in FullAOT compile, to prevent OOM

* Reduce parallelism further, due to more OOM

* Moved AssemblyName helpers to managed (#62866)

* Moved ComputePublicKeyToken to managed

* Managed assembly name parsing (adapted from nativeaot)

* Fix for HostActivation failures.

* PR feedback (RuntimeAssemblyName is back to CoreRT + other comments)

* remove AssemblyNameNative::Init form the .hpp

* remove AppX compat ifdef

* renamed instance fields to convention used in C#

* `Argument_InvalidAssemblyName`   should be   `InvalidAssemblyName`. Majority of use is `FileLoadException`.

* remove `this.`

* PR feedback (assign to fileds, bypass properties)

* missed this change in the rebase

* "low-hanging fruit" perf tweaks.

* move one-user helpers to where they are used.

* removed ActiveIssue for #45032

* remove AssemblyNameHelpers.cs form corelib

* Remove the List when detecting duplicates. Support PublicKey.

* whitespace

* Fix managed implementation to match the new tests.

* Some minor cleanup.

* Do not validate culture too early

* PR feedback

* use SR.InvalidAssemblyName

* Report the input string when throwing FileLoadException

* tweaked couple comments

* Disable RegexReductionTests tests on browser

* Fix formatting of resource string where excess arguments are passed (#63824)

* Fix formatting of resource string where excess arguments are passed. #63607

* Fix BuildCharExceptionArgs and ECCurve.Validate

* Fix CA2208

* Fix CA2208. Remove paramName becaus it is in error message

* Code review fixes

* Code review fixes

* Add Regex.Count string overloads (#64289)

* Clarify purpose of PDB Document hashing (#64306)

Fixes #63505

* Fix arm64/PInvoke so that NESTED_ENTRY/NESTED_END labels match. (#64296)

This was exposed by building on arm64 with gcc-12,
wherein the assembler complained about not being able
to evaluate the constant expression for .size for the symbol
on NESTED_END.  Since the symbol on NESTED_END is not
referenced anywhere else in the code base,
I concluded that it was wrong, and NESTED_ENTRY was right.

I have not tested this on anything but arm64 + gcc-12

* When decommitting, leaving one instead of two pages in regions case. (#64243)

* Ensure that canceled Task.Delays invoke continuations asynchronously from Cancel (#64217)

* Add gen folder moving gen projects from src folder (#64231)

* Fix minor typos in GC documentation. (#64298)

* Explicitly specify four subdirectories to use as part of the paths for -pmi_path arguments and expand the paths on a remote machine in src/coreclr/scripts/superpmi-collect.proj (#64308)

* Disable RegexReductionTests on browser (#64312)

* Add UnreachableException (#63922)

* [mono] Recognize new names for Xamarin.iOS etc assemblies (#64278)

They are being renamed in https://github.com/xamarin/xamarin-macios/pull/13847

* Remove usage of codecvt from corerun (#64157)

* Remove usage of codecvt from corerun

* Update src/coreclr/hosts/corerun/corerun.cpp

Co-authored-by: Aaron Robinson <arobins@microsoft.com>

Co-authored-by: Aaron Robinson <arobins@microsoft.com>

* Refactor FileStatus.Unix. (#62721)

* Refactor FileStatus.Unix.

- Moves InitiallyDirectory out of FileStatus into FileSystemInfo.
In FileSystemInfo it can be a readonly field making its usage clearer.
And FileStatus can then directly be used to implement some FileSystem methods
without allocating an intermediate FileInfo/DirectoryInfo.

- Treat not exists/exist as initialized states to avoid wrongly assuming
initialized means the file cache is valid, which isn't so when the file does
not exist.

- Use 0 for tracking uninitialized to make default(FileStatus) uninitialized.

* Fix unique VNs for `ADDR`s (#64230)

* Add the test

* Fix unique VNs for ADDRs

They need to keep the exception sets.

* Implemented hierarchy of attributes. (#64201)

* Implemented hierarchy of attributes.

* Shortened.

* Fixed overlooked test naming and simplified.

* Partial refactor.

* Update the managed type system to more gracefully fail when calling a varargs method. (#64286)

* Update the managed type system to more gracefully fail when calling a varargs method.

* Use ThrowHelper instead of manually throwing the exception.

* Update src/coreclr/tools/Common/TypeSystem/Ecma/EcmaSignatureParser.cs

Co-authored-by: Jan Kotas <jkotas@microsoft.com>

* [mono] Add some missing Internal.Runtime.CompilerServices.Unsafe intrinsics. (#64314)

* Remove usage of FEATURE_CORESYSTEM (#63850)

* Remove usage of FEATURE_CORESYSTEM from coreclr.

* [main] Update dependencies from dotnet/arcade dotnet/runtime-assets (#64331)

* Update dependencies from https://github.com/dotnet/arcade build 20220125.6

Microsoft.DotNet.XUnitConsoleRunner , Microsoft.DotNet.CodeAnalysis , Microsoft.DotNet.Build.Tasks.Workloads , Microsoft.DotNet.Build.Tasks.Templating , Microsoft.DotNet.Build.Tasks.TargetFramework.Sdk , Microsoft.DotNet.Build.Tasks.Packaging , Microsoft.DotNet.Build.Tasks.Installers , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Build.Tasks.Archives , Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.ApiCompat , Microsoft.DotNet.XUnitExtensions , Microsoft.DotNet.GenAPI , Microsoft.DotNet.VersionTools.Tasks , Microsoft.DotNet.GenFacades , Microsoft.DotNet.SharedFramework.Sdk , Microsoft.DotNet.RemoteExecutor , Microsoft.DotNet.PackageTesting , Microsoft.DotNet.Helix.Sdk
 From Version 2.5.1-beta.22074.13 -> To Version 2.5.1-beta.22075.6

* Update dependencies from https://github.com/dotnet/runtime-assets build 20220125.1

Microsoft.DotNet.CilStrip.Sources , System.ComponentModel.TypeConverter.TestData , System.Drawing.Common.TestData , System.IO.Compression.TestData , System.IO.Packaging.TestData , System.Net.TestData , System.Private.Runtime.UnicodeData , System.Runtime.Numerics.TestData , System.Runtime.TimeZoneData , System.Security.Cryptography.X509Certificates.TestData , System.Text.RegularExpressions.TestData , System.Windows.Extensions.TestData
 From Version 7.0.0-beta.22060.1 -> To Version 7.0.0-beta.22075.1

Co-authored-by: dotnet-maestro[bot] <dotnet-maestro[bot]@users.noreply.github.com>

* Fixes bad log method generation in certain cases. (#64311)

In certain cases when developer by mistake places ILogger, Exception, or LogLevel in the message template,  the code generator will produce the expected warning and makes sure the code will indeed compile and run correctly.

Prior to this fix, the code generator would fail to compile with when either of ILogger, Exception or LogLevel were placed in message template incorrectly.

Fixes #64310

* Fix IsMutuallyAuthenticated on Linux and OSX (#63945)

* WIP - prepared a failing test

* Fix IsMutuallyAuthenticated on Linux

* Fix failing unit tests

* Minor cleanup

* Port changes to OSX

* Fix comment

* Invoke cert selection inline, don't allocate new credentials on Linux/OSX

* Fix tests on OSX

* Code review feedback

* Move tests to separate file

* Fix build

* Fix Failing tests

* Support {Last}IndexOfAny with sets after {lazy} loops (#64254)

When emitting backtracking loops, the loop consumes as much and then backtracks through the consumed input.  Rather than doing this one character by one character, we previously added use of LastIndexOf to search for the next place the literal after the loop matches.  We can also augment that to use IndexOfAny to search for a small set that comes after a loop instead of a literal.

Similarly when emitting backtracking lazy loops, rather than consuming one character and trying the rest of the expression and then consuming another character and trying the rest of the expression, we previously added an optimization to use IndexOf{Any} to find the next possible location of a match based on the literal that comes after the lazy loop.  And we can similarly augment that to support a small set after the lazy loop.

This is particularly helpful for IgnoreCase, as we're on a path to replacing literals with sets that contain all equivalent casings of that character.

* Fix race conditions in SystemEvents shutdown logic (#62773)

* Fix race conditions in SystemEvents shutdown logic

When the application is terminated through Restart Manager the event broadcasting window will get the `WM_CLOSE` message. The message gets handled by passing it to `DefWndProc` which calls `DestroyWindow` on the window itself thus making the window handle invalid. The `Shutdown` method expects the window handle to be valid to post `WM_QUIT` message to terminate the thread running the message loop but that's no longer possible under these conditions.

Additionally there's second race condition with the `s_eventThreadTerminated` event that is created during shutdown and set conditionally. A race condition between the threads could cause it to be created when the window message thread is already shutting down and thus it would never be set. Waiting for it in the `Shutdown` method would be cause a deadlock. This thread is also completely unnecessary since a `Join` is performed on the thread itself.

The fix has several changes that act together:
- `s_eventThreadTerminated` event is removed completely in favor of only relying on `Thread.Join`
- `WM_DESTROY` message is detected (which happens as a result of WM_CLOSE calling `DefWndProc` which in turn calls `DestroyWindow`) and handled by shutting down the message loop thread
- The message loop itself is rewritten to use standard `GetMessageW` loop. The reasoning on why it was not used seems not to be valid anymore since AppDomain shutdowns are performed differently

* Add unit test.

* Add braces

* Add marshaller for TypeLoad failure cases (#64317)

This is marshaller used when there incorrect configuration of marshaller applied to fields mostly

* Add additional loop table asserts (#64126)

1. Assert that top-level loops are basic block disjoint
2. Assert LPFLG_ITER related flags are legal

In addition:
1. Create a `optClearLoopIterInfo` phase to clear various bits in the loop
table that are known to no longer be valid, to prevent bad asserts or JitDump
output on their values.
2. Move the EndPhase call in Phase::PostPhase happens early, not late.
This causes any subsequent asserts due to post-phase checking to be
marked with the correct phase, in cases where there was a nested phase
executed (such as liveness re-computation).
3. Convert PHASE_INSERT_GC_POLLS to use EndPhase checking
4. Convert fgDetermineFirstCodeBlock to return a PhaseStatus
5. Some minor cleanup in optUpdateLoopsBeforeRemoveBlock()
(this was extracted from some bigger changes)

* Moved AssemblyName helpers to managed (part 2) (#63915)

* implement GetAssemblyName via dynamic call to MetadataReader

* A few more file-locking tests.

* fix #28153

* no need for version when getting MetadataReader

* rename the argument to match AssemblyName

* perf tweaks

* use memory-mapped file to read metadata

* adjust tests for the new implementation

* use "bufferSize: 1" when stream is going to be mapped.

* null-conditional operator.

* do Dispose before re-throwing

* get rid of the platform-specific/native stuff

* remove assemblyname.hpp

* remove `VerifyIsAssembly()`

* PR feedback

* put back gStdMngIEnumerableFuncs and the others

* Fix several bugs in NullabilityInfoContext. (#64143)

* Fix several bugs in NullabilityInfoContext.

* Reverse ASG(CLS_VAR, ...) (#63957)

This helps with register allocation. Consider:
```
***** BB01
STMT00001 ( 0x000[E-] ... ??? )
N003 ( 18, 10) [000003] -ACXG-------              *  ASG       ref    $c0
N001 (  3,  4) [000002] ----G--N----              +--*  CLS_VAR   ref    Hnd=0x8fec230 Fseq[hackishFieldName]
N002 ( 14,  5) [000000] --CXG-------              \--*  CALL      ref    CscBench.GetMscorlibPathCore $c0
```
The rationalizer will rewrite it to what is effectively:
```
***** BB01
STMT00001 ( 0x000[E-] ... ??? )
N004 ( 18, 12) [000003] -ACXG---R---              *  ASG       ref
N003 (  3,  6) [000002] n---G--N----              +--*  IND       ref
N002 (  1,  4) [000006] H-----------              |  \--*  CLS_VAR_ADDR byref  Hnd=0x8fec230
N001 ( 14,  5) [000000] --CXG-------              \--*  CALL      ref    CscBench.GetMscorlibPathCore
```
And the final LIR will look like:
```
               [000006] ------------                 IL_OFFSET void   INLRT @ 0x000[E-]
N001 (  3,  4) [000002] ----G--N----         t2 =    CLS_VAR_ADDR byref  Hnd=0x8fec230
N002 ( 14,  5) [000000] --CXG-------         t0 =    CALL      ref    CscBench.GetMscorlibPathCore $c0
                                                  /--*  t2     byref
                                                  +--*  t0     ref
N003 ( 18, 10) [000003] -A-XG-------              *  STOREIND  ref
               [000007] ------------                 IL_OFFSET void   INLRT @ 0x00A[E-]
N001 (  0,  0) [000004] ------------                 RETURN    void   $180
```
Since this store must use a barrier, `CLS_VAR_ADDR` won't be contained and will have to be evaludated
separately. Because its value is live across a call, it'll get spilled and reloaded. Reversing the ASG
fixes the problem:
```
------------ BB01 [000..00B) (return), preds={} succs={}
               [000006] ------------                 IL_OFFSET void   INLRT @ 0x000[E-]
N001 ( 14,  5) [000000] --CXG-------         t0 =    CALL      ref    CscBench.GetMscorlibPathCore $c0
N002 (  3,  4) [000002] ----G--N----         t2 =    CLS_VAR_ADDR byref  Hnd=0x8fec230
                                                  /--*  t2     byref
                                                  +--*  t0     ref
N003 ( 18, 10) [000003] -A-XG-------              *  STOREIND  ref
               [000007] ------------                 IL_OFFSET void   INLRT @ 0x00A[E-]
N001 (  0,  0) [000004] ------------                 RETURN    void   $180
```

* Fixes a few issues for dprintf on OSX (#64076)

* [Codespaces] Make it possible to run wasm samples in the browser (#64277)

With these changes, running the following in the Codespace will open the local browser to a page served from the codespace hosting the WASM sample:

```console
cd src/mono/sample/wasm/browser
make
make run-browser
```


* Set EMSDK_PATH in .devcontainer.json

   We provision Emscripten as part of the devcontainer prebuild.

   Set EMSDK_PATH to allow rebuilding the wasm runtime to work without any additional ceremony

* Install dotnet-serve into .dotnet-tools-global

* [wasm] Don't try to open browser if running in Codespaces

* .devcontainer: add global tools dir to PATH

* .devcontainer: forward port 8000

   This enables running the mono wasm samples in the local browser:

* [wasm] samples: also check for dotnet-serve on the path

   On Codespaces we install dotnet-se…
@ghost ghost locked as resolved and limited conversation to collaborators Feb 19, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Consider using SortedList instead of Dictionary in HttpHeaders
10 participants