Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate coreclr's worker thread pool to be able to use the portable thread pool in opt-in fashion #38225

Merged
merged 48 commits into from Oct 21, 2020

Conversation

kouvel
Copy link
Member

@kouvel kouvel commented Jun 22, 2020

  • Enables using the portable thread pool with coreclr as opt-in. Change is off by default for now, and can be enabled with COMPlus_ThreadPool_UsePortableThreadPool=1. Once it's had bake time and seen to be stable, at a reasonable time in the future the config flag would ideally be removed and the relevant parts of native implementation deleted.
  • The IO thread pool is not being migrated in this change, and remains on the native side
  • My goal was to get compatible behavior, compatible with diagnostics tools, and similar perf to the native implementation in coreclr. Tried to avoid changing scheduling behavior, behavior of heuristics, etc., compared with that implementation.
  • The eventual goal is to have one mostly managed thread pool implementation that can be shared between runtimes, to ease maintenance going forward

Commit descriptions:

  • "Add dependencies"
    • Ported LowLevelLock from CoreRT, and moved LowLevelSpinWaiter to shared. Since we support Thread.Interrupt(), they were necessary in the wait subsystem in CoreRT partly to support that, and were also used in the portable thread pool implementation where a pending thread interrupt on a thread pool thread would otherwise crash the process. Interruptible waits are already used in the managed side of the thread pool in the queue implementations. It may be reasonable to ignore the thread interrupt problem and suggest that it not be used on thread pool threads, but for now I just brought in the dependencies to keep behavior consistent with the native implementation.
  • "Add config var"
    • Added config var COMPlus_ThreadPool_UsePortableThreadPool (disabled by default for now)
    • Flowed the new config var to the managed side and set up a mechanism to flow all of the thread pool config vars
    • Removed debug-only config var COMPlus_ThreadpoolTickCountAdjustment, which didn't seem to be too useful
    • Specialized native and managed thread pool paths based on the config var. Added assertions to paths that should not be reached depending on the config var.
  • "Move portable RegisteredWaitHandle implementation to shared ThreadPool.cs"
    • Just moved the portable implementation, no functional changes. In preparation for merging the two implementations.
  • "Merge RegisteredWaitHandle implementations"
    • Merged implementations of RegisteredWaitHandle using the portable version as the primary and specializing small parts of it for coreclr
    • Fixed PortableThreadPool's registered waits to track SafeWaitHandles instead of WaitHandles similarly to the native implementation. The SafeWaitHandle in a WaitHandle can be modified, so it is retrieved once and reused thereafter. Also added/removed refs for the SafeWaitHandles that are registered.
  • "Separate portable-only portion of RegisteredWaitHandle"
    • Separated RegisteredWaitHandle.UnregisterPortable into a different file, no functional changes. Those paths reference PortableThreadPool, which is conditionally included unlike ThreadPool.cs. Just for consistency such that the new file can be conditionally included similarly to PortableThreadPool.
  • "Fix timers, tiered compilation, introduced time-sensitive work item queue to simulate coreclr behavior"
    • Wired work items queued from the native side (appdomain timer callback, tiered compilation background work callback) to queue them into the managed side
    • The timer thread calls into managed code to queue the callback
    • Some tiered compilation work item queuing paths cannot call managed code, so used a timer with zero due time instead
    • Added a queue of "time-sensitive" work items to the managed side to mimic how work items queued from the native side ran previously. In particular, if the global queue is backed up, when using the native thread pool the native work items still run ahead of them periodically (based on the Dispatch quantum). Potentially they could be queued into the global queue but if it's backed up it can potentially significantly and perhaps artificially delay the appdomain timer callback and the tiering background jitting. I didn't want to change the behavior in an observable (and potentially bad) way here for now, a good time to revisit this would be when IO completion handling is added to the portable thread pool, then the native work items could be handled somewhat similarly.
  • "Implement ResetThreadPoolThread, set thread names for diagnostics"
    • Aside from that, setting the thread names (at OS level) allows debuggers to identify the threads better as before. For threads that may run user code, the thread Name property is kept as null as before, such that it may be set without exception.
  • "Cache-line-separate PortableThreadPool._numRequestedWorkers similarly to coreclr"
    • Was missed before, separated it for consistency
  • "Post wait completions to the IO completion port on Windows for coreclr, similarly to before"
    • On Windows, wait completions are queued to the IO thread pool, which is still implemented on the native side. On Unixes, they are queued to the global queue.
  • "Reroute managed gate thread into unmanaged side to perform gate activites, don't use unmanaged gate thread"
    • When the config var is enabled, removed the gate thread from the native side. Instead, the gate thread on the managed side calls into the native side to perform gate activities for the IO completion thread pool, and returns a value to indicate whether the gate thread is still necessary.
    • Also added a native-to-managed entry point to request the gate thread to run for the IO completion thread pool
  • "Flow config values from CoreCLR to the portable thread pool for compat"
    • Flowed the rest of the thread pool config vars to the managed side, such that COMPlus variables continue to work with the portable thread pool
    • Config var values are stored in AppContext, made the names consistent for supported and unsupported values
  • "Port - ..." * 3
    • Ported a few fixes that did not make it into the portable thread pool implementation
  • "Fix ETW events"
    • Fixed the EventSource used by the portable thread pool, added missing events
    • For now, the event source uses the same name and GUID as the native side. It seems to work for now for ETW, we may switch to a separate provider (along with updating tools) before enabling the portable thread pool by default.
    • For enqueue/dequeue events, changed to use the object's hash code as the work item identifier instead of the pointer since the pointer may change between enqueue and dequeue
  • "Fix perf of counts structs"
    • Structs used for multiple counts with interlocked operations were implemented with explicit struct layout and field offsets. The JIT seems to generate stack-based code for such structs and it was showing up as higher overhead in perf profiles compared to the equivalent native implementation. Slower code in compare-exchange loops can cause a larger gap of time between the read and the compare-exchange, which can also cause higher contention.
    • Changed the structs to use manual bit manipulation instead, and microoptimized some paths. The code is still not as good as that generated by C++, but it seems to perform similarly based on perf profiles.
    • Code size also improved in many cases, for example one of the larger differences was in MaybeAddWorkingWorker(), which decreased from 585 bytes to 382 bytes and with far fewer stack memory operations
  • "Fix perf of dispatch loop"
    • Just some minor tweaks as I was looking at perf profiles and code of Dispatch()
  • "Fix perf of ThreadInt64PersistentCounter"
    • The implementation used to count completed work items was using ThreadLocal<T>, which turned out to be too slow for that purpose according to perf profiles
    • Changed it to rely on the user of the component to provide an object that tracks the count, which the user of the component would obtain from a ThreadStatic field
    • Also removed the thread-local lookup per iteration in one of the hot paths in Dispatch() and improved inlining
  • "Miscellaneous perf fixes"
    • A few more small tweaks as I was looking at perf profiles and code
    • In ConcurrentQueue, added check for empty into the fast path
    • For the portable thread pool, updated to trigger the worker thread Wait event after the short spin-wait completes and before actually waiting, the event is otherwise too verbose when profiling and changes performance characteristics
    • Cache-line-separated the gate thread running state as is done in the native implementation
    • Accessing PortableThreadPool.ThreadPoolInstance multiple times was generating less than ideal code that was noticeable in perf profiles. Tried to avoid it especially in hot paths, and in some cases where unnecessary for consistency if nothing else.
    • Removed an extra call to Environment.TickCount in Dispatch() per iteration
    • Noticed that a field that was intended to be cache-line-separated was not actually being separated, see ThreadPoolWorkQueue.numOutstandingThreadRequests is not being padded as requested, despite the explicit sequential layout #38215, fixed
  • "Fix starvation heuristic"
    • Described in comment
  • "Implement worker tracking"
    • Implemented the equivalent in the portable thread pool along with raising the relevant event
  • "Use smaller stack size for threads that don't run user code"
    • Using the same stack size as in the native side for those threads
  • "Note some SOS dependencies, small fixes in hill climbing to make equivalent to coreclr"
  • "Port some tests from CoreRT"
    • Also improved some of the tests
  • "Fail-fast in thread pool native entry points specific to thread pool implementations based on config"
    • Scanned all of the managed-to-native entry points from the thread pool and thread-entry functions, and promoted some assertions to be verified in all builds with fail-fast. May help to know in release builds when a path that should not be taken is taken and to avoid running further along that path.
  • "Fix SetMinThreads() and SetMaxThreads() to return true only when both changes are successful with synchronization"
    • These are a bit awkward when the portable thread pool is enabled, because they should return true only when both changes are valid and return false without making any changes otherwise, and since the worker thread pool is on the managed side and IO thread pool is on the native side
    • Added some managed-to-native entry points to allow checking validity before making the changes, all under a lock taken by the managed side
  • "Fix registered wait removals for fairness since there can be duplicate system wait objects in the wait array"
    • Described in comment
  • "Allow multiple DotNETRuntime event providers/sources in EventPipe"
    • Temporary change to EventPipe to be able to get events from dotnet-trace
    • For now, the event source uses the same name and GUID as the native side. It seems to work for now for ETW, and with this change it seems to work with EventPipe for getting events. Subscribing to the NativeRuntimeEventSource does not get thread pool events yet, that is left for later. We may switch to a separate provider (along with updating tools) before enabling the portable thread pool by default, as a long-term solution.
  • "Fix registered wait handle timeout logic in the wait thread"
    • The timeout logic was comparing against how long the last wait took and was not timing out waits sometimes, fixed to consider the total time since the last reset of timeout instead
  • "Fix Browser build"
    • Updated the Browser-specific thread pool variant based on the other changes

Corresponding PR to update SOS: dotnet/diagnostics#1274
Fixes #32020

@kouvel kouvel added NO-MERGE The PR is not ready for merge yet (see discussion for detailed reasons) area-System.Threading labels Jun 22, 2020
@kouvel kouvel added this to the 5.0.0 milestone Jun 22, 2020
@kouvel kouvel self-assigned this Jun 22, 2020
@kouvel kouvel added NO-MERGE The PR is not ready for merge yet (see discussion for detailed reasons) and removed NO-MERGE The PR is not ready for merge yet (see discussion for detailed reasons) labels Jun 22, 2020
@kouvel
Copy link
Member Author

kouvel commented Jun 22, 2020

Corresponding PR to update SOS: dotnet/diagnostics#1274

Change is ready for review, but flagging as NO-MERGE for now, until both PRs are ready.

@kouvel
Copy link
Member Author

kouvel commented Jun 22, 2020

Perf data and testing info to follow

@kouvel
Copy link
Member Author

kouvel commented Jun 22, 2020

ASP.NET RPS perf results

. Machine OS Connections Clr before Clr after Diff Clr after with PTP Diff from before Mono JIT before Mono JIT after Diff
PlaintextPlatform 28-proc x64 Linux 512 14467899 14456871 -0.1% 14819954 2.4% 9626436 9679046 0.5%
Plaintext 28-proc x64 Linux 512 5265058 5286626 0.4% 5344409 1.5% 2855862 2890546 1.2%
JsonPlatform 28-proc x64 Linux 512 1164882 1169727 0.4% 1209226 3.8% 830938 842005 1.3%
Json 28-proc x64 Linux 512 954094 958050 0.4% 987212 3.5% 619826 624145 0.7%

No change when portable thread pool is disabled, some small improvements when enabled (some are within the error margin). I wasn't seeing regression or improvement before, but many things changed and hopefully it'll stay.

FortunesPlatform/Fortunes are currently not working. On Windows I was seeing large swings in numbers, without full CPU usage, and resulting in very different numbers from Linux, will try later on a different machine. I wasn't able to test on arm64, as updating the build with my locally cross-built native binaries doesn't seem to work (even without any changes), I can verify after it's merged and an sdk with the change is produced.

@kouvel kouvel changed the title [WIP] Migrate coreclr's worker thread pool to be able to use the portable thread pool in opt-in fashion Migrate coreclr's worker thread pool to be able to use the portable thread pool in opt-in fashion Jun 22, 2020
@kouvel
Copy link
Member Author

kouvel commented Jun 22, 2020

Microbenchmark perf. The benchmark measures throughput with a very short CPU-intensive delay in each work item, trying to measure mostly the overhead of the thread pool for very short work items, in sustained fashion or in bursty fashion. The length of the burst is the number multiplied by proc count. The benchmark is mostly useful to find larger regressions, sometimes even large differences don't translate into reality, especially as work items do more work.

Windows x64 8-proc

. Clr before Clr after Diff Clr after with PTP Diff from before
Global sustained 9334 9301 -0.3% 9313 -0.2%
Global 1*proc burst 1181 1183 0.2% 1175 -0.5%
Global 4*proc burst 3363 3420 1.7% 3350 -0.4%
Global 16*proc burst 5809 5898 1.5% 5859 0.9%
Global 64*proc burst 6973 7019 0.7% 6978 0.1%
Global 256*proc burst 7148 7264 1.6% 7191 0.6%
Local sustained 19968 20711 3.7% 21423 7.3%
Local 1*proc burst 1174 1177 0.2% 1168 -0.5%
Local 4*proc burst 3684 3725 1.1% 3767 2.2%
Local 16*proc burst 8705 8818 1.3% 9036 3.8%
Local 64*proc burst 13433 13605 1.3% 14184 5.6%
Local 256*proc burst 15435 15560 0.8% 16188 4.9%

Linux x64 8-proc VM

. Clr before Clr after Diff Clr after with PTP Diff from before Mono JIT before Mono JIT after Diff Notes for Mono JIT
Global sustained 8954 8827 -1.4% 8747 -2.3% 7098 7508 5.8%
Global 1*proc burst 121 114 -5.5% 132 9.4% 35 41 18.7%
Global 4*proc burst 1670 1696 1.5% 1694 1.4% 104 122 17.1%
Global 16*proc burst 4125 4111 -0.3% 4230 2.5% 3056 3195 4.6%
Global 64*proc burst 6005 6025 0.3% 6017 0.2% 5054 5256 4.0%
Global 256*proc burst 6719 6695 -0.4% 6765 0.7% 5846 6068 3.8%
Local sustained 16499 17801 7.9% 19305 17.0% 9328 10261 10.0%
Local 1*proc burst 114 110 -4.1% 125 9.5% 34 46 36.5% High error
Local 4*proc burst 1634 1698 3.9% 1805 10.4% 153 235 53.3% Very high error
Local 16*proc burst 5219 5275 1.1% 5518 5.7% 2170 3485 60.7% Very high error
Local 64*proc burst 10008 10283 2.7% 10845 8.4% 6193 6573 6.1%
Local 256*proc burst 12883 13395 4.0% 14335 11.3% 7708 8362 8.5%

For Clr results, the tests with the regressions appear to be multimodal, not sure why. I don't think it's significant.

Those three tests when running under Mono seem to have very high error margins before and after the change, they can be ignored. I had collected the Mono perf numbers before, when my machine was reporting lower numbers on all runtimes, my machine does that sometimes.

Windows arm64 8-proc

. Clr before Clr after Diff Clr after with PTP Diff from before
Global sustained 4600 4611 0.2% 4547 -1.2%
Global 1*proc burst 562 575 2.3% 645 14.8%
Global 4*proc burst 1458 1475 1.2% 1645 12.9%
Global 16*proc burst 2684 2725 1.5% 2813 4.8%
Global 64*proc burst 3298 3340 1.3% 3343 1.4%
Global 256*proc burst 3468 3506 1.1% 3489 0.6%
Local sustained 8662 8667 0.1% 8858 2.3%
Local 1*proc burst 557 574 2.9% 644 15.7%
Local 4*proc burst 1540 1545 0.3% 1792 16.4%
Local 16*proc burst 3464 3565 2.9% 3748 8.2%
Local 64*proc burst 5102 5271 3.3% 5234 2.6%
Local 256*proc burst 5371 5412 0.8% 5509 2.6%

Code:

using System;
using System.Diagnostics;
using System.Linq;
using System.Runtime.CompilerServices;
using System.Threading;

namespace ThreadPoolWorkThroughput
{
    class Program
    {
        private static void Main(string[] args)
        {
            if (args.Length <= 0)
            {
                Console.WriteLine("Usage: ThreadPoolWorkThroughput <global|local> [burstLengthProcCountMultiplier]");
                return;
            }

            bool preferLocal;
            if ("global".Equals(args[0], StringComparison.OrdinalIgnoreCase))
                preferLocal = false;
            else if ("local".Equals(args[0], StringComparison.OrdinalIgnoreCase))
                preferLocal = true;
            else
            {
                Console.WriteLine("Invalid first parameter");
                return;
            }

            int processorCount = Environment.ProcessorCount;
            int burstLength = 0;
            if (args.Length > 1)
            {
                if (int.TryParse(args[1], out int burstLengthProcCountMultiplier) && burstLengthProcCountMultiplier > 0)
                    burstLength = burstLengthProcCountMultiplier * processorCount;
                else
                {
                    Console.WriteLine("Invalid second parameter");
                    return;
                }
            }

            ThreadPoolWorkThroughput(processorCount, preferLocal, burstLength);
        }

        private static void ThreadPoolWorkThroughput(int threadCount, bool preferLocal, int burstLength)
        {
            if (burstLength > 0 && burstLength < threadCount)
                burstLength = threadCount;

            var startTest = new ManualResetEvent(false);
            var threadOperationCounts = new int[(threadCount + 1) * 16];
            var workItemsScheduled = 0;
            var workItemsCompleted = new AutoResetEvent(false);
            ThreadPool.SetMinThreads(threadCount, threadCount);
            ThreadPool.SetMaxThreads(threadCount, threadCount);

            Action<int> workItem = null;
            workItem = toQueue =>
            {
                bool isSustained = burstLength <= 0;
                do
                {
                    if (isSustained)
                        ++toQueue;
                    else if (toQueue <= 0)
                        break;

                    bool localPreferLocal = preferLocal;
                    do
                    {
                        ThreadPool.UnsafeQueueUserWorkItem(workItem, 0, localPreferLocal);
                    } while (--toQueue > 0);
                } while (false);

                var tld = t_data ?? CreateThreadLocalData();
                Delay(tld);
                ++threadOperationCounts[tld.threadArrayIndex];
                if (!isSustained && Interlocked.Decrement(ref workItemsScheduled) == 0)
                    workItemsCompleted.Set();
            };

            var threadReady = new AutoResetEvent(false);
            Thread producerThread;
            if (burstLength <= 0)
            {
                producerThread = new Thread(() =>
                {
                    bool localPreferLocal = preferLocal;
                    int initialWorkItemCount = threadCount * 8;
                    threadReady.Set();
                    startTest.WaitOne();
                    for (int i = 0; i < initialWorkItemCount; ++i)
                        ThreadPool.UnsafeQueueUserWorkItem(workItem, 1, localPreferLocal);
                });
            }
            else
            {
                producerThread = new Thread(() =>
                {
                    var localThreadCount = threadCount;
                    bool localPreferLocal = preferLocal;
                    var localBurstLength = burstLength;
                    threadReady.Set();
                    startTest.WaitOne();
                    while (true)
                    {
                        Interlocked.Exchange(ref workItemsScheduled, localBurstLength);

                        int toQueueTotal = localBurstLength - localThreadCount;
                        int toQueuePerWorkItem = toQueueTotal <= 0 ? 0 : toQueueTotal / localThreadCount;
                        int toQueueExtra = toQueueTotal <= 0 ? 0 : toQueueTotal - toQueuePerWorkItem * localThreadCount;
                        for (int i = 0; i < localThreadCount; ++i)
                        {
                            int toQueue = toQueuePerWorkItem;
                            if (toQueueExtra > 0)
                            {
                                --toQueueExtra;
                                ++toQueue;
                            }
                            ThreadPool.UnsafeQueueUserWorkItem(workItem, toQueue, localPreferLocal);
                        }

                        workItemsCompleted.WaitOne();
                    }
                });
            }
            producerThread.IsBackground = true;
            producerThread.Start();
            threadReady.WaitOne();

            Run(startTest, threadOperationCounts);
        }

        private static void Run(ManualResetEvent startTest, int[] threadOperationCounts)
        {
            var sw = new Stopwatch();
            int threadCount = threadOperationCounts.Length / 16 - 1;
            var afterWarmupOperationCounts = new long[threadCount];
            var operationCounts = new long[threadCount];
            startTest.Set();

            // Warmup

            Thread.Sleep(1000);

            for (int j = 0; j < 4; ++j)
            {
                for (int i = 0; i < threadCount; ++i)
                    afterWarmupOperationCounts[i] = threadOperationCounts[(i + 1) * 16];

                // Measure

                sw.Restart();
                Thread.Sleep(500);
                sw.Stop();

                for (int i = 0; i < threadCount; ++i)
                    operationCounts[i] = threadOperationCounts[(i + 1) * 16];
                for (int i = 0; i < threadCount; ++i)
                    operationCounts[i] -= afterWarmupOperationCounts[i];

                double score = operationCounts.Sum() / sw.Elapsed.TotalMilliseconds;
                Console.WriteLine($"Score: {score,15:0.000000}");
            }
        }

        private sealed class ThreadLocalData
        {
            private static int s_previousThreadArrayIndex;

            public int threadArrayIndex = Interlocked.Increment(ref s_previousThreadArrayIndex) * 16;
            public Random rng = new Random();
            public int delayFibSum;
        }

        [ThreadStatic]
        private static ThreadLocalData t_data;

        [MethodImpl(MethodImplOptions.NoInlining)]
        private static ThreadLocalData CreateThreadLocalData()
        {
            var tld = new ThreadLocalData();
            t_data = tld;
            return tld;
        }

        private static void Delay(ThreadLocalData tld) => tld.delayFibSum += Fib(tld.rng.Next(4, 10));

        [MethodImpl(MethodImplOptions.NoInlining)]
        private static int Fib(int n) => n <= 1 ? n : Fib(n - 2) + Fib(n - 1);
    }
}

@kouvel
Copy link
Member Author

kouvel commented Jun 22, 2020

General testing done:

  • ThreadPool, Timer, and Thread tests
  • Thread pool features - starvation on worker threads, starvation on IO completion threads, hill climbing
  • Anything that was added like each reroute based on config, worker tracking, etc.
  • SOS ThreadPool -ti -wi, VS/WinDbg/lldb thread views
  • PerfView, perfcollect, dotnet-trace for events (currently doesn't seem to be working for mono), thread type identification from events
  • dotnet-counters, profile comparisons with events and some benchmarks to see that thread pool is behaving similarly
    • perfcollect was not showing EventSource events including existing ones, so when the portable thread pool is enabled those would not show up currently
  • CscRoslynSource

@kouvel
Copy link
Member Author

kouvel commented Jun 22, 2020

Looks like Mono folks are already added, CC some more people

- For a registered wait that is automatically unregistered (due to `executeOnlyOnce: true`), the registered wait handle gets added to the array of pending removals, and this automatic unregister does not wait for the removal to actually happen
- If shortly after that a user calls `Unregister(null)` on the same registered wait handle, it is supposed to wait for the removal to actually happen, but was not because the handle is already in the array of pending removals
- A `Dispose` on the wait handle shortly after `Unregister` returns would delete the safe handle and `DangerousRelease` upon removal would throw and crash the process
- Fixed by waiting when a registered wait handle is pending removal, regardless of whether the caller of `Unregister` added the handle to the array of pending removals or if it was added by anyone else
@kouvel
Copy link
Member Author

kouvel commented Oct 20, 2020

Rebased to fix conflict

@kouvel kouvel closed this Oct 20, 2020
@kouvel kouvel reopened this Oct 20, 2020
@mangod9 mangod9 added this to Epics in Core-Runtime .net 9 Oct 21, 2020
@kouvel kouvel merged commit 2a234f9 into dotnet:master Oct 21, 2020
@kouvel kouvel deleted the ThreadPool branch October 21, 2020 17:18
@mangod9 mangod9 moved this from Epics to Complete in Core-Runtime .net 9 Oct 21, 2020
kouvel added a commit to dotnet/diagnostics that referenced this pull request Oct 25, 2020
…led (#1274)

Related to and depends on dotnet/runtime#38225

When the managed portable thread pool is enabled:
- Made the `ThreadPool` command work as expected, including `ThreadPool -ti` to show in-memory hill climbing thread adjustment history
- After the command queries the native side for info, it looks for a couple of static variables to determine if the portable thread pool is available and enabled
- If it's enabled, it collects equivalent information from the managed side
- Verified that the command works with and without the changes in dotnet/runtime#38225, and with the changes in both modes (portable thread pool enabled and disabled)
layomia added a commit to layomia/dotnet_runtime that referenced this pull request Nov 10, 2020
* Arm32 Crossgen2 initial support (#43243)

- Fix type layout bugs
  - Sequential or Explicit layout classes without explicit field offsets on arm32 should align their fields based on the start of the field list of the object
  - The field base offset used for R2R calculation on Arm32 should respect the RequiresAlign8 flag
  - Computing true for requiresAlign8 in auto field layout should set the alignment of a class to 8 during auto layout
  - if a class derives from an type which requires 8 byte alignment, set the derived to require higher alignment
- Align the EH info table on 4 byte boundaries
- Set the thumb bit on the arm32 personality routine RVA in XData
- Enable Crossgen2 smoke test for arm
- Adjust architecture specific type layout tests to match CoreCLR behavior
- Fix alignment of Export functions within PE file

* Remove unsafe code from System.Web.HttpUtility (#43422)

* Fix the android cmake build. (#43421)

* [browser][crypto] Remove restraining not supported attribute Primitives (#43387)

* [browser][crypto] Remove restraining not supported attribute

- The modules included within the System.Security.Cryptography.Primitives module should still be available for use outside of browser os.

* Address review comments.  Remove `<IncludePlatformAttributes>` attribute as well

* Fix StaticTestGenerator (#43432)

It's rotted a bit.

* Clean up DependencyModel Json read/write (#43376)

* Clean up DependencyContextWriter

- Remove UnifiedJsonWriter
- Remove ArrayBufferWriter and write to the Stream directly

* Clean up DependencyContextJsonReader

- Remove UnifiedJsonReader
- Move any reader logic to extension methods

* Add MetadataToken getter override to builder classes (#43330)

* [mono] Include hostpolicy/hostfxr in mono desktop runtime packs (#42729)

Currently mono desktop runtime packs don't include `libhostfxr.*` and `libhostpolicy*` libs needed for corehost in self contained mode.

Co-authored-by: Alexander Köplinger <alex.koeplinger@outlook.com>

* Remove some unsafe code from Console (#43368)

* Remove some unsafe code from Console

* preserve original byte-by-byte decoding

* remove unsafe declaration where not needed

Co-authored-by: Stephen Toub <stoub@microsoft.com>

* Bind byte array from base64 config value (#43150)

* Bind byte array from base64 config value

* Add test case for failure

* Add test for null case

* Remove unnecessary null check

Co-authored-by: Thomas Levesque <thomaslevesque@users.noreply.github.com>

* Add w32subset.h into monoutils_sources. (#43435)

Co-authored-by: lateralusX <lateralusX@users.noreply.github.com>

* [runtime] Add an options API. (#32595)

* [runtime] Add an options API.

Add a general options API to the runtime, based on the flags API in Google V8:

```https://chromium.googlesource.com/v8/v8.git/+/refs/heads/master/src/flags/```

Supported features:
* Definition of runtime options in a declarative way.
* Options are mapped to C globals.
* BOOL/INT/STRING data types.
* Generic option parsing code.
* Generic usage code.
* Read-only flags for build-time optimization.

This is designed to replace the many option parsing functions in
the runtime, MONO_DEBUG, the many mono_set_... functions etc.

* Fix the build.

Co-authored-by: vargaz <vargaz@users.noreply.github.com>
Co-authored-by: Zoltan Varga <vargaz@gmail.com>

* Revert "add better handling of SECBUFFER_EXTRA during TLS handshake on Windows (#42427)" (#43442)

This reverts commit 51f6b8bd3a2a38c432b1cd1f7c465c256f5f699c.

* Fix runincontext testing (#43446)

After some recent shuffles of tests location, running the coreclr tests
with runincontext option stopped working. This change fixes it by fixing
the script path in the run.py.

* Delete CoreDllMain, remove DLL_THREAD_DETACH from EEDllMain, just rely on thread local destructor (#43423)

* [debugger][wasm] Implement Debugger.IsAttached on wasm (#42532)

* Debugger.IsAttached is now working on wasm. And can be used to detect if debugger is attached.
Fix #42411

* Update src/mono/wasm/runtime/library_mono.js

Co-authored-by: Ryan Lucia <ryan@luciaonline.net>

* Using the infrastructure to not send dynamically loaded assemblies if debugger is not attached.
Changing where to check if the assembly is already added to avoid unnecessary checks.

* Checking the assembly name size.

Co-authored-by: Ryan Lucia <ryan@luciaonline.net>

* Add a makefile sample to run test-browser (#43382)

* Add back nightly build table (#43392)

* Update README table generator.
* Link README to new table location.
* Add 6.0 coreclr runtime links for table generation.
* Add generated table to the dogfooding page.
* Update some stale references to sleet feeds, older versions of the runtime, and deprecated packages.
* Add subset for table generation and reorder table to frontload OS groups.

Co-authored-by: Jan Kotas <jkotas@microsoft.com>

* Delete superfluous suffix

* JIT: ensure bbflags get treated as 64 bit literals (#43451)

Expressions like `~(BBF_KEEP_BBJ_ALWAYS)` were being evaluated as 32 bit signed
quantities, leading to mask value `00000000_7FFFFFFFF` instead of the desired
`FFFFFFFF_7FFFFFFFF`, causing inadvertent clearing of flags with higher value.

SPMI diffs showed the only flag loss that impacted codegen was `BBF_HAS_CALL`,
which feeds into the CSE heuristics. So no known correctness issue, but it is
certainly possible to also lose `BBF_DOMINATED_BY_EXCEPTIONAL_ENTRY` or
`BBF_HAS_SUPPRESSGC_CALL` and that may be more serious.

* Fix two tests for runincontext testing (#43457)

* Disable two tests for runincontext testing

These tests are not compatible with running using the runincontext tool.

* Reflect PR feedback - make the test work instead

* Remove unused Unix PKCS12 shims

Because the PKCS#12/PFX import is now done with managed code, the PKCS12 shim was dead code.

* Modify System.Net.HttpListener to throw PNSE at assembly level on browser (#43401)

* JIT: some small profile related fixes (#43408)

1. If we're inheriting a fraction of the profile weight of a profiled block,
mark the inheriting block as profiled. This prevents methods like
`optSetBlockWeights` or `optMarkLoopBlocks` from coming along later and setting
the weights to something else. Since the full inheritance method has similar
logic, make it delegate to the fractional one, with a scale of 100 (no scaling).

2. If we switch from Tier0 to FullOpt, make sure to clear the BBINSTR flag,
else we'll put probes into optimized code.

3. Dump edge weights in the dot graph, if we have them.

4. Only dump the flow graph twice per phase.

* adjust renegotiation tests to match product change (#43123)

* adjust renegotiation tests to match product change

* add assert for validationCount

* Add issues.targets exclusions for Crossgen2 Pri 1 Tests (#43473)

This baselines the remaining test failures with GH issues to track for further investigation.

* Fix new warnings (#43097)

* Fix CA1416 warnings in runtime repo

* Port ACL OpenExisting overloads for EventWaitHandle/Mutex/Semaphore (#43134)

* Add methods to ref file.

* Add empty methods to src files.

* Add the .NET Framework version of these methods for .NET Standard.

* Move OpenExistingResult enum to Common and consume it where needed.

* Remove Unix comment in Windows-only file.

* Document OpenExistingResult enum.

* Make out result parameters nullable.

* Add exception resource string.

* Imlement EventWaitHandleAcl methods.

* Implement MutexAcl methods.

* Implement SemaphoreAcl methods.

* Remove unnecessary check for null or empty name.

* Document the EventWaitHandleAcl methods.

* Document the MutexAcl methods.

* Document the SemaphoreAcl methods.

* Add negative enum check. Fix incorrect cast.

* Add NotNullWhen attribute to TryOpenExisting out parameter. Adjust docs.

* Add EventWaitHandle basic unit tests.

* Add Semaphore basic unit tests

* Add Mutex basic unit tests.

* Add Mutex exception handling unit tests.

* Add EventWaitHandle exception handling unit tests.

* Add Semaphore exception handling unit tests.

* EventWaitHandle and Mutex throw DirectoryNotFoundException when PathNotFound. Adjust documentation.

* Nullability in ref file.

* Do not check for rights out of range value, let Windows handle it. Adjust unit tests accordingly.

* Spacing.

* Remove enum range test. Add PathNotFound tests.

Co-authored-by: Carlos Sanchez Lopez <carlossanlop@users.noreply.github.com>

* Fix xunit analyzers to run on library test projects (#43459)

We don't want most analyzers running over our test code currently (some rules could be enabled with varying degrees of effort), but we do want the xunit analyzers running, and they haven't been.  Fix that by creating a new ruleset specific to library tests, and switching over to use it when building library test projects.

* Remove some unnecessary unsafe usage (#43430)

* Fix nullable warnings in struct constructors (#43472)

* Add additional URI schemes (WIP) (#43375)

* Remove unused Common Extensions code (#43452)

* Make more suitable SPC instance methods static (#43280)

Co-authored-by: Jan Kotas <jkotas@microsoft.com>

* Enable Mono substitutions and attributes exclusion for mobile (#43507)

* [mono] Ensure MonoAssemblyName is in sync between managed and native (#43536)

We also no longer appear to need the NETCORE or DISABLE_REMOTING defines in msbuild, so remove them

* Port changes from dotnet/runtimelab (#43496)

- Fix build errors in System.Globalization.Native with libraries/Native warning level, add System.Globalization.Native back to the libraries/Native build to protect it,
- Misc other changes

* test for 2164 and corert 8246 (#43511)

* Ongoing cmake build work. (#43519)

* Ongoing cmake build work.

* Fix llvm support when cross compiling
* Fix/enable ios support.

* Add support for amd64->arm/arm64 cross builds on CI.

* [master] Update dependencies from dotnet/arcade dotnet/xharness dotnet/llvm-project dotnet/icu mono/linker (#43355)

Co-authored-by: dotnet-maestro[bot] <dotnet-maestro[bot]@users.noreply.github.com>
Co-authored-by: Viktor Hofer <viktor.hofer@microsoft.com>
Co-authored-by: Jeremy Koritzinsky <jekoritz@microsoft.com>
Co-authored-by: Marek Safar <marek.safar@gmail.com>

* [threads] At shutdown, don't wait for native threads that aren't calling Mono (#43174)

* [test] Call P/Invoke callback delegates from foreign threads

Change post-detach-1.cs to also have versions that call reverse pinvokes from
foreign threads that were not attached to mono.

The foreign threads should not prevent GC and should not prevent Mono from
shutting down using mono_manage_internal.

* [threads] Don't wait for native threads that aren't calling Mono

If a thread is started from native code, and it interacts with the runtime (by
calling a thunk that invokes a managed method), the runtime will attach the
thread - it will create a `MonoInternalThread` and add it to the list of
threads hash (in threads.c).

If the thread returns from the managed call, it will still be recorded by the
runtime, but as long as it is not running managed code anymore, it will prevent
shutdown.  The problem is when we try to suspend the thread in order to abort
it, `mono_thread_state_init_from_handle` will see a NULL domain (because
`mono_threads_detach_coop_internal` will restore it to NULL when a managed
method returns back to native code).  (On systems using POSIX signals to
suspend, the same check is in `mono_thread_state_init_from_sigctx`).  As a
result, `mono_threads_suspend_begin_async_suspend` (or `suspend_signal_handler`
on POSIX) will set `suspend_can_continue` to FALSE, and
`mono_thread_info_safe_suspend_and_run` will not run the suspend callback.

As a result, when `mono_manage_internal` calls `abort_threads`, it will add the
thread handle to the wait list, but it will not actually request the thread to
abort.  As a result, after `abort_threads` returns, the subsequent call to
`wait_for_tids` will block until the native thread terminates (at which point
the TLS destructor will notify the thread handle and wait_for_tids will
unblock).

This commit changes the behavior of `abort_threads` to ignore threads that do
not run `async_suspend_critical` and not to add them to the wait list.  As a
result, if a native thread calls into managed and then returns to native code,
the runtime will not wait for it.

* [threads] Warn if mono_thread_manage_internal can't abort a thread

Give a hint to embedders to aid debugging

* rename AbortThreadData:thread_will_abort field

It's used to keep track of whether the thread will eventually throw a TAE (and
thus that we need to wait for it).

The issue is that under full coop suspend, we treat threads in GC
Safe (BLOCKING) state as if they're suspended and always execute
async_abort_critical.  So the field has nothing to do with whether the thread
was suscessfully suspended, but rather whether it will (eventually) be aborted.

* [threads] Fix async_abort_critical for full coop

If the foreign external thread doesn't have any managed methods on its
callstack, but it once called a native-to-managed wrapper, it will be left by
mono_threads_detach_coop in GC Safe (BLOCKING) state.  But under full coop, GC
Safe state is considered suspended, so mono_thread_info_safe_suspend_and_run
will run async_abort_critical for the thread.

But the thread may never call into Mono again, in which case it will never
safepoint and aknowledge the abort interruption.  So set thread_will_abort to
FALSE in this case, so that mono_thread_manage_internal won't try to wait for it.

---

Related to an issue first identified in https://github.com/mono/mono/pull/18517

---

This supersedes mono/mono#18656


Co-authored-by: lambdageek <lambdageek@users.noreply.github.com>

* add better handling of SECBUFFER_EXTRA during TLS handshake on Windows (#43475)

* add better handling of SECBUFFER_EXTRA during TLS handshake on Windows (#42427)

* add better handling of SECBUFFER_EXTRA during TLS handshake on Windows

* fix boundery check

* fix spelling

* update Authentication_IncorrectServerName_Fail test

* feedback from review

* Update src/libraries/System.Net.Security/tests/FunctionalTests/CertificateValidationRemoteServer.cs

Co-authored-by: Stephen Toub <stoub@microsoft.com>

Co-authored-by: Stephen Toub <stoub@microsoft.com>

* Convert more interop to use function pointers (#43514)

* Fix  TimeZoneInfo to handle Yukon zone (#43550)

* Fix async iterators to clear out state upon completion (#43522)

AsyncIteratorMethodBuilder was only doing its clean-up for completion (e.g. zeroing out the state machine and context, removing the object from a debugger-incited tracking table) if the last call to the iterator was part of asynchronous completion; if the last MoveNextAsync completed synchronously, the used code path could miss that cleanup work.

* Add test to validate precedence of roll-forward and roll-forward-on-no-candidate-fx settings (#43510)

The comment on the test documents the desired precedence order.

Moved some of the existing tests which validate the behavior of the two settings together into a new separate test class.

* Refactor NonAscii bit mask usage (#43537)

* Refactor NonAscii bit mask usage

The Vector128 type is being left in a Blazor WASM application because Utf8Utility.GetPointerToFirstInvalidByte is always creating one, even though it isn't used.

I refactored the code such that the bit mask is no longer created on platforms it is not used, since it is only being used by Arm64.

* Improve performance of polymorphism (#42538)

* Use Span-based CreateHMAC where possible

* Add crossBuild parameter to yaml (#43319)

* Add crossBuild parameter to yaml

* Use NetCorePublic-Pool pool instead of AzDO hosted pool for Browser jobs (#43589)

The hosted pool runs into no disk space issues.

* Equals and GetHashCode for Reflection.Pointer (#42547)

* JIT: don't inline methods with small stackallocs if the call site is … (#43516)

The logic in `fgInlinePrependStatements` that zero-initializes locals doesn't
kick in for jit temps introduced when small stackallocs are optimized. So if we
inline a method with a small stackalloc into a loop, the memory for the
stackalloc doesn't get properly re-zeroed on each iteration.

Fix by disallowing such inlines by adding an extra check: the call site must
not be in a loop.

Closes #43391.

* Fix CG2 outerloop comparison runs and OSX leg warning (#43547)

The outerloop run has OSX checked test runs for CG2 and CG2 composite modes. Currently both would use a log upload artifact with the same name (coreclr__TestRunLogs_R2R_CG2_OSX_x64_checked_outerloop). Disambiguate the two with a different `LogNamePrefix` for composite runs.

The crossgen2 comparison runs are failing to create the baseline crossgen'd framework because the live libraries zip's internal path has changed. .net5 has been replaced with .net6. The build should have failed when we tried to copy from the wrong folder but the error got eaten and we ended up with a malformed framework folder. Adjust the inline yml scripts so if they fail, it will fail that containing build task.

* First draft of Dynamic Pgo proposal (#43371)

First draft of Dynamic Pgo proposal.

* Remove UnsupportedOsPlatform from CryptoConfig as a utility class (#43611)

* Remove UnsupportedOsPlatform from CryptoConfig as it is a safe utility class

* Clean up the other references

* Add CSV map file generation for compiler diagnostics (#43612)

Add `--csvmap` switch to Crossgen2 which causes it to generate node summary CSV files that are parsable by tests. The intent is to use this for size on disk perf tests so just the node type statistics and individual node map are implemented in CSV files.  We can add section and relocs easily if we think they'll be useful in future.

* add activity support for android sample (#43504)

* [debugger] Switch to GC Unsafe in signal handler callbacks (#43600)

If the runtime gets a single step or breakpoint signal while it is already
running native code for a P/Invoke, it will be in GC Safe mode.  Switch back to
GC Unsafe to run the debugger engine steps.

Addresses https://github.com/mono/mono/issues/20490

Co-authored-by: lambdageek <lambdageek@users.noreply.github.com>

* disable DefaultConnect_EndToEnd_Ok on Windows7 (#43628)

* disable DefaultConnect_EndToEnd_Ok on Windows7

* update platform check

* update platform check

* [iOS] Add mono runtime and AppleAppBuilder pkgproj for iOS sample (#43048)

* Add iOS sample pkgproj

* Add iOS Sample pkgproj to descriptions

* Add iOS sample pkgproj ProjectReference

* Add project reference to build AppleAppBuiler.csproj

* Fixup AppleAppBuilder assembly path

* Add AppleAppBuilder Packaging target

* Remove UI file to allow and encourage sample users to modify the UI

* Move package from dotnet6-transport to dotnet6 feed

Co-authored-by: Mitchell Hwang <mitchell.hwang@microsoft.com>

* [System.IO.Compression] ZipHelper.DosTimeToDateTime handle empty LastModified fields in zip archive entry header without internal exception (#43008)

* [System.IO.Compression] ZipHelper.DosTimeToDateTime handle empty LastModified field without internal exception to improve debugging performance on several zip files opening asynchronously

* [Test][Compression][ZipArhiveEntry] Add unit test to test empty lastModified field in zip entry

* refactor unit test code

* do not use arraypool in tests

* fix test after azure pipeline checks with errors

* fix invalid assert in new test

* improve usability of  NegotiateStreamInvalidOperationTest (#43622)

* [browser][debugger] Clean up MessageId logic to prepare for sessions in the test harness (#43188)

* Clean up MessageId logic

* Update src/mono/wasm/debugger/DebuggerTestSuite/InspectorClient.cs

Co-authored-by: Ankit Jain <radical@gmail.com>

* Update src/mono/wasm/debugger/DebuggerTestSuite/InspectorClient.cs

Co-authored-by: Ankit Jain <radical@gmail.com>

* Update src/mono/wasm/debugger/DebuggerTestSuite/InspectorClient.cs

Co-authored-by: Ankit Jain <radical@gmail.com>

Co-authored-by: Ankit Jain <radical@gmail.com>

* Fixing stale version badges in docs (#43558)

Fixing stale version badges in docs

From the issue description:
Improper cache-control in generated badges caused browsers caching svg
badges for a year. It is, therefore, recommend to modify related
README.md content by simple find & replace from: _version_badge.svg
to _version_badge.svg?no-cache
This will trigger github to compute and use different/new Camo proxy URL.

Fix #3822

* improve reliability of SslStream tests with failing certificate validation (#43570)

* improve reliability of SslStream tests with failing certificate validation

* Update src/libraries/System.Net.Security/src/System/Net/Security/SslStream.Implementation.cs

Co-authored-by: Stephen Toub <stoub@microsoft.com>

* fix failing ALPN test on old OpenSSL

Co-authored-by: Stephen Toub <stoub@microsoft.com>

* Mono: Turn on cmake build by default (#42119)

Turns it on everywhere except Windows.

* Sync shared code from aspnetcore (#43624)

* Use ThrowHelper inside ValueType types (#43634)

* Display the real offset after branch tightening (#43469)

* display the real offset after branch tightening

* emit offset for cold codeblock

* move isColdBlock inside #DEBUG

* fix formatting

* Revert "fix formatting"

This reverts commit 0744e1e432e12c5bddd4ce8c208660e31d96f12a.

* Revert "move isColdBlock inside #DEBUG"

This reverts commit 60e511b0bfb1836476a5fcf05ff3450d32c3d907.

* Revert "emit offset for cold codeblock"

This reverts commit e760beb2fcd1a5fcfd22ddf47eb6939a7094562d.

* Use emitCurCodeOffs() to display correct offset

* Clarify pre-merge commit squash guidance (#43664)

* Update MulticoreJit (#39996)

- Enable Generic Methods in MulticoreJit
- Enable NDirect Stub in MulticoreJit

* Fix missing signatures for Cross bitness DAC symbols (#43500)

* Unify paths used for cross-bit components

* Pass down buildArchitecture for signing

* Pass target properties to signing

* Automatically trigger the ilasm round-trip test on PRs (#43666)

For changes to the ilasm or ildasm source code, trigger
the ilasm round-trip pipeline.

There could, of course, be other changes that could affect
ilasm/ildasm, but this at least catches the primary ones.

* Mono: Fix the windows cmake build. (#43658)

This will not use cmake on windows, it just fixes the conditionals so the windows build doesn't fail if cmake is enabled by default.

* [Browser] don't pass redundant args to wasm (#41608)

* Enable implicit fallthrough warning (#43397)

* Enable implicit fallthrough warning

This change enables warnings for implicit fallthrough in
switch cases and fixes all the cases where the warning
was reported.
It also fixes some places where the fall through was incorrect. 
Fortunately, these places were not causing functional issues.

* Disable / fix failing test in runincontext testing (#43663)

* Disable all profiler tests since they launch a secondary process
  and process launch creates an infinite event loop in the
  SocketAsyncEngine on Linux. Since runincontext loads even
  framework assemblies into the unloadable context, locals in this
  loop prevent unloading.
  The tests were working before Process.Start moved to using sockets.
* Fix the multifoldertest to work under runincontext - the shell
  script generated from the .csproj was passing an absolute path
  for the multifolder.dll to the runincontext.sh/cmd instead of
  a relative path that is used in all other tests and that the
  runincontext expects.

* Add test leg to the PR build to run libraries tests on Android emulators (#37585)

* Add public JsonElement.ParseValue() and TryParseValue() (#43601)

* [mono] Copy image data with AssemblyLoadContext.LoadFromStream (#43592)

We don't actually pin the byte array, so it must be copied or it can be overwritten once we run a GC on the LOH.

Fixes https://github.com/dotnet/runtime/issues/43402

Tested manually that it fixes the issue using the associated repro. This isn't really something that lends itself to a test, so that's the best I can do.

* Fix incremental build of tasks.proj for mobile (#43674)

* Fix incremental build of tasks.proj for mobile

The semaphore file that is used as the input for deciding whether to rebuild tasks.proj wasn't properly taking the mobile task projects into account.
This resulted in e.g. WasmAppBuilder not being built if you built for desktop before, resulting in a build error.

We now use the conditioned project references as an input instead of globbing through all nested projects.

Co-authored-by: Viktor Hofer <viktor.hofer@microsoft.com>

* [master] Update dependencies from mono/linker dotnet/arcade dotnet/xharness dotnet/llvm-project (#43583)

* Update dependencies from https://github.com/dotnet/arcade build 20201015.7

Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Build.Tasks.Packaging , Microsoft.DotNet.Build.Tasks.SharedFramework.Sdk , Microsoft.DotNet.Build.Tasks.TargetFramework.Sdk , Microsoft.DotNet.CodeAnalysis , Microsoft.DotNet.GenAPI , Microsoft.DotNet.GenFacades , Microsoft.DotNet.XUnitExtensions , Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.RemoteExecutor , Microsoft.DotNet.VersionTools.Tasks , Microsoft.DotNet.XUnitConsoleRunner , Microsoft.DotNet.ApiCompat
 From Version 6.0.0-beta.20514.1 -> To Version 6.0.0-beta.20515.7

* Update dependencies from https://github.com/dotnet/xharness build 20201019.2

Microsoft.DotNet.XHarness.CLI , Microsoft.DotNet.XHarness.TestRunners.Xunit
 From Version 1.0.0-prerelease.20516.1 -> To Version 1.0.0-prerelease.20519.2

* Update dependencies from https://github.com/mono/linker build 20201020.1

Microsoft.NET.ILLink.Tasks
 From Version 6.0.0-alpha.1.20516.1 -> To Version 6.0.0-alpha.1.20520.1

* Update dependencies from https://github.com/mono/linker build 20201020.2

Microsoft.NET.ILLink.Tasks
 From Version 6.0.0-alpha.1.20516.1 -> To Version 6.0.0-alpha.1.20520.2

* Update dependencies from https://github.com/dotnet/arcade build 20201016.5

Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Build.Tasks.Packaging , Microsoft.DotNet.Build.Tasks.SharedFramework.Sdk , Microsoft.DotNet.Build.Tasks.TargetFramework.Sdk , Microsoft.DotNet.CodeAnalysis , Microsoft.DotNet.GenAPI , Microsoft.DotNet.GenFacades , Microsoft.DotNet.XUnitExtensions , Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.RemoteExecutor , Microsoft.DotNet.VersionTools.Tasks , Microsoft.DotNet.XUnitConsoleRunner , Microsoft.DotNet.ApiCompat
 From Version 6.0.0-beta.20514.1 -> To Version 6.0.0-beta.20516.5

* Update dependencies from https://github.com/dotnet/llvm-project build 20201019.1

runtime.linux-x64.Microsoft.NETCore.Runtime.Mono.LLVM.Tools , runtime.win-x64.Microsoft.NETCore.Runtime.Mono.LLVM.Tools , runtime.win-x64.Microsoft.NETCore.Runtime.Mono.LLVM.Sdk , runtime.osx.10.12-x64.Microsoft.NETCore.Runtime.Mono.LLVM.Tools , runtime.osx.10.12-x64.Microsoft.NETCore.Runtime.Mono.LLVM.Sdk , runtime.linux-x64.Microsoft.NETCore.Runtime.Mono.LLVM.Sdk , runtime.linux-arm64.Microsoft.NETCore.Runtime.Mono.LLVM.Tools , runtime.linux-arm64.Microsoft.NETCore.Runtime.Mono.LLVM.Sdk
 From Version 9.0.1-alpha.1.20512.1 -> To Version 9.0.1-alpha.1.20519.1

Co-authored-by: dotnet-maestro[bot] <dotnet-maestro[bot]@users.noreply.github.com>

* RyuJIT: Don't emit null checks for constant strings (#37245)

* Fold "ldstr ==/!== null" (mikedn's PR)

* Formatting

* check IsUnsigned for GT_GT

* Address feedback

* Update steps in debugging instruction of CLR (#43557)

* Update CLR debugging instruction steps.

* Use suggested BCL path

Co-authored-by: Jan Vorlicek <janvorli@microsoft.com>

* Mention the usage cases of CORE_LIBRARIES.

Co-authored-by: Jan Vorlicek <janvorli@microsoft.com>

* Remove non-cmake build support from mono.proj. (#43678)

* [Android] Add AndroidAppBuilder pkgproj for Android sample (#43216)

In preparation to bring mono samples to `dotnet/samples`, files that are most likely to change will be packaged into a nuget package to be downloaded and consumed on the `dotnet/samples` end rather than mirroring changes.

This PR expands the process that created the NuGet package for `Microsoft.NETCore.BrowserDebugHost.Transport` to build a NuGet package for the Android sample.

Co-authored-by: Mitchell Hwang <mitchell.hwang@microsoft.com>

* Add test leg to the PR build to run libraries tests on Android devices (#42209)

The on-device version of https://github.com/dotnet/runtime/pull/37585

Co-authored-by: Premek Vysoky <premek.vysoky@microsoft.com>
Co-authored-by: Santiago Fernandez Madero <safern@microsoft.com>

* Migrate coreclr's worker thread pool to be able to use the portable thread pool in opt-in fashion (#38225)

- Enables using the portable thread pool with coreclr as opt-in. Change is off by default for now, and can be enabled with COMPlus_ThreadPool_UsePortableThreadPool=1. Once it's had bake time and seen to be stable, at a reasonable time in the future the config flag would ideally be removed and the relevant parts of native implementation deleted.
- The IO thread pool is not being migrated in this change, and remains on the native side
- My goal was to get compatible behavior, compatible with diagnostics tools, and similar perf to the native implementation in coreclr. Tried to avoid changing scheduling behavior, behavior of heuristics, etc., compared with that implementation.
- The eventual goal is to have one mostly managed thread pool implementation that can be shared between runtimes, to ease maintenance going forward

Commit descriptions:
- "Add dependencies"
  - Ported LowLevelLock from CoreRT, and moved LowLevelSpinWaiter to shared. Since we support Thread.Interrupt(), they were necessary in the wait subsystem in CoreRT partly to support that, and were also used in the portable thread pool implementation where a pending thread interrupt on a thread pool thread would otherwise crash the process. Interruptible waits are already used in the managed side of the thread pool in the queue implementations. It may be reasonable to ignore the thread interrupt problem and suggest that it not be used on thread pool threads, but for now I just brought in the dependencies to keep behavior consistent with the native implementation.
- "Add config var"
  - Added config var COMPlus_ThreadPool_UsePortableThreadPool (disabled by default for now)
  - Flowed the new config var to the managed side and set up a mechanism to flow all of the thread pool config vars
  - Removed debug-only config var COMPlus_ThreadpoolTickCountAdjustment, which didn't seem to be too useful
  - Specialized native and managed thread pool paths based on the config var. Added assertions to paths that should not be reached depending on the config var.
- "Move portable RegisteredWaitHandle implementation to shared ThreadPool.cs"
  - Just moved the portable implementation, no functional changes. In preparation for merging the two implementations.
- "Merge RegisteredWaitHandle implementations"
  - Merged implementations of RegisteredWaitHandle using the portable version as the primary and specializing small parts of it for coreclr
  - Fixed PortableThreadPool's registered waits to track SafeWaitHandles instead of WaitHandles similarly to the native implementation. The SafeWaitHandle in a WaitHandle can be modified, so it is retrieved once and reused thereafter. Also added/removed refs for the SafeWaitHandles that are registered.
- "Separate portable-only portion of RegisteredWaitHandle"
  - Separated RegisteredWaitHandle.UnregisterPortable into a different file, no functional changes. Those paths reference PortableThreadPool, which is conditionally included unlike ThreadPool.cs. Just for consistency such that the new file can be conditionally included similarly to PortableThreadPool.
- "Fix timers, tiered compilation, introduced time-sensitive work item queue to simulate coreclr behavior"
  - Wired work items queued from the native side (appdomain timer callback, tiered compilation background work callback) to queue them into the managed side
  - The timer thread calls into managed code to queue the callback
  - Some tiered compilation work item queuing paths cannot call managed code, so used a timer with zero due time instead
  - Added a queue of "time-sensitive" work items to the managed side to mimic how work items queued from the native side ran previously. In particular, if the global queue is backed up, when using the native thread pool the native work items still run ahead of them periodically (based on the Dispatch quantum). Potentially they could be queued into the global queue but if it's backed up it can potentially significantly and perhaps artificially delay the appdomain timer callback and the tiering background jitting. I didn't want to change the behavior in an observable (and potentially bad) way here for now, a good time to revisit this would be when IO completion handling is added to the portable thread pool, then the native work items could be handled somewhat similarly.
- "Implement ResetThreadPoolThread, set thread names for diagnostics"
  - Aside from that, setting the thread names (at OS level) allows debuggers to identify the threads better as before. For threads that may run user code, the thread Name property is kept as null as before, such that it may be set without exception.
- "Cache-line-separate PortableThreadPool._numRequestedWorkers similarly to coreclr"
  - Was missed before, separated it for consistency
- "Post wait completions to the IO completion port on Windows for coreclr, similarly to before"
  - On Windows, wait completions are queued to the IO thread pool, which is still implemented on the native side. On Unixes, they are queued to the global queue.
- "Reroute managed gate thread into unmanaged side to perform gate activites, don't use unmanaged gate thread"
  - When the config var is enabled, removed the gate thread from the native side. Instead, the gate thread on the managed side calls into the native side to perform gate activities for the IO completion thread pool, and returns a value to indicate whether the gate thread is still necessary.
  - Also added a native-to-managed entry point to request the gate thread to run for the IO completion thread pool
- "Flow config values from CoreCLR to the portable thread pool for compat"
  - Flowed the rest of the thread pool config vars to the managed side, such that COMPlus variables continue to work with the portable thread pool
  - Config var values are stored in AppContext, made the names consistent for supported and unsupported values
- "Port - ..." * 3
  - Ported a few fixes that did not make it into the portable thread pool implementation
- "Fix ETW events"
  - Fixed the EventSource used by the portable thread pool, added missing events
  - For now, the event source uses the same name and GUID as the native side. It seems to work for now for ETW, we may switch to a separate provider (along with updating tools) before enabling the portable thread pool by default.
  - For enqueue/dequeue events, changed to use the object's hash code as the work item identifier instead of the pointer since the pointer may change between enqueue and dequeue
- "Fix perf of counts structs"
  - Structs used for multiple counts with interlocked operations were implemented with explicit struct layout and field offsets. The JIT seems to generate stack-based code for such structs and it was showing up as higher overhead in perf profiles compared to the equivalent native implementation. Slower code in compare-exchange loops can cause a larger gap of time between the read and the compare-exchange, which can also cause higher contention.
  - Changed the structs to use manual bit manipulation instead, and microoptimized some paths. The code is still not as good as that generated by C++, but it seems to perform similarly based on perf profiles.
  - Code size also improved in many cases, for example one of the larger differences was in MaybeAddWorkingWorker(), which decreased from 585 bytes to 382 bytes and with far fewer stack memory operations
- "Fix perf of dispatch loop"
  - Just some minor tweaks as I was looking at perf profiles and code of Dispatch()
- "Fix perf of ThreadInt64PersistentCounter"
  - The implementation used to count completed work items was using `ThreadLocal<T>`, which turned out to be too slow for that purpose according to perf profiles
  - Changed it to rely on the user of the component to provide an object that tracks the count, which the user of the component would obtain from a ThreadStatic field
  - Also removed the thread-local lookup per iteration in one of the hot paths in Dispatch() and improved inlining
- "Miscellaneous perf fixes"
  - A few more small tweaks as I was looking at perf profiles and code
  - In ConcurrentQueue, added check for empty into the fast path
  - For the portable thread pool, updated to trigger the worker thread Wait event after the short spin-wait completes and before actually waiting, the event is otherwise too verbose when profiling and changes performance characteristics
  - Cache-line-separated the gate thread running state as is done in the native implementation
  - Accessing PortableThreadPool.ThreadPoolInstance multiple times was generating less than ideal code that was noticeable in perf profiles. Tried to avoid it especially in hot paths, and in some cases where unnecessary for consistency if nothing else.
  - Removed an extra call to Environment.TickCount in Dispatch() per iteration
  - Noticed that a field that was intended to be cache-line-separated was not actually being separated, see https://github.com/dotnet/runtime/issues/38215, fixed
- "Fix starvation heuristic"
  - Described in comment
- "Implement worker tracking"
  - Implemented the equivalent in the portable thread pool along with raising the relevant event
- "Use smaller stack size for threads that don't run user code"
  - Using the same stack size as in the native side for those threads
- "Note some SOS dependencies, small fixes in hill climbing to make equivalent to coreclr"
  - Corresponds with PR that updates SOS: https://github.com/dotnet/diagnostics/pull/1274
  - Also fixed a couple of things to work similarly to the native implementation
- "Port some tests from CoreRT"
  - Also improved some of the tests
- "Fail-fast in thread pool native entry points specific to thread pool implementations based on config"
  - Scanned all of the managed-to-native entry points from the thread pool and thread-entry functions, and promoted some assertions to be verified in all builds with fail-fast. May help to know in release builds when a path that should not be taken is taken and to avoid running further along that path.
- "Fix SetMinThreads() and SetMaxThreads() to return true only when both changes are successful with synchronization"
  - These are a bit awkward when the portable thread pool is enabled, because they should return true only when both changes are valid and return false without making any changes otherwise, and since the worker thread pool is on the managed side and IO thread pool is on the native side
  - Added some managed-to-native entry points to allow checking validity before making the changes, all under a lock taken by the managed side
- "Fix registered wait removals for fairness since there can be duplicate system wait objects in the wait array"
  - Described in comment
- "Allow multiple DotNETRuntime event providers/sources in EventPipe"
  - Temporary change to EventPipe to be able to get events from dotnet-trace
  - For now, the event source uses the same name and GUID as the native side. It seems to work for now for ETW, and with this change it seems to work with EventPipe for getting events. Subscribing to the NativeRuntimeEventSource does not get thread pool events yet, that is left for later. We may switch to a separate provider (along with updating tools) before enabling the portable thread pool by default, as a long-term solution.
- "Fix registered wait handle timeout logic in the wait thread"
  - The timeout logic was comparing against how long the last wait took and was not timing out waits sometimes, fixed to consider the total time since the last reset of timeout instead
- "Fix Browser build"
  - Updated the Browser-specific thread pool variant based on the other changes

Corresponding PR to update SOS: https://github.com/dotnet/diagnostics/pull/1274
Fixes https://github.com/dotnet/runtime/issues/32020

* Delete NetEventSource.Fail (#43579)

At some point some Debug.Asserts/Fails were replaced by this NetEventSource.Fail helper, which both Debug.Fails and fires an EventSource event.  But asserts in our code base are intended for things that should never happen, and we needn't be emitting events for them (if we did want to emit events for them, we'd need to tackle the other ~20,000 Debug.Assert/Fails in the codebase.

I've deleted NetEventSource.Fail, and fixed up the call sites.  Some were simply replaced by Debug.Assert/Fail.  Some were deleted entirely, when from code inspection it looked like they could actually be hit, but were guarded by a check for the event source being enabled and thus were unlikely to have been triggered in our previous testing.  Etc.

* Code sample for supporting dynamic objects (#42097)

* Fix crossgen2 armel build (#42811)

Build fails with libjitinterface_armel.so and libclrjit_unix_armel_x64.so was not found after #41126.

Signed-off-by: Timur <t.mustafin@partner.samsung.com>

* Fix fallthrough cases in portable thread pool change (#43701)

* Fix and test crossgen2 on arm and x86 (#42998)

Fix last issues preventing the Pri0 tests from passing under crossgen2 for arm and x86
- Add support for stackprobe helper on arm32
- Fix field layout for x86 structures that contain long enums
  - Add test suite for these enum scenarios 
- Report the same alignment info to the jit for crossgen2 based compilation as was done in the core runtime

Also enable testing targetting x86 and arm

* [llvm] Fix some simd issues. (#43647)

* [llvm] Add support for MONO_TYPE_FTNPTR.

* [llvm] Fix support for some SIMD intrinsics.

* Always use OP_SSE41_ROUNDS with 2 arguments, JIT opcodes can't
  have optional arguments.
* Convert arguments to intrinsic calls, sometimes they have
  slighly different pointer or vector types.

* [jit] Refactor the SIMD intrinsics handling code to share more code.

Avoid emitting LLVM intrinsics which are not not enabled since it
would cause llc to fail.

* Replace command-line parser for Crossgen2 (#43655)

Switch back to the old command-line parser for Crossgen2 to improve performance of parsing arguments.

* Extend allowed Task.Delay/CTS TimeSpan values to match Timer (#43708)

For some reason, Task.Delay(TimeSpan, ...) and CancellationTokenSource.CancelAfter(TimeSpan) cut off the max allowed timeout at int.MaxValue milliseconds, whereas Timer's TimeSpan support (which is used under the covers) goes all the way to UInt32.MaxValue - 2.  This changes Task/CancellationTokenSource to match, effectively doubling the allowed length of the timeouts.

* Slim down Path::Combine by removing dependency on ValueTuple`8 (#43582)

Co-authored-by: Jan Kotas <jkotas@microsoft.com>

* Fix issue #43714. Guid fallback for BigEndian had icorrect offsets. (#43718)

* Align MailAddress.GetHashCode with Equals (#43573)

* Bring back missing Exit_Failure case (#43665)

This script label and code was removed in a previous change
even though it is still required. Bring it back.

* Fix ildasm of certain floating-point numbers (#43673)

Change #42848 altered comments to print numbers in comments in
little-endian format, but went too far and changed two places
that print out numbers outside of comments that are later parsed
by ilasm in round-trip testing.

Fixes #43672

* Fix perf issue in System.Diagnostics.Activity (#43710)

* Update "Input Image Architectures" table in r2rdump README.md (#43727)

* gc: shorten background thread name to fit Linux name limit (#43679)

* gc: shorten background thread name to fit Linux name limit

* Allow dynamic code sample to compile on v3.0-5.0 (#43703)

* [jit] Fix gsharedvt constrained calls to Object.GetType () under netcore. (#43729)

* [jit] Fix gsharedvt constrained calls to Object.GetType () under netcore.

Fixes https://github.com/dotnet/runtime/issues/35674.

* Update src/mono/mono/mini/jit-icalls.c

Co-authored-by: Aleksey Kliger (λgeek) <akliger@gmail.com>

Co-authored-by: Aleksey Kliger (λgeek) <akliger@gmail.com>

* Change list-processed to ps (#42293) (#42297)

Co-authored-by: Sunguoyun <sunguoyun@loongson.cn>

* Increase Hosting tests timeout (#43695)

* Increase Hosting tests timeout

There are some environments (checked CoreCLR, no tiered compilation) where the current timeout is insufficient. Increasing the timeout so the tests don't fail in these environments.

Fix #43389

* Make DebugDirectoryBuilder.AddCodeViewEntry public (#43267)

* Trim quoted file names passed to Crossgen2 (#43746)

* Fixes to allow using OS-provided threadpool (#43726)

Backport from dotnet/runtimelab:NativeAOT

* Move Thread PNSE check to managed to cut more dependencies (#43730)

* Disable test against https://github.com/dotnet/runtime/issues/43754 (#43755)

* Fix the signature of Interop.Sys.Log () to match the native signature. (#43744)

Signature mismatches cause errors on some platforms like wasm.

* Add missing opcodes to k_rgnStackPushes (#42246)

The rewriter defined two extra opcodes, CEE_COUNT and CEE_SWITCH_ARG, but does not define them in the k_rgnStackPushes array. This can cause out-of-bounds reads when computing the value of maxstack.

* [mono] Use CMAKE_CURRENT_SOURCE_DIR instead of CMAKE_SOURCE_DIR (#43715)

CLion creates a directory named ".idea" as a sibling to the top-level
CMakeLists.txt that describes the project; with src/mono, git clean -dXf
will delete src/mono/.idea because ".idea/" is an ignored pattern in
.gitignore.

One workaround is to create an out-of-tree CMakeLists.txt that contains
nothing but add_directory(relative/path/to/src/mono), but this changes
the value of CMAKE_SOURCE_DIR to something other than what our CMake
build files expect.

* Allow WasmAppBuilder to run after the publish step. Change the (#43742)

wasm sample to use publish.

* Fix NegotiateStream handling of EOF (#43739)

In my refactoring of NegotiateStream to use async/await, I broke its handling of EOF, with it throwing an exception instead of returning 0.  This fixes it to correctly handle EOF.

* Free allocated buffer after UTF8 encode on FreeBSD (#43431)

* Free allocated buffer after UTF8 encode on FreeBSD

* Address PR feedback

* [master] Update dependencies from mono/linker (#43698)

[master] Update dependencies from mono/linker

* Use non-generic TaskCompletionSource where possible in System.Threading.Channels (#40953)

* Use non-generic TaskCompletionSource where possible

* Apply suggestions from code review

Co-authored-by: Stephen Toub <stoub@microsoft.com>

Co-authored-by: Stephen Toub <stoub@microsoft.com>

* Remove Common Extensions HashCodeCombiner (#43454)

* Remove Common Extensions HashCodeCombiner

Fix #33259

* Add unit tests for FilePatternMatch.GetHashCode.

* Add link to Discord server (#43765)

* Add link to Discord server

* Update README.md

Co-authored-by: Stephen Toub <stoub@microsoft.com>

Co-authored-by: Stephen Toub <stoub@microsoft.com>

* Suppress trim analysis warnings inside reflection implementation (#43594)

See the Justification for each of the added suppressions.

Also annotate GetMethodBody as potentially dangerous with trimming.
Trimming can change IL of a method (can remove branches, remove local variables, change some isntructions, ...) - as such accessing the actual body of a method is potentially dangerous.

* Fix issue #36999 (#43723)

* Created two resources string for exceptions.
Replaced static strings with resource.

* Fixed formatting.

Co-authored-by: Nikolay-parhimovich <>

* Improve error message thrown by Microsoft.Extensions.Configuration.Json.JsonConfigurationFileParser when the root element of a JSON config file is not an object. (#42780)

JSON configuration files must have an object as the root element. Previously, the exception message produced by Microsoft.Extensions.Configuration.Json.JsonConfigurationFileParser when a non-object root element was parsed was Error: Unsupported JSON token 'TOKEN_TYPE' was found. This error message was vague and caused confusion. This commit updates the error message to specifically mention that the root element must be an object.

Fix dotnet/extensions#3543.

* Fix null reload token for ConfigurationProvider (#43306)

* Fix null reload token for ConfigurationProvider
* Add not null assert

* Setting `WebProxy.BypassList` to null throws (#40656)

* Argument validation in WebProxy BypassList setter

* Updated a failing test

* Added tests for new BypassList behavior

* Updated the tests for new BypassList behavior

* Added AllowNull attribute on BypassList property

* Update src/libraries/System.Net.WebProxy/src/System/Net/WebProxy.cs

Co-authored-by: Stephen Toub <stoub@microsoft.com>

* Mono: Clang bin path not added to PATH for VS2019 builds. (#43505)

Building full AOT using CI scripts won't work when VS2019 IDE is present (not falling back to VS2019 build tools).

This happens due to a typo in setup-vs-msvcbuild-env.bat. Since CI doesn't install VS IDE, but using buildtools (where clan bin
path is added to PATH) problem only shows up on local development environment using VS2019 IDE.

Co-authored-by: lateralusX <lateralusX@users.noreply.github.com>

* Block the runtime on ready to early bind breakpoints (#43260)

* Block in ready

* Remove hideWebDriver option and always do it

* Rename attach method

* Fix issue: ConventionsBuilder and open generic export with constructor dependencies (#43003)

* Add new function to scan the open generic's constructors when the type is generic

Co-authored-by: HuyLuong <huy.luong@orientsoftware.com>

* [w32socket] Turn WireGuard ENOKEY errno to WASANETUNREACH (#43734)

When the destination IP on the packet doesn't match any WireGuard peer (such as
if the peer is disconnected while the app is running), the sender may get an ENOKEY errno.

This is mentioned in Section 3 "Send/Receive" of
https://www.wireguard.com/papers/wireguard.pdf

Fixes https://github.com/mono/mono/issues/20503

Co-authored-by: lambdageek <lambdageek@users.noreply.github.com>

* improve link detection for WIFI on macOS (#43737)

* improve link detection for WIFI on macOS

* enable test back

* correct condition

* improve type detection for WIFI

* Update src/libraries/Native/Unix/System.Native/pal_interfaceaddresses.c

Co-authored-by: Stephen Toub <stoub@microsoft.com>

Co-authored-by: Stephen Toub <stoub@microsoft.com>

* Fix inconsistency of the 'CanRead' property after disposing an HTTP content stream (#43766)

* Fix inconsistency of the 'CanRead' property after disposing an HTTP content stream

* Update src/libraries/System.Net.Http/tests/FunctionalTests/RawConnectionStreamTest.cs

Co-authored-by: Marie Píchová <11718369+ManickaP@users.noreply.github.com>

Co-authored-by: Angelo Breuer <46497296+angelobreuer@users.noreply.github.com>
Co-authored-by: Marie Píchová <11718369+ManickaP@users.noreply.github.com>

* [mono] Use lax Roslyn and analyzers settings for samples (#43769)

* [mono] Use lax Roslyn and analyzers settings for samples

The samples are often used for debugging runtime problems by modifying the code
to reproduce issues.  The code analyzers and warnings-as-errors add a papercut before a modified
sample can run.

* [mono] Make HelloWorld sample Makefile settings overridable

allow MONO_CONFIG and MONO_ENV_OPTIONS to be set from the outside

* [wasm][debugger] Fix check for already loaded assemblies (#43747)

We send assembly loaded events to the proxy based off events from the
debugger engine. And we check that it isn't an assembly that was already
loaded. This check has a bug in computing the assembly name, from the
filename, which caused these events to be sent even for already loaded
assemblies.

* Pass -target arm or -target  arm64 to superpmi.exe when replaying in superpmi.py (#43784)

* Pass -target (arm|arm64) argument to superpmi.exe when replaying for arm/arm64 in superpmi.py

* Add the -target argument only in altjit in superpmi.py

* Update Android instrumentation runner to be closer to upstream (#43775)

The upstream instrumentation runners don't use runOnMainSync() in the onStart() method, update our runner to do the same and add a bit more logging.

Also fixed a small typo in configure.cmake that I happened to notice.

* Use function pointers for interop (Unix) (#43793)

* [master] Update dependencies from dotnet/arcade Microsoft/vstest dotnet/llvm-project dotnet/runtime-assets mono/linker (#43768)

[master] Update dependencies from dotnet/arcade Microsoft/vstest dotnet/llvm-project dotnet/runtime-assets mono/linker

* Disable JIT counters if the JIT is disabled. (#43808)

* Fix RHEL7 socket dispose hang, and extend coverage (#43409)

Fix #42686 by doing a graceful close in case if the abortive connect(AF_UNSPEC) call fails on Linux, and improve couple of related tests:
- Extend RetryHelper with an exception filter
- Connect, Accept, Send, Receive, SendFile cancellation tests: make sure cancellation failures are not masked by RetryHelper (use exception filter)
- Connect, Accept, Send, Receive cancellation tests: also test IPV6 and DualMode sockets

* Fix to set the inner exception for ALC event (#43667)

* Fix to set the inner exception for ALC event

Removes the exception handling at
CLRPrivBinderCoreCLR::BindAssemblyByName so that
the inner exceptioncan be set when the default
AssemblyLoadContext.Resolving handler throws

* Fixing the test for alc.default

* Enhanced the tests by customizing the exception type that
will be thrown by the handlers

* [w32process-win32] Implement System.Diagnostics.Process::MainWindowHandle. (#43724)

L.A.Noire splash screen calls it to check if the game has opened its window, and stays forever visible if this is not implemented.

Co-authored-by: rbernon <rbernon@users.noreply.github.com>

* Ensure IBC data is copied to the expected location (#43644)

* [docs] Add area-System.Reflection-mono area owners (#43825)

* CoreCLR runtime tests + Mono LLVM AOT on arm64 Linux (#41751)

This change:

- Adds new options to mono.proj:

    - MonoAOTEnableLLVM, which enables (or disables) building a Mono AOT
    cross compiler with LLVM;

    - MonoAOTLLVMDir, which specifies the path to a copy of LLVM suitable for
    building a Mono AOT cross compiler, and is optional; and

    - BuildMonoAOTCrossCompilerOnly, which allows building a Mono AOT
    cross compiler without building an associated full Mono runtime.

- Changes offsets-tool.py's user interface slightly; '--include-prefix' is
renamed to '--prefix' and may be specified multiple times. While this latter
feature isn't necessary to build a Mono cross compiler today, because we don't
use distribution-supplied cross-compilation headers on CI, it does make it
easier to experiment with the offsets tool using arbitrary header layouts.  For
example, Debian's arm64 cross-compilation packages scatter useful header files
across `/usr/lib/gcc-cross/aarch64-linux-gnu` and `/usr/aarch64-linux-gnu`.

- Updates the docker image used for arm64 cross-compilation to a newer revision
that includes libclang (see also
https://github.com/dotnet/dotnet-buildtools-prereqs-docker/pull/375).

- Adds some tests that currently fail to compile with Mono LLVM AOT to
issues.targets.

- Adds a Linux_arm64 LLVM AOT job to CI. Nothing particularly fancy is done to
build the Mono LLVM AOT cross compiler; it is built as a step in the same job
that also sends Linux_arm64 tests to Helix.

* ServiceProcess Controller Refactor (#43797)

1. Simplified if checks.
2. Use expression property syntax.
3. Used using declaration.
4. Inlined out declaration.
5. Removed unwated unsafe modifier.
6. Removed redundant casting.

* Enhance #43238 so that it covers DefineScope method group with the tests to reflect the behavior added in the PR. (#43790)

* Bump emscripten to 2.0.6. (#43800)

* Bump emscripten to 2.0.6.

* Define HAVE_SYS_RANDOM_H on wasm, its not detected correctly.

* Fixed bug in ReadOnlyDictionary's IDictionary.this[object] implementation. This method didn't adhere to IDictionary's contract to return null on a missing key. (#36926)

* Add cancellable and AddressFamily-specific name resolution. (#33420)

Add AddressFamily-specific name resolution and cancellation support for Windows. Resolves #939

* Setting value of enums didn't properly widen the value when setting (#43779)

Use VerifierCorElementType instead of SignatureCorElementType to specify the element type of the target field
- This will send the runtime down the path which can perform primitive widening

* Merge PAL's _wcslwr into _wcslwr_s (#43265)

In (non-palsuite) product code, `_wcslwr` is only used within PAL
inside `_wcslwr_unsafe()` method, which is exposed as `_wcslwr_s` for
PAL consumers. PR inlines the usage of `_wcslwr` in `_wcslwr_unsafe`
and fixes up PAL tests.

* [WIP][interp] Unify execution and valuetype stacks (#43681)

Before this change, an InterpFrame contained 3 regions of data : args + locals, valuetype stack, execution stack. Each entry on the execution stack was a stackval structure. The space for valuetypes, was allocated separately, since they have various sizes. When pushing a valuetype on the stack, we allocated first the space for it on the vtstack and then pushed the address of the region on the execution stack. This change merges the execution stack with the valuetype stack, meaning we push now variable sized data on the execution stack. In order to keep track of the current stack location, whenever we push a type on stack, during transform phase, we also keep track of the offset where this value will reside on stack, as well as the size it occupies. All callsites need to be informed how much they need to pop the stack for the arguments. While called code can access this space normally (the args are special locals belonging to the frame and are accessed directly as such), external code needs a new mechanism to detect each argument for a given frame. This is achieved with the lazily initialized arg_offsets array on an InterpMethod. The method doesn't need to be compiled for this array to be correctly initialized.

Why :
- this simplifies handling of valuetypes, their storage follows the same rules as a normal objref/primitive type
- removes the common use of the vt_sp variable. The compiler no longer needs to reserve it in a register during the switch loop, we no longer need to save it with each call. The sp and ip become now the only variables describing the execution state in a method.
- the flow of the data on the execution stack is well behaved now (with the exception of a few opcodes that update directly based on the stack offset). We were using the vtstack for some magic storage (MINT_VTRESULT for example)
- this makes it such that the stack offset of every value is easily known at compile time, making it possible to completely drop the execution stack approach, and have every opcode have a unique dreg and a list of sregs (which are mapped to a certain stack offset). This will enable more advanced optimizations during compile stage.

Co-authored-by: BrzVlad <BrzVlad@users.noreply.github.com>

* Revert "Try re-enabling IBC on macOS. (#39801)" (#43839)

This reverts commit 3a4298cf15379678c4d437a6554a1453706cc3b3.

* Statically linking coreclr and clrjit in single file host (#43556)

* Statically linking coreclr and clrjit in single file host.

* setting g_hmodCoreCLR on Unix

* System.Globalization.Native.lib must build with coreclr to be linkable with it

* Always use system unwind libs on FREEBSD

* no DllMain when coreclr is statically linked

* Handle cases when coreclr configuration is different from libraries

* Adding and using PAL_GetPalHostModule

* simplify SslStream_StreamToStream_Alpn_NonMatchingProtocols_Fail test (#43625)

* simplify SslStream_StreamToStream_Alpn_NonMatchingProtocols_Fail test

* feedback from review

* Improve annotations for XLinq methods taking params object[] (#43717)

* Improve annotations for XLinq classes taking params object[]

* annotate ref

* address Jozkee's feedback

* Fix XStreamingElement ctor to take nullable content

* [RyuJIT] Propagate gtFlags in Vector.Create (#43578)

Propagate GTF_CALL if needed in GT_LIST. Use gtNewListNode. Ignore test for Mono

* [Portable thread pool] Don't spin-wait on semaphore when hill climbing stops a thread from processing work (#43840)

* Add IDictionary_Generic_Tests test for multiple values with hash collisions (#43836)

* Add IDictionary_Generic_Tests test for multiple values with hash collisions

* Update src/libraries/System.Collections.Concurrent/tests/ConcurrentDictionary/ConcurrentDictionary.Generic.Tests.cs

Co-authored-by: Eirik Tsarpalis <eirik.tsarpalis@gmail.com>

* Update src/libraries/System.Collections.Concurrent/tests/ConcurrentDictionary/ConcurrentDictionary.Generic.Tests.cs

Co-authored-by: Eirik Tsarpalis <eirik.tsarpalis@gmail.com>

* Set TARGET_SIZEOF_VOID_P and SIZEOF_REGISTER correctly when cross compiling. (#43851)

* Rewrite Socket.ConnectAsync for DNS with async/await (#43661)

* Avoid several WildcardBindForConnectIfNecessary allocations on each connect

* Rewrite Socket.ConnectAsync for DNS with async/await

Rips out all of the APM code that was previously used to implement this and replaces it with {Value}Task-based async/await implementations.

* add missing CBOR xmldocs (#43882)

* RyuJIT: Fold Popcnt.PopCount with constant argument (#37836)

Intrinsify BitOperations.PopCount for constant input

* [RyuJIT] Fold "(X op C1) op C2" to "X op (C1 op C2)" for commutative operators (#43567)

Fold "(X op C1) op C2" to "X op (C1 op C2)"

* Fix covariant return type validation for canon parents (#43843)

The return type validation was rejecting cases when the method being
overriden had canon type in its generic arguments.
This change fixes the problem by using parent method type instantiation
for constructing the SigTypeContext in such case.
It also adds a regression test.

* Reimplement Socket.Begin/EndSend/Receive on Send/ReceiveAsync (#43886)

* Remove a volatile access from Task.Id (#43891)

* Add info about how named mutexes work on Unix into code from the orginal PR (#43161)

* Add exception case xml comment for ExecutionContext.Restore

* Add some info about how named mutexes work on Unixes into code from the PR

* Fix test for 32-bit platforms (#43888)

- Fixed an assertion failure. `WorkerCounter` shouldn't be used in the native thread pool implementation when the portable thread pool is enabled (all the counts will be zero), fixed a missed case in `GetAvailableThreads`.
- The native implementation uses a smaller max default worker thread count by default on 32-bit platforms, allowing configured values to go beyond that. Fixed the portable thread pool implementation to do similar, instead of limiting the max including for configured values.

* Make runtime tests run with Android (#42683)

* Prototype for runtime tests running with Android

* Conditionally collect app dependencies

* Switch android sample from publish to build

* Switch Android sample from publish to build

* Modify AndroidTestRunner

* Only build and test one test

* Clean up some changes

* Update new run test script path and update AndroidTestRunner to use files under CoreRoot

* Add Helix configuration for Android_x64

* Disable test which replies on coreclr System.Private.CoreLib.dll

* Fixed format and removed irrelevant parameters

* Rmoved unused parameters and fixed comment

* Revert my AndroidAppBuilder related changes to prepare for…
@dotnet dotnet locked as resolved and limited conversation to collaborators Dec 8, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet