Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.

Retry: Enable EventPipe across Unix and Windows #15611

Merged
merged 14 commits into from Jan 2, 2018

Conversation

nategraf
Copy link

@nategraf nategraf commented Dec 21, 2017

This PR Enables the native components of EventPipe and SampleProfiler to be used on Unix and Windows (expanded from only Linux).

The two largest components are the creation of native interfaces to ETW which match those used to LTTng, and the fixing of contracts within EventPipe code. Additionally bug fixes, build changes, and modification to the SampleProfiler were done to address errors and enable windows builds

Follow-up #14772
Resolves #15607

Copy link
Member

@brianrob brianrob left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LTGM modulo addressing the build and test issues that were found on x64 Pri1 and ARM.

@jkotas
Copy link
Member

jkotas commented Dec 21, 2017

@dotnet test Windows_NT arm Cross Checked Innerloop Build and Test please
@dotnet test Windows_NT arm64 Cross Checked Innerloop Build and Test please
@dotnet test Windows_NT armlb Cross Checked Innerloop Build and Test please

@jkotas
Copy link
Member

jkotas commented Dec 21, 2017

arm build failing with:

13:09:58 CMake Error at src/vm/CMakeLists.txt:523 (add_subdirectory):
13:09:58   add_subdirectory given source
13:09:58   "D:/j/workspace/arm_cross_che---9fe37e18/bin/obj/Windows_NT.arm.Checked/eventing/eventpipe"
13:09:58   which is not an existing directory.

@benaadams
Copy link
Member

Revert fixed the ETW failures https://github.com/dotnet/coreclr/issues/15607

@dotnet-bot test Windows_NT x64 Checked corefx_baseline

@nategraf
Copy link
Author

@dotnet-bot test Windows_NT pri1 please

@benaadams
Copy link
Member

https://github.com/dotnet/coreclr/issues/15607 occurs again

Test Result (29 failures / +29)
--
System.Net.Mail.Tests.LoggingTest.EventSource_EventsRaisedAsExpected
System.Threading.Tasks.Tests.EtwTests.TestEtw
System.Diagnostics.Tests.DiagnosticSourceEventSourceBridgeTests.LinuxNewLineConventions
BasicEventSourceTests.FuzzyTests.Test_Write_Fuzzy
BasicEventSourceTests.TestEventCounter.Test_Write_Metric_EventListener
BasicEventSourceTests.TestsManifestGeneration.Test_EventSource_NamedEventSource
BasicEventSourceTests.TestsWriteEvent.Test_WriteEvent_ByteArray_Manifest_EventListener
BasicEventSourceTests.TestsWriteEvent.Test_WriteEvent_ComplexData_SelfDescribing_EventListener
BasicEventSourceTests.TestsWriteEvent.Test_WriteEvent_Manifest_EventListener_UseEvents
BasicEventSourceTests.TestsWriteEvent.Test_WriteEvent_SelfDescribing_EventListener
BasicEventSourceTests.TestsWriteEvent.Test_WriteEvent_ByteArray_Manifest_EventListener_UseEvents
BasicEventSourceTests.TestsWriteEvent.Test_WriteEvent_ByteArray_SelfDescribing_EventListener
BasicEventSourceTests.TestsWriteEvent.Test_WriteEvent_SelfDescribing_EventListener_UseEvents
BasicEventSourceTests.TestsWriteEvent.Test_WriteEvent_NoAttribute
BasicEventSourceTests.TestsWriteEvent.Test_WriteEvent_Manifest_EventListener
BasicEventSourceTests.TestsWriteEventToListener.Test_WriteEvent_InvalidCalls
BasicEventSourceTests.TestsWriteEventToListener.Test_WriteEvent_ArgsCornerCases
BasicEventSourceTests.TestsWriteEventToListener.Test_WriteEvent_ArgsBasicTypes
BasicEventSourceTests.TestsWrite.Test_Write_T_In_Manifest_Serialization
BasicEventSourceTests.TestsWrite.Test_Write_T_EventListener
BasicEventSourceTests.TestsWrite.Test_Write_T_EventListener_UseEvents
System.Buffers.ArrayPool.Tests.ArrayPoolUnitTests.ReturnBufferFiresDiagnosticEvent
System.Threading.Tests.EtwTests.TestEtw
System.Net.Security.Tests.LoggingTest.EventSource_EventsRaisedAsExpected
System.Data.Tests.DataCommonEventSourceTest.InvokeCodeThatShouldFirEvents_EnsureEventsFired
System.Linq.Parallel.Tests.EtwTests.TestEtw
System.Collections.Concurrent.Tests.EtwTests.TestEtw
System.Net.Primitives.Functional.Tests.LoggingTest.EventSource_EventsRaisedAsExpected
System.Threading.Tasks.Dataflow.Tests.EtwTests.TestEtw

@nategraf
Copy link
Author

Thanks @benaadams I'm working on it

@benaadams
Copy link
Member

FYI there's also https://github.com/dotnet/coreclr/issues/15537 in the errors for the run which is definitely unrelated

@benaadams
Copy link
Member

@dotnet-bot test Windows_NT x64 Checked corefx_baseline

@nategraf
Copy link
Author

@dotnet-bot test Windows_NT x64 Checked corefx_baseline

@nategraf
Copy link
Author

The change originally causing #15607, is that I enabled FeaturePerfTracing in the managed areas, assuming it would be a purely additive and compatible change. I have since figured out that certain places within managed code treat the use of EventPipe and EventListener logging as mutually exclusive, and will require work to make both usable at once.

In order to avoid expanding this PR further, I am opening https://github.com/dotnet/coreclr/issues/15656 and disabling tests which require use of the EventPipe's managed surface.

@nategraf
Copy link
Author

@jkotas could you please kick off the ARM builds?

@jkotas
Copy link
Member

jkotas commented Dec 28, 2017

@dotnet test Windows_NT arm Cross Checked Innerloop Build and Test please
@dotnet test Windows_NT arm64 Cross Checked Innerloop Build and Test please
@dotnet test Windows_NT armlb Cross Checked Innerloop Build and Test please

@jkotas
Copy link
Member

jkotas commented Dec 28, 2017

@dotnet-bot test Ubuntu arm Cross Debug Innerloop Build please

@nategraf
Copy link
Author

@dotnet-bot test Windows_NT x64 Checked corefx_baseline

@nategraf
Copy link
Author

@BruceForstall @jkotas @benaadams all the concerns (Failures in ARM builds, failures in pri1 Windows builds, failed corefx tests, and erroneously introduced python2.7 dependency) of this PR's first iteration have been addressed. With your signoff I'll merge this PR.

@@ -15,12 +15,12 @@ internal struct EventPipeProviderConfiguration
[MarshalAs(UnmanagedType.LPWStr)]
private string m_providerName;
private UInt64 m_keywords;
private uint m_loggingLevel;
private UInt32 m_loggingLevel;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Our convention is to use language keywords instead of BCL types (i.e. int, string, float instead of Int32, String, Single, etc). https://github.com/dotnet/corefx/blob/master/Documentation/coding-guidelines/coding-style.md . So this should have been rather changed other way around.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was changed to fix a marshalling error on x86. Is there perhaps another solution to the marshalling error?

Copy link
Member

@jkotas jkotas Dec 29, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not see how changing uint to UInt32 can fix marshalling error in this file. uint and UInt32 are different names for the same thing in C#. It is just a matter of style convention which one to use.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same with long and Int64. These are also different names for the same thing.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is the commit which fixed the marshalling error. If those are irrelevant, what fixed it?
Just the changes in the C++ side?
97d5c0b

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just the changes in the C++ side?

I think so.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright. I'll try reverting the changes to the C# code and test this again

@@ -0,0 +1,108 @@
from filecmp import dircmp
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

License header?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added now. Thanks

@benaadams
Copy link
Member

benaadams commented Dec 29, 2017

Add to PR summary

Resolves https://github.com/dotnet/coreclr/issues/15607

Then should auto close on merge?

@BruceForstall
Copy link
Member

@dotnet-bot test Windows_NT arm Cross Checked Innerloop Build and Test
@dotnet-bot test Windows_NT arm Cross Checked jitstress1 Build and Test
@dotnet-bot test Windows_NT arm Cross Checked jitstress2 Build and Test
@dotnet-bot test Windows_NT arm64 Cross Checked Innerloop Build and Test
@dotnet-bot test Windows_NT arm64 Cross Checked jitstress1 Build and Test
@dotnet-bot test Windows_NT arm64 Cross Checked jitstress2 Build and Test
@dotnet-bot test Windows_NT x64 Checked jitstress1
@dotnet-bot test Windows_NT x64 Checked jitstress2
@dotnet-bot test Windows_NT x86 Checked jitstress1
@dotnet-bot test Windows_NT x86 Checked jitstress2
@dotnet-bot test Windows_NT x86 Checked corefx_baseline
@dotnet-bot test Windows_NT x64 Checked corefx_baseline
@dotnet-bot test Windows_NT x86_arm_altjit Checked Build and Test
@dotnet-bot test Windows_NT x86_arm_altjit Checked jitstress1
@dotnet-bot test Windows_NT armlb Cross Checked Innerloop Build and Test
@dotnet-bot test Windows_NT armlb Cross Checked jitstress1 Build and Test
@dotnet-bot test Windows_NT armlb Cross Checked jitstress2 Build and Test

@BruceForstall
Copy link
Member

@nategraf If you can, you should avoid pushing any changes until the jobs I triggered finished, otherwise we'll lose their results and they'll need to get re-triggered.

@nategraf
Copy link
Author

If you click on the ❌ or ✔ next to commit hash in between comments you can see any old results

@nategraf
Copy link
Author

Jitstress2

20:20:25 Pass: 9436 Fail: 0 Complete: 84% Start:20:20:24 Test:[_relstress3.cmd_9615] JIT\Methodical\refany\_relstress3\_relstress3.cmd
20:20:25 Pass: 9436 Fail: 1 Complete: 84% Start:20:20:24 Test:[_dbglcs_gcref.cmd_9606] JIT\Methodical\VT\port\_dbglcs_gcref\_dbglcs_gcref.cmd

Jitstress1

19:34:51 Pass: 4106 Fail: 0 Complete: 36% Start:19:34:51 Test:[fieldExprUnchecked1.cmd_4187] JIT\jit64\opt\cse\fieldExprUnchecked1\fieldExprUnchecked1.cmd
19:34:51 Pass: 4106 Fail: 1 Complete: 36% Start:19:34:51 Test:[_rellcs_gcref.cmd_4177] JIT\Methodical\VT\port\_rellcs_gcref\_rellcs_gcref.cmd
...
19:48:46 Pass: 9434 Fail: 1 Complete: 84% Start:19:48:47 Test:[_dbglcs.cmd_9614] JIT\Methodical\Arrays\lcs\_dbglcs\_dbglcs.cmd
19:48:46 Pass: 9434 Fail: 2 Complete: 84% Start:19:48:47 Test:[_dbglcs_gcref.cmd_9606] JIT\Methodical\VT\port\_dbglcs_gcref\_dbglcs_gcref.cmd

Do you think these failures could be related to these changes, or might they be unrelated?

@BruceForstall
Copy link
Member

@nategraf Looks like those armlb failures have been occurring for a while. You can ignore them.

Copy link
Member

@BruceForstall BruceForstall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some questions about the build.cmd changes

build.cmd Outdated
set __IntermediatesIncDir=%__IntermediatesDir%\src\inc
set __IntermediatesEventingDir=%__IntermediatesDir%\eventing

echo Laying out dynamically generated files consumed by the build system
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I appreciate you have these "echo" status messages. Can you please make sure they all start with echo %__MsgPrefix%, so we know where the message came from?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in %__MsgPrefix%

build.cmd Outdated
%PYTHON% -B -Wall %__SourceDir%\scripts\genEventing.py --inc %__IntermediatesIncDir% --dummy %__IntermediatesIncDir%\etmdummy.h --man %__SourceDir%\vm\ClrEtwAll.man --nonextern || exit /b 1

echo Laying out dynamically generated EventPipe Implementation
%PYTHON% -B -Wall %__SourceDir%\scripts\genEventPipe.py --man %__SourceDir%\vm\ClrEtwAll.man --intermediate %__IntermediatesEventingDir%\eventpipe --nonextern || exit /b 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If one of the python scripts fails, will it output an appropriate error message, so we know what failed? Because you're going to immediate exit the build script with no additional error message.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. It lets python print out the stack trace like normal

@@ -380,6 +390,45 @@ for /f "tokens=*" %%s in ('%DotNetCli% msbuild "%OptDataProjectFilePath%" /t:Dum
set __IbcOptDataVersion=%%s
)

REM =========================================================================================
REM ===
REM === Generate source files for eventing
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This phase looks unconditional, whether we're building native, corelib, etc. Is that appropriate?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll take another look at what types of builds actually need these files

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I figured out better conditioning. Thanks for catching this

@@ -380,6 +390,45 @@ for /f "tokens=*" %%s in ('%DotNetCli% msbuild "%OptDataProjectFilePath%" /t:Dum
set __IbcOptDataVersion=%%s
)

REM =========================================================================================
REM ===
REM === Generate source files for eventing
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this affect incremental build? If I run "build.cmd" twice in a row, will it generate new files that will cause the subsequent builds to build things they shouldn't need to?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. I created a utility which ensures files are only written if the copy on disk is different

Copy link
Member

@BruceForstall BruceForstall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The build.cmd changes look good to me. I don't have any comments on the rest of it.

@nategraf
Copy link
Author

@dotnet-bot test Ubuntu arm Cross Debug Innerloop Build please

@nategraf nategraf merged commit c1bbdae into dotnet:master Jan 2, 2018
@ghost
Copy link

ghost commented Jan 3, 2018

This commit is breaking the build for people who keep their pythons in directories named "C:\Program Files". The pathname needs to be quoted.

@k15tfu
Copy link

k15tfu commented Jan 4, 2020

btw, seems that it also fixed ICorProfilerInfo::ForceGC on macOS because it always returned E_FAIL w/o FEATURE_EVENT_TRACE: https://github.com/dotnet/coreclr/blob/v1.1.13/src/vm/proftoeeinterfaceimpl.cpp#L4825

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
6 participants