Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[cdac] cdac-build-tool #100650

Merged
merged 67 commits into from
Apr 19, 2024
Merged

[cdac] cdac-build-tool #100650

merged 67 commits into from
Apr 19, 2024

Conversation

lambdageek
Copy link
Member

@lambdageek lambdageek commented Apr 4, 2024

Contributes to #99298

cDAC Build Tool

Summary

The purpose of cdac-build-tool is to generate a .c file that contains a JSON cDAC contract descriptor.

It works by processing one or more object files containing data descriptors and zero or more text
files that specify contracts.

Running

% cdac-build-tool compose [-v] -o contractdescriptor.c -c contracts.txt datadescriptor.o

.NET runtime build integration

cdac-build-tool is meant to run as a CMake custom command.
It consumes a target platform object file and emits a C source
file that contains a JSON contract descriptor. The C source
is the included in the normal build and link steps to create the runtime.

The contract descriptor source file depends on contract-aux-data.c which is a source file that contains
the definitions of the "indirect pointer data" that is referenced by the data descriptor. This is typically the addresses of important global variables in the runtime.
Constants and build flags are embedded directly in the JSON payload.

Multiple data descriptor source files may be specified (for example if they are produced by different components of the runtime, or by different source languages). The final JSON payload will be a composition of all the data descriptors.

Multiple contracts text files may be specified. This may be useful if some contracts are conditionally included (for example if they are platform-specific). The final JSON payload will be a composition of all the contracts files.

flowchart TB
  headers("runtime headers")
  data_header("datadescriptor.h")
  data_src("datadescriptor.c")
  compile_data["clang"]
  data_obj("datadescriptor.o")
  contracts("contracts.txt")
  globals("contractpointerdata.c")
  build[["cdac-build-tool"]]
  descriptor_src("contractdescriptor.c")
  vm("runtime sources")
  compile_runtime["clang"]
  runtime_lib(["libcoreclr.so"])

  headers -.-> data_src
  headers ~~~ data_header
  data_header -.-> data_src
  headers -.-> globals
  headers -.-> vm
  data_src --> compile_data --> data_obj --> build
  contracts ---> build
  build --> descriptor_src
  descriptor_src --> compile_runtime
  data_header -.-> globals ----> compile_runtime
  vm ----> compile_runtime --> runtime_lib

@dotnet-issue-labeler dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Apr 4, 2024
@lambdageek lambdageek added area-Diagnostics-coreclr NO-MERGE The PR is not ready for merge yet (see discussion for detailed reasons) and removed needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels Apr 4, 2024
…nConverter.cs

Co-authored-by: Elinor Fung <elfung@microsoft.com>
@lambdageek
Copy link
Member Author

All the windows build lanes are failing with something like:

.dotnet\sdk\9.0.100-preview.3.24204.13\Microsoft.Common.CurrentVersion.targets(4806,5): error MSB3026: (NETCORE_ENGINEERING_TELEMETRY=Build) Could not copy "D:\a\_work\1\s\artifacts\obj\ILLink.Tasks\Debug\net9.0\ILLink.Tasks.dll" to "D:\a\_work\1\s\artifacts\bin\ILLink.Tasks\Debug\net9.0\ILLink.Tasks.dll". Beginning retry 1 in 1000ms. The process cannot access the file 'D:\a\_work\1\s\artifacts\bin\ILLink.Tasks\Debug\net9.0\ILLink.Tasks.dll' because it is being used by another process. The file is locked by: ".NET Host (5476)"

which looks like we're running the in-tree artifacts/bin/.../ILLink.Tasks.dll at the same time as we're building it (a second time?) I guess the obvious suspect is the compilation of cdac-build-tool, although i'm unclear why that would use the ILLink tasks from the in-tree build

@lambdageek lambdageek merged commit 4abe399 into dotnet:main Apr 19, 2024
149 of 155 checks passed
@elinor-fung
Copy link
Member

I think _RequiresLiveILLink is ending up true becasue EnableSingleFileAnalyzer is set to true in

<EnableSingleFileAnalyzer Condition="
'$(EnableSingleFileAnalyzer)' == '' and
'$(TargetFrameworkIdentifier)' == '.NETCoreApp' and
'$(IsSourceProject)' == 'true'">true</EnableSingleFileAnalyzer>

Maybe we should set EnableSingleFileAnalyzer to false instead of the private-per-underscore-convention property - similar to what we do for illink tasks/analyzers themselves.

@shushanhf
Copy link
Contributor

shushanhf commented Apr 19, 2024

@lambdageek @elinor-fung

After this PR, there is an error during coreclr camke-config step by command

####linux-debian-AMD64 and loongarch64 are both failed !!!
./build-runtime.sh -debug -loongarch64 -nopgooptimize -skipmanaged

-- Looking for process_vm_readv
-- Looking for process_vm_readv - found
CMake Error at debug/runtimeinfo/CMakeLists.txt:57 (message):
  No cdac-build-tool set or does not exist


-- Configuring incomplete, errors occurred!
See also "/home/qiao/work_qiao/runtime/artifacts/obj/coreclr/linux.loongarch64.Debug/CMakeFiles/CMakeOutput.log".
See also "/home/qiao/work_qiao/runtime/artifacts/obj/coreclr/linux.loongarch64.Debug/CMakeFiles/CMakeError.log".


set(GENERATED_CDAC_DESCRIPTOR_DIR "${CMAKE_CURRENT_BINARY_DIR}/cdac")
set(CONTRACT_DESCRIPTOR_OUTPUT "${GENERATED_CDAC_DESCRIPTOR_DIR}/contract-descriptor.c")
if("${CDAC_BUILD_TOOL_BINARY_PATH}" STREQUAL "" OR NOT EXISTS "${CDAC_BUILD_TOOL_BINARY_PATH}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

- "${CDAC_BUILD_TOOL_BINARY_PATH}" STREQUAL ""
+ NOT CDAC_BUILD_TOOL_BINARY_PATH

it checks both "not defined" and "evaluates to empty string".

add_custom_command(
OUTPUT "${CONTRACT_DESCRIPTOR_OUTPUT}"
VERBATIM
COMMAND dotnet ${CDAC_BUILD_TOOL_BINARY_PATH} compose -o "${CONTRACT_DESCRIPTOR_OUTPUT}" -c "${CONTRACT_FILE}" $<TARGET_OBJECTS:cdac_data_descriptor>
Copy link
Member

@am11 am11 Apr 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. src/coreclr/build-runtime.sh direct invocation skips requiring dotnet/msbuild, which is necessary for new platform port work. This PR assumes dotnet/msbuild are always available. We should handle that case.

  2. Secondly, can use published cdac-build-tool instead of assembly path so instead of requiring dotnet in PATH, it uses executable (AOT'd, singilefilehost, apphost -based doesn't matter)? If that's not possible (e.g. crossbuild case), then please pass DOTNET_HOST_PATH (set by SDK to the executing dotnet(1)) via cmakeargs as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just make embedding the contract descriptor optional if someone is doing a build without msbuild?

Copy link
Member

@am11 am11 Apr 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If making it optional doesn't affect debugging, that's good. Second part is:

-      <_CoreClrBuildArg Include="-cmakeargs &quot;-DCDAC_BUILD_TOOL_BINARY_PATH=$(RuntimeBinDir)cdac-build-tool\cdac-build-tool.dll&quot;" />
+      <_CoreClrBuildArg Include="-cmakeargs &quot;-DCDAC_BUILD_TOOL_BINARY_PATH='$(RuntimeBinDir)cdac-build-tool/cdac-build-tool.dll'&quot; -cmakeargs &quot;-DCLR_DOTNET_HOST_PATH='$(DOTNET_HOST_PATH)'&quot;" />
-    COMMAND dotnet ${CDAC_BUILD_TOOL_BINARY_PATH} compose -o "${CONTRACT_DESCRIPTOR_OUTPUT}" -c "${CONTRACT_FILE}" $<TARGET_OBJECTS:cdac_data_descriptor>
+    COMMAND "${CLR_DOTNET_HOST_PATH}" "${CDAC_BUILD_TOOL_BINARY_PATH}" compose -o "${CONTRACT_DESCRIPTOR_OUTPUT}" -c "${CONTRACT_FILE}" $<TARGET_OBJECTS:cdac_data_descriptor>

which will resolve to path/to/runtime/.dotnet/dotnet.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If making it optional doesn't affect debugging, that's good

in the future, it will make using something like dotnet-sos impossible

I don't understand what scenarios don't have a .NET sdk available. any new platform bringup will start with cross compiling from an existing platform with a .NET sdk and a cross compiler toolchain, right? that case should be handled by the cmake+msbuild infrastructure

I added #101297


-    COMMAND dotnet ${CDAC_BUILD_TOOL_BINARY_PATH} compose -o "${CONTRACT_DESCRIPTOR_OUTPUT}" -c "${CONTRACT_FILE}" $<TARGET_OBJECTS:cdac_data_descriptor>
+    COMMAND "${CLR_DOTNET_HOST_PATH}" "${CDAC_BUILD_TOOL_BINARY_PATH}" compose -o "${CONTRACT_DESCRIPTOR_OUTPUT}" -c "${CONTRACT_FILE}" $<TARGET_OBJECTS:cdac_data_descriptor>

I had something like this before and it broke source build because it was pointing to the repo root .dotnet/dotnet, not to artifacts/sb/.dotnet/dotnet. AFAICT it's fine to just call "dotnet" from the path. eng/commons/build.sh always adds the right SDK dir to the path - it's done by InitializeDotNetCli, and it is source-build aware

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update I'm educating myself over in dotnet/installer#19534 (comment)

I think previously I used something other than DOTNET_HOST_PATH

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated #101297 to use DOTNET_HOST_PATH if it's set

matouskozak pushed a commit to matouskozak/runtime that referenced this pull request Apr 30, 2024
# cDAC Build Tool

## Summary

The purpose of `cdac-build-tool` is to generate a `.c` file that contains a JSON cDAC contract descriptor.

It works by processing one or more object files containing data descriptors and zero or more text
files that specify contracts.

## Running

```console
% cdac-build-tool compose [-v] -o contractdescriptor.c -c contracts.txt datadescriptor.o
```
## .NET runtime build integration

`cdac-build-tool` is meant to run as a CMake custom command.
It consumes a target platform object file and emits a C source
file that contains a JSON contract descriptor.  The C source
is the included in the normal build and link steps to create the runtime.

The contract descriptor source file depends on `contract-aux-data.c` which is a source file that contains
the definitions of the "indirect pointer data" that is referenced by the data descriptor.  This is typically the addresses of important global variables in the runtime.
Constants and build flags are embedded directly in the JSON payload.

Multiple data descriptor source files may be specified (for example if they are produced by different components of the runtime, or by different source languages).  The final JSON payload will be a composition of all the data descriptors.

Multiple contracts text files may be specified.  This may be useful if some contracts are conditionally included (for example if they are platform-specific).  The final JSON payload will be a composition of all the contracts files.

```mermaid
flowchart TB
  headers("runtime headers")
  data_header("datadescriptor.h")
  data_src("datadescriptor.c")
  compile_data["clang"]
  data_obj("datadescriptor.o")
  contracts("contracts.txt")
  globals("contractpointerdata.c")
  build[["cdac-build-tool"]]
  descriptor_src("contractdescriptor.c")
  vm("runtime sources")
  compile_runtime["clang"]
  runtime_lib(["libcoreclr.so"])

  headers -.-> data_src
  headers ~~~ data_header
  data_header -.-> data_src
  headers -.-> globals
  headers -.-> vm
  data_src --> compile_data --> data_obj --> build
  contracts ---> build
  build --> descriptor_src
  descriptor_src --> compile_runtime
  data_header -.-> globals ----> compile_runtime
  vm ----> compile_runtime --> runtime_lib
```


--- 

* add implementation note notes

* add an emitter

* read in the directory header

* contract parsing

* indirect pointer value support

* move sample to tool dir

* Take baselines from the docs/design/datacontracts/data dir

  We don't parse them yet, however

* Add README

* fix BE

   Store the magic as a uint64_t so that it will follow the platform endianness.

   Store endmagic as bytes so that it directly follows the name pool - and fix the endmagic check not to look at the endianness

* hook up cdac-build-tool to the coreclr build; export DotNetRuntimeContractDescriptor

* cleanup; add contracts.txt

* add diagram to README

* move implementation notes

* better verbose output from ObjectFileScraper

* turn off whole program optimizations for data-descriptor.obj

   On windows /GL creates object files that cdac-build-tool cannot read

   It's ok to do this because we don't ship data-descriptor.obj as part of the product - it's only used to generate the cDAC descriptor

* C++-ify and add real Thread offsets

* no C99 designated initializers in C++ until C++20

* build data descriptor after core runtime

* fix gcc build

* simplify ObjectFileScraper

   just read the whole file into memory

* invoke 'dotnet cmake-build-tool.dll' instead of 'dotnet run --project'

* clean up macro boilerplate

* platform flags

* turn off verbose output

* can't use constexpr function in coreclr

   because debugreturn.h defines a `return` macro that expands to something that is not c++11 constexpr

* Rename "aux data" to "pointer data"

* rename "data-descriptor" to "datadescriptor"

* simplify linking

* cdac-build-tool don't build dotnet tool; turn on analyzers

* rationalize naming; update docs; add some inline comments

* renamce cdac.h to cdacoffsets.h

* improve output: hex offsets; improved formatting

* don't throw in ParseContracts; add line numbers to errors

* change input format for contracts to jsonc

* add custom JsonConverter instances for the compact json representation

* simplify; bug fix - PointerDataCount include placeholder

* one more set of feedback changes: simpler json converters

* set _RequiresLiveILLink=false for cdac-build-tool.csproj

   fixes windows builds:

   error MSB3026: (NETCORE_ENGINEERING_TELEMETRY=Build) Could not copy "D:\a\_work\1\s\artifacts\obj\ILLink.Tasks\Debug\net9.0\ILLink.Tasks.dll" to "D:\a\_work\1\s\artifacts\bin\ILLink.Tasks\Debug\net9.0\ILLink.Tasks.dll". Beginning retry 1 in 1000ms. The process cannot access the file 'D:\a\_work\1\s\artifacts\bin\ILLink.Tasks\Debug\net9.0\ILLink.Tasks.dll' because it is being used by another process. 

   
---------

Co-authored-by: Elinor Fung <elfung@microsoft.com>
Co-authored-by: Aaron Robinson <arobins@microsoft.com>
@github-actions github-actions bot locked and limited conversation to collaborators May 20, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants