Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds Tracy portable source file support for vmfb files #16223

Closed
wants to merge 3 commits into from

Conversation

KoolJBlack
Copy link
Member

DRAFT in progress...

Progress on #15699

Copy link
Collaborator

@benvanik benvanik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is going to need some reworking - there's an invalid assumption here that there's 1 source file per export which is not true. if you pass a 1GB .mlir file with 800 dispatch exports as input this will embed 800 copies of the entire 1GB .mlir file. you also need to shift all logic to the compiler - the only thing the runtime should do is take a pointer from the library and pass it to tracy. you also need to handle other tracing providers (even if they no-op today) as tracy is not the only way to trace.

if (options.debugLevel >= 1) {
if (auto loc = findFirstFileLoc(exportOp.getLoc())) {
sourceFile = loc->getFilename().str();
sourceLine = loc->getLine();
if (options.debugLevel >= 3) {
std::string error;
auto file = mlir::openInputFile(loc->getFilename(), &error);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what this is doing, but I'm pretty sure it's not what we want. Not only will this load a file that won't always exist (remember that locs may reference files from way earlier in the pipeline) but the file may be binary, 100GB, etc.

What we want to embed is the hal.executable.variant contents.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's an example of how to get the contents of an MLIR Operation into a string: https://github.com/openxla/iree/blob/09e5d1d9ad7d640b7b6078aab316ea0322c0f37f/compiler/src/iree/compiler/Dialect/HAL/Target/VulkanSPIRV/VulkanSPIRVTarget.cpp#L235-L241

Pulling arbitrary source text could work if the location is a FileLineColLoc (see https://github.com/llvm/llvm-project/blob/main/mlir/include/mlir/IR/BuiltinLocationAttributes.td) and the file exists / is accessible. That could be interesting as a way to get portable .py sources embedded, for example.

The existing source information embedding works implicitly together with the location snapshotting and file writing that "dump executables" performs. This should decide if it wants to follow that or do something more explicit.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For "something more explicit", we could have an enum flag (using clEnumValN) like --iree-hal-executable-source-locations=[original, mlir] (or other names, idk), that would let you choose between

  • using the source locs already in there (what this PR does right now)
  • using the MLIR hal.executable.variant contents

Decoupling from --iree-hal-dump-executable-sources-to would also make this easier to use.

Maybe for this PR start with just embedded MLIR contents, unless you really want input .py/.mlir files for your use cases. Once the plumbing is complete then we can figure out how to make it more flexible and what flags to define.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@benvanik: agreed this is not desired, I was debugging here.

The problem is at the SerializeExecutablesPass, the hal.executable.variant does not contain the correct source that is indicated by the Loc that is shown in tracy currently. If you compile a model (MobileBert for example), we'll either embed the source locs from the input mlir/py file (depending on input), or in the case of using --iree-hal-dump-executable-sources-to, the output from the DumpExecutableSourcesPass (what mlir::openInputFile() is reading as a test).

At SerializeExecutablesPass this is no longer available. This gist shows DumpExecutableSourcesPass vs SerializeExecutablesPass: https://gist.github.com/KoolJBlack/cec5481510e3852d87e070f2dcc22aec

Essentially I need access to source higher up in the pipeline. With Scott's suggestion for an explicit flag: --iree-hal-executable-source-locations=[original, mlir], I could leverage this to add the MLIR to the executable op at the DumpExecutableSourcesPass as an attribute and carry it down to SerializeExecutablesPass. This won't handle Locs that point to the input file however, and I'm not sure how to handle src locs that are provided with the input mlir that point to companion files (like python). This still per executable op however which could duplicate source until deduped during serialization.

Suggestions?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could leverage this to add the MLIR to the executable op at the DumpExecutableSourcesPass as an attribute and carry it down to SerializeExecutablesPass.

Rather than overload what that pass does, I'd add a new pass (added to the pipeline at the same point here: https://github.com/openxla/iree/blob/main/compiler/src/iree/compiler/Dialect/HAL/Transforms/Passes.cpp). That would decouple this from "dump executable sources".

I was thinking start simple:

  • If --iree-hal-executable-debug-level=3 is set (or a new flag), have a pass there snapshot the variant op as a string (not a file), stashed onto an attribute
  • Later in executable serialization, look for that attribute and copy the text into what is serialized (for the runtime / Tracy to use)

Then after the first PR, consider also using MLIR source locations as a way to look up and embed input python files, if that would be useful (input python files can be referenced today but not portably)

Copy link
Collaborator

@ScottTodd ScottTodd Feb 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just had a few more ideas, if snapshotting the variant as a string in an attribute is tricky (not sure how string escaping works there).

  • Clone the variant ops into a module that is excluded from future compiler passes (not sure how excluding would work, but that would keep it in IR/memory the whole way, without worrying about serialize/deserializing)
  • Dump the variant ops to tool temp files, separately from DumpExecutableSourcesPass (maybe in that pass, but at least to separate files), then remember the paths to those files in attributes, not MLIR source locs. Then have code like what you already have to read those temp files back into memory here.

i32Type,
i8PtrType,
},
"iree_hal_executable_src_loc_v0_t",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wrong name

llvm::Constant::getNullValue(srcLocType->getPointerTo());
if (mode == Mode::INCLUDE_REFLECTION_ATTRS) {
SmallVector<llvm::Constant *> exportSrcFileValues;
for (auto dispatch : exports) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this granularity doesn't work - we don't want a source file per export as all exports may come from the same file or any combination of files

@@ -35,7 +35,7 @@ struct TargetOptions {
// 1: minimal debug information
// 2: default debug information
// 3: maximal debug information
int debugLevel;
int debugLevel = 2;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please split out these fixes to a separate PR

/*data=*/(void*)file_mapping);
}

int tracy_file_contents_sort_cmp(const void* a, const void* b) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as with all other methods in this code prefix with iree_


// Alloc tracy file mapping
if (iree_status_is_ok(status)) {
status = iree_allocator_malloc(host_allocator, sizeof(**out_file_mapping),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as mentioned all of this should be done in the compiler - it's part of the contract that they are sorted - we don't want extra allocs


// Register file mappings
if (iree_status_is_ok(status)) {
iree_tracing_register_custom_file_contents(file_mapping);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note that tracy is not the only tracing implementation - any function defined here needs to be defined as part of the full tracing suite (such as -DIREE_TRACING_PROVIDER=console)

Comment on lines +57 to +62
iree_hal_executable_library_setup_tracing(library, host_allocator, \
out_file_mapping)
#else
#define IREE_HAL_EXECUTABLE_LIBRARY_SETUP_TRACING(library, host_allocator, \
out_file_mapping) \
iree_ok_status()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have an inline function or actually return - a standalone iree_ok_status() is weird

iree_allocator_t host_allocator, tracy_file_mapping** out_file_mapping);
#define IREE_HAL_EXECUTABLE_LIBRARY_SETUP_TRACING(library, host_allocator, \
out_file_mapping) \
iree_hal_executable_library_setup_tracing(library, host_allocator, \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can always define the function and have it no-op inside itself (like it already does) - no need for macros or switches here

@@ -30,6 +30,11 @@ typedef struct iree_hal_elf_executable_t {
// Name used for the file field in tracy and debuggers.
iree_string_view_t identifier;

#if (IREE_TRACING_FEATURES)
// Optional mapping of filenames to custom contents.
tracy_file_mapping* file_mapping;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

benvanik added a commit that referenced this pull request Mar 14, 2024
This adds a new `CaptureExecutableSourcesPass` that allows for capture
of individual `hal.executable.variant` ops at any number of compilation
stages. Unlike the `DumpExecutableSourcesPass` this does not change the
original source locations in the IR and instead captures both the
textual IR and the remapped locations within it of each executable
export at the time the pass is run. The textual IR is stored as
resources and associated with the variants through linking and made
available to serialization for embedding in target-specific formats.

Because this increases compilation time (generating all of the sources
multiple times per executable is expensive) and bloats binaries the
capture is only enabled with the `--iree-hal-executable-debug-level=3`
or greater flag set (default is `=2`).

As part of this PR the CPU, Vulkan, and legacy ROCM formats have been
updated to store the new information and source it at runtime. This is a
breaking change to the executable library binary format. I'm not quite
happy with it, but it's probably good enough for the next 6mo-1yr.

To make this more usable a copy button has been added to the tracy
source view: wolfpld/tracy#750
Now clicking on a dispatch in the CPU or GPU timeline will show the
source and the copy button can be used to get it in the clipboard. The
source can then be run through `iree-compile
--compile-mode=hal-executable` to generate binaries.

![image](https://github.com/openxla/iree/assets/75337/e690e1f8-52a7-40db-a3aa-5fd7e781791b)

Fixes #15699. Closes #16223.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants