Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add functionality to extract specified kernels from app so that they can be replayed independently #294

Merged
merged 33 commits into from Apr 5, 2023

Conversation

Novermars
Copy link
Contributor

(Partially) Implements #290

Description of Changes

First iteration to add functionality to extract specified kernels from app so that they can be replayed independently, currently WIP due to some bugs and missing features.

Things that have to be done/clarified before it can be merged

  1. Right now users have to copy run.py manually to the folder where the other files for the replaying, make it automatically
  2. Current limitations:
    • Only works for OpenCL buffers
    • Only works for OpenCL program which use clCreateProgramWithSource, building from SPIRV functionality is tbd.
  3. Right now the gws(_offset) and lws and build options/flags are written to separate files, which are then read by the python script. It would be nicer if these are written into the script automatically.

Testing Done

Tried with a few basic OpenCL examples on Ubuntu 20.04, partially opening this PR so that I can test more complex programs.

Novermars and others added 14 commits February 22, 2023 17:41
…of(void*)

Python script handles the duplicate arguments
Adds a script that automatically captures a specified kernel and then validates it by comparing the replayed results to the dumped results
Fixed typo (Bump->Dump)
Now works on Windows and Linux
Getting the binaries from the cl_program (or cl_kernel with an Intel extension) has proved to be very unreliable.
This version saves the device binaries which the program inputs into clCreateProgramWithBinary().
The python script then uses these binaries to build the program
@Novermars Novermars marked this pull request as ready for review March 1, 2023 15:03
@Novermars Novermars changed the title [WIP] Add functionality to extract specified kernels from app so that they can be replayed independently Add functionality to extract specified kernels from app so that they can be replayed independently Mar 1, 2023
@bashbaug
Copy link
Contributor

I just started having a look at this and in general it looks really great - thank you for your contribution!

I'll try to have a more thorough review in the next couple days. Thanks again!

@Novermars
Copy link
Contributor Author

@bashbaug I just added functionality to dump a kernel by name instead of by an enqueue number, should be useful if you're just interested in replaying the kernel instead of a specific kernel.

You can also now specify how often you want to run a kernel with the run.py script, for better profiling :)

Copy link
Contributor

@bashbaug bashbaug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again, this all looks really good even though I have a few comments.

Let's see if we can fix some of the simpler issues before merging. It's fine not to fix all of them right now. The sooner we can merge these changes the sooner people can start using it and finding bugs and the sooner we can fix them. 😄

My biggest priority is keeping things fast and robust when the replay controls are not enabled, if that helps.

CMakeLists.txt Show resolved Hide resolved
docs/capture_single_kernels.md Outdated Show resolved Hide resolved
intercept/src/dispatch.cpp Show resolved Hide resolved
intercept/src/dispatch.cpp Outdated Show resolved Hide resolved
intercept/src/dispatch.cpp Show resolved Hide resolved
intercept/src/intercept.cpp Outdated Show resolved Hide resolved
intercept/src/intercept.h Outdated Show resolved Hide resolved
scripts/capture_and_validate.py Outdated Show resolved Hide resolved
scripts/capture_and_validate.py Outdated Show resolved Hide resolved
scripts/rsl_run.py Show resolved Hide resolved
@bashbaug
Copy link
Contributor

I had a few warnings when I built on Linux as well, mostly for -Wparentheses. Might be worth checking the CI build logs (when they finally complete, things have been running slow recently...).

Example:

intercept/src/intercept.h:2289:17: warning: suggest parentheses around ‘&&’ within ‘||’ [-Wparentheses]

@Novermars
Copy link
Contributor Author

I just had a look at the warnings on my linux machine, there I can indeed see them! Consider them fixed.

I'll push some changes already this evening (Germany time) and will do the rest tomorrow, taking your feedback on some questions in consideration :)

Novermars and others added 2 commits March 29, 2023 19:53
These changes are not final yet and incomplete
@Novermars
Copy link
Contributor Author

Novermars commented Mar 31, 2023

For some reason, I now run into segfaults for certain apps, since I've switched to std::string/std::vector instead of manual new/delete. Intercept-Layer reports exit code C0000005 for the child process...

I'll revert back to manual memory management to see if that fixes the problem, then one by one go back to the modern variants while looking at my test suite.

@Novermars
Copy link
Contributor Author

Okay, this should work again. I have no clue why, but seems like one of the std::string constructors does something which I did not expect, now using the other constructor should be better.

@bashbaug It should be ready for merging now, assuming C++17 is okay for you :)

@bashbaug
Copy link
Contributor

bashbaug commented Apr 3, 2023

It should be ready for merging now

Awesome! I'll do a few final spot checks and aim to merge this by EOD. Thank you!

@bashbaug bashbaug merged commit e49110b into intel:main Apr 5, 2023
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants