Conversation
to be able to override invoke type
| * | ||
| * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | ||
| * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | ||
| * IMPLIED, INCnLnUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, |
… throw escape err
|
@DrizztDoUrden Please review this PR. |
Vasili's branch went in, so now its a PR from develop. Is this ok? |
* Implemented igemm data invokers
* Missing invoker
* Removed accidental redundant unused FFT calls
* Fixed kernel name split
* ConvHipImplicitGemmBwdDataV4R1 invoker
* Removed precompilation duplication
* Removed RunAndMeasureSolution's from header
* Finished merginr RaMS into invokers
* Self-copy/self-move tidy fix
* Fixed is_detected_v tidy warning
* fixed conv_common.hpp build
* Updated static_assert to use {}
* env vars for custom cache location and env var to spoof arch
* added PrecompileKernels fct call
* calling PrecompileKernels
* hip code path for compiling only, tobeadded: ocl
* added env var for num_cu and removed custom cache dir
* added custom bin cache location to avoid MIOPEN_VERSION_TWEAK
* implemented GPU'less compiling for ocl backend
* [Squashed] Generic search with invokers
* Missing PrecompileSoulutions
* taking bin cache location as full path, and updated gen_search to not throw escape err
Co-authored-by: Vasilii Filippov <vasja.filippov94@outlook.com>
Co-authored-by: Vasilii Filippov <vasilii.filippov@amd.com>
Co-authored-by: daniellowell <daniel.lowell@amd.com>
| MIOPEN_STRINGIZE(MIOPEN_VERSION_TWEAK); | ||
| auto p = boost::filesystem::path{miopen::ExpandUser(cache_dir)} / version; | ||
|
|
||
| const char* const custom = miopen::GetStringEnv(MIOPEN_CUSTOM_CACHE_DIR{}); |
There was a problem hiding this comment.
[Quality] This is developer's feature, therefore at least DEBUG must be used in the name. And this is binary cache, let's indicate that as well.
[Recommendation] MIOPEN_DEBUG_BINARY_CACHE_DIR
| const char* const num_cu = miopen::GetStringEnv(MIOPEN_DEVICE_CU{}); | ||
| if(num_cu != nullptr && strlen(num_cu) > 0) | ||
| { | ||
| return boost::lexical_cast<std::size_t>(num_cu); | ||
| } |
There was a problem hiding this comment.
[Quality]
- Developers' envvar should contain
DEBUG- [Recommendation]
MIOPEN_DEBUG_ENFORCE_DEVICE_CU.
- [Recommendation]
boost::lexical_castshould not be used.- All the checks and conversions should be moved to a function that performs the job once, to minimize delays.
[Hint] We have Value() in env.hpp for this purpose.
|
|
||
| std::string Handle::GetDeviceName() const | ||
| { | ||
| const char* const arch = miopen::GetStringEnv(MIOPEN_DEVICE_ARCH{}); |
There was a problem hiding this comment.
[Quality] This change is not necessary
There was a problem hiding this comment.
What exactly are you referring to? Returning the env var specified over the actual arch we are running on is necessary to compile for a different architecture.
There was a problem hiding this comment.
We already have MIOPEN_DEBUG_ENFORCE_DEVICE. See #307 (comment)
| const char* const c_and_r = miopen::GetStringEnv(MIOPEN_COMPILE_AND_RUN{}); | ||
| std::string compile_and_run; | ||
| if(c_and_r != nullptr && strlen(c_and_r) > 0) | ||
| { | ||
| compile_and_run = c_and_r; | ||
| } | ||
|
|
| { | ||
| current_solution = s.GetSolution(context, current_config, true); | ||
|
|
||
| if(compile_and_run == "0") |
There was a problem hiding this comment.
if(IsEnabled(MIOPEN_DEBUG_COMPILE_ONLY{}))
| kernels.push_back(kernel); | ||
| } | ||
|
|
||
| std::vector<Program> programs = PrecompileKernels(profile_h, kernels); |
There was a problem hiding this comment.
[Quality] std::ignore = ...
There was a problem hiding this comment.
[Note] This is just a waste of time for developers' builds (BUILD_DEV=On). Not that important, just FYI.
| OCLKernel(SharedProgramPtr p, const std::string& kernel_name) | ||
| : program(p), kernel(CreateKernel(p.get(), kernel_name)) | ||
| { | ||
| } | ||
|
|
There was a problem hiding this comment.
[Quality] Do not add a new ctor. Just update existing one: read MIOPEN_DEBUG_COMPILE_ONLY and skip loading and running the kernels, if enabled.
| } | ||
|
|
||
| std::vector<Program> programs = PrecompileKernels(profile_h, kernels); | ||
| continue; |
There was a problem hiding this comment.
[Quality] You may wish adding a new monitoring method to Heartbeat and using it here, to show signs of life when log level < 6.
* Implemented igemm data invokers
* Missing invoker
* Removed accidental redundant unused FFT calls
* Fixed kernel name split
* ConvHipImplicitGemmBwdDataV4R1 invoker
* Removed precompilation duplication
* Removed RunAndMeasureSolution's from header
* Finished merginr RaMS into invokers
* Self-copy/self-move tidy fix
* Fixed is_detected_v tidy warning
* fixed conv_common.hpp build
* Updated static_assert to use {}
* env vars for custom cache location and env var to spoof arch
* added PrecompileKernels fct call
* calling PrecompileKernels
* hip code path for compiling only, tobeadded: ocl
* added env var for num_cu and removed custom cache dir
* added custom bin cache location to avoid MIOPEN_VERSION_TWEAK
* implemented GPU'less compiling for ocl backend
* [Squashed] Generic search with invokers
* Missing PrecompileSoulutions
* taking bin cache location as full path, and updated gen_search to not throw escape err
Co-authored-by: Vasilii Filippov <vasja.filippov94@outlook.com>
Co-authored-by: Vasilii Filippov <vasilii.filippov@amd.com>
Co-authored-by: daniellowell <daniel.lowell@amd.com>
* Resolve #307 (comment) and #307 (comment) * Partially resolve #307 (comment) * Resolve Resolve #307 (comment) * Remove W/A for https://ontrack-internal.amd.com/browse/SWDEV-225285
* Resolve #307 (comment) and #307 (comment) * Partially resolve #307 (comment) * Resolve Resolve #307 (comment) * Remove W/A for https://ontrack-internal.amd.com/browse/SWDEV-225285
I am posting this PR preemptively so Tuna people can see my changes.
The idea is:
The following steps need to happen:
build with MIOPEN_SQLITE_KERN_CACHE=ON
To compile we need to set the following env vars:
export MIOPEN_DEBUG_CONV_GEMM=0 (GEMM code path has not been implemented for this)
export MIOPEN_CUSTOM_CACHE_DIR=test (this will be appended to ~/.cache)
export MIOPEN_FIND_ENFORCE=4
export MIOPEN_COMPILE_AND_RUN=0 (compile only, escape generic search early)
export MIOPEN_DEVICE_ARCH=gfx906 (or whatever other supported arch)
export MIOPEN_DEVICE_CU=60 (or any other num_cu)
To run:
export MIOPEN_DEBUG_CONV_GEMM=0 (this can also be left out and invokers will actually compile and run as well)
export MIOPEN_CUSTOM_CACHE_DIR=test (must be same location as compile so it can find compiled kernels)
export MIOPEN_FIND_ENFORCE=4
export MIOPEN_COMPILE_AND_RUN=2 (anything but 0, can also be left unset)
I have tested as such:
Feel free to propose some actual utest I can add.
Blocked by: #203