-
Notifications
You must be signed in to change notification settings - Fork 74k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ROCm] AMDGPU compiler fixes #41641
[ROCm] AMDGPU compiler fixes #41641
Conversation
Deleting temporary files after compilation
g_hsacoCache.request_count++; | ||
if (hit) g_hsacoCache.hit_count++; | ||
if (!(g_hsacoCache.request_count % 50)) | ||
VLOG(0) << "HSACO cache: " << g_hsacoCache.request_count << " requests, " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you bump this to VLOG(1)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes
@@ -584,18 +640,29 @@ StatusOr<std::vector<uint8>> EmitModuleToHsaco( | |||
std::string tempdir_name = tempdir_vector.front(); | |||
VLOG(1) << "Compile-time artifacts located at: " << tempdir_name; | |||
|
|||
bool keep_tempfiles = false; | |||
TF_CHECK_OK(tensorflow::ReadBoolFromEnvVar("TF_ROCM_XLA_TEMPFILES", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would TF_ROCM_KEEP_XLA_TEMPFILES
be clearer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure
@@ -584,18 +640,29 @@ StatusOr<std::vector<uint8>> EmitModuleToHsaco( | |||
std::string tempdir_name = tempdir_vector.front(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible (and better?) to pick a new sub-directory each time, instead of different file names?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what's the benefit in that. More library calls, and all the files are going to be deleted anyway. Also, there's a theoretical possibility that we won't be able to delete the subdirectory if it is still open in some subprocess when we get to the end of the function.
In any event, that is an additional feature and shouldn't go into this PR, I think.
std::string ir_str; | ||
llvm::raw_string_ostream stream(ir_str); | ||
stream << *module; | ||
std::string str = stream.str(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use ir_str
, no need to copy it into str
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
std::string str = stream.str(); | ||
// Delete the first two lines, since they usually vary even when the rest of | ||
// the code is the same (but verify that they are what we expect). | ||
if (str.size() >= 13 && str.substr(0, 13) == "; ModuleID = ") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to use llvm string processing helpers here, or even llvm::Regex?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code does the job as written. Why bring in nonstandard APIs and add more complexity?
if (dump_lls) { | ||
static int hsaco_count = 0; | ||
char name[256]; | ||
sprintf(name, "/tmp/%d.ll", hsaco_count); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should use C libraries here. Can you please modernize this block?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter. We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the ℹ️ Googlers: Go here for more info. |
499cdbd
to
a190fee
Compare
CLAs look good, thanks! ℹ️ Googlers: Go here for more info. |
This PR:
Ensures that AMDGPU compiler's temporary files are all different from instance to instance (important for multi-GPU training, e.g. with Horovod) and that they are deleted after compilation
Adds a HSACO cache, to speed up the compilation process when compilation of multiple identical IR's is requested