-
Notifications
You must be signed in to change notification settings - Fork 10.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[NVPTX] improve identifier renaming for PTX #79459
Conversation
@Artem-B, could you please review this change when you have a minute? Thanks! |
if (C == '.' || C == '@' || C == '<' || C == '>') { | ||
ValidNameStream << "_$_"; | ||
} else { | ||
if (isAlnum(C) || C == '_' || C == '$') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PTX also allows '%' as the first character:
https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#identifiers
followsym: [a-zA-Z0-9_$]
identifier: [a-zA-Z]{followsym}* | {[_$%]{followsym}+
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LLVM will throw a fatal error for '%' in symbol names in MCSymbol::print
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's worth a comment explaining the discrepancy between what PTX allows and what we're checking for here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, added
ValidNameStream << C; | ||
} else { | ||
ValidNameStream << "_$_"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ugh. I think we'll have a problem when we have symbols that have different invalid characters in the same place. E.g. str-
and str.
.
I think ideally we need to encode specific symbols instead. So, instead of just _$_
for all invalid characters, emit "$"+hex(char)
which will be unambiguous.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
setName
will add a postfix number to avoid collisions, so I think correctness is fine. Do you think adding the hex values in necessary for readability? I agree the current renaming leaves much to be desired but the hex scheme suggested will also have ambiguous cases. (i.e. str.
and str$2E
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
setName will add a postfix number to avoid collisions
That would make such escaping inconsistent between TUs which may be a problem for public symbols that need to match across TUs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This renaming is already limited to names with local linkage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. I think we do have some bugs open for generating invalid global PTX identifiers, but that's a bigger, and somewhat different can of worms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, I agree there are a lot of unaddressed edge cases here but that's beyond the scope of this PR
7c91f18
to
a422079
Compare
@Artem-B, assuming there aren't any other issues you'd like me to fix, could you please land this on my behalf? Thanks! |
Update
NVPTXAssignValidGlobalNames
to convert all characters which are illegal in PTX identifiers to_$_
. (PTX ISA: 4.4 Identifiers).