-
Notifications
You must be signed in to change notification settings - Fork 149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix MasterSolutionLibrary indexing for multiple architecture build #1888
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leave it to Tensile team to approve, but I would make as soon as you can add trivial test using your csv output to validate both the csv, and see that counts or something else match actual code object data. Could be post_build cmake step, or just a python test.
Tensile/SolutionLibrary.py
Outdated
solutionIndexMap = {architectureName:int(offset*pow(2,16)) | ||
for architectureName,offset in zip(architectureList,range(len(architectureList)))} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While you debug you might just want to use the gfx number as is hex for the high 16bits. Solution indices aren't going to be preserved across release I thought so if there is ever trouble you can drop down to sequence int.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean that you prefer the upper 16 bits to be, for example, 0x90a for gfx90a?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes but if you think you'll need > 65536 solutions for a given gfx soon you may need to drop down to fewer bits for the gfx.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea was just while analyzing and debugging you don't have to look at that order list to figure out the sequence table to gfx number from high 4 bytes, you can just look at it in hex.
newSolutions[curIndex] = s | ||
curIndex += 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should add guard for chosen bucket size overflow, can use constant or 65536
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bucket sizes are now 262144. Instead, I added a check for architecture clobbering.
I see gfx906 test failed with the following error message. terminate called after throwing an instance of 'std::invalid_argument' I am not sure if this is caused by your change or not. |
Some merged code from develop is showing up in your change list. |
a9a58d0
to
6267093
Compare
I do not have any further comments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well as you may have failures in develop now will approve with the hope of pushing other reviewers. The validation scan is manually run and passed I presume.
Can it be run on the failing pipeline?
@bragadeesh or @AlexBrownAMD who is the scrum master for this sprint? You should want to get reviews withing a day or so, or @yenong-amd you should assign the people who must review before merge to help push it along. |
I think @lringham was going to test this on some of his tickets. |
@nakajee Can you please help me merge? I don't have merge privilege. Thanks! |
Done |
Hotfix: Fix MasterSolutionLibrary indexing for multiple architecture build (#1888)
Implements a fix for solution index collision in multi-architecture build by