New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue 11788 - [x86] Valgrind unhandled instruction bytes: 0xC8 0x8 0x0 0x0 #3609
Conversation
…8 0x0 0x0 Use combination of PUSH, MOV and SUB instead of unsupported ENTER instruction
Thanks, I look forward to being able to use valgrind. |
Unfortunately it's going to take more than this patch. Next up is this: http://valgrind.10908.n7.nabble.com/how-to-disable-unhandled-instruction-bytes-error-reporting-td42100.html |
Are the remaining unhandled instructions only floating point? Or will it still fault on certain integer instructions? |
I don't know - with this patch hello world will work under valgrind, but that doesn't mean there aren't other broken integer instructions. |
buf->writeByte(0); | ||
buf->writeByte(0); | ||
off += 6; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a rather inefficient way to do things, with all the repeated calls to writeByte. A smaller way is to use a fixed size array, assigning the values to the array, then passing the array in one call to buf->write():
unsigned char code[11];
unsigned char* p = &code[0];
code[0] = 0x50 + BP;
if (I64)
{ code[1] = REX | REX_W;
++p;
}
p[1] = 0x8B;
p[2] = modregrm(3,BP,SP);
if (align)
{
if (I64)
{
p[3] = REX | REX_W;
++p;
}
p[3] = 0x81;
... etc ...
}
buf->write(code, p - code + 3);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't that basically the same thing? Doesn't the outbuf essentially do this internally? For code like this that is run only once per program, I would think clarity is more important, and the code is dead simple.
I'll change it if you feel strongly about it, let me know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right in that the code is logically the same. But if you look at the object code, you'll find my proposed version is dramatically shorter. I'd prefer the shorter version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer @yebblies version, it's easier to read & maintain.
What's the point of the micro-optimization? Tradeoff the time of people maintaining the backend for a few bytes of object code? HDD is cheap and life is short.
Perhaps if you we're more active in the maintenance of the backend this type of "optimization" would be more palatable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the point of the micro-optimization?
Sloth and bloat seeps in through a thousand cracks.
Perhaps if you we're more active in the maintenance of the backend this type of "optimization" would be more palatable.
Uncalled for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right in that the code is logically the same. But if you look at the object code, you'll find my proposed version is dramatically shorter. I'd prefer the shorter version.
Looking at the code again, your version is ridiculously complicated, and does not match the surrounding code, which uses repeated calls to writeByte
. It also does not appear to be significantly shorter, except for the fact that you removed the comments and some of the code.
This is not a frequently-called function, and micro-optimizing it at the expense of readability is not something I'm going to do.
Also note that align
is usually (always?) zero, resulting in the same number of calls to writeByte
as the original code.
Please reconsider merging this as-is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If align is 0, your version does 3 calls to writeByte() while mine does one call to write().
I do not agree that my version is ridiculously complicated, all it does is simply fill an array. I understand your argument about micro-optimizing something, but in this case do not agree with it. The road to a slow compiler is filled with individually insignificant slowdowns, and a primary attraction of dmd is ridiculously fast compile times.
But I turned on auto-merge because I'm tired of arguing about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It also does not appear to be significantly shorter
It is in terms of generated code. Take a look at the asm output and see.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If align is 0, your version does 3 calls to writeByte() while mine does one call to write().
And the original code does 4 calls. (And I don't think it's actually possible for align to be non-zero, so this code is already an optimization)
I do not agree that my version is ridiculously complicated, all it does is simply fill an array. I understand your argument about micro-optimizing something, but in this case do not agree with it. The road to a slow compiler is filled with individually insignificant slowdowns, and a primary attraction of dmd is ridiculously fast compile times.
Since "ridiculously complicated" is a very subjective term I guess we'll have to agree to disagree here. I do not consider the readability hit worth the unnoticeable difference in performance. Also, I would have done it that way if the surrounding code did it that way.
But I turned on auto-merge because I'm tired of arguing about it.
Thank you.
Auto-merge toggled on |
Issue 11788 - [x86] Valgrind unhandled instruction bytes: 0xC8 0x8 0x0 0x0
Use combination of PUSH, MOV and SUB instead of unsupported ENTER instruction, as recommended here.
Also, why the hell is this code generated manually? @WalterBright