-
Notifications
You must be signed in to change notification settings - Fork 562
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
i#2440: Macros for creating conditional instructions #4500
i#2440: Macros for creating conditional instructions #4500
Conversation
Add macros to create conditional instructions Add functions conidtional predicate operands Add tests to verify the added functiononality
opnd_t | ||
opnd_create_cond(int cond) | ||
{ | ||
/* FIXME i#1569: Move definition of dr_pred_type_t to opnd.h for AArch64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please create an issue separate from the AArch64 master issue #1569 to track this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created: #4502.
int | ||
opnd_get_cond(opnd_t opnd) | ||
{ | ||
/* FIXME i#1569: Move definition of dr_pred_type_t to opnd.h for AArch64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above.
for (reg_id_t rn = DR_REG_X0; rn <= DR_REG_XZR; rn++) { | ||
if (rn == DR_REG_XSP) { | ||
continue; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the code format checker insist on having braces for single-line if
statements?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if code formatter insists on it. I believe using braces improves readability of code, so I'd rather keep it like this.
* \note AArch64-only. | ||
*/ | ||
opnd_t | ||
opnd_invert_cond(opnd_t opnd); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this needs some thought as to how to integrate with existing interfaces and code, especially with ARM code.
The API already has instr_invert_predicate
. I would agree it seems less of an instr-level operation, but it would be confusing to have two different sets of interfaces.
The API also already has predicates passed as dr_pred_type_t to things like XINST_CREATE_jump_cond
in arm/instr_create.h. If a new type of operand, even if it's an immediate underneath is created, it would be best to use it everywhere. Whatever is decided, it seems best to have the same approach for all platforms, rather than special AArch64-only functions that don't match ARM functions yet do the same thing. That makes for confusing interfaces and complex cross-platform clients.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be a very big change to move all the condition-related interfaces that currently exist on the level of instructions to the level of operands. I don't think I know enough about other platforms to come up with a uniform solution, and I don't know if such change actually makes sense for x86 (after all instr_invert_predicate
was create for some reason).
Anyway, I'd like to do it in steps rather than in a single big pull request which is very difficult to merge. At least some guidance, suggestions or ideas would be helpful.
I would start with #4502, but again as a separate patch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now I could remove opnd_invert_cond
and use instr_invert_predicate
around instr_create_...
to flip the operand, however I'm not sure if instr_invert_predicate
is actually implemented for AArch64.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See https://dynamorio.org/dynamorio_docs/API_BT.html#sec_predication which lists one key current use of predicates/conditions:
- An instruction-level modifier that is not an operand:
instr_{get,set}_predicate
;INSTR_PRED
short form. While perhaps this could have been an operand (along with other things like condition code effects: tradeoffs with matching ISA conventions), it is not. Yet we want to be able to invert these predicates too. The currentinstr_invert_predicate
does not actually operate oninstr_t
: it operates ondr_pred_type_t
and thus operates on the instr-level predicates as well as operand-level.
Another current use, similar to the macros added here:
XINST_CREATE_jump_cond()
. This, on x86, arm, and aarch64, takes in adr_pred_type_t
-typed value. I would think that the signature/operation of this instruction should match things like theINSTR_CREATE_ccmn
added here. One shouldn't take adr_pred_type_t
while the other takes anopnd_t
. Maybe the argument could be made that we should change theXINST_CREATE_jump_cond
to take anopnd_t
of this new type: though note that the condition turns into different opcodes on x86, the instruction-level predicate field on arm, and an operand on aarch64, so they are not all treating it as an operand. See also other high-level discussions on what should constitute an operand versus an opcode or sub-opcode: i#4329: Implement effective address of DC cache operations #4386 (comment). Perhaps these aarch64 comparison should be using separate opcodes, rather than operands that change behavior which goes against what operands are supposed to do. Having a single opcode doing all these different comparisons is sort of like having one opcode "arithmetic" with an operand that says whether it's an add or a subtract or whatnot, instead of separate add and subtract opcodes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WDYT about splitting up CCMN and CCMP?
Why only these instructions? All conditional instructions in AArch64 (all mentioned in this PR) have cond
operand.
Also, in my opinion, inventing non-documented opcodes may create extra confusion. If some notion is not common to the rest of the architectures it's not the reason to disallow low-level access as documented. I think in DynamoRIO, when we work at the level of instructions and operands, we don't really have the benefit of "write once run everywhere". In my understanding, the point of the tool is to allow working on the lowest possible level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The behavior comes from the opcode, not the operands.
How about compare and swap instructions (e.g. CAS)? The behaviour depends on the value of the register being equal to the value loaded from memory. It is similar to the conditional instructions, although in the latter case the comparison has happened before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be useful to think of the pseudocode that specifies instruction behaviors. CCMP is specified by one pseudocode function, and the operands are inputs to that function. That single behavior function acts uniformly on all values of the operands. So it makes sense to treat it as one instruction, rather than single out one operand and split it into one instruction for each value of that operand.
The situation with SYS instructions is very different (similar to CDP instructions in 32-bit Arm) - they really are separate functions, selected by a function code, and it does make much more sense to split them out, at least when their specific functions are being differently handled by DynamoRIO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AssadHashmi WDYT about splitting up CCMN and CCMP?
If I understand correctly we have 3 choices:
1 - Create a new operand for cond
.
2 - Split out the opcodes into specific cond
types, e.g. ccmp_eq
, ccmp_ne
etc.
3 - Use existing dr_pred_type_t
as in XINST_CREATE_jump_cond
(alias for INSTR_CREATE_bcond
).
1 requires a non-trivial amount of work for a type which has essentially the same intent as dr_pred_type_t
.
2 is not consistent with AArch64's existing codec. We may run into problems as I did in #4386
3 would be my choice. It makes use of an existing type with the same intent which has already been implemented and does not involve a lot of codec reworking.
e.g.
#define INSTR_CREATE_ccmp(dc, Rn, Op, nzcv, cond) \
(INSTR_PRED(instr_create_0dst_3src(dc, OP_ccmp, Rn, Op, nzcv), (cond)))
Which users can call with:
INSTR_CREATE_ccmp(dc, opnd_create_reg(DR_REG_X1),
opnd_create_immed_int(0b01010, OPSZ_5b),
opnd_create_immed_int(0b1111, OPSZ_4b), DR_PRED_EQ);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AssadHashmi highlights what is perhaps the main issue here: using a simple and consistent set of abstractions and types in the IR. If we take each encoded behavior immediate integer that AArch64 presents as an operand (but I would call a sub-opcode and another ISA might well make part of the opcode, such as x86 conditional moves) and place it as a separate operand in DR's IR, we then need to make a new operand type with a new set of rules and enums and handling code and introduce a new concept to tool writers for what could logically fit into existing concepts. We could end up with a dozen new operand types without much benefit but incurring non-trivial costs. Using an existing IR concept keeps the IR simpler and cleaner and more consistent.
we don't really have the benefit of "write once run everywhere".
We can make it as easy as possible, though. It does make a difference: each IR choice that lets a tool or analysis library be more platform-agnostic makes tool writing easier.
In my understanding, the point of the tool is to allow working on the lowest possible level.
None of any of the discussion here has any bearing on whether a tool writer can generate any precise machine code desired: we're only talking about how to represent ISA concepts in the DR IR. None of the choices affect whether the ISA concept can be represented or whether the tool writer can generate it. And while the IR should be more abstracted and cross-platform, we can and do have layers that are closer to the assembly conventions of that ISA: the INSTR_CREATE_ macros try to look more like assembly, disassembly has both an abstracted DR format and an ISA format, and we'd love to have an even closer layer where assembler strings are converted into IR (can't seem to find it in the tracker...#4510) when writing code for a specific machine. But code analyzing and manipulating at the IR level should have as few types and special operand concepts as possible.
core/ir/aarch64/instr_create.h
Outdated
* Creates a CCMN (Conditional Compare Negative) instruction. | ||
* \param dc The void * dcontext used to allocate memory for the instr_t. | ||
* \param Rn The source register. | ||
* \param Op Either a 5-bit immediate (use opnd_create_immed_uint(val, OPSZ_5b) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this really need to be created with the OPSZ_5b
size? Could we change that and have the encoder handle any declared size? Unless there are multiple encoding variants that differ in immediate sizes, the encoder shouldn't care so long as the value fits in 5 bits? It would be best for usability to not require anyone to specify all these precise bit counts. If a bit count does have to specified, there should be a named constant like OPSZ_ccmn
or sthg like x86 has for OPSZ_xlat
, OPSZ_lgdt
, etc., and in addition maybe this macro should take just the integer and supply the size or set the size of the opnd before passing it on or sthg.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For these instructions we need an unsigned integer immediate operand. The existing API for creating such operands (opnd_create_immed_uint
) requires to specify an opnd_size_t
value. How do you suggest to create an operand without using opnd_create_immed_uint
? Or do you suggest to change opnd_create_immed_uint
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Allow any size. Or additionally maybe add opnd_create_immed_uint_unsized
and internally it sets OPSZ_4 or sthg and the a64 encoder ignores the size since it has no encoding decisions based on size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought it is helpful to give an example of how to create a correct operand. It would be better if this macro took just an integer value, but most of INSTR_CREATE_
macros accept operands. I can remove OPSZ_5b
from the doc string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can remove OPSZ_5b from the doc string.
SGTM. The other suggestions (like an _unsized version) are separate/larger scope from this PR. I'm pretty sure the a64 encoder today ignores the size completely, so the user could pass OPSZ_NA or sthg.
Enables the diagnostic option -steal_reg_at_reset for AArch64, generalizing the existing ARM code. Switches -reset_at_fragment_count to work in release build by using the sum of the release stats num_bbs and num_traces. Adds a release-build syslog for an informational notification when any reset occurs. Adds a test of -steal_reg_at_reset. Includes the key fix for #4497 since it could show up on the new test or with use of this now-enabled option; the full fix for that with better tests for its code path will come in separately. Issue: #1569, #4497
Instead of using partial struct initialisation, initialise the required fields individually. We've found that the time taken to zero a large struct shows up as noticeable overhead for optimised clients. So, instead of using partial struct initialisation, we set the fields required individually.
DR translates a fault in the code cache to a fault at the corresponding application address. This is done using ilist reconstruction for the fragment where the fault occurred. But, this does not work as expected when the DR client changes instrumentation during execution; currently, drcachesim does this to enable tracing after -trace_after_instrs. The reconstructed basic block gets the new instrumentation whereas the one in code cache has the old one. This causes issues during fault handling. In the current drcachesim case, it appears as though a meta-instr has faulted because the reconstructed ilist has a meta-instr at the code cache fault pc. This issue may manifest differently if the basic block with the new instrumentation is smaller than the old one (unlike the drcachesim 'meta-instr faulted' case) and the faulting address lies beyond the end of the new instrumented basic block. We may see an ASSERT_NOT_REACHED due to the ilist walk ending before the faulting code cache pc was found in the reconstructed ilist. In the existing code, drcachesim attempts to avoid this by flushing old fragments using dr_unlink_flush_region after it switches to the tracing instrumentation. However, due to the flush being asynch, there's a race and the flush does not complete in time. This PR adds support for a callback in the synchronous dr_flush_region API. The callback is executed after the flush but before the threads are resumed. Using the dr_flush_region callback to change drcachesim instrumentation ensures that old instrumentation is not applied after the flush and the new one is not applied before. Fixes: #1369
Adds two gdb python scripts I've developed that may be useful to others: 1) drsymload: loads DR symbols regardless of gdb's current state, which may include having DR symbols at the wrong address. It does this by reading /proc/self/maps and running objdump on libdynamorio.so. Ideally this would be integrated into a revived libdynamorio.so-gdb.py: that's part of #2100. 2) memquery: prints the /proc/self/maps line for a given address. I'm still shocked gdb doesn't provide such a command natively. Issue: #2100
While investigating #4460 I found that reg_ptr_used in insert_save_addr is uninitialized locally and insert_obtain_addr only writes it when true, leaving it uninitialized for the false case: thus we may re-instate the buffer pointer in cases where we don't need to. I believe this is only a performance issue. Issue: #4460
Change-Id: Ia3699cc5eb8074a10d8ce9e6c362c1bffd0cf477
To sum up the discussion, I would like to freeze this PR until I have time to submit a patch for #4504 (or someone else does it). |
Add macros to create conditional instructions
Add functions conditional predicate operands
Add tests to verify the added functionality
Issue: #2440