i#2440: Macros for creating conditional instructions #4500

yury-khrustalev · 2020-10-29T10:11:14Z

Add macros to create conditional instructions
Add functions conditional predicate operands
Add tests to verify the added functionality

Issue: #2440

Add macros to create conditional instructions Add functions conidtional predicate operands Add tests to verify the added functiononality

AssadHashmi · 2020-10-29T16:09:29Z

core/ir/opnd_shared.c

+opnd_t
+opnd_create_cond(int cond)
+{
+    /* FIXME i#1569: Move definition of dr_pred_type_t to opnd.h for AArch64


Please create an issue separate from the AArch64 master issue #1569 to track this.

Created: #4502.

AssadHashmi · 2020-10-29T16:09:47Z

core/ir/opnd_shared.c

+int
+opnd_get_cond(opnd_t opnd)
+{
+    /* FIXME i#1569: Move definition of dr_pred_type_t to opnd.h for AArch64


AssadHashmi · 2020-10-29T16:12:50Z

suite/tests/api/ir_aarch64.c

+    for (reg_id_t rn = DR_REG_X0; rn <= DR_REG_XZR; rn++) {
+        if (rn == DR_REG_XSP) {
+            continue;
+        }


Does the code format checker insist on having braces for single-line if statements?

I'm not sure if code formatter insists on it. I believe using braces improves readability of code, so I'd rather keep it like this.

derekbruening · 2020-10-29T18:26:36Z

core/ir/opnd.h

+ * \note AArch64-only.
+ */
+opnd_t
+opnd_invert_cond(opnd_t opnd);


I think this needs some thought as to how to integrate with existing interfaces and code, especially with ARM code.

The API already has instr_invert_predicate. I would agree it seems less of an instr-level operation, but it would be confusing to have two different sets of interfaces.

The API also already has predicates passed as dr_pred_type_t to things like XINST_CREATE_jump_cond in arm/instr_create.h. If a new type of operand, even if it's an immediate underneath is created, it would be best to use it everywhere. Whatever is decided, it seems best to have the same approach for all platforms, rather than special AArch64-only functions that don't match ARM functions yet do the same thing. That makes for confusing interfaces and complex cross-platform clients.

It would be a very big change to move all the condition-related interfaces that currently exist on the level of instructions to the level of operands. I don't think I know enough about other platforms to come up with a uniform solution, and I don't know if such change actually makes sense for x86 (after all instr_invert_predicate was create for some reason).

Anyway, I'd like to do it in steps rather than in a single big pull request which is very difficult to merge. At least some guidance, suggestions or ideas would be helpful.

I would start with #4502, but again as a separate patch.

Right now I could remove opnd_invert_cond and use instr_invert_predicate around instr_create_... to flip the operand, however I'm not sure if instr_invert_predicate is actually implemented for AArch64.

See https://dynamorio.org/dynamorio_docs/API_BT.html#sec_predication which lists one key current use of predicates/conditions:

An instruction-level modifier that is not an operand: instr_{get,set}_predicate; INSTR_PRED short form. While perhaps this could have been an operand (along with other things like condition code effects: tradeoffs with matching ISA conventions), it is not. Yet we want to be able to invert these predicates too. The current instr_invert_predicate does not actually operate on instr_t: it operates on dr_pred_type_t and thus operates on the instr-level predicates as well as operand-level.

Another current use, similar to the macros added here:

XINST_CREATE_jump_cond(). This, on x86, arm, and aarch64, takes in a dr_pred_type_t-typed value. I would think that the signature/operation of this instruction should match things like the INSTR_CREATE_ccmn added here. One shouldn't take a dr_pred_type_t while the other takes an opnd_t. Maybe the argument could be made that we should change the XINST_CREATE_jump_cond to take an opnd_t of this new type: though note that the condition turns into different opcodes on x86, the instruction-level predicate field on arm, and an operand on aarch64, so they are not all treating it as an operand. See also other high-level discussions on what should constitute an operand versus an opcode or sub-opcode: i#4329: Implement effective address of DC cache operations #4386 (comment). Perhaps these aarch64 comparison should be using separate opcodes, rather than operands that change behavior which goes against what operands are supposed to do. Having a single opcode doing all these different comparisons is sort of like having one opcode "arithmetic" with an operand that says whether it's an add or a subtract or whatnot, instead of separate add and subtract opcodes.

Continuing in the thread of the last few sentences: xref other issues where we want to split up the current aarch64 opcodes which have behavior-controlling integer operands which are perhaps better as separate opcodes rather than operands: #4388, #4393.

WDYT about splitting up CCMN and CCMP?

Why only these instructions? All conditional instructions in AArch64 (all mentioned in this PR) have cond operand.

Also, in my opinion, inventing non-documented opcodes may create extra confusion. If some notion is not common to the rest of the architectures it's not the reason to disallow low-level access as documented. I think in DynamoRIO, when we work at the level of instructions and operands, we don't really have the benefit of "write once run everywhere". In my understanding, the point of the tool is to allow working on the lowest possible level.

The behavior comes from the opcode, not the operands.

How about compare and swap instructions (e.g. CAS)? The behaviour depends on the value of the register being equal to the value loaded from memory. It is similar to the conditional instructions, although in the latter case the comparison has happened before.

It might be useful to think of the pseudocode that specifies instruction behaviors. CCMP is specified by one pseudocode function, and the operands are inputs to that function. That single behavior function acts uniformly on all values of the operands. So it makes sense to treat it as one instruction, rather than single out one operand and split it into one instruction for each value of that operand.

The situation with SYS instructions is very different (similar to CDP instructions in 32-bit Arm) - they really are separate functions, selected by a function code, and it does make much more sense to split them out, at least when their specific functions are being differently handled by DynamoRIO.

@AssadHashmi WDYT about splitting up CCMN and CCMP?

If I understand correctly we have 3 choices:

1 - Create a new operand for cond.
2 - Split out the opcodes into specific cond types, e.g. ccmp_eq, ccmp_ne etc.
3 - Use existing dr_pred_type_t as in XINST_CREATE_jump_cond (alias for INSTR_CREATE_bcond).

1 requires a non-trivial amount of work for a type which has essentially the same intent as dr_pred_type_t.

2 is not consistent with AArch64's existing codec. We may run into problems as I did in #4386

3 would be my choice. It makes use of an existing type with the same intent which has already been implemented and does not involve a lot of codec reworking.
e.g.

#define INSTR_CREATE_ccmp(dc, Rn, Op, nzcv, cond) \ (INSTR_PRED(instr_create_0dst_3src(dc, OP_ccmp, Rn, Op, nzcv), (cond)))

Which users can call with:

INSTR_CREATE_ccmp(dc, opnd_create_reg(DR_REG_X1), opnd_create_immed_int(0b01010, OPSZ_5b), opnd_create_immed_int(0b1111, OPSZ_4b), DR_PRED_EQ);

@AssadHashmi highlights what is perhaps the main issue here: using a simple and consistent set of abstractions and types in the IR. If we take each encoded behavior immediate integer that AArch64 presents as an operand (but I would call a sub-opcode and another ISA might well make part of the opcode, such as x86 conditional moves) and place it as a separate operand in DR's IR, we then need to make a new operand type with a new set of rules and enums and handling code and introduce a new concept to tool writers for what could logically fit into existing concepts. We could end up with a dozen new operand types without much benefit but incurring non-trivial costs. Using an existing IR concept keeps the IR simpler and cleaner and more consistent.

we don't really have the benefit of "write once run everywhere".

We can make it as easy as possible, though. It does make a difference: each IR choice that lets a tool or analysis library be more platform-agnostic makes tool writing easier.

In my understanding, the point of the tool is to allow working on the lowest possible level.

None of any of the discussion here has any bearing on whether a tool writer can generate any precise machine code desired: we're only talking about how to represent ISA concepts in the DR IR. None of the choices affect whether the ISA concept can be represented or whether the tool writer can generate it. And while the IR should be more abstracted and cross-platform, we can and do have layers that are closer to the assembly conventions of that ISA: the INSTR_CREATE_ macros try to look more like assembly, disassembly has both an abstracted DR format and an ISA format, and we'd love to have an even closer layer where assembler strings are converted into IR (can't seem to find it in the tracker...#4510) when writing code for a specific machine. But code analyzing and manipulating at the IR level should have as few types and special operand concepts as possible.

derekbruening · 2020-10-30T03:02:43Z

core/ir/aarch64/instr_create.h

+ * Creates a CCMN (Conditional Compare Negative) instruction.
+ * \param dc      The void * dcontext used to allocate memory for the instr_t.
+ * \param Rn      The source register.
+ * \param Op      Either a 5-bit immediate (use opnd_create_immed_uint(val, OPSZ_5b)


Does this really need to be created with the OPSZ_5b size? Could we change that and have the encoder handle any declared size? Unless there are multiple encoding variants that differ in immediate sizes, the encoder shouldn't care so long as the value fits in 5 bits? It would be best for usability to not require anyone to specify all these precise bit counts. If a bit count does have to specified, there should be a named constant like OPSZ_ccmn or sthg like x86 has for OPSZ_xlat, OPSZ_lgdt, etc., and in addition maybe this macro should take just the integer and supply the size or set the size of the opnd before passing it on or sthg.

For these instructions we need an unsigned integer immediate operand. The existing API for creating such operands (opnd_create_immed_uint) requires to specify an opnd_size_t value. How do you suggest to create an operand without using opnd_create_immed_uint? Or do you suggest to change opnd_create_immed_uint?

Allow any size. Or additionally maybe add opnd_create_immed_uint_unsized and internally it sets OPSZ_4 or sthg and the a64 encoder ignores the size since it has no encoding decisions based on size.

I thought it is helpful to give an example of how to create a correct operand. It would be better if this macro took just an integer value, but most of INSTR_CREATE_ macros accept operands. I can remove OPSZ_5b from the doc string.

I can remove OPSZ_5b from the doc string.

SGTM. The other suggestions (like an _unsized version) are separate/larger scope from this PR. I'm pretty sure the a64 encoder today ignores the size completely, so the user could pass OPSZ_NA or sthg.

Enables the diagnostic option -steal_reg_at_reset for AArch64, generalizing the existing ARM code. Switches -reset_at_fragment_count to work in release build by using the sum of the release stats num_bbs and num_traces. Adds a release-build syslog for an informational notification when any reset occurs. Adds a test of -steal_reg_at_reset. Includes the key fix for #4497 since it could show up on the new test or with use of this now-enabled option; the full fix for that with better tests for its code path will come in separately. Issue: #1569, #4497

Instead of using partial struct initialisation, initialise the required fields individually. We've found that the time taken to zero a large struct shows up as noticeable overhead for optimised clients. So, instead of using partial struct initialisation, we set the fields required individually.

DR translates a fault in the code cache to a fault at the corresponding application address. This is done using ilist reconstruction for the fragment where the fault occurred. But, this does not work as expected when the DR client changes instrumentation during execution; currently, drcachesim does this to enable tracing after -trace_after_instrs. The reconstructed basic block gets the new instrumentation whereas the one in code cache has the old one. This causes issues during fault handling. In the current drcachesim case, it appears as though a meta-instr has faulted because the reconstructed ilist has a meta-instr at the code cache fault pc. This issue may manifest differently if the basic block with the new instrumentation is smaller than the old one (unlike the drcachesim 'meta-instr faulted' case) and the faulting address lies beyond the end of the new instrumented basic block. We may see an ASSERT_NOT_REACHED due to the ilist walk ending before the faulting code cache pc was found in the reconstructed ilist. In the existing code, drcachesim attempts to avoid this by flushing old fragments using dr_unlink_flush_region after it switches to the tracing instrumentation. However, due to the flush being asynch, there's a race and the flush does not complete in time. This PR adds support for a callback in the synchronous dr_flush_region API. The callback is executed after the flush but before the threads are resumed. Using the dr_flush_region callback to change drcachesim instrumentation ensures that old instrumentation is not applied after the flush and the new one is not applied before. Fixes: #1369

Adds two gdb python scripts I've developed that may be useful to others: 1) drsymload: loads DR symbols regardless of gdb's current state, which may include having DR symbols at the wrong address. It does this by reading /proc/self/maps and running objdump on libdynamorio.so. Ideally this would be integrated into a revived libdynamorio.so-gdb.py: that's part of #2100. 2) memquery: prints the /proc/self/maps line for a given address. I'm still shocked gdb doesn't provide such a command natively. Issue: #2100

While investigating #4460 I found that reg_ptr_used in insert_save_addr is uninitialized locally and insert_obtain_addr only writes it when true, leaving it uninitialized for the false case: thus we may re-instate the buffer pointer in cases where we don't need to. I believe this is only a performance issue. Issue: #4460

Change-Id: Ia3699cc5eb8074a10d8ce9e6c362c1bffd0cf477

yury-khrustalev · 2020-11-03T19:13:02Z

To sum up the discussion, I would like to freeze this PR until I have time to submit a patch for #4504 (or someone else does it).
The idea is to use instr_invert_predicate function instead of opnd_invert_cond proposed here (which will be remove from the patch). I hope this is OK.

) This adds immediate and register variants of CCMP and CCMN to the codec. The implementation also addresses the issues raised in PR #4500. Issues: #2440, #2443, #2626

yury-khrustalev added 4 commits October 29, 2020 10:06

i#2440: Macros for creating conditional instructions

14bfb4b

Add macros to create conditional instructions Add functions conidtional predicate operands Add tests to verify the added functiononality

i#2440: Fix formatting issues

08ce2c4

i#2440: Fix more formatting issues

3c13a13

i#2440: Fix more formatting issues

16f30ba

AssadHashmi assigned yury-khrustalev Oct 29, 2020

AssadHashmi reviewed Oct 29, 2020

View reviewed changes

derekbruening requested changes Oct 29, 2020

View reviewed changes

derekbruening reviewed Oct 30, 2020

View reviewed changes

derekbruening and others added 7 commits November 2, 2020 10:11

Merge branch 'master' into i2440-add-conditional-instr-macros

f985439

i#2440: Remove sizes of imm operands from docstings

c84871d

Change-Id: Ia3699cc5eb8074a10d8ce9e6c362c1bffd0cf477

yury-khrustalev mentioned this pull request Dec 29, 2020

i#4504: Implement instr_invert_predicate for AArch64 #4637

Merged

yury-khrustalev closed this Apr 17, 2021

derekbruening mentioned this pull request Nov 18, 2021

i#2440: AArch64 v8.0 codec: add CCMP and CCMN INSTR_CREATE macros #5216

Merged

AssadHashmi mentioned this pull request Jul 18, 2024

AArch64 CCMP is not marked as reading NZCV, nor writing NZCV unconditionally #6871

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

i#2440: Macros for creating conditional instructions #4500

i#2440: Macros for creating conditional instructions #4500

yury-khrustalev commented Oct 29, 2020 •

edited

Loading

AssadHashmi Oct 29, 2020

yury-khrustalev Oct 29, 2020

AssadHashmi Oct 29, 2020

AssadHashmi Oct 29, 2020

yury-khrustalev Oct 29, 2020

derekbruening Oct 29, 2020

yury-khrustalev Oct 29, 2020

yury-khrustalev Oct 29, 2020

derekbruening Oct 30, 2020

derekbruening Oct 30, 2020

yury-khrustalev Oct 30, 2020

yury-khrustalev Oct 30, 2020

algrant-arm Nov 2, 2020

AssadHashmi Nov 2, 2020

derekbruening Nov 2, 2020

derekbruening Oct 30, 2020

yury-khrustalev Oct 30, 2020

derekbruening Oct 30, 2020

yury-khrustalev Oct 30, 2020

derekbruening Oct 30, 2020

yury-khrustalev commented Nov 3, 2020

i#2440: Macros for creating conditional instructions #4500

i#2440: Macros for creating conditional instructions #4500

Conversation

yury-khrustalev commented Oct 29, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yury-khrustalev commented Nov 3, 2020

yury-khrustalev commented Oct 29, 2020 •

edited

Loading