Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

arm: Add ARM_OP_ADDR #771

Closed
wants to merge 2 commits into from
Closed

Conversation

akihikodaki
Copy link
Contributor

@akihikodaki akihikodaki commented Sep 8, 2016

ATTENTION: THIS CHANGE BREAKS COMPATIBILITY

I found the assumption that address is stored as ARM_OP_IMM is actually already broken with #762. However, it is no longer possible to put it to imm since address can be greater than INT32_MAX. Doing so can result in anything, especially faults in arithmetic operations.

Moreover, the old binding in Ocaml ignores bit 31 since the length of int in the language is 31 bits. It means addresses should be treated differently in the binding.

Though the series of changes is a bug fix, breaking compatibility is inevitable.

See also #765.

@radare
Copy link
Contributor

radare commented Apr 14, 2017

please rebase

@radare
Copy link
Contributor

radare commented Apr 14, 2017

ok to break compat for me unless it introduces regressions

@pranith
Copy link
Contributor

pranith commented Mar 4, 2021

Can you please generate a new PR on libcapstone?

@akihikodaki
Copy link
Contributor Author

Rebased to commit 3f46b83 with commit 2e6575f, and it is mirrored to libcapstone/libcapstone#3.

@pranith
Copy link
Contributor

pranith commented Mar 5, 2021

@aquynh LGTM.

@XVilka
Copy link
Contributor

XVilka commented Jun 16, 2023

Relevant to the ARM auto-sync work #1949 and #2045

@Rot127
Copy link
Collaborator

Rot127 commented Jun 16, 2023

Indeed it is. But I would not add ARM_OP_ADDR as additional operand type. The underlying problem is more that the immediate type is int32_t and not uint64_t + some information how to interpret it.

The new operand mapping tables contain the type of each operand. (although they seem to be off. E.g. the address is signed. Maybe we need to fix the td files for this:

Here an example:

{ /* ARM_VLD1LNdAsm_16 (384) - ARM_INS_VLD1 - vld1${p}.16	$list, $addr */
{
  { CS_OP_REG, CS_AC_READ, { CS_DATA_TYPE_f64, CS_DATA_TYPE_v8i8, CS_DATA_TYPE_v4i16, CS_DATA_TYPE_v2i32, CS_DATA_TYPE_v1i64, CS_DATA_TYPE_v2f32, CS_DATA_TYPE_v4f16, CS_DATA_TYPE_v4bf16, CS_DATA_TYPE_LAST } }, /* list - DPR */
  { CS_OP_IMM, CS_AC_READ, { CS_DATA_TYPE_i32, CS_DATA_TYPE_LAST } }, /* list - i32imm */
  { CS_OP_MEM | CS_OP_REG, CS_AC_READ, { CS_DATA_TYPE_i32, CS_DATA_TYPE_LAST } }, /* addr - GPR */
  { CS_OP_MEM | CS_OP_IMM, CS_AC_READ, { CS_DATA_TYPE_i32, CS_DATA_TYPE_LAST } }, /* addr - i32imm */
  { CS_OP_IMM, CS_AC_READ, { CS_DATA_TYPE_i32, CS_DATA_TYPE_LAST } }, /* p - i32imm */
  { CS_OP_IMM, CS_AC_READ, { CS_DATA_TYPE_i32, CS_DATA_TYPE_LAST } }, /* p - i32imm */
  { 0 }
}},

So I would propose to solve this problem by making arm_op->imm an uint64_t and add a helper function like

typedef union {
  uint32_t u32;
  int32_t s32;
  float f32;
  ...
} cast_result;

/// Cast the given @imm to its data type specified by LLVM.
/// It returns the cs_data_type the @imm was casted to.
/// The corresponding field in @result is set to the casted value.
cs_data_type map_cast_operand(uint64_t imm, cast_result *result);

@akihikodaki
Copy link
Contributor Author

I'm closing this pull request. As @Rot127 says, the sign of the immediate should be determined by the instruction.
ARM_OP_ADDR clarifies an immediate is unsigned if it's PC-relative, but it cannot tell the sign of the immediate otherwise. There should be some common mechanism to determine the sign of the immediate, and such a mechanism can be implemented easily with a table @Rot127 suggests.

However I suggest to leave imm 32-bit instead of converting it to uint64_t. There is no need for extending it to 64-bit.

@akihikodaki akihikodaki deleted the next_addr branch June 21, 2023 06:58
@Rot127
Copy link
Collaborator

Rot127 commented Jun 21, 2023

@akihikodaki Would you mind open an issue about it? Just so we do not forget it and we can assign it to a milestone.

However I suggest to leave imm 32-bit instead of converting it to uint64_t. There is no need for extending it to 64-bit.

Will include this suggestion in #1949

@akihikodaki
Copy link
Contributor Author

@akihikodaki Would you mind open an issue about it? Just so we do not forget it and we can assign it to a milestone.

Done: #2056

Rot127 added a commit to Rot127/capstone that referenced this pull request Jul 4, 2023
kabeor pushed a commit that referenced this pull request Jul 19, 2023
* Add auto-sync updater.

* Update Capstone core with auto-sync changes.

* Update ARM via auto-sync.

* Make changes to arch modules which are introduced by auto-sync.

* Update tests for ARM.

* Fix build warnings for make

* Remove meson.build

* Print shift amount in decimal

* Patch non LLVM register alias.

* Change type of immediate operand to unsiged (due to: #771)

* Replace all occurances of a register with its alias.

* Fix printing of signed imms

* Print rotate amount in decimal

* CHange imm type to int64_t to match LLVM imm type.

* Fix search for register names, by completing string first.

* Print ModImm operands always in decimal

* Use number format of previous capstone version.

* Correct implicit writes and update_flags according to SBit.

* Add missing test for RegImmShift

* Reverse incorrect comparision.

* Set shift information for move instructions.

* Set mem access for all memory operands

* Set subtracted flag if offset is negative.

* Add flag for post-index memory operands.

* Add detail op for BX_RET and MOVPCLR

* Use instruction post_index operand.

* Add VPOP and VPUSH as unique CS IDs.

* Add shifting info for MOVsr.

* Add TODOs.

* Add in LLVM hardcoded operands to detail.

* Move detail editing from InstPrinter to Mapping

* Formatting

* Add removed check.

* Add writeback register and constraints to RFEI instructions.

* Translate shift immediate

* Print negative immediates

* Remove duplicate invalid entry

* Add CS groups to instructions

* Fix write attriutes of stores.

* Add missing names of added instructions

* Fix LLVM bug

* Add more post_index flags

* http -> https

* Make generated functions static

* Remove tab prefix for alias instructions.

* Set ValidateMCOperand to NULL.

* Fix AddrMode3Operand operands

* Allow getting system and banked register name via API

* Add writeback to STC/LDC instructions.

* Fix (hopefully) last case where disp is negative and subtracted = true

* Remove accidentially introduced regressions
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants