-
Notifications
You must be signed in to change notification settings - Fork 209
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for Cavium Octeon MIPS instructions, and general MIPS fixes #5615
Conversation
This makes it easier to add opcodes, and also is a little easier on the eyes. This also makes it more explicit that there won't be any out-of-bound read weirdness if accessing a row/column of an array that hasn't been specified.
These were some instructions that were added in MIPS III, but also exist in V, VI, but weren't in the decode tables.
This is to prepare for adding an option to support Cavium MIPS instructions.
These were in version 4 for some reason, but not version 6...
This was using the wrong shift direction, and also an unintuitive order in the AND masking.
This wasn't performing the shift at all, and was masking the wrong value for the register whose value to keep. The order of the ANDs was also unintuitive, starting with the constant.
There are a bunch of these in the linux kernel I'm working with, and this helps cut down on the number of "unimplemented LLIL" to sift through.
This also sets the second immediate to 0 to clarify that cop2 only has one immediate field.
Note that we don't have any code samples that use these instructions yet, but they're basically the same as their 32-bit counterparts in terms of disassembly.
Bitwise instructions (AND, ANDI, NOR, OR, ORI, XOR, XORI...there may be others, they're not conveniently listed anywhere) don't sign-extend the result, so for 64-bit architectures the whole register gets used.
Previously, this didn't account for a carry bit from the add to LO going to the add of HI, and it's a bit clearer with register splitting anyways.
These just sign extend to the width of the register.
With the exception of the unaligned store/loads, this should be all of the cavium instructions. We don't have any code samples for VMM0 or VMULU, so we leave them unlifted for now.
This seemed to have assumed that if the code reaches the point where it's lifting a lone LWL/LWR/SWL/SWR, then it didn't get caught by the code that checked for pairs of LWL/LWR/SWL/SWR, and so it must be a 2-byte read...however, this assumption doesn't hold for code like this: ``` swl v0, (v1) <random other instructions> swr v0, 3(v1) ``` In this case, the intent is still an unaligned store, just split up for whatever reason the compiler did it, and this isn't a 2-byte store, which is what the LLIL was. The actual LLIL is going to be a lot trickier, but in the meantime it's probably better to admit we can't lift this code, rather than lifting it incorrectly.
If BEQZ gets to exist, it only makes sense to make BNEZ too.
noone seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
Note that LWL/LWR hasn't been tested as much, since the only places these are used in the binary sample I have are right next to each other, so they get treated as a pair and lifted together. There didn't seem to be a clean way to lift these instructions that wouldn't involve masking pointers, which isn't friendly to users or automated analysis, so I settled on intrinsics. One fidelity trade-off is that the LLIL indicates that the entire 32-bit/64-bit value is read/written to for each instruction. The decision to be optimistic about how many bytes get accessed was chosen to help make sure structure xrefs get noticed. For example, suppose there's a 4-byte value at offset 7 of a struct; when this gets set by SWL rX, 7(rY)/SWR rX, 0xa(rY), we want to make sure it gets caught by each instruction, since each one writes that offset. An unfortunate side effect is that in this example, the LLIL indicates that offset 0xb also gets written to by the SWR instruction, which is not true. However, it's generally more annoying when xrefs aren't originally found than having to filter through xrefs that do exist, so this trade-off seemed worth it to me.
So I do believe we can support this as a separate architecture using some of the newer platform callbacks. I can look them up when I get to a computer, but we do this now for a couple of architectures. |
This uses a way less hacky way to recognize whether or not to use the octeon MIPS architecture versus the vanilla MIPS architecture than a settings menu.
il.Const(registerSize, (1<<op4.immediate)-1), | ||
il.ShiftLeft(registerSize, ReadILOperand(il, instr, 2, registerSize), | ||
il.Const(1, op3.immediate))))); | ||
il.LogicalShiftRight(registerSize, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch!
As it stands right now this does not cause any regressions and the changes which do impact normal MIPS binaries are positive, I am inclined to merge this as-is. The Cavium specific instructions are not easily reviewable, however given the other changes to the MIPS architecture were correct and high-quality, the lifting of those instructions are not a cause for concern. |
Added in |
There's one minor change I'm tempted to make to this after the next stable is out, that would be to look into For the samples I've seen, I think would be generally viable and clean up intrinsics somewhat, but my sample size hasn't been enormous. You have any thoughts about this? This is strictly speaking less precise but possibly more useful. |
Bear with me if I'm doing anything weird, this is literally the first time I've ever made a pull request on github. I also do most of my work on a non-internet-connected machine (whose git user name is noone), so hopefully nothing weird that came from that either.
This pull request adds support for Cavium MIPS instructions; these processors are also MIPS64 processors, so I added missing MIPS64 instructions that make Linux disassemble more nicely. While I was doing work here, I figured I might as well fix some lifting bugs as I found them.
Explanations for some potentially head-scratching decisions:
Un-done work:
An unsolicited suggestion would be that we don't need all of the MIPS version decode tables for 1 through 6; only a few are used anyways (the one for MIPS32, MIPS64, and cavium64), and as far as I can tell there aren't any forwards-incompatible instructions; the one example I saw was MIPS_LHI that only exists in version 1, which isn't an instruction in the documentation and only seems to be in SPIM documentation.