-
Notifications
You must be signed in to change notification settings - Fork 310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possible Add Mips architecture. #14
Comments
I guess I could do some work based on the ARM-target, and by looking at some code from the boomerangproject, for both PPC and MIPS. |
Not sure if it is as easy as adding mips to https://github.com/yegord/snowman/blob/master/src/nc/core/arch/ArchitectureRepository.cpp |
Should be pretty straight forward if I use my boomerang skills (I might abandon the boomerang project (as I am the admin of it) in favor of this one). |
Should not be too difficult. One needs to write MipsArchitecture class, add it to ArchitectureRepository make the parsers aware of this new architecture, write MipsDisassembler, make sure that it works (add test inputs, run If someone is willing to invest his/her time in this, I will be glad to guide you through the details. |
As said being the admin of http://boomerang.sf.net it should not be too much difficulties for me to re-implement that stuff. |
Then assign the issue to yourself and go ahead. :) |
I guess implementing the disassembly for MIPS was quite an easy task (took me about 2h max). Should not be much problems to implement SPARC and PPC support either. |
@techbliss you can clone it from my fork... ;-) And no I'vent implemented the registers and instruction set for the decompiler yet so hold your horses. |
@yegord One question: What is the 'domain' in the register aspect? |
When you describe the semantics of an instruction in IR, you describe it in terms of operations on a certain virtual machine. This virtual machine has multiple flat non-overlapping address spaces. Consequently, a memory location in this virtual machine is identified by a triplet (domain, address, size), where domain identifies the one of the non-overlapping address spaces. In the code the triplet is usually stored as ir::MemoryLocation object. Registers are obviously a kind of memory, so, they must be assigned a MemoryLocation. In the MipsRegisterTable.i you specify the domain, address, and size for each register's MemoryLocation. The only requirement is that you make MemoryLocations of non-overlapping registers not overlap, and those of overlapping registers overlap in the right parts. See, for example, the definition of RAX, EAX, AX, AL, AH in X86RegisterTable.i. |
Hmm... I guess imdirectably registers such as $lo and $hi would classify for another domain on the MIPS arch. |
I am not a specialist in MIPS, but LO and HI look like usual registers, so, no special handling is required. You can put them anywhere, as long as they do not overlap with the others. |
Okey... now I've successfully implemented to NOP-OP-code with inline assembly in the output ;-) Is the semantics described somewhere? Feel free to take a look at: https://github.com/nihilus/snowman/blob/master/src/nc/arch/mips/MipsInstructionAnalyzer.cpp |
Your code looks completely reasonable. The assignments from the AND and MOVE instructions do not show up in the decompiler code, because the decompiler thinks that their results are not used. It makes sense to implement loads/stores from/to memory in the first place, to make decompiler actually generate something: it assumes that values written to untraceable locations in memory are used. Implementing jumps and probably calls is the second step: arguments of jumps and calls are always assumed to be used. The semantics of IR is essentially defined by the implementation, but there is nothing there that one would not expect. IR is a control-flow graph ( Statements have expressions (implemented as expressions trees, the base class is IR can be generated by just writing C++ code like I hope, this helps. Let me know if you have other questions. |
Thx @yegord for the verbose explanation. Is this documented somewhere? I could create a TeX/PDF file with info like this and more to ship with the releases. "Implementor's handbook". :-) |
There is an outdated (and removed) |
An update on the progress, I got back my book 'See MIPS Run, 2nd ed' today and successfully implemented "load word" and "branch / imp". |
@techbliss @hlide Things are moving in the correct direction. |
Now I should try to add branch delay handling. |
I'm detailing only for MIPS architecture prior to MIPS[32|64]R6.
When not likely and ONLY when the jump/call is taken, the next instruction in address is executed as a delay slot before jumping/calling to target instruction. When not taken, the next instruction is just executed normally after the untaken jump/call. equivalent IR for
When likely and ONLY when the jump/call is taken, the next instruction in address is executed as a delay slot before jumping/calling to target instruction. When no taken, the next instruction is skipped after the untaken jump/call. equivalent IR for
|
Afaik R6 uses the some opcodes from pre-R6 ISA to do things like 'aui'. R6 is out of the scope atm actually. |
Do I see you are trying to use the "invisible" program counter @hlide? |
Well you need to "insert" the delay slot instruction in the IR of the branch instruction when taken. something like:
|
the likely is a little tricky as you need to skip the next instruction if untaken:
|
@hlide Yes that is why I brought Sweetman's book to Domino's pizza. Because branch likely is a conditional. Gives me headache... However I met a friend there so now we are drinking beer. Why dont you fork me and help me out here? ;-) |
Here an example I did to translate MIPS code into X86-64 code: As you can see, the pale yellow part is the translated BNE instruction and the pale blue part is the delay slot instruction. The CFG is done on x86-64 instructions, not on MIPS instructions, which makes easier to end a basic block at a jump instruction. I was thinking you could do something similar by replacing the X86-64 code by IR code. |
I would love some documentation on the IR code except @yegord giving me lot of interesting info. :-) |
Ok, I had a look on the source and it's pretty impossible to handle likely/delay slots with the actual |
@hlide Thanks for your explanations. They are even better than mine. :-) Concerning the interface, I was thinking more of patching the previous instruction, when you generate the current one, if the previous instruction is one from the list #14 (comment). Generating directly what is needed by looking at the next instruction looks like a good idea and should be somewhat simpler to implement. |
@nihilus Could you elaborate on what kind of issue you expect? |
Upd. This post contained nonsense. Deleted. |
@yegord: that the return address will be set prematurely and not to the adress AFTER the call. |
is the latest trunk. |
seems to be more like what we want... AFAIK, correct? |
Probably something like that (minus bogus): case MIPS_INS_JALR: {
if (op_count > 1) {
auto ret = std::make_unique<core::ir::Intrinsic>(core::ir::Intrinsic::RETURN_ADDRESS, architecture_->bitness());
_[operand(0) ^= ret];
}
delayslot(_)[call(operand(op_count - 1)), jump(constant(nextDirectSuccessorAddress))];
break;
} |
Explanation:
so we should handle $s0 as a return address in this case. Another possibility:
So you may consider possible transfer of the return address from a register to another one. |
@hlide I think the op_count == 1 was added because it is either 1 or 2. |
On Tue, Aug 04, 2015 at 08:17:05AM -0700, Markus Gothe wrote:
I would like to note that these are no-ops: you just create an Yegor Derevenets |
On Mon, Aug 03, 2015 at 09:52:59PM -0700, hlide wrote:
I see what you want to do. However, dataflow analysis is currently Theoretically, after functions are constructed, one could do a separate, register_with_return_address := return_address_intrinsic. Then, when the function will be analyzed, the decompiler will know that Yegor Derevenets |
@yegord: So how do you siggest we use it? |
On Tue, Aug 04, 2015 at 12:10:52PM -0700, Markus Gothe wrote:
Let's fix the problem first. My initial suggestion was referring to the In the assumption that this is a common pattern, I suggested to force To make decompiler think so, one needs to generate for a jump to $ra the auto ret = std::make_unique<core::ir::Jump>(
std::make_unique<core::ir::Intrinsic>(
core::ir::Intrinsic::RETURN_ADDRESS,
32)) This statement must be added to the basic block to which you would The intrinsic expression essentially returns a value marked with a flag If you have a different problem, please take time to define it Yegor Derevenets |
Ok, I see your point. Well it is true that we can assume a return is pretty much done through "jr $ra", so your suggestion will be enough to start with. |
On Tue, Aug 04, 2015 at 12:56:34PM -0700, hlide wrote:
I just pushed to github a small patch that allows you to write just _[jump(return_address())]; instead of that four lines. Yegor Derevenets |
Thanks yegord, we will try to create separate issues the next time. It is true this one is breaking a record in number of posts. |
How about adding tests for the architecture, making sure they work, cleaning up the code, and submitting a patch? |
@yegord hopefully I'll have something for a pull request by tomorrow. |
eliminate dead assignment sony by calls.
For the record: I've just added 64-bits / n32 ABI support for MIPS now. |
@hlide I found this IDA Pro plugin in python for resolving MIPS ELF symbols at http://syscall.eu I think we need to patch the ELF loader with the info given in this plugin. |
@yegord this was a very very long issue thread, and since there haven't been activity so long, maybe I should close it again,. I see your whating for a pull request from @Nihilius but, according to him you banned him, so he can't due such thing. Keep up the good work |
As long as nobody is really working on the project, it does not really matter if the issue is open or closed. So, let's keep the status quo. :) If @nihilus is willing to submit a proper patch, with code for one architecture, added in one commit, working, with proper tests having correct answers, he is welcome to do it now. AFAIR last time we ended up with a patch adding support for two architectures in lots of commits, for the second architecture there were no input files at all. And at the same time there was a significant traffic from him in other issues not adding any value. |
Okay let's keep it running :) |
Hey Yegord.
I see you switched to Capstone engine.
So Question is.
You think it would be possible to add MIPS architecture also to the decompiler,
since Capstone already have the architecture included in the capstone engine.
https://github.com/aquynh/capstone/tree/master/arch/Mips
Or would it be much work to add MIPS support.
Regards
The text was updated successfully, but these errors were encountered: