Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible Add Mips architecture. #14

Open
techbliss opened this issue Jun 15, 2015 · 306 comments
Open

Possible Add Mips architecture. #14

techbliss opened this issue Jun 15, 2015 · 306 comments

Comments

@techbliss
Copy link

Hey Yegord.

I see you switched to Capstone engine.
So Question is.
You think it would be possible to add MIPS architecture also to the decompiler,
since Capstone already have the architecture included in the capstone engine.

https://github.com/aquynh/capstone/tree/master/arch/Mips
Or would it be much work to add MIPS support.
Regards

@nihilus
Copy link

nihilus commented Jun 15, 2015

I guess I could do some work based on the ARM-target, and by looking at some code from the boomerangproject, for both PPC and MIPS.

@techbliss
Copy link
Author

Not sure if it is as easy as adding mips to https://github.com/yegord/snowman/blob/master/src/nc/core/arch/ArchitectureRepository.cpp
updating snowman arch with mips from Capstone.
much of the code is already there from ARM arch, calling conventions and little /big endian.
offcause making a cmake list to wrapp it up(not for the faint hearted )
Im not a c++ guy, i only write in python.Lets here Yegord if it is not to much trouble to add.

@nihilus
Copy link

nihilus commented Jun 15, 2015

Should be pretty straight forward if I use my boomerang skills (I might abandon the boomerang project (as I am the admin of it) in favor of this one).

@yegord
Copy link
Owner

yegord commented Jun 15, 2015

Should not be too difficult. One needs to write MipsArchitecture class, add it to ArchitectureRepository make the parsers aware of this new architecture, write MipsDisassembler, make sure that it works (add test inputs, run nocode --print-instructions tests/mips/*), describe memory locations of registers (MipsRegisterTable.i), describe a calling convention, make sure the decompiler generates some code (nocode tests/mips/*), define the semantics of instructions (MipsInstructionAnalyzer), make sure the decompiler starts generating sensible code. Some proficiency in C++ is required.

If someone is willing to invest his/her time in this, I will be glad to guide you through the details.

@nihilus
Copy link

nihilus commented Jun 16, 2015

As said being the admin of http://boomerang.sf.net it should not be too much difficulties for me to re-implement that stuff.

@yegord
Copy link
Owner

yegord commented Jun 16, 2015

Then assign the issue to yourself and go ahead. :)

@nihilus
Copy link

nihilus commented Jun 18, 2015

I guess implementing the disassembly for MIPS was quite an easy task (took me about 2h max). Should not be much problems to implement SPARC and PPC support either.

@techbliss
Copy link
Author

@nihilus that was fast.Possible pull request to @yegord he can review proposed changes.So we can build and try it out.If he agrees.
Also do it output sensible code ?

@nihilus
Copy link

nihilus commented Jun 19, 2015

@techbliss you can clone it from my fork... ;-) And no I'vent implemented the registers and instruction set for the decompiler yet so hold your horses.

@nihilus
Copy link

nihilus commented Jun 21, 2015

@yegord One question: What is the 'domain' in the register aspect?

@yegord
Copy link
Owner

yegord commented Jun 21, 2015

When you describe the semantics of an instruction in IR, you describe it in terms of operations on a certain virtual machine. This virtual machine has multiple flat non-overlapping address spaces. Consequently, a memory location in this virtual machine is identified by a triplet (domain, address, size), where domain identifies the one of the non-overlapping address spaces. In the code the triplet is usually stored as ir::MemoryLocation object.

Registers are obviously a kind of memory, so, they must be assigned a MemoryLocation. In the MipsRegisterTable.i you specify the domain, address, and size for each register's MemoryLocation. The only requirement is that you make MemoryLocations of non-overlapping registers not overlap, and those of overlapping registers overlap in the right parts. See, for example, the definition of RAX, EAX, AX, AL, AH in X86RegisterTable.i.

@nihilus
Copy link

nihilus commented Jun 21, 2015

Hmm... I guess imdirectably registers such as $lo and $hi would classify for another domain on the MIPS arch.

@yegord
Copy link
Owner

yegord commented Jun 21, 2015

I am not a specialist in MIPS, but LO and HI look like usual registers, so, no special handling is required. You can put them anywhere, as long as they do not overlap with the others.

@nihilus
Copy link

nihilus commented Jun 22, 2015

Okey... now I've successfully implemented to NOP-OP-code with inline assembly in the output ;-)
I still have some problems to figure out how to make the rest make sense.

Is the semantics described somewhere? Feel free to take a look at: https://github.com/nihilus/snowman/blob/master/src/nc/arch/mips/MipsInstructionAnalyzer.cpp

@yegord
Copy link
Owner

yegord commented Jun 23, 2015

Your code looks completely reasonable.

The assignments from the AND and MOVE instructions do not show up in the decompiler code, because the decompiler thinks that their results are not used. It makes sense to implement loads/stores from/to memory in the first place, to make decompiler actually generate something: it assumes that values written to untraceable locations in memory are used. Implementing jumps and probably calls is the second step: arguments of jumps and calls are always assumed to be used.

The semantics of IR is essentially defined by the implementation, but there is nothing there that one would not expect. IR is a control-flow graph (ir::Program), it consists of basic blocks — linear sequences of statements, possibly having a starting address (ir::BasicBlock). A statement (ir::Statement) can be an inline assembly (ir::InlineAssembly), an assignment (ir::Assignment), a jump (ir::Jump, can be conditional or unconditional), a function call (ir::Call), a termination statement (ir::Halt), or one of the several black-magic statements used only in calling conventions-related business.

Statements have expressions (implemented as expressions trees, the base class is ir::Term) as arguments. An expression always has a size, and the sizes are strictly checked, so, if you screwed up, an assertion will fail, and you will immediately know. For example, operands of ir::Assignment must have the same sizes. Expressions include up-to-64-bit constants (ir::Constant), memory location access (e.g., access to a register, ir::MemoryLocationAccess), dereference (ir::Dereference, like ir::MemoryLocationAccess, but the address being accessed is computed, i.e., is an IR expression), unary and binary operators (ir::UnaryOperator, ir::BinaryOperator). BinaryOperator, similar to assignment, usually requires that the operands have the same size, and the binary expression itself has the same size as the operands (in bit-shift operators the size of the right operand does not matter, comparison operators always return 1-bit-sized value). How these expressions are evaluated is in detail defined in ir/dflow/DataflowAnalyzer.cpp and ir/dflow/AbstractValue.h.

IR can be generated by just writing C++ code like basicBlock->pushBack(std::make_unique<Assignment>(std::make_unique<MemoryLocationAccess>(reg1->memoryLocation()), std::make_unique<MemoryLocationAccess>(reg2->memoryLocation())));, however, it is verbose and boring. This is why Alex Fokin invented a C++-expression-templates-based domain-specific language, currently situated in irgen::expressions namespace. The idea there is to use special types and overload operators, so that operations on objects of that types create objects of even more complex types, and so on. For instance, if you write Expr<T1>() + Expr<T2>(), then the result is an object of type Sum<Expr<T1>, Expr<T2>>. Such expressions is what you write within square brackets. The expression is passed to operator[] of ExpressionFactoryCallback, which in turn passes it to ExpressionFactory::createStatement. The latter infers sizes of expressions, creates an IR statement, returns it, and ExpressionFactoryCallback appends it to the basic block that you passed to its constructor. This is what is going on when you write _[operand(0) ^= operand(0) + operand(1)].

I hope, this helps. Let me know if you have other questions.

@nihilus
Copy link

nihilus commented Jun 24, 2015

Thx @yegord for the verbose explanation. Is this documented somewhere? I could create a TeX/PDF file with info like this and more to ship with the releases. "Implementor's handbook". :-)

@yegord
Copy link
Owner

yegord commented Jun 24, 2015

There is an outdated (and removed) doc/developer/main.tex (can be found, e.g., in the root commit), there is nothing up-to-date on the topic.
I think, I will eventually expand the explanations from here and put them into doc/ir.asciidoc.

@nihilus
Copy link

nihilus commented Jun 29, 2015

An update on the progress, I got back my book 'See MIPS Run, 2nd ed' today and successfully implemented "load word" and "branch / imp".

@nihilus
Copy link

nihilus commented Jul 12, 2015

@techbliss @hlide Things are moving in the correct direction.

@nihilus
Copy link

nihilus commented Jul 12, 2015

Now I should try to add branch delay handling.

@hlide
Copy link

hlide commented Jul 12, 2015

I'm detailing only for MIPS architecture prior to MIPS[32|64]R6.

  • direct unconditional jump using J-type instruction : J target
  • indirect unconditional jump using R-type instruction : JR rs
  • direct unconditional call using J-type instruction : JAL target
  • indirect unconditional call using R-type instruction : JALR rd,rs (rd is usually $ra)
  • direct conditional jump using I-type instruction : Bxx rs[, rt], target
  • direct conditional likely jump using I-type instruction : BxxL rs[, rt], target
  • direct conditional call using I-type instruction : BxxAL rs, target
  • direct conditional likely call using I-type instruction : BxxALL rs, target
  • direct COPx conditional jump using I-type instruction : BCx[F/T] target
  • direct COPx conditional likely jump using I-type instruction : BCx[F/T]L target

When not likely and ONLY when the jump/call is taken, the next instruction in address is executed as a delay slot before jumping/calling to target instruction. When not taken, the next instruction is just executed normally after the untaken jump/call.

equivalent IR for BNE r1, r2, target for instance:

insn2ir(address):
    ...
    if opcode = BNE:
        return
            _[
                jump(~(operand(0) == operand(1)), _[insn2ir(address + 4), operand(3)], directSuccessor())
            ]
    ...

When likely and ONLY when the jump/call is taken, the next instruction in address is executed as a delay slot before jumping/calling to target instruction. When no taken, the next instruction is skipped after the untaken jump/call.

equivalent IR for BNEL r1, r2, target for instance:

insn2ir(address):
    ...
    if opcode = BNEL:
        return
            _[
                jump(~(operand(0) == operand(1)), _[insn2ir(address + 4), operand(3)], address+8)
            ]
    ...

@nihilus
Copy link

nihilus commented Jul 12, 2015

Afaik R6 uses the some opcodes from pre-R6 ISA to do things like 'aui'. R6 is out of the scope atm actually.

@nihilus
Copy link

nihilus commented Jul 12, 2015

Do I see you are trying to use the "invisible" program counter @hlide?

@hlide
Copy link

hlide commented Jul 12, 2015

Well you need to "insert" the delay slot instruction in the IR of the branch instruction when taken.

something like:

    if (taken)
    {
        insert_ir_instruction_for(address + 4);
        goto target;
    }

@hlide
Copy link

hlide commented Jul 12, 2015

the likely is a little tricky as you need to skip the next instruction if untaken:

    if (taken)
    {
        insert_ir_instruction_for(address + 4);
        goto target;
    }
    else
    {
        goto address + 8;
    }

@nihilus
Copy link

nihilus commented Jul 12, 2015

@hlide Yes that is why I brought Sweetman's book to Domino's pizza. Because branch likely is a conditional. Gives me headache... However I met a friend there so now we are drinking beer.

Why dont you fork me and help me out here? ;-)

@hlide
Copy link

hlide commented Jul 12, 2015

Here an example I did to translate MIPS code into X86-64 code:

bne

As you can see, the pale yellow part is the translated BNE instruction and the pale blue part is the delay slot instruction. The CFG is done on x86-64 instructions, not on MIPS instructions, which makes easier to end a basic block at a jump instruction. I was thinking you could do something similar by replacing the X86-64 code by IR code.

@nihilus
Copy link

nihilus commented Jul 12, 2015

I would love some documentation on the IR code except @yegord giving me lot of interesting info. :-)

@hlide
Copy link

hlide commented Jul 12, 2015

Ok, I had a look on the source and it's pretty impossible to handle likely/delay slots with the actual IRGenerator::generate() and MipsInstructionAnalyzerImpl::createStatements(const MipsInstruction *instruction, core::ir::Program *program) functions which cannot insert IR of the next instruction. We need the ability to call MipsInstructionAnalyzerImpl::createStatements(const MipsInstruction *instruction, core::ir::Program *program) recursively when dealing with likely/delay slots.

@yegord
Copy link
Owner

yegord commented Jul 12, 2015

@hlide Thanks for your explanations. They are even better than mine. :-)

Concerning the interface, I was thinking more of patching the previous instruction, when you generate the current one, if the previous instruction is one from the list #14 (comment).

Generating directly what is needed by looking at the next instruction looks like a good idea and should be somewhat simpler to implement. IRGenerator knows the whole set of instructions, so, nothing stops you from passing it to InstructionAnalyzer::createStatements. Your instr2ir can then become something like addStatementsImplementingInstructionToBasicBlock(const MipsInstruction *, ir::BasicBlock *).

@yegord
Copy link
Owner

yegord commented Aug 3, 2015

@nihilus Could you elaborate on what kind of issue you expect?

@yegord
Copy link
Owner

yegord commented Aug 3, 2015

Upd. This post contained nonsense. Deleted.

@nihilus
Copy link

nihilus commented Aug 3, 2015

@yegord: that the return address will be set prematurely and not to the adress AFTER the call.

@nihilus
Copy link

nihilus commented Aug 3, 2015

@hlide

          case MIPS_INS_JALR: {
                _[operand(0) ^= constant(nextDirectSuccessorAddress)];
                delayslot(_)[call(operand(op_count - 1))];
                break;
            }
            case MIPS_INS_BAL: /* Fall-through */
            case MIPS_INS_JAL: {
                if (op_count == 1){
                    _[regizter(MipsRegisters::ra()) ^= constant(nextDirectSuccessorAddress)];
                    delayslot(_)[call(operand(0))];
                } else {
                    _[operand(0) ^= constant(nextDirectSuccessorAddress)];
                    delayslot(_)[call(operand(op_count - 1))];
                }
                break;
            }

is the latest trunk.

@nihilus
Copy link

nihilus commented Aug 3, 2015

@hlide

            case MIPS_INS_JALR: {
                _[regizter(MipsRegisters::ra()) ^= constant(nextDirectSuccessorAddress)];
                _[operand(0) ^= constant(nextDirectSuccessorAddress)];
                delayslot(_)[call(operand(op_count - 1))];
                _[jump(constant(nextDirectSuccessorAddress))];
                break;
            }
            case MIPS_INS_BAL: /* Fall-through */
            case MIPS_INS_JAL: {
                if (op_count == 1){
                    _[regizter(MipsRegisters::ra()) ^= constant(nextDirectSuccessorAddress)];
                    delayslot(_)[call(operand(0))];
                    _[jump(constant(nextDirectSuccessorAddress))];
                } else {
                    _[regizter(MipsRegisters::ra()) ^= constant(nextDirectSuccessorAddress)];
                    _[operand(0) ^= constant(nextDirectSuccessorAddress)];
                    delayslot(_)[call(operand(op_count - 1))];
                    _[jump(constant(nextDirectSuccessorAddress))];
                }
                break;
            }

seems to be more like what we want... AFAIK, correct?

@nihilus
Copy link

nihilus commented Aug 3, 2015

@yegord @hlide still we have this:

400be0: lw $ra, 0x18($sp)
400be4: nop 
400be8: jr $ra

resulting in:

    goto v5;
}

void fun_400bc8(int32_t a0) {
    int32_t v2;

    goto v2;
}

@yegord any clues how to make a special case when loading / storing to $ra?

@hlide
Copy link

hlide commented Aug 4, 2015

Probably something like that (minus bogus):

            case MIPS_INS_JALR: {
                if (op_count > 1) {
                    auto ret = std::make_unique<core::ir::Intrinsic>(core::ir::Intrinsic::RETURN_ADDRESS, architecture_->bitness());
                    _[operand(0) ^= ret];
                } 
                delayslot(_)[call(operand(op_count - 1)), jump(constant(nextDirectSuccessorAddress))];
                break;
            }

@hlide
Copy link

hlide commented Aug 4, 2015

Explanation:

    ...
    jalr $s0, $s1 ; save return address in $s0 and jump to address given by $s1
    ...
target:
    ...
    jr $s0 ; jump back to caller
    ...

so we should handle $s0 as a return address in this case.

Another possibility:

    ...
    jalr $s0, $s1 ; save return address in $s0 and jump to address given by $s1
    ...
target:
    ...
    move $ra, $s0 ; set $ra with the return address
    ...
    jr $ra ; jump back to caller
    ...

So you may consider possible transfer of the return address from a register to another one.

@nihilus
Copy link

nihilus commented Aug 4, 2015

@hlide I think the op_count == 1 was added because it is either 1 or 2.

@yegord
Copy link
Owner

yegord commented Aug 4, 2015

On Tue, Aug 04, 2015 at 08:17:05AM -0700, Markus Gothe wrote:

  •     std::make_uniquecore::ir::Intrinsic(core::ir::Intrinsic::RETURN_ADDRESS, architecture_->bitness());
    
    [...]
  • std::make_uniquecore::ir::Intrinsic(core::ir::Intrinsic::RETURN_ADDRESS, architecture_->bitness());
    

I would like to note that these are no-ops: you just create an
intrinsic, and it is deleted at the end of the statement.

Yegor Derevenets

@yegord
Copy link
Owner

yegord commented Aug 4, 2015

On Mon, Aug 03, 2015 at 09:52:59PM -0700, hlide wrote:

        case MIPS_INS_JALR: {
            auto ret = std::make_unique<core::ir::Intrinsic>(core::ir::Intrinsic::RETURN_ADDRESS, architecture_->bitness());
            _[operand(0) ^= ret];
            delayslot(_)[call(operand(op_count - 1)), jump(constant(nextDirectSuccessorAddress))];
            break;
        }

I see what you want to do. However, dataflow analysis is currently
intraprocedural, so, when the call target is analyzed, the decompiler
will not know that whatever was operand(0) contains the return address.

Theoretically, after functions are constructed, one could do a separate,
architecture-specific pass to detect in which register which function
gets the return address, and prepend the entry of each function with an
assignment of the form

register_with_return_address := return_address_intrinsic.

Then, when the function will be analyzed, the decompiler will know that
a certain register contains the return address and recognize jumps as
returns accordingly.

Yegor Derevenets

@nihilus
Copy link

nihilus commented Aug 4, 2015

@yegord: So how do you siggest we use it?

@yegord
Copy link
Owner

yegord commented Aug 4, 2015

On Tue, Aug 04, 2015 at 12:10:52PM -0700, Markus Gothe wrote:

@yegord: So how do you siggest we use it?

Let's fix the problem first. My initial suggestion was referring to the
following problem: a function got a return address not in $ra, but
somewhere on the stack. It copied it to $ra and returned by jumping to
$ra. The decompiler with its intraprocedural dataflow analysis could not
know that that value on the stack that was copied to $ra and jumped to
is a return address.

In the assumption that this is a common pattern, I suggested to force
the decompiler to think that any jump to $ra is a return. This is
formally incorrect, but de-facto may be useful.

To make decompiler think so, one needs to generate for a jump to $ra the
following statement (a jump to an address being a special intrinsic):

auto ret = std::make_unique<core::ir::Jump>(
    std::make_unique<core::ir::Intrinsic>(
        core::ir::Intrinsic::RETURN_ADDRESS,
        32))

This statement must be added to the basic block to which you would
otherwise add the jump to $ra. It can be done by something like
_(std::move(ret)).

The intrinsic expression essentially returns a value marked with a flag
'Hey! I am the return address from the current function!'. When
generating LikeC code, the decompiler checks if the value being jumped
to has this flag set. If this is the case, the decompiler generates a
return. If it is not, it generates an ugly goto. So, for the jump ret
it will always generate a return, and this is what you probably want.

If you have a different problem, please take time to define it
precisely, possibly, create a different issue for that.

Yegor Derevenets

@hlide
Copy link

hlide commented Aug 4, 2015

Ok, I see your point. Well it is true that we can assume a return is pretty much done through "jr $ra", so your suggestion will be enough to start with.

@yegord
Copy link
Owner

yegord commented Aug 4, 2015

On Tue, Aug 04, 2015 at 12:56:34PM -0700, hlide wrote:

Ok, I see your point. Well it is true that we can assume a return is
pretty much done through "jr $ra", so your suggestion will be enough
to start with.

I just pushed to github a small patch that allows you to write just

_[jump(return_address())];

instead of that four lines.

Yegor Derevenets

@hlide
Copy link

hlide commented Aug 4, 2015

Thanks yegord, we will try to create separate issues the next time. It is true this one is breaking a record in number of posts.

@nihilus
Copy link

nihilus commented Aug 5, 2015

Thanks @yegord and I agree with you and @hlide on breaking down things in separate issues.

@yegord
Copy link
Owner

yegord commented Aug 5, 2015

@yegord
Copy link
Owner

yegord commented Aug 19, 2015

How about adding tests for the architecture, making sure they work, cleaning up the code, and submitting a patch?

@nihilus
Copy link

nihilus commented Aug 19, 2015

@yegord I've got the test in my fork of the testsuite... However I need to fix the ELFparser to patch the GOT/PLT for getting correct addresses for external functions (it's a MIPS quirk). I will merge fixes from @hlide tonight.

@nihilus
Copy link

nihilus commented Aug 22, 2015

@yegord hopefully I'll have something for a pull request by tomorrow.

@nihilus
Copy link

nihilus commented Sep 1, 2015

@yegord now I've sorted out some issues with the delay slot etc. and created a pull request here #47

uxmal pushed a commit to uxmal/snowman that referenced this issue Sep 4, 2015
eliminate dead assignment sony by calls.
@nihilus
Copy link

nihilus commented Sep 21, 2015

For the record: I've just added 64-bits / n32 ABI support for MIPS now.

@nihilus
Copy link

nihilus commented Oct 16, 2015

@hlide I found this IDA Pro plugin in python for resolving MIPS ELF symbols at http://syscall.eu I think we need to patch the ELF loader with the info given in this plugin.

@techbliss
Copy link
Author

techbliss commented Oct 9, 2016

@yegord this was a very very long issue thread, and since there haven't been activity so long, maybe I should close it again,. I see your whating for a pull request from @Nihilius but, according to him you banned him, so he can't due such thing.
So not sure to close it if your still whating for such pull request.
however it was properly the longest issue thread I have seen for a while, should I close it or keep it open.

Keep up the good work

@yegord
Copy link
Owner

yegord commented Oct 9, 2016

As long as nobody is really working on the project, it does not really matter if the issue is open or closed. So, let's keep the status quo. :)

If @nihilus is willing to submit a proper patch, with code for one architecture, added in one commit, working, with proper tests having correct answers, he is welcome to do it now. AFAIR last time we ended up with a patch adding support for two architectures in lots of commits, for the second architecture there were no input files at all. And at the same time there was a significant traffic from him in other issues not adding any value.

@techbliss
Copy link
Author

Okay let's keep it running :)
I think you would need to unblock @Nihilius. I know he gets a little intuiastic some times, but he does great work also.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants