Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IR Optimization #123

Open
ghost opened this issue Apr 4, 2017 · 4 comments
Open

IR Optimization #123

ghost opened this issue Apr 4, 2017 · 4 comments

Comments

@ghost
Copy link

ghost commented Apr 4, 2017

Consider this assembly code:
41b00f: lea eax, [edx*4+0x0] 41b016: sub eax, edx 41b018: add eax, eax 41b01a: mov edx, eax 41b01c: shl eax, 0x3 41b01f: add edx, eax
Which translates to:
(edx10 * 4 - edx10) * 2 + (edx10 * 4 - edx10) * 2 * 8)
but is actually a multiplication by 54
We can fold this in the LikeC tree, but struct members and global variable types are determined prior to tree generation from what I can tell. So I wrote up some proof-of-concept code for analyzing preceding IR statements in the current basic block and merging them. Is this a good idea, or would you prefer it be implemented a different way?

@ghost
Copy link
Author

ghost commented Apr 4, 2017

The following is what the prologue of createStatements in X86InstructionAnalyzer looks like with this added:
`
assert(instr != nullptr);
assert(program != nullptr);
using namespace nc::core::ir;
currentInstruction_ = instr;
BasicBlock *thisBlock = program->getBasicBlockForInstruction(instr);
auto disassembleInstr = [this](const X86Instruction *x86instr) {
ud_set_pc(&ud_obj_, x86instr->addr());
ud_set_input_buffer(&ud_obj_, const_cast<uint8_t *>(x86instr->bytes()),
checked_caststd::size_t(x86instr->size()));
ud_disassemble(&ud_obj_);
};
auto disassembleCurrent = &disassembleInstr, instr { disassembleInstr(instr); };

    disassembleCurrent();
    assert(ud_obj_.mnemonic != UD_Iinvalid);

    BasicBlock *cachedDirectSuccessor = nullptr;
    auto directSuccessor = [&]() -> BasicBlock * {
        if (!cachedDirectSuccessor) {
            cachedDirectSuccessor = program->createBasicBlock(instr->endAddr());
        }
        return cachedDirectSuccessor;
    };

    auto gatherBlockX86Instructions = [](BasicBlock *blk) -> std::vector<const X86Instruction *> {
        std::vector<const X86Instruction *> result;
        foreach (auto stmt, blk->statements()) {
            bool alreadyAdded = false;
            const X86Instruction *currInstr = static_cast<const X86Instruction *>(stmt->instruction());
            foreach (auto instr, result) {
                if (instr == currInstr) {
                    alreadyAdded = true;
                    break;
                }
            }
            if (!alreadyAdded)
                result.push_back(currInstr);
        }
        return std::move(result);
    };
    auto gatherIRStmtsForInsn = [thisBlock](const X86Instruction *x86insn) {
        std::vector<Statement *> ownedIR;
        foreach (Statement *stmt, thisBlock->statements()) {
            if (stmt->instruction() == x86insn)
                ownedIR.push_back(stmt);
        }
        return ownedIR;
    };
    auto findMemLocWrite = [this](std::vector<Statement *> stmts,
                                  const MemoryLocation &memloc) -> core::ir::Statement * {

        foreach (auto stmt, stmts) {
            /*
             * register potentially clobbered
             */
            if (stmt->is<Call>() || stmt->is<Jump>() || stmt->is<InlineAssembly>())
                return nullptr;
            if (auto asg = stmt->as<Assignment>()) {
                auto assigned = asg->left();
                if (auto locAccess = assigned->asMemoryLocationAccess()) {
                    if (locAccess->memoryLocation().covers(memloc) || locAccess->memoryLocation().overlaps(memloc))
                        return stmt;
                }
            }
        }
        // no preceding write in this range
        return nullptr;
    };
    auto findRegisterLocWrite = [this, &findMemLocWrite](std::vector<Statement *> stmts,
                                                         ud_type reg) -> core::ir::Statement * {
        auto regAccess = createRegisterAccess(reg);
        // if(!regAccess)
        //    return nullptr;
        assert(regAccess);
        auto regMemAccess = regAccess->asMemoryLocationAccess();
        // if(!regMemAccess)
        //    return nullptr;
        assert(regMemAccess);
        auto regMemLoc = regMemAccess->memoryLocation();

        return findMemLocWrite(stmts, regMemLoc);
    };

    X86ExpressionFactory factory(architecture_);
    X86ExpressionFactoryCallback _(factory, thisBlock, instr);

    using namespace core::irgen::expressions;

    auto getPreceding = [&]() -> const X86Instruction * {
        // auto stmts = thisBlock->statements();
        if (thisBlock->statements().size()) {
            auto back = thisBlock->statements().back();
            if (back) {
                return static_cast<const X86Instruction *>(back->instruction());
            }
        }
        return nullptr;
    };
    auto removeIRStatements = [&thisBlock](std::vector<Statement *> stmts) {
        foreach (auto stmt, stmts)
            thisBlock->erase(stmt);

    };
    auto pred = getPreceding();

    const core::arch::Instruction *predecessor = getPreceding();
    ud_operand udOp0 = std::move(ud_obj_.operand[0]);
    ud_operand udOp1 = std::move(ud_obj_.operand[1]);
    ud_operand udOp2 = std::move(ud_obj_.operand[2]);

And here is a (messy) example of it being used to combine instructions:
case UD_Iadd: {

        if (udOp0.type == UD_OP_REG && udOp0.type == udOp1.type && udOp0.base == udOp1.base) {
            if (pred) {
                auto predIR = gatherIRStmtsForInsn(pred);
                auto lastWrite = findRegisterLocWrite(predIR, udOp0.base);
                // only one preceding IR statement, and its a write to this one
                if (predIR.size() == 1 && lastWrite) {
                    auto fusedTerm = predIR[0]->as<Assignment>()->right()->clone();
                    auto asBinary = fusedTerm->as<BinaryOperator>();
                    if (asBinary && asBinary->operatorKind() == BinaryOperator::MUL) {
                        assert(asBinary->right()->is<Constant>());
                        auto constantRight = asBinary->right()->as<Constant>();

                        SizedValue distributed(constantRight->size(), constantRight->value().value() * 2);
                        _[temporary(operand(0).size()) ^= TermExpression(std::make_unique<BinaryOperator>(
                              BinaryOperator::MUL, lastWrite->as<Assignment>()->left()->clone(),
                              static_cast<std::unique_ptr<Term>>(std::make_unique<Constant>(distributed)),
                              asBinary->size()))];
                    } else
                        _[temporary(operand(0).size()) ^= TermExpression(std::move(fusedTerm)) * constant(2)];
                    removeIRStatements(predIR);
                    goto flags_add;
                } else if (lastWrite) {
                    disassembleInstr(pred);
                    auto predo0 = std::move(ud_obj_.operand[0]);
                    auto predo1 = std::move(ud_obj_.operand[1]);
                    if (ud_obj_.mnemonic == UD_Iadd) {
                        if (predo0.type == udOp0.type && predo1.type == udOp1.type && predo0.base == udOp0.base &&
                            predo1.base == udOp1.base) {
                            assert(predIR.size() > 1);
                            auto right = lastWrite->as<Assignment>()->right();
                            if (!right->is<BinaryOperator>()) {
                                assert(right->is<MemoryLocationAccess>());
                                /*
                                 * find the write to the temporary register instead
                                 */
                                auto tempWrite =
                                    findMemLocWrite(predIR, right->asMemoryLocationAccess()->memoryLocation());
                                assert(tempWrite->is<Assignment>());
                                right = tempWrite->as<Assignment>()->right();
                                assert(right->is<BinaryOperator>());
                            }
                            auto asBinary = right->as<BinaryOperator>();
                            assert(asBinary->operatorKind() == BinaryOperator::MUL);
                            auto constantRight = asBinary->right()->as<Constant>();
                            SizedValue distributed(constantRight->size(), constantRight->value().value() * 2);

                            _[temporary(operand(0).size()) ^= TermExpression(std::make_unique<BinaryOperator>(
                                  BinaryOperator::MUL, lastWrite->as<Assignment>()->left()->clone(),
                                  static_cast<std::unique_ptr<Term>>(std::make_unique<Constant>(distributed)),
                                  asBinary->size()))];

                            disassembleCurrent();
                            removeIRStatements(predIR);
                            goto flags_add;
                        }
                    }
                    disassembleCurrent();
                }
            }
            _[temporary(operand(0).size()) ^= operand(0) * constant(2)];
        } else
            _[temporary(operand(0).size()) ^= operand(0) + operand(1)];
    flags_add:
        _[cf ^= unsigned_(temporary(operand(0).size())) < unsigned_(operand(0)),
          operand(0) ^= temporary(operand(0).size()), pf ^= intrinsic(), zf ^= operand(0) == constant(0),
          sf ^= signed_(operand(0)) < constant(0), of ^= intrinsic(), af ^= intrinsic(), less ^= ~(sf == of),
          less_or_equal ^= less | zf, below_or_equal ^= cf | zf];

        break;
    }

`
This combines multiple adds that have the same destination and source into a single multiplication IR statement, removing the IR statements generated from the previous instruction, including ones that modify the flags, as their values are discarded by the current instruction anyway.

@yegord
Copy link
Owner

yegord commented Apr 4, 2017 via email

@yegord
Copy link
Owner

yegord commented Apr 4, 2017

Probably you will benefit from reading nc/core/ir/misc/PatternRecognition.cpp — you might want to implement a function similar to those in that file, called, e.g., recognizeMultiplicationByConstant().

Once you have this function, you could also change recognizeArrayAccess() to use it.

@ghost
Copy link
Author

ghost commented Apr 4, 2017

Thank you for the information, it can be a little difficult figuring out the best way to do these things sometimes when you're not familiar with the code. I'll try adding something after decompile()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant