Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EthContractService only get get_function_sighashes on basicblocks[0] #349

Open
hanh-gamejam opened this issue May 16, 2022 · 7 comments
Open

Comments

@hanh-gamejam
Copy link

hanh-gamejam commented May 16, 2022

I running a ethereum-etl for get data from my private block chain running on geth. but it detect fail type of contract.

After reading code I saw that eth_contract_service only read functions sighashes on basicblocks[0]. So is it ok?

def get_function_sighashes(self, bytecode):

    def get_function_sighashes(self, bytecode):
        bytecode = clean_bytecode(bytecode)
        if bytecode is not None:
            evm_code = EvmCode(contract=Contract(bytecode=bytecode), static_analysis=False, dynamic_analysis=False)
            evm_code.disassemble(bytecode)
            basic_blocks = evm_code.basicblocks
            if basic_blocks and len(basic_blocks) > 0:
                init_block = basic_blocks[0]
                instructions = init_block.instructions
                push4_instructions = [inst for inst in instructions if inst.name == 'PUSH4']
                return sorted(list(set('0x' + inst.operand for inst in push4_instructions)))
            else:
                return []
        else:
            return []
@gddezero
Copy link

I agree that only checking basicblocks[0] is not enough. I have an example of DAI smart contract 0x6B175474E89094C44Da98b954EedeAC495271d0F. By checking the function signatures in basicblocks[0] will disqualify DAI as an ERC20 token.

My suggestion is iterate through all basicblocks and check function signatures.

@gddezero
Copy link

The easiest way to correct this is:
change
init_block = basic_blocks[0]
instructions = init_block.instructions
to
instructions = [inst for block in basic_blocks for inst in block.instructions]

In addition the whole traces need to be re-parsed to identify all ERC20 and ERC721 tokens.

@pmdyy
Copy link

pmdyy commented Sep 2, 2022

    def get_function_sighashes(self, bytecode):
        bytecode = clean_bytecode(bytecode)
        if bytecode is not None:
            evm_code = EvmCode(contract=Contract(bytecode=bytecode), static_analysis=False, dynamic_analysis=False)
            evm_code.disassemble(bytecode)
            basic_blocks = evm_code.basicblocks
            if basic_blocks and len(basic_blocks) > 0:
                push4_instructions = []
                for basic_block in basic_blocks:
                    instructions = basic_block.instructions
                    push4_instructions.extend([inst for inst in instructions if inst.name == 'PUSH4'])
                return sorted(list(set('0x' + inst.operand for inst in push4_instructions)))
            else:
                return []
        else:
            return []

This fixes the problem when a contract has more than 6 PUSH4 instructions.

@wartstone
Copy link

does this fix for erc20, erc721 detection?

@imrankhan37
Copy link

Hey, could you show some proofs that this fixes token detection

@hanh-gamejam
Copy link
Author

I quick fix this function in the code bellow

    def get_function_sighashes(self, bytecode):
        bytecode = clean_bytecode(bytecode)
        if bytecode is not None:
            evm_code = EvmCode(contract=Contract(bytecode=bytecode), static_analysis=False, dynamic_analysis=False)
            evm_code.disassemble(bytecode)
            tmp = list()
            basic_blocks = evm_code.basicblocks
            if basic_blocks and len(basic_blocks) > 0:
                for i in range(len(basic_blocks)):
                    init_block = basic_blocks[i]
                    instructions = init_block.instructions
                    push4_instructions = [inst for inst in instructions if inst.name == 'PUSH4']
                    tmp = tmp + list(set('0x' + inst.operand for inst in push4_instructions))
                return sorted(list(set(tmp)))
            else:
                return []
        else:
            return []

It work with ERC20, ERC721. Note that if you want to validate ERC1155 you must be remove 00fdd58e (hash of balanceOf(address,uint256) ) function because if you optimize contract when deploy, it will become PUSH3 instead of PUSH4

@cagostino
Copy link

@hanh-gamejam i'm interested in this as well. As noted in referenced issue #353 , BAYC not shown as is_erc721=True. I've checked the function sighashes output by current ethereum-etl setup and the ones required for confirming is_erc721 are not present. if this is this solved by the above solution, should this be a pull request and move this forward?

Two other PRs that have already been started that are relevant: #267 , #282 . Perhaps this solution could be folded in somehow?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants