Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement DAG builder for parsing Ethereum logs #17

Closed
medvedev1088 opened this issue Jul 28, 2019 · 6 comments
Closed

Implement DAG builder for parsing Ethereum logs #17

medvedev1088 opened this issue Jul 28, 2019 · 6 comments

Comments

@medvedev1088
Copy link
Member

medvedev1088 commented Jul 28, 2019

The DAG builder is simply a Python function that generates an Airflow DAG given a list of parameters. An example of a DAG builder can be found here.

The input for the DAG builder for parsing Ethereum logs is a list of json files, each of which is a table definition file. An example table definition file is given below:

{
    "table": {
        "project_name": "blockchain-etl",
        "dataset_name": "zeroex",
        "table_name": "Exchange_event_LogFill",
        "table_description": "Lorem ipsum.",
        "schema": [
            {
                "name": "block_timestamp",
                "description": "Lorem ipsum.",
                "type": "TIMESTAMP"
            },
            {
                "name": "maker",
                "type": "STRING"
            },
            {
                "name": "taker",
                "type": "STRING"
            },
            {
                "name": "feeRecipient",
                "type": "STRING"
            },
            {
                "name": "makerToken",
                "type": "STRING"
            },
            {
                "name": "takerToken",
                "type": "STRING"
            },
            {
                "name": "filledMakerTokenAmount",
                "type": "STRING"
            },
            {
                "name": "filledTakerTokenAmount",
                "type": "STRING"
            },
            {
                "name": "paidMakerFee",
                "type": "STRING"
            },
            {
                "name": "paidTakerFee",
                "type": "STRING"
            },
            {
                "name": "tokens",
                "type": "STRING"
            },
            {
                "name": "orderHash",
                "type": "STRING"
            }
        ]
    },
    "parser": {
        "type": "log",
        "contract_address": "0x12459c951127e0c374ff9105dda097662a027093",
        "abi": {
            "anonymous": false,
            "inputs": [
                {
                    "indexed": true,
                    "name": "maker",
                    "type": "address"
                },
                {
                    "indexed": false,
                    "name": "taker",
                    "type": "address"
                },
                {
                    "indexed": true,
                    "name": "feeRecipient",
                    "type": "address"
                },
                {
                    "indexed": false,
                    "name": "makerToken",
                    "type": "address"
                },
                {
                    "indexed": false,
                    "name": "takerToken",
                    "type": "address"
                },
                {
                    "indexed": false,
                    "name": "filledMakerTokenAmount",
                    "type": "uint256"
                },
                {
                    "indexed": false,
                    "name": "filledTakerTokenAmount",
                    "type": "uint256"
                },
                {
                    "indexed": false,
                    "name": "paidMakerFee",
                    "type": "uint256"
                },
                {
                    "indexed": false,
                    "name": "paidTakerFee",
                    "type": "uint256"
                },
                {
                    "indexed": true,
                    "name": "tokens",
                    "type": "bytes32"
                },
                {
                    "indexed": false,
                    "name": "orderHash",
                    "type": "bytes32"
                }
            ],
            "name": "LogFill",
            "type": "event"
        },
        "field_mapping": {
            "TODO": "if necessary define rules for mapping abi fields to BigQuery table columns"
        }
    }
}

The output is an Airflow DAG:

  • For each table definition file a PythonOperator task should be created which executes a BigQuery query job with destination table (an example can be found here).
  • The SQL for the BigQuery job should be generated from table and parser definitions, using Jinja template (an example Jinja template can be found here. An example log parsing query can be found here).
@medvedev1088 medvedev1088 changed the title Implement DAG for parsing Ethereum logs Implement DAG generator for parsing Ethereum logs Jul 28, 2019
@medvedev1088 medvedev1088 changed the title Implement DAG generator for parsing Ethereum logs Implement DAG builder for parsing Ethereum logs Jul 28, 2019
@gitcoinbot
Copy link

Issue Status: 1. Open 2. Started 3. Submitted 4. Done


This issue now has a funding of 500.0 DAI (500.0 USD @ $1.0/DAI) attached to it.

@gitcoinbot
Copy link

gitcoinbot commented Sep 19, 2019

Issue Status: 1. Open 2. Started 3. Submitted 4. Done


Work has been started.

These users each claimed they can complete the work by 2 weeks from now.
Please review their action plans below:

1) igetgames has applied to start work (Funders only: approve worker | reject worker).

I will implement the DAG builder according to ethereum-etl-airflow/issues#17 and the README.
2) askeluv has started work.

Create an Airflow DAG builder which can later be used to parse Ethereum logs given an ABI. For example 0x transactions and ENS events.

Learn more on the Gitcoin Issue Details page.

@askeluv
Copy link
Member

askeluv commented Sep 24, 2019

Should we also add a field event_topic inside the parser object, so we can filter out the right log rows? Or is there perhaps some easy way to convert from the event name to the topic / signature?

E.g. Fill events for 0x should be topic 0x0bcc4c97732e47d9946f229edb95f5b6323f601300e4690de719993f3c371129

@medvedev1088
Copy link
Member Author

medvedev1088 commented Sep 24, 2019

@askeluv There is an example here for converting event signature e.g. Transfer(address,address,uint256) to it's hash: https://github.com/blockchain-etl/ethereum-etl/blob/develop/ethereumetl/cli/get_keccak_hash.py. It uses the keccak function from eth_utils lib.

I've also found these 2 functions in eth_utils/abi.py:

def event_signature_to_log_topic(event_signature: str) -> bytes:
    return keccak(text=event_signature.replace(" ", ""))


def event_abi_to_log_topic(event_abi: Dict[str, Any]) -> bytes:
    event_signature = _abi_to_signature(event_abi)
    return event_signature_to_log_topic(event_signature)

@gitcoinbot
Copy link

Issue Status: 1. Open 2. Started 3. Submitted 4. Done


Work for 500.0 DAI (500.0 USD @ $1.0/DAI) has been submitted by:

  1. @askeluv

@ceresstation please take a look at the submitted work:


@gitcoinbot
Copy link

Issue Status: 1. Open 2. Started 3. Submitted 4. Done


The funding of 500.0 DAI (500.0 USD @ $1.0/DAI) attached to this issue has been approved & issued to @askeluv.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

No branches or pull requests

3 participants