Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implementation of internal transaction exporter #104

Merged
merged 3 commits into from Oct 11, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
38 changes: 38 additions & 0 deletions README.md
Expand Up @@ -31,6 +31,13 @@ Export ERC20 and ERC721 token details ([Schema](#tokenscsv), [Reference](#export
--provider-uri https://mainnet.infura.io --output tokens.csv
```

Export traces ([Schema](#tracescsv), [Reference](#export_tracespy)):

```bash
> python export_traces.py --start-block 0 --end-block 500000 \
--provider-uri file://$HOME/Library/Ethereum/parity.ipc --output traces.csv
```

[LIMITATIONS](#limitations)

## Table of Contents
Expand All @@ -43,6 +50,7 @@ Export ERC20 and ERC721 token details ([Schema](#tokenscsv), [Reference](#export
- [logs.csv](#logscsv)
- [contracts.csv](#contractscsv)
- [tokens.csv](#tokenscsv)
- [traces.csv](#tracescsv)
- [Exporting the Blockchain](#exporting-the-blockchain)
- [Export in 2 Hours](#export-in-2-hours)
- [Command Reference](#command-reference)
Expand Down Expand Up @@ -151,6 +159,24 @@ name | string |
decimals | bigint |
total_supply | numeric |

### traces.csv

Column | Type |
-----------------------------|-------------|
block_number | bigint |
transaction_hash | hex_string |
from_address | address |
to_address | address |
value | numeric |
contract_address | address |
input | hex_string |
trace_type | string |
gas | bigint |
gas_used | bigint |
subtraces | bigint |
trace_address | string |
error | string |

You can find column descriptions in [https://github.com/medvedev1088/ethereum-etl-airflow](https://github.com/medvedev1088/ethereum-etl-airflow/tree/master/dags/resources/stages/raw/schemas)

Note: for the `address` type all hex characters are lower-cased.
Expand Down Expand Up @@ -270,6 +296,7 @@ Additional steps:
- [export_receipts_and_logs.py](#export_receipts_and_logspy)
- [export_contracts.py](#export_contractspy)
- [export_tokens.py](#export_tokenspy)
- [export_traces.py](#export_tracespy)
- [get_block_range_for_date.py](#get_block_range_for_datepy)
- [get_keccak_hash.py](#get_keccak_hashpy)

Expand Down Expand Up @@ -391,6 +418,17 @@ Then export ERC20 / ERC721 tokens:

You can tune `--max-workers` for performance.

##### export_traces.py

The API used in this command is not supported by Infura, so you will need a local node.

```bash
> python export_traces.py --start-block 0 --end-block 500000 \
--provider-uri file://$HOME/Library/Ethereum/parity.ipc --batch-size 100 --output traces.csv
```

You can tune `--batch-size`, `--max-workers` for performance.

##### get_block_range_for_date.py

```bash
Expand Down
38 changes: 38 additions & 0 deletions ethereumetl/domain/trace.py
@@ -0,0 +1,38 @@
# MIT License
#
# Copyright (c) 2018 Evgeniy Filatov, evgeniyfilatov@gmail.com
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.


class EthTrace(object):
def __init__(self):
self.block_number = None
self.transaction_hash = None
self.from_address = None
self.to_address = None
self.value = None
self.contract_address = None
self.input = None
self.trace_type = None
self.gas = None
self.gas_used = None
self.subtraces = None
self.trace_address = None
self.error = None
75 changes: 75 additions & 0 deletions ethereumetl/jobs/export_traces_job.py
@@ -0,0 +1,75 @@
# MIT License
#
# Copyright (c) 2018 Evgeniy Filatov, evgeniyfilatov@gmail.com
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.

from ethereumetl.executors.batch_work_executor import BatchWorkExecutor
from ethereumetl.jobs.base_job import BaseJob
from ethereumetl.utils import validate_range
from ethereumetl.mappers.trace_mapper import EthTraceMapper


class ExportTracesJob(BaseJob):
def __init__(
self,
start_block,
end_block,
batch_size,
web3,
item_exporter,
max_workers):
validate_range(start_block, end_block)
self.start_block = start_block
self.end_block = end_block

self.web3 = web3

self.batch_work_executor = BatchWorkExecutor(batch_size, max_workers)
self.item_exporter = item_exporter

self.trace_mapper = EthTraceMapper()

def _start(self):
self.item_exporter.open()

def _export(self):
self.batch_work_executor.execute(
range(self.start_block, self.end_block + 1),
self._export_batch,
total_items=self.end_block - self.start_block + 1
)

def _export_batch(self, block_number_batch):
assert len(block_number_batch) > 0

filter_params = {
'fromBlock': hex(block_number_batch[0]),
'toBlock': hex(block_number_batch[-1]),
}

json_traces = self.web3.parity.traceFilter(filter_params)

for json_trace in json_traces:
trace = self.trace_mapper.json_dict_to_trace(json_trace)
self.item_exporter.export_item(self.trace_mapper.trace_to_dict(trace))

def _end(self):
self.batch_work_executor.shutdown()
self.item_exporter.close()
51 changes: 51 additions & 0 deletions ethereumetl/jobs/exporters/traces_item_exporter.py
@@ -0,0 +1,51 @@
# MIT License
#
# Copyright (c) 2018 Evgeniy Filatov, evgeniyfilatov@gmail.com
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.


from ethereumetl.jobs.exporters.composite_item_exporter import CompositeItemExporter

FIELDS_TO_EXPORT = [
'block_number',
'transaction_hash',
'from_address',
'to_address',
'value',
'contract_address',
'input',
'trace_type',
'gas',
'gas_used',
'subtraces',
'trace_address',
'error',
]


def traces_item_exporter(traces_output):
return CompositeItemExporter(
filename_mapping={
'trace': traces_output
},
field_mapping={
'trace': FIELDS_TO_EXPORT
}
)
94 changes: 94 additions & 0 deletions ethereumetl/mappers/trace_mapper.py
@@ -0,0 +1,94 @@
# MIT License
#
# Copyright (c) 2018 Evgeniy Filatov, evgeniyfilatov@gmail.com
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.


from ethereumetl.domain.trace import EthTrace
from ethereumetl.utils import hex_to_dec, to_normalized_address


class EthTraceMapper(object):
def json_dict_to_trace(self, json_dict):
trace = EthTrace()

trace.block_number = json_dict.get('blockNumber', None)
trace.transaction_hash = json_dict.get('transactionHash', None)
trace.subtraces = json_dict.get('subtraces', None)
trace.trace_address = json_dict.get('traceAddress', [])

error = json_dict.get('error', None)

if error:
trace.error = error

action = json_dict.get('action', {})
result = json_dict.get('result', {})

trace_type = json_dict.get('type', None)

# common fields in call/create
if trace_type in ('call', 'create'):
trace.from_address = to_normalized_address(action.get('from', None))
trace.value = hex_to_dec(action.get('value', None))
trace.gas = hex_to_dec(action.get('gas', None))
trace.gas_used = hex_to_dec(result.get('gasUsed', None))

# process 'call' traces
if trace_type == 'call':
trace.trace_type = action.get('callType', None)

trace.to_address = to_normalized_address(action.get('to', None))
trace.input = action.get('input', None)
else:
trace.trace_type = trace_type

# process other traces
if trace_type == 'create':
trace.contract_address = result.get('address', None)
trace.to_address = to_normalized_address(0)
trace.input = action.get('init', None)
elif trace_type == 'suicide':
trace.from_address = to_normalized_address(action.get('address', None))
trace.to_address = to_normalized_address(action.get('refundAddress', None))
trace.value = hex_to_dec(action.get('balance', None))
elif trace_type == 'reward':
trace.to_address = to_normalized_address(action.get('author', None))
trace.value = hex_to_dec(action.get('value', None))

return trace

def trace_to_dict(self, trace):
return {
'type': 'trace',
'block_number': trace.block_number,
'transaction_hash': trace.transaction_hash,
'from_address': trace.from_address,
'to_address': trace.to_address,
'value': trace.value,
'contract_address': trace.contract_address,
'input': trace.input,
'trace_type': trace.trace_type,
'gas': trace.gas,
'gas_used': trace.gas_used,
'subtraces': trace.subtraces,
'trace_address': trace.trace_address,
'error': trace.error,
}
57 changes: 57 additions & 0 deletions export_traces.py
@@ -0,0 +1,57 @@
# MIT License
#
# Copyright (c) 2018 Evgeniy Filatov, evgeniyfilatov@gmail.com
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.


import argparse

from web3 import Web3

from ethereumetl.jobs.export_traces_job import ExportTracesJob
from ethereumetl.logging_utils import logging_basic_config
from ethereumetl.providers.auto import get_provider_from_uri
from ethereumetl.thread_local_proxy import ThreadLocalProxy
from ethereumetl.jobs.exporters.traces_item_exporter import traces_item_exporter

logging_basic_config()

parser = argparse.ArgumentParser(
description='Exports traces using trace_filter JSON RPC API.')
parser.add_argument('-s', '--start-block', default=0, type=int, help='Start block')
parser.add_argument('-e', '--end-block', required=True, type=int, help='End block')
parser.add_argument('-b', '--batch-size', default=100, type=int, help='The number of blocks to filter at a time.')
parser.add_argument('-o', '--output', default='-', type=str, help='The output file. If not specified stdout is used.')
parser.add_argument('-w', '--max-workers', default=5, type=int, help='The maximum number of workers.')
parser.add_argument('-p', '--provider-uri', required=True, type=str,
help='The URI of the web3 provider e.g. '
'file://$HOME/.local/share/io.parity.ethereum/jsonrpc.ipc or http://localhost:8545/')

args = parser.parse_args()

job = ExportTracesJob(
start_block=args.start_block,
end_block=args.end_block,
batch_size=args.batch_size,
web3=ThreadLocalProxy(lambda: Web3(get_provider_from_uri(args.provider_uri))),
item_exporter=traces_item_exporter(args.output),
max_workers=args.max_workers)

job.run()