Datasets
{Un}pack datasets are basically clickhouse database dumps that are available for you to use freely.
We tried numerous approaches to deal with massive amount of the data, to make it available publicly, while preserving sanity of the amount of the data that you would need to download. Idea was, even if you have smaller bandwidth and network connection to be able with some time, download and start using tool-chain locally.
We didn't test on smaller machines to do the exports or imports. Our current machine processing the datasets (ingestion and export) utilizes 256Gb of RAM and AMD Threadripper's processor (48C). In the future we are going to deal with this and ensure that proper reports are created and applied to the wiki. Bellow, under examples you can see the time it took for us to dump, compress and decompress the data.
We'll keep it short here and just in short bullet points explain higher overview of how datasets are being generated.
- Syncer service ingests the data into the database.
- Executing
export
command which will take all of the available tables from the database and create{table}.clickhouse
dump. - Executing
compress
command that is going to compress the data using 7z compression algorithm and create{table}.clickhouse.7z
files. - Executing
make compress-ethereum
which is going to compress raw ethereum source code. - Running the
scripts/upload.sh
by executingmake upload
that is going to upload clickhouse and ethereum datasets to https://r2.unpack.dev/.
Navigate to Import to see how you can import datasets. Check out Compress to see how you can compress datasets.
unpack datasets clickhouse export
Bellow you can see how you can utilizing one-liner export whole datasets into .clickhouse
format.
Exporting tables for database: default . Please be patient. This WILL take a while...
Exported table: contracts to /home/nevio/dev/unpack/inspector/datasets/contracts.clickhouse (size: 4.71 GB)
Exported table: metadata to /home/nevio/dev/unpack/inspector/datasets/metadata.clickhouse (size: 0.03 GB)
Exported table: tokens to /home/nevio/dev/unpack/inspector/datasets/tokens.clickhouse (size: 0.01 GB)
Exported table: ast to /home/nevio/dev/unpack/inspector/datasets/ast.clickhouse (size: 0.01 GB)
Exported table: cfg to /home/nevio/dev/unpack/inspector/datasets/cfg.clickhouse (size: 0.05 GB)
Exported table: constructors to /home/nevio/dev/unpack/inspector/datasets/constructors.clickhouse (size: 1.39 GB)
Exported table: standards to /home/nevio/dev/unpack/inspector/datasets/standards.clickhouse (size: 1.42 GB)
Exported table: variables to /home/nevio/dev/unpack/inspector/datasets/variables.clickhouse (size: 1.51 GB)
Exported table: functions to /home/nevio/dev/unpack/inspector/datasets/functions.clickhouse (size: 40.98 GB)
Exported table: events to /home/nevio/dev/unpack/inspector/datasets/events.clickhouse (size: 2.06 GB)
Successfully exported tables for database: default - Completed in 1m57.11145471s
Navigate to Export to see how you can export datasets.
unpack datasets clickhouse import
Importing tables into the database. Please be patient. This WILL take a while...
Imported data into table: contracts
Imported data into table: metadata
Imported data into table: tokens
Imported data into table: ast
Imported data into table: cfg
Imported data into table: constructors
Imported data into table: standards
Imported data into table: variables
Imported data into table: functions
Imported data into table: events
Successfully imported tables into the database. Completed in 3m32.002066679s
unpack datasets clickhouse compress
Compressing exported files. Please be patient. This WILL take a while...
Compressed file: /home/nevio/dev/unpack/inspector/datasets/contracts.clickhouse.7z (size: 0.19 GB)
Compressed file: /home/nevio/dev/unpack/inspector/datasets/metadata.clickhouse.7z (size: 0.01 GB)
Compressed file: /home/nevio/dev/unpack/inspector/datasets/tokens.clickhouse.7z (size: 0.00 GB)
Compressed file: /home/nevio/dev/unpack/inspector/datasets/ast.clickhouse.7z (size: 0.00 GB)
Compressed file: /home/nevio/dev/unpack/inspector/datasets/cfg.clickhouse.7z (size: 0.00 GB)
Compressed file: /home/nevio/dev/unpack/inspector/datasets/constructors.clickhouse.7z (size: 0.05 GB)
Compressed file: /home/nevio/dev/unpack/inspector/datasets/standards.clickhouse.7z (size: 0.01 GB)
Compressed file: /home/nevio/dev/unpack/inspector/datasets/variables.clickhouse.7z (size: 0.09 GB)
Compressed file: /home/nevio/dev/unpack/inspector/datasets/functions.clickhouse.7z (size: 2.23 GB)
Compressed file: /home/nevio/dev/unpack/inspector/datasets/events.clickhouse.7z (size: 0.08 GB)
Successfully compressed exported files. Completed in 18m34.316607237s
unpack datasets clickhouse decompress
Decompressing exported database files. Please be patient. This WILL take a while...
Decompressed file: /home/nevio/dev/unpack/inspector/datasets/contracts.clickhouse
Decompressed file: /home/nevio/dev/unpack/inspector/datasets/metadata.clickhouse
Decompressed file: /home/nevio/dev/unpack/inspector/datasets/tokens.clickhouse
Decompressed file: /home/nevio/dev/unpack/inspector/datasets/ast.clickhouse
Decompressed file: /home/nevio/dev/unpack/inspector/datasets/cfg.clickhouse
Decompressed file: /home/nevio/dev/unpack/inspector/datasets/constructors.clickhouse
Decompressed file: /home/nevio/dev/unpack/inspector/datasets/standards.clickhouse
Decompressed file: /home/nevio/dev/unpack/inspector/datasets/variables.clickhouse
Decompressed file: /home/nevio/dev/unpack/inspector/datasets/functions.clickhouse
Decompressed file: /home/nevio/dev/unpack/inspector/datasets/events.clickhouse
Successfully decompressed exported files. Completed in 3m35.361370553s
unpack datasets clickhouse upload
Uploading exported archive datasets to Cloudflare R2. Please be patient. This WILL take a while...
Uploading (contracts.clickhouse.7z): 190.985 MiB, 100%, ETA 0s
Uploading (metadata.clickhouse.7z): 11.128 MiB, 100%, ETA 0s
Uploading (tokens.clickhouse.7z): 3.243 MiB, 100%, ETA 0s
Uploading (ast.clickhouse.7z): 3.015 MiB, 100%, ETA 0s
Uploading (cfg.clickhouse.7z): 5.093 MiB, 100%, ETA 0s
Uploading (constructors.clickhouse.7z): 51.992 MiB, 100%, ETA 0s
Uploading (standards.clickhouse.7z): 10.909 MiB, 100%, ETA 0s
Uploading (variables.clickhouse.7z): 93.205 MiB, 100%, ETA 0s
Uploading (functions.clickhouse.7z): 2.225 GiB, 100%, ETA 0s
Uploading (events.clickhouse.7z): 76.833 MiB, 100%, ETA 0s
Successfully uploaded exported archive datasets to Cloudflare R2. Completed in 4m24.593401168s
unpack datasets download
Downloading exported archive datasets from Cloudflare R2. Please be patient. This WILL take a while...
Destination path: /home/nevio/dev/unpack/inspector/datasets-test
Database Tables: [contracts metadata tokens ast cfg constructors standards variables functions events]
Blockchains: [ethereum]
[11/12] Downloading ethereum.7z... 100% [========================================] (20 MB/s)
Successfully downloaded exported archive datasets from Cloudflare R2. Completed in 3m3.04008376s