ipld-eth-state-snapshot/scripts at v1.11.6-statediff-5.0.8 · cerc-io/ipld-eth-state-snapshot

History

Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
filter-bad-rows.sh		filter-bad-rows.sh
find-bad-rows.sh		find-bad-rows.sh

README.md

Data Validation

For a given table in the ipld-eth-db schema, we know the number of columns to be expected in each row in the data dump:

Table Expected columns

public.nodes 5

public.blocks 3

eth.header_cids 16

eth.state_cids 8

eth.storage_cids 9

Table	Expected columns
public.nodes	5
public.blocks	3
eth.header_cids	16
eth.state_cids	8
eth.storage_cids	9

Find Bad Data

Run the following command to find any rows having unexpected number of columns:

./scripts/find-bad-rows.sh -i <input-file> -c <expected-columns> -o [output-file] -d [include-data]

input-file -i: Input data file path
expected-columns -c: Expected number of columns in each row of the input file
output-file -o: Output destination file path (default: STDOUT)
include-data -d: Whether to include the data row in the output (true | false) (default: false)

The output is of format: row number, number of columns, the data row

Eg:

./scripts/find-bad-rows.sh -i eth.state_cids.csv -c 8 -o res.txt -d true

Output:

1 9 1500000,xxxxxxxx,0x83952d392f9b0059eea94b10d1a095eefb1943ea91595a16c6698757127d4e1c,,baglacgzasvqcntdahkxhufdnkm7a22s2eetj6mx6nzkarwxtkvy4x3bubdgq,\x0f,0,f,/blocks/,DMQJKYBGZRQDVLT2CRWVGPQNNJNCCJU7GL7G4VAI3LZVK4OL5Q2ARTI

Eg:

./scripts/find-bad-rows.sh -i public.nodes.csv -c 5 -o res.txt -d true
./scripts/find-bad-rows.sh -i public.blocks.csv -c 3 -o res.txt -d true
./scripts/find-bad-rows.sh -i eth.header_cids.csv -c 16 -o res.txt -d true
./scripts/find-bad-rows.sh -i eth.state_cids.csv -c 8 -o res.txt -d true
./scripts/find-bad-rows.sh -i eth.storage_cids.csv -c 9 -o res.txt -d true

Data Cleanup

In case of column count mismatch, data from file mode dumps can't be imported readily into ipld-eth-db.

Filter Bad Data

Run the following command to filter out rows having unexpected number of columns:

./scripts/filter-bad-rows.sh -i <input-file> -c <expected-columns> -o <output-file>

input-file -i: Input data file path
expected-columns -c: Expected number of columns in each row of the input file

output-file -o: Output destination file path

Eg:

./scripts/filter-bad-rows.sh -i public.nodes.csv -c 5 -o cleaned-public.nodes.csv
./scripts/filter-bad-rows.sh -i public.blocks.csv -c 3 -o cleaned-public.blocks.csv
./scripts/filter-bad-rows.sh -i eth.header_cids.csv -c 16 -o cleaned-eth.header_cids.csv
./scripts/filter-bad-rows.sh -i eth.state_cids.csv -c 8 -o cleaned-eth.state_cids.csv
./scripts/filter-bad-rows.sh -i eth.storage_cids.csv -c 9 -o cleaned-eth.storage_cids.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

scripts

scripts

README.md

Data Validation

Find Bad Data

Data Cleanup

Filter Bad Data

Files

scripts

Directory actions

More options

Directory actions

More options

Latest commit

History

scripts

Folders and files

parent directory

README.md

Data Validation

Find Bad Data

Data Cleanup

Filter Bad Data