Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when deploying a node using snapshot,dbx_env_open: MDBX_CORRUPTED,Looks like there's enough space #10814

Open
hanzhenlong1314 opened this issue Jun 20, 2024 · 25 comments
Assignees

Comments

@hanzhenlong1314
Copy link

log:

meta_checktxnid:11415 catch invalid root_page_txnid 11557706 for maindb.mod_txnid 24300513 (workaround for incoherent flaw of unified page/buffer cache)
meta_waittxnid:11454 bailout waiting for valid snapshot (workaround for incoherent flaw of unified page/buffer cache)
mdbx_setup_dxb:16208 error -30796, while updating meta.geo: from l3-n749199549-u939524096/s2048-g1024 (txn#24300516), to l3-n749199549-u1006632960/s2048-g1024 (txn#24300517)
[EROR] [06-19|03:40:14.411] Erigon startup err="mdbx_env_open: MDBX_CORRUPTED: Maybe free space is over on disk. Otherwise it's hardware failure. Before creating issue please use tools like https://www.memtest86.com/ to test RAM and tools like https://www.smartmontools.org/ to test Disk. To handle hardware risks: use ECC RAM, use RAID of disks, run multiple application instances (or do backups). If hardware checks passed - check FS settings - 'fsync' and 'flock' must be enabled. Otherwise - please create issue in Application repo. On default DURABLE mode, power outage can't cause this error. On other modes - power outage may break last transaction and mdbx_chk can recover db in this case, see '-t' and '-0|1|2' options., label: chaindata, trace: [kv_mdbx.go:357 node.go:367 node.go:370 backend.go:245 node.go:124 main.go:66 make_app.go:54 command.go:276 app.go:333 app.go:307 main.go:34 proc.go:267 asm_amd64.s:1650]"
mdbx_env_open: MDBX_CORRUPTED: Maybe free space is over on disk. Otherwise it's hardware failure. Before creating issue please use tools like https://www.memtest86.com/ to test RAM and tools like https://www.smartmontools.org/ to test Disk. To handle hardware risks: use ECC RAM, use RAID of disks, run multiple application instances (or do backups). If hardware checks passed - check FS settings - 'fsync' and 'flock' must be enabled. Otherwise - please create issue in Application repo. On default DURABLE mode, power outage can't cause this error. On other modes - power outage may break last transaction and mdbx_chk can recover db in this case, see '-t' and '-0|1|2' options., label: chaindata, trace: [kv_mdbx.go:357 node.go:367 node.go:370 backend.go:245 node.go:124 main.go:66 make_app.go:54 command.go:276 app.go:333 app.go:307 main.go:34 proc.go:267 asm_amd64.s:1650]

commond:
docker run -d --name ok-erigon -u root -p 7011:30303 -p 7012:8545 -p 7013:9090 -v /data4/poly:/root/erigon/data/ ok-erigon --chain=bor-mainnet --bor.heimdall=https://heimdall-api.polygon.technology/ --http.addr=0.0.0.0 --http.vhosts=* --http.corsdomain=* --http.api=eth,erigon,engine,debug,trace --db.size.limit=15TB --datadir=/root/erigon/data/ --torrent.download.rate=512mb

I am using this snapshot, unzip and replace the mdbx.dat in chaindata

image

image

image

@AskAlexSharov
Copy link
Collaborator

AskAlexSharov commented Jun 20, 2024

continue with error message recommendations:

  1. check free space
  2. [] maybe hardware failure. please use tools like https://www.memtest86.com/ to test RAM and tools like https://www.smartmontools.org/ to test Disk
    if hardware checks will pass
  3. [] make db-tools mdbx_chk may recover db. see '-t' and '-0|1|2' options.
  4. also double-check that fsync is not disabled on server

@hanzhenlong1314
Copy link
Author

continue with error message recommendations:

  1. check free space
  2. [] maybe hardware failure. please use tools like https://www.memtest86.com/ to test RAM and tools like https://www.smartmontools.org/ to test Disk
    if hardware checks will pass
  3. [] make db-tools mdbx_chk may recover db. see '-t' and '-0|1|2' options.
  4. also double-check that fsync is not disabled on server

,I found that I cannot start the system with snapshots, but I can start it normally without snapshots. Is there something wrong with the snapshots?
image

@AskAlexSharov
Copy link
Collaborator

cannot start - totally not enough information.

@hanzhenlong1314
Copy link
Author

annot start - totally not enough information.
I checked the free disk space and memory space and they are both sufficient.
image

@AskAlexSharov
Copy link
Collaborator

what means cannot start? error message? logs? it suck? or what?

@hanzhenlong1314
Copy link
Author

Even though the memory and disk are sufficient, the startup error is: meta_checktxnid:11415 catch invalid root_page_txnid 11557706 for maindb.mod_txnid 24300513 (workaround for incoherent flaw of unified page/buffer cache)
meta_waittxnid:11454 bailout waiting for valid snapshot (workaround for incoherent flaw of unified page/buffer cache)
mdbx_setup_dxb:16208 error -30796, while updating meta.geo: from l3-n749199549-u939524096/s2048-g1024 (txn#24300516), to l3-n749199549-u1006632960/s2048-g1024 (txn#24300517)
[EROR] [06-19|03:40:14.411] Erigon startup err="mdbx_env_open: MDBX_CORRUPTED: Maybe free space is over on disk. Otherwise it's hardware failure. Before creating issue please use tools like https://www.memtest86.com/ to test RAM and tools like https://www.smartmontools.org/ to test Disk. To handle hardware risks: use ECC RAM, use RAID of disks, run multiple application instances (or do backups). If hardware checks passed - check FS settings - 'fsync' and 'flock' must be enabled. Otherwise - please create issue in Application repo. On default DURABLE mode, power outage can't cause this error. On other modes - power outage may break last transaction and mdbx_chk can recover db in this case, see '-t' and '-0|1|2' options., label: chaindata, trace: [kv_mdbx.go:357 node.go:367 node.go:370 backend.go:245 node.go:124 main.go:66 make_app.go:54 command.go:276 app.go:333 app.go:307 main.go:34 proc.go:267 asm_amd64.s:1650]"
mdbx_env_open: MDBX_CORRUPTED: Maybe free space is over on disk. Otherwise it's hardware failure. Before creating issue please use tools like https://www.memtest86.com/ to test RAM and tools like https://www.smartmontools.org/ to test Disk. To handle hardware risks: use ECC RAM, use RAID of disks, run multiple application instances (or do backups). If hardware checks passed - check FS settings - 'fsync' and 'flock' must be enabled. Otherwise - please create issue in Application repo. On default DURABLE mode, power outage can't cause this error. On other modes - power outage may break last transaction and mdbx_chk can recover db in this case, see '-t' and '-0|1|2' options., label: chaindata, trace: [kv_mdbx.go:357 node.go:367 node.go:370 backend.go:245 node.go:124 main.go:66 make_app.go:54 command.go:276 app.go:333 app.go:307 main.go:34 proc.go:267 asm_amd64.s:1650]

@AskAlexSharov
Copy link
Collaborator

maybe hardware failure. please use tools like https://www.memtest86.com/ to test RAM and tools like https://www.smartmontools.org/ to test Disk

@hanzhenlong1314
Copy link
Author

maybe hardware failure. please use tools like https://www.memtest86.com/ to test RAM and tools like https://www.smartmontools.org/ to test Disk

./memtester 100G 10
memtester version 4.5.1 (64-bit)
Copyright (C) 2001-2020 Charles Cazabon.
Licensed under the GNU General Public License version 2 (only).

pagesize is 4096
pagesizemask is 0xfffffffffffff000
want 102400MB (107374182400 bytes)
got 102400MB (107374182400 bytes), trying mlock ...locked.
Loop 1/10:
I tested the memory pagesize to be 4kb. Will this affect the node startup? When I changed db.pagesize=4kb, the node reported another error, erigon/data/ ok-erigon --chain=bor-mainnet --bor.heimdall=https://heimdall-api.polygon.technology --http.addr=0.0.0.0 --http.vhosts=* --http.corsdomain=* --http.api=eth,erigon,engine,debug,trace --db.size.limit=13TB --db.pagesize=4kb --datadir=/root/erigon/data/ --torrent.download.rate=512mb
[root@c01_docker_solfullnode_pap_hk poly-archive]# docker logs -f ok-erigon
[INFO] [06-25|08:31:57.497] logging to file system log dir=/root/erigon/data/logs file prefix=erigon log level=info json=false
[INFO] [06-25|08:31:57.497] Build info git_branch= git_tag= git_commit=
[INFO] [06-25|08:31:57.497] Starting Erigon on Bor Mainnet...
[INFO] [06-25|08:31:57.498] Maximum peer count ETH=100 total=100
[INFO] [06-25|08:31:57.498] starting HTTP APIs port=8545 APIs=eth,erigon,engine,debug,trace
[INFO] [06-25|08:31:57.498] torrent verbosity level=WRN
[INFO] [06-25|08:31:59.601] Set global gas cap cap=50000000
[INFO] [06-25|08:31:59.602] [Downloader] Running with ipv6-enabled=true ipv4-enabled=true download.rate=512mb upload.rate=4mb
[INFO] [06-25|08:31:59.602] Opening Database label=chaindata path=/root/erigon/data/chaindata
[EROR] [06-25|08:31:59.602] Erigon startup err="mdbx_env_set_geometry: MDBX_TOO_LARGE: Database is too large for current system, e.g. could NOT be mapped into RAM"
mdbx_env_set_geometry: MDBX_TOO_LARGE: Database is too large for current system, e.g. could NOT be mapped into RAM @AskAlexSharov

@AskAlexSharov
Copy link
Collaborator

4kb pagesize can maximum address 8tb db. so, --db.size.limit=13TB is too much for 4kb pagesize. set it smaller.

@hanzhenlong1314
Copy link
Author

4kb pagesize can maximum address 8tb db. so, --db.size.limit=13TB is too much for 4kb pagesize. set it smaller.

So I should change the memory pagesize to 8kb, instead of simply setting db.pagesize=8kb, right? @AskAlexSharov

@hanzhenlong1314
Copy link
Author

4kb pagesize can maximum address 8tb db. so, --db.size.limit=13TB is too much for 4kb pagesize. set it smaller.

/data2/erigon/build/bin/integration mdbx_to_mdbx --datadir /data1/erigon_temp --chaindata /data2/poly-archive/erigon_data --chaindata.to /data1/poly/chaindata/
INFO[06-27|14:21:09.692] logging to file system log dir=/data1/erigon_temp/logs file prefix=integration log level=info json=false
panic: fail to open mdbx: mdbx_txn_begin: MDBX_PROBLEM: Unexpected internal error, transaction should be aborted, label: chaindata, trace: [kv_mdbx.go:369 kv_mdbx.go:475 backup.go:33 refetence_db.go:120 command.go:987 command.go:1115 command.go:1039 command.go:1032 main.go:18 proc.go:267 asm_amd64.s:1650]

goroutine 1 [running]:
github.com/ledgerwatch/erigon-lib/kv/mdbx.MdbxOpts.MustOpen({{0x1e3d520, 0xc000844fa0}, 0xc000b35e50, 0xc0014f80d0, {0x7ffd8c41f0b1, 0x1f}, 0x0, 0x20000000000, 0x40000000, 0xffffffffffffffff, ...})
github.com/ledgerwatch/erigon-lib@v0.0.0-00010101000000-000000000000/kv/mdbx/kv_mdbx.go:477 +0xc5
github.com/ledgerwatch/erigon-lib/kv/backup.OpenPair({0x7ffd8c41f0b1, 0x1f}, {0x7ffd8c41f0e0, 0x16}, 0x0, 0x0, {0x1e3d520, 0xc000844fa0})
github.com/ledgerwatch/erigon-lib@v0.0.0-00010101000000-000000000000/kv/backup/backup.go:33 +0x29c
github.com/ledgerwatch/erigon/cmd/integration/commands.glob..func5(0xc000691200?, {0x199afa6?, 0x4?, 0x199ae3e?})
github.com/ledgerwatch/erigon/cmd/integration/commands/refetence_db.go:120 +0x8c
github.com/spf13/cobra.(*Command).execute(0x2a8e000, {0xc0014d40c0, 0x6, 0x6})
github.com/spf13/cobra@v1.8.0/command.go:987 +0xaa3
github.com/spf13/cobra.(*Command).ExecuteC(0x2a8e5c0)
github.com/spf13/cobra@v1.8.0/command.go:1115 +0x3ff
github.com/spf13/cobra.(*Command).Execute(...)
github.com/spf13/cobra@v1.8.0/command.go:1039
github.com/spf13/cobra.(*Command).ExecuteContext(0x462d9c?, {0x1e334d0?, 0xc000b32af0?})
github.com/spf13/cobra@v1.8.0/command.go:1032 +0x47
main.main()
github.com/ledgerwatch/erigon/cmd/integration/main.go:18 +0xe6

There is a problem with the copy data, please help me
image

@AskAlexSharov
Copy link
Collaborator

AskAlexSharov commented Jun 27, 2024

if need more then 8Tb database - then can't use 4kb pagesize. use 8kb or more (re-create target db)

@AskAlexSharov
Copy link
Collaborator

@hanzhenlong1314
Copy link
Author

if need more then 8Tb database - then can't use 4kb pagesize. use 8kb or more (re-create target db)

I have understood this problem. I have changed db.pagesize=8KB, but the following error occurred when executing snapshot data copy. Integration mdbx_to_mdbx has an error

Running version: 2.60.0
System: Linux c01_docker_solfullnode_pap_hk 5.10.134-16.3.al8.x86_64 #1 SMP Tue Mar 26 18:54:05 CST 2024 x86_64 x86_64 x86_64 GNU/Linux
Disk size: 20T
Memory: 250G @AskAlexSharov

data2/erigon/build/bin/integration mdbx_to_mdbx --datadir /data1/erigon_temp --chaindata /data2/poly-archive/erigon_data --chaindata.to /data1/poly/chaindata/
INFO[06-27|14:21:09.692] logging to file system log dir=/data1/erigon_temp/logs file prefix=integration log level=info json=false
panic: fail to open mdbx: mdbx_txn_begin: MDBX_PROBLEM: Unexpected internal error, transaction should be aborted, label: chaindata, trace: [kv_mdbx.go:369 kv_mdbx.go:475 backup.go:33 refetence_db.go:120 command.go:987 command.go:1115 command.go:1039 command.go:1032 main.go:18 proc.go:267 asm_amd64.s:1650]

goroutine 1 [running]:
github.com/ledgerwatch/erigon-lib/kv/mdbx.MdbxOpts.MustOpen({{0x1e3d520, 0xc000844fa0}, 0xc000b35e50, 0xc0014f80d0, {0x7ffd8c41f0b1, 0x1f}, 0x0, 0x20000000000, 0x40000000, 0xffffffffffffffff, ...})
github.com/ledgerwatch/erigon-lib@v0.0.0-00010101000000-000000000000/kv/mdbx/kv_mdbx.go:477 +0xc5
github.com/ledgerwatch/erigon-lib/kv/backup.OpenPair({0x7ffd8c41f0b1, 0x1f}, {0x7ffd8c41f0e0, 0x16}, 0x0, 0x0, {0x1e3d520, 0xc000844fa0})
github.com/ledgerwatch/erigon-lib@v0.0.0-00010101000000-000000000000/kv/backup/backup.go:33 +0x29c
github.com/ledgerwatch/erigon/cmd/integration/commands.glob..func5(0xc000691200?, {0x199afa6?, 0x4?, 0x199ae3e?})
github.com/ledgerwatch/erigon/cmd/integration/commands/refetence_db.go:120 +0x8c
github.com/spf13/cobra.(*Command).execute(0x2a8e000, {0xc0014d40c0, 0x6, 0x6})
github.com/spf13/cobra@v1.8.0/command.go:987 +0xaa3
github.com/spf13/cobra.(*Command).ExecuteC(0x2a8e5c0)
github.com/spf13/cobra@v1.8.0/command.go:1115 +0x3ff
github.com/spf13/cobra.(*Command).Execute(...)
github.com/spf13/cobra@v1.8.0/command.go:1039
github.com/spf13/cobra.(*Command).ExecuteContext(0x462d9c?, {0x1e334d0?, 0xc000b32af0?})
github.com/spf13/cobra@v1.8.0/command.go:1032 +0x47
main.main()
github.com/ledgerwatch/erigon/cmd/integration/main.go:18 +0xe6

There is a problem with the copy data, please help me

@hanzhenlong1314
Copy link
Author

hanzhenlong1314 commented Jun 27, 2024

this seems to be for copying data from new and old nodes, not for snapshot import. The snapshot only has one file mdbx.dat. Is this the reason for the execution failure? @AskAlexSharov
0. You will need 2x disk space (can be different disks).

  1. Stop Erigon
  2. Create new db with new --db.pagesize:
    ONLY_CREATE_DB=true ./build/bin/erigon --datadir=/erigon-new/ --chain="$CHAIN" --db.pagesize=8kb --db.size.limit=12T

if erigon doesn't stop after 1 min. just stop it.

  1. Build integration: cd erigon; make integration
  2. Run: ./build/bin/integration mdbx_to_mdbx --chaindata /existing/erigon/path/chaindata/ --chaindata.to /erigon-new/chaindata/
  3. cp -R /existing/erigon/path/snapshots /erigon-new/snapshots
  4. start erigon in new datadir as usually

@AskAlexSharov
Copy link
Collaborator

try take a look if both db's are fine. for example by:
mdbx_stat -ef /data1/poly/chaindata/
mdbx_stat -ef /data2/poly-archive/erigon_data

@hanzhenlong1314
Copy link
Author

@AskAlexSharov
./mdbx_stat -ef /data1/poly/chaindata/
mdbx_stat v0.12.9-16-gfff3fbd8 (2024-03-06T22:58:31+03:00, T-c5e6e3a4f75727b9e0039ad420ae167d3487d006)
Running for /data1/poly/chaindata/...
./mdbx_stat: mdbx_env_open() error -30794 MDBX_VERSION_MISMATCH: DB version mismatch libmdbx

[root@c01_docker_solfullnode_pap_hk bin]# ./mdbx_stat -ef /data2/poly-archive/erigon_data
mdbx_stat v0.12.9-16-gfff3fbd8 (2024-03-06T22:58:31+03:00, T-c5e6e3a4f75727b9e0039ad420ae167d3487d006)
Running for /data2/poly-archive/erigon_data...
./mdbx_stat: mdbx_txn_begin() error -30779 MDBX_PROBLEM: Unexpected internal error, transaction should be aborted

@AskAlexSharov
Copy link
Collaborator

git --no-pager log -1 --oneline
make db-tools
du -h /data1/poly/chaindata/
./build/bin/mdbx_stat -ef /data1/poly/chaindata/
du -h /data2/poly-archive/erigon_data
./build/bin/mdbx_stat -ef /data2/poly-archive/erigon_data
./build/bin/mdbx_chk -0 -d /data2/poly-archive/erigon_data
./build/bin/mdbx_chk -1 -d /data2/poly-archive/erigon_data
./build/bin/mdbx_chk -2 -d /data2/poly-archive/erigon_data

plz use triple backticks for output formatting: https://docs.github.com/en/get-started/writing-on-github/getting-started-with-writing-and-formatting-on-github/basic-writing-and-formatting-syntax#quoting-code

@hanzhenlong1314
Copy link
Author

git --no-pager log -1 --oneline
make db-tools
du -h /data1/poly/chaindata/
./build/bin/mdbx_stat -ef /data1/poly/chaindata/
du -h /data2/poly-archive/erigon_data
./build/bin/mdbx_stat -ef /data2/poly-archive/erigon_data
./build/bin/mdbx_chk -0 -d /data2/poly-archive/erigon_data
./build/bin/mdbx_chk -1 -d /data2/poly-archive/erigon_data
./build/bin/mdbx_chk -2 -d /data2/poly-archive/erigon_data

plz use triple backticks for output formatting: https://docs.github.com/en/get-started/writing-on-github/getting-started-with-writing-and-formatting-on-github/basic-writing-and-formatting-syntax#quoting-code

1、 du -h /data2/poly-archive/erigon_data
12T /data2/poly-archive/erigon_data

2、./build/bin/mdbx_stat -ef /data2/poly-archive/erigon_data
mdbx_stat v0.12.9-16-gfff3fbd8 (2024-03-06T22:58:31+03:00, T-c5e6e3a4f75727b9e0039ad420ae167d3487d006)
Running for /data2/poly-archive/erigon_data...
./build/bin/mdbx_stat: mdbx_txn_begin() error -30779 MDBX_PROBLEM: Unexpected internal error, transaction should be aborted

3、./build/bin/mdbx_chk -0 -d /data2/poly-archive/erigon_data
mdbx_chk v0.12.9-16-gfff3fbd8 (2024-03-06T22:58:31+03:00, T-c5e6e3a4f75727b9e0039ad420ae167d3487d006)
Running for /data2/poly-archive/erigon_data in 'read-only' mode...
! mdbx_txn_begin() failed, error -30779 MDBX_PROBLEM: Unexpected internal error, transaction should be aborted

4、./build/bin/mdbx_chk -1 -d /data2/poly-archive/erigon_data
mdbx_chk v0.12.9-16-gfff3fbd8 (2024-03-06T22:58:31+03:00, T-c5e6e3a4f75727b9e0039ad420ae167d3487d006)
Running for /data2/poly-archive/erigon_data in 'read-only' mode...
! mdbx_txn_begin() failed, error -30779 MDBX_PROBLEM: Unexpected internal error, transaction should be aborted

5、./build/bin/mdbx_chk -2 -d /data2/poly-archive/erigon_data
mdbx_chk v0.12.9-16-gfff3fbd8 (2024-03-06T22:58:31+03:00, T-c5e6e3a4f75727b9e0039ad420ae167d3487d006)
Running for /data2/poly-archive/erigon_data in 'read-only' mode...
! mdbx_txn_begin() failed, error -30779 MDBX_PROBLEM: Unexpected internal error, transaction should be aborted

@AskAlexSharov
Copy link
Collaborator

./build/bin/mdbx_chk -vvv /data2/poly-archive/erigon_data

if it will return same error, can try:

git pull
git checkout e35_mdbx_v0_13
make db-tools
./build/bin/mdbx_chk -vvv /data2/poly-archive/erigon_data

if this command doesn't return anything new - then something wrong with your database. maybe you did backup it wrong way (without shutting down erigon). maybe your hardware is broken: can use tools like https://www.memtest86.com/ to test RAM and tools like https://www.smartmontools.org/ to test Disk

don't forget to git checkout release/2.60 back

@hanzhenlong1314
Copy link
Author

./build/bin/mdbx_chk -vvv /data2/poly-archive/erigon_data

if it will return same error, can try:

git pull
git checkout e35_mdbx_v0_13
make db-tools
./build/bin/mdbx_chk -vvv /data2/poly-archive/erigon_data

if this command doesn't return anything new - then something wrong with your database. maybe you did backup it wrong way (without shutting down erigon). maybe your hardware is broken: can use tools like https://www.memtest86.com/ to test RAM and tools like https://www.smartmontools.org/ to test Disk

don't forget to git checkout release/2.60 back

yes,[root@c01_docker_solfullnode_pap_hk erigon]# git branch

  • (头指针在 v2.60.0 分离)
    main
    [root@c01_docker_solfullnode_pap_hk erigon]#
    [root@c01_docker_solfullnode_pap_hk erigon]#
    [root@c01_docker_solfullnode_pap_hk erigon]#
    [root@c01_docker_solfullnode_pap_hk erigon]# ./build/bin/mdbx_stat -ef /data2/poly-archive/erigon_data
    mdbx_stat v0.12.0-71-g1cac6536 (2022-07-28T09:57:31+07:00, T-9a6d7e5b917e5fbd14dc51835fa749d092aa1d72)
    Running for /data2/poly-archive/erigon_data...
    ./build/bin/mdbx_stat: mdbx_txn_begin() error -30796 MDBX_CORRUPTED: Database is corrupted
    [root@c01_docker_solfullnode_pap_hk erigon]# ./build/bin/mdbx_chk -0 -d /data2/poly-archive/erigon_data
    mdbx_chk v0.12.0-71-g1cac6536 (2022-07-28T09:57:31+07:00, T-9a6d7e5b917e5fbd14dc51835fa749d092aa1d72)
    Running for /data2/poly-archive/erigon_data in 'read-only' mode...
    ! bailout waiting for valid snapshot (workaround for incoherent flaw of unified page/buffer cache)
    ! mdbx_txn_begin() failed, error -30796 MDBX_CORRUPTED: Database is corrupted
    [root@c01_docker_solfullnode_pap_hk erigon]#
    [root@c01_docker_solfullnode_pap_hk erigon]# ./build/bin/mdbx_chk -1 -d /data2/poly-archive/erigon_data
    mdbx_chk v0.12.0-71-g1cac6536 (2022-07-28T09:57:31+07:00, T-9a6d7e5b917e5fbd14dc51835fa749d092aa1d72)
    Running for /data2/poly-archive/erigon_data in 'read-only' mode...
    ! bailout waiting for valid snapshot (workaround for incoherent flaw of unified page/buffer cache)
    ! mdbx_txn_begin() failed, error -30796 MDBX_CORRUPTED: Database is corrupted
    [root@c01_docker_solfullnode_pap_hk erigon]# ./build/bin/mdbx_chk -2 -d /data2/poly-archive/erigon_data
    mdbx_chk v0.12.0-71-g1cac6536 (2022-07-28T09:57:31+07:00, T-9a6d7e5b917e5fbd14dc51835fa749d092aa1d72)
    Running for /data2/poly-archive/erigon_data in 'read-only' mode...
    ! bailout waiting for valid snapshot (workaround for incoherent flaw of unified page/buffer cache)
    ! mdbx_txn_begin() failed, error -30796 MDBX_CORRUPTED: Database is corrupted

@hanzhenlong1314
Copy link
Author

./build/bin/mdbx_chk -vvv /data2/poly-archive/erigon_data

if it will return same error, can try:

git pull
git checkout e35_mdbx_v0_13
make db-tools
./build/bin/mdbx_chk -vvv /data2/poly-archive/erigon_data

if this command doesn't return anything new - then something wrong with your database. maybe you did backup it wrong way (without shutting down erigon). maybe your hardware is broken: can use tools like https://www.memtest86.com/ to test RAM and tools like https://www.smartmontools.org/ to test Disk
don't forget to git checkout release/2.60 back

yes,[root@c01_docker_solfullnode_pap_hk erigon]# git branch

  • (头指针在 v2.60.0 分离)
    main
    [root@c01_docker_solfullnode_pap_hk erigon]#
    [root@c01_docker_solfullnode_pap_hk erigon]#
    [root@c01_docker_solfullnode_pap_hk erigon]#
    [root@c01_docker_solfullnode_pap_hk erigon]# ./build/bin/mdbx_stat -ef /data2/poly-archive/erigon_data
    mdbx_stat v0.12.0-71-g1cac6536 (2022-07-28T09:57:31+07:00, T-9a6d7e5b917e5fbd14dc51835fa749d092aa1d72)
    Running for /data2/poly-archive/erigon_data...
    ./build/bin/mdbx_stat: mdbx_txn_begin() error -30796 MDBX_CORRUPTED: Database is corrupted

[root@c01_docker_solfullnode_pap_hk erigon]# ./build/bin/mdbx_chk -0 -d /data2/poly-archive/erigon_data
mdbx_chk v0.12.0-71-g1cac6536 (2022-07-28T09:57:31+07:00, T-9a6d7e5b917e5fbd14dc51835fa749d092aa1d72)
Running for /data2/poly-archive/erigon_data in 'read-only' mode...
! bailout waiting for valid snapshot (workaround for incoherent flaw of unified page/buffer cache)
! mdbx_txn_begin() failed, error -30796 MDBX_CORRUPTED: Database is corrupted

[root@c01_docker_solfullnode_pap_hk erigon]#
[root@c01_docker_solfullnode_pap_hk erigon]# ./build/bin/mdbx_chk -1 -d /data2/poly-archive/erigon_data
mdbx_chk v0.12.0-71-g1cac6536 (2022-07-28T09:57:31+07:00, T-9a6d7e5b917e5fbd14dc51835fa749d092aa1d72)
Running for /data2/poly-archive/erigon_data in 'read-only' mode...
! bailout waiting for valid snapshot (workaround for incoherent flaw of unified page/buffer cache)
! mdbx_txn_begin() failed, error -30796 MDBX_CORRUPTED: Database is corrupted

[root@c01_docker_solfullnode_pap_hk erigon]# ./build/bin/mdbx_chk -2 -d /data2/poly-archive/erigon_data
mdbx_chk v0.12.0-71-g1cac6536 (2022-07-28T09:57:31+07:00, T-9a6d7e5b917e5fbd14dc51835fa749d092aa1d72)
Running for /data2/poly-archive/erigon_data in 'read-only' mode...
! bailout waiting for valid snapshot (workaround for incoherent flaw of unified page/buffer cache)
! mdbx_txn_begin() failed, error -30796 MDBX_CORRUPTED: Database is corrupted

@Giulio2002
Copy link
Collaborator

Giulio2002 commented Jun 27, 2024

Hey, when you mean snapshot, you mean the ones provided by polygon? where do you download the snapshot from?

@hanzhenlong1314
Copy link
Author

Hey, when you mean snapshot, you mean the ones provided by polygon? where do you download the snapshot from?

MDBX

Yes, after the snapshot is unzipped, there is only one mdbx.dat file

@Giulio2002
Copy link
Collaborator

Giulio2002 commented Jul 1, 2024

Hey, @mh0lt can you ping the Polygon guys on this specific issue? it is not an Erigon problem. leaving this issue open until we receive a response from the mantainers of those snapshots but this is not an Erigon issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants