fix: avoid data loss after truncate on init volume #5259

kmlebedev · 2024-01-29T20:23:04Z

What problem are we solving?

This is not exactly a solution to the problem, about the restriction on deleting data from dat files due to a broken idx
I want to avoid:

volume_checking.go:123 Truncate /data/bucket_254.dat from 421324863000 bytes to 234354667858 bytes!

If you don’t do Truncate, the volume will go into read-only mode and you can rebuild the index manually using a weed fix

It's not the best solution, but it's better than nothing.
The correct thing to do would be to rebuild the index. But this is an expensive operation.

How are we solving the problem?

avoid truncate dat file on init volume

How is the PR tested?

localy after fix

make server
warp put --insecure --host 192.168.3.43:8000 --noclear  --obj.size 1KiB --concurrent 300 --noprefix --duration 20s --access-key some_access_key1 --secret-key some_secret_key1
# stop server
truncate -s 219632 /var/folders/g9/bmz0lmtj5d1714v0g7pkdqvc0000gn/T/warp-benchmark-bucket_7.idx

make server
W0130 01:12:05.589731 volume_checking.go:123 Truncate /var/folders/g9/bmz0lmtj5d1714v0g7pkdqvc0000gn/T/warp-benchmark-bucket_7.dat from 21829240 bytes to 14980336 bytes!
I0130 01:12:05.594169 volume_loading.go:128 volumeDataIntegrityChecking failed data size mismatch
I0

volume id:7  size:21829240  collection:"warp-benchmark-bucket"  file_count:13727  read_only:true  version:3  modified_at_second:1706554626

before fix:

I0130 13:23:16.356967 volume_loading.go:121 open to write file /var/folders/g9/bmz0lmtj5d1714v0g7pkdqvc0000gn/T/warp-benchmark-bucket_7.idx
W0130 13:23:16.357495 volume_checking.go:121 Truncate /var/folders/g9/bmz0lmtj5d1714v0g7pkdqvc0000gn/T/warp-benchmark-bucket_7.dat from 21829240 bytes to 14980336 bytes!
I0130 13:23:16.374413 needle_map_memory.go:54 max file key: 97964 for file: /var/folders/g9/bmz0lmtj5d1714v0g7pkdqvc0000gn/T/warp-benchmark-bucket_7.idx
I0130 13:23:16.374423 disk_location.go:182 data file /var/folders/g9/bmz0lmtj5d1714v0g7pkdqvc0000gn/T//warp-benchmark-bucket_7.dat, replication=000 v=3 size=14980336 ttl=
I0130 13:23:16.378492 volume_layout.go:406 Volume 7 becomes writable

Checks

I have added unit tests if possible.
I will add related wiki document changes and link to this PR after merging.

seaweedfs#4991

weed/storage/volume_checking.go

chrislusf · 2024-01-30T07:54:43Z

Please add some comments here.

I feel there should be a better way for this purpose.

kmlebedev · 2024-01-30T07:57:58Z

There can be two directions for improvement

Eliminate Truncate completely. In some cases, this helps trim the last file that was not completely recorded. But you need to catch this case and make sure that this is it.
Determine that the index is completely broken, for example due to the fact that it was formed due to problems with the disk or due to incorrect sorting and offer to fix it manually.

chrislusf · 2024-02-01T17:41:24Z

Let's remove the truncation from the verifyNeedleIntegrity(). It is not re-entrant and does not match the function name.
Currently it is called twice.

kmlebedev · 2024-02-15T08:39:20Z

Let's remove the truncation from the verifyNeedleIntegrity(). It is not re-entrant and does not match the function name.
Currently it is called twice.

I have seen cases where trunking is necessary, but I suggest limiting it to one or two last files from the index.
Perhaps the first idea on how to limit the deleted size is not the best.

kmlebedev · 2024-02-15T08:56:37Z

I looked everywhere where truncation was used to revert to a previous state before writing or before appending data on errors.

So, in case of panic at the time of recording, the data that was not completely recorded can be saved, so when connecting the volume, a check is made to ensure that the last few files were recorded in their entirety. But this check does not cover extreme cases where the .dat or .idx file itself is significantly truncated.

SmsS4 · 2024-02-16T12:36:58Z

Isn't it better to get maximum offset of last 10 indexes (or maybe last 1% of indexes) and only truncate with that offset?

kmlebedev · 2024-02-16T12:39:06Z

Isn't it better to get maximum offset of last 10 indexes (or maybe last 1% of indexes) and only truncate with that offset?

Yes. good thought.

related to #5259

chrislusf · 2024-03-06T15:25:43Z

fixed in 7c45992

fix: avoid data loss after truncate on init volume

942a4f7

seaweedfs#4991

chrislusf reviewed Jan 30, 2024

View reviewed changes

weed/storage/volume_checking.go Outdated Show resolved Hide resolved

chrislusf reviewed Jan 30, 2024

View reviewed changes

weed/storage/volume_checking.go Outdated Show resolved Hide resolved

fix: dataLossLimitPercent

31c1881

kmlebedev mentioned this pull request Jan 31, 2024

fs.verify: failed verify needle <volume>:<id>: rpc error: code = Unknown desc = not found #5270

Open

chrislusf added a commit that referenced this pull request Feb 24, 2024

when verifyNeedleIntegrity, do not truncate files

7c45992

related to #5259

chrislusf closed this Mar 6, 2024

kmlebedev deleted the avoid_data_loss branch March 28, 2024 16:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: avoid data loss after truncate on init volume #5259

fix: avoid data loss after truncate on init volume #5259

kmlebedev commented Jan 29, 2024 •

edited

Loading

chrislusf commented Jan 30, 2024 •

edited

Loading

kmlebedev commented Jan 30, 2024 •

edited

Loading

chrislusf commented Feb 1, 2024

kmlebedev commented Feb 15, 2024

kmlebedev commented Feb 15, 2024

SmsS4 commented Feb 16, 2024

kmlebedev commented Feb 16, 2024

chrislusf commented Mar 6, 2024

fix: avoid data loss after truncate on init volume #5259

fix: avoid data loss after truncate on init volume #5259

Conversation

kmlebedev commented Jan 29, 2024 • edited Loading

What problem are we solving?

How are we solving the problem?

How is the PR tested?

Checks

chrislusf commented Jan 30, 2024 • edited Loading

kmlebedev commented Jan 30, 2024 • edited Loading

chrislusf commented Feb 1, 2024

kmlebedev commented Feb 15, 2024

kmlebedev commented Feb 15, 2024

SmsS4 commented Feb 16, 2024

kmlebedev commented Feb 16, 2024

chrislusf commented Mar 6, 2024

kmlebedev commented Jan 29, 2024 •

edited

Loading

chrislusf commented Jan 30, 2024 •

edited

Loading

kmlebedev commented Jan 30, 2024 •

edited

Loading