HDDS-7111. SCMHADBTransactionBuffer not flushed in time #3670

Xushaohong · 2022-08-10T03:13:10Z

What changes were proposed in this pull request?

We recently found that after we deleted all the keys, the cluster will remain with some data. One of the reasons is due to this:

SCM will add deleted blocks into transactions when receiving the request from OM. When HA is enabled, DBTransactionbuffer is implemented as the SCMHADBTransactionbufferImpl. Inside this, the buffer will not be flushed immediately. Normally, it will be flushed when SCM takes a snapshot. The snapshot gap threshold is default 1000. If the user has little load pressure on the cluster (no writing more ratis logs), the buffer will be always pending in the memory. Real deletion happened in SCMBlockDeletingService, which will scan the DB and get the transactions, for those txns in-memory buffer, it won't find them. This is why DN has not yet received these deleted block info.

This PR adds a flush monitor checking regularly to trigger the non-empty flush, overall, this could be a common mechanism if some other cases like the deleted blocks appear in the future.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-7111

How was this patch tested?

UT

Xushaohong · 2022-08-10T03:13:37Z

@ChenSammi PTAL~

JacksonYao287

thanks @Xushaohong for the patch!
i would like to not use DBTransactionBuffer for DeleteBlockLog, since deletion is not time-sensitive. we can just use a rocksdb write batch for each of om delete request, and commit it to rocksdb directly, so that the iterator can always get the latest view of deleteblocklog.

errose28 · 2022-08-10T22:55:40Z

Hi @Xushaohong. The issue you've brought up is definitely valid. We have discussed this as well in HDDS-6721. After some discussion the preferred solution in that Jira was to add a time based Ratis snapshot feature in RATIS-1583, which has not been implemented yet. IMO time based Ratis snapshot is the easiest way forward because it will provide a general purpose Ratis feature that we can use with a single config. For example, have Ratis take a snapshot (flush to DB) every 1000 transactions or every 10 minutes (arbitrary example value), whichever comes first.

Xushaohong · 2022-08-16T06:26:21Z

Hi @Xushaohong. The issue you've brought up is definitely valid. We have discussed this as well in HDDS-6721. After some discussion the preferred solution in that Jira was to add a time based Ratis snapshot feature in RATIS-1583, which has not been implemented yet. IMO time based Ratis snapshot is the easiest way forward because it will provide a general purpose Ratis feature that we can use with a single config. For example, have Ratis take a snapshot (flush to DB) every 1000 transactions or every 10 minutes (arbitrary example value), whichever comes first.

Make sense, but the refactor in ratis need a long time comes into ozone. If i have time, i will take a look. Close this PR now

errose28 · 2022-08-16T15:31:11Z

Thanks @Xushaohong, I will close this for now. If you need a temporary workaround to get data deleted you can restart the SCMs to force the logs to flush.

HDDS-7111. SCMHADBTransactionBuffer not flushed in time

0d86277

JacksonYao287 reviewed Aug 10, 2022

View reviewed changes

errose28 closed this Aug 16, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HDDS-7111. SCMHADBTransactionBuffer not flushed in time #3670

HDDS-7111. SCMHADBTransactionBuffer not flushed in time #3670

Xushaohong commented Aug 10, 2022

Xushaohong commented Aug 10, 2022

JacksonYao287 left a comment

errose28 commented Aug 10, 2022

Xushaohong commented Aug 16, 2022

errose28 commented Aug 16, 2022

HDDS-7111. SCMHADBTransactionBuffer not flushed in time #3670

HDDS-7111. SCMHADBTransactionBuffer not flushed in time #3670

Conversation

Xushaohong commented Aug 10, 2022

What changes were proposed in this pull request?

What is the link to the Apache JIRA

How was this patch tested?

Xushaohong commented Aug 10, 2022

JacksonYao287 left a comment

Choose a reason for hiding this comment

errose28 commented Aug 10, 2022

Xushaohong commented Aug 16, 2022

errose28 commented Aug 16, 2022