[feature-wip](unique-key-merge-on-write) speed up publish_txn #11557

yixiutt · 2022-08-05T08:18:52Z

In our origin design, we calc delete bitmap in publish txn, and this operation
will cost too much time as it will load segment data and lookup row key in pre
rowset and segments.And publish version task should run in order, so it'll lead
to timeout in publish_txn.

In this pr, we seperate delete_bitmap calculation to tow part, one of it will be
done in flush mem table, so this work can run parallel. And we calc final
delete_bitmap in publish_txn, get a rowset_id set that should be included and
remove rowsets that has been compacted, the rowset difference between memtable_flush
and publish_txn is really small so publish_txn become very fast.In our test,
publish_txn cost about 10ms.

Proposed changes

Issue Number: close #xxx

Problem summary

Describe your changes.

Checklist(Required)

Does it affect the original behavior:
- Yes
- No
- I don't know
Has unit tests been added:
- Yes
- No
- No Need
Has document been added or modified:
- Yes
- No
- No Need
Does it need to update dependencies:
- Yes
- No
Are there any changes that cannot be rolled back:
- Yes (If Yes, please explain WHY)
- No

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

be/src/olap/rowset/rowset_tree.cpp

be/src/olap/memtable.cpp

zhannngchen

LGTM

github-actions · 2022-08-05T13:06:44Z

PR approved by anyone and no changes requested.

be/src/olap/memtable.cpp

be/src/olap/rowset/beta_rowset_writer.cpp

be/src/olap/rowset/rowset_tree.h

be/src/olap/txn_manager.cpp

In our origin design, we calc delete bitmap in publish txn, and this operation will cost too much time as it will load segment data and lookup row key in pre rowset and segments.And publish version task should run in order, so it'll lead to timeout in publish_txn. In this pr, we seperate delete_bitmap calculation to tow part, one of it will be done in flush mem table, so this work can run parallel. And we calc final delete_bitmap in publish_txn, get a rowset_id set that should be included and remove rowsets that has been compacted, the rowset difference between memtable_flush and publish_txn is really small so publish_txn become very fast.In our test, publish_txn cost about 10ms.

dataroaring

LGTM

github-actions · 2022-08-08T10:52:20Z

PR approved by at least one committer and no changes requested.

zhannngchen reviewed Aug 5, 2022

View reviewed changes

be/src/olap/rowset/rowset_tree.cpp Outdated Show resolved Hide resolved

be/src/olap/memtable.cpp Outdated Show resolved Hide resolved

yixiutt force-pushed the yixiu_pk_publish branch 2 times, most recently from ccbc414 to ad47937 Compare August 5, 2022 13:01

zhannngchen previously approved these changes Aug 5, 2022

View reviewed changes

github-actions bot added the reviewed label Aug 5, 2022

yixiutt dismissed zhannngchen’s stale review via 2bb522f August 8, 2022 02:26

yixiutt force-pushed the yixiu_pk_publish branch 2 times, most recently from 2bb522f to d5468ef Compare August 8, 2022 02:53

dataroaring reviewed Aug 8, 2022

View reviewed changes

be/src/olap/memtable.cpp Outdated Show resolved Hide resolved

be/src/olap/rowset/beta_rowset_writer.cpp Show resolved Hide resolved

be/src/olap/rowset/rowset_tree.h Show resolved Hide resolved

be/src/olap/txn_manager.cpp Outdated Show resolved Hide resolved

yixiutt force-pushed the yixiu_pk_publish branch from d5468ef to 0695464 Compare August 8, 2022 09:06

yixiutt force-pushed the yixiu_pk_publish branch from 0695464 to abfe0d1 Compare August 8, 2022 09:09

dataroaring approved these changes Aug 8, 2022

View reviewed changes

github-actions bot added the approved Indicates a PR has been approved by one committer. label Aug 8, 2022

dataroaring merged commit 0a5fd99 into apache:master Aug 8, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[feature-wip](unique-key-merge-on-write) speed up publish_txn #11557

[feature-wip](unique-key-merge-on-write) speed up publish_txn #11557

Uh oh!

yixiutt commented Aug 5, 2022

Uh oh!

Uh oh!

Uh oh!

zhannngchen left a comment

Uh oh!

github-actions bot commented Aug 5, 2022

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dataroaring left a comment

Uh oh!

github-actions bot commented Aug 8, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[feature-wip](unique-key-merge-on-write) speed up publish_txn #11557

[feature-wip](unique-key-merge-on-write) speed up publish_txn #11557

Uh oh!

Conversation

yixiutt commented Aug 5, 2022

Proposed changes

Problem summary

Checklist(Required)

Further comments

Uh oh!

Uh oh!

Uh oh!

zhannngchen left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Aug 5, 2022

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dataroaring left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Aug 8, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants