Implement a way to build an index without blocking concurrent data modifications #7967

hvlad · 2024-01-16T22:23:45Z

Currently, when index is build, any modifications of table data is not allowed. This is required to create correct index not missing new keys inserted during the build.

The time when such read lock is required could be significantly shortened.

aafemt · 2024-01-16T22:47:23Z

Time can be shortened up to zero, IMHO. Consider this:

Index get state "active but unusable".
DML operations work with it as usual but selects ignore it.
Background table scan is running to add any missing node.

hvlad · 2024-01-17T18:33:08Z

This way creates less dense b-tree and could be much slower than our fast_load().

Currently I thinking on combined approach:
at 1st stage engine build "main" b-tree using table snapshot and fast_load(), user attachments maintains separate ("small") b-tree with usual DML activity;
at 2nd stage "main" b-tree is "published" as index b-tree and maintained by user attachments as usual, engine merges "small" b-tree into "main" b-tree;
after merge finishes, index is allowed to use in SELECT's.

Not sure how to handle deletion of index keys on 1st stage.

aafemt · 2024-01-17T20:10:08Z

This way creates less dense b-tree and could be much slower than our fast_load().

Yes, this is the price for uninterrupted DB operations which may be acceptable.

Not sure how to handle deletion of index keys on 1st stage.

1st stage is running in snapshot mode so garbage collection is blocked and node deletions shouldn't occur at all, no?

hvlad self-assigned this Jan 16, 2024

hvlad added the type: improvement label Jan 16, 2024

dyemanov changed the title ~~Implement a way to build index without blocking of data modifications~~ Implement a way to build an index without blocking concurrent data modifications Jan 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement a way to build an index without blocking concurrent data modifications #7967

Implement a way to build an index without blocking concurrent data modifications #7967

hvlad commented Jan 16, 2024

aafemt commented Jan 16, 2024

hvlad commented Jan 17, 2024

aafemt commented Jan 17, 2024

Implement a way to build an index without blocking concurrent data modifications #7967

Implement a way to build an index without blocking concurrent data modifications #7967

Comments

hvlad commented Jan 16, 2024

aafemt commented Jan 16, 2024

hvlad commented Jan 17, 2024

aafemt commented Jan 17, 2024