Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added datablock out of order ops #1022

Merged
merged 8 commits into from
Mar 27, 2020
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
25 changes: 25 additions & 0 deletions src/util/datablock/datablock.c
Original file line number Diff line number Diff line change
Expand Up @@ -180,3 +180,28 @@ void DataBlock_Free(DataBlock *dataBlock) {
assert(pthread_mutex_destroy(&dataBlock->mutex) == 0);
rm_free(dataBlock);
}

/* --------- Out of Order interface implementation --------*/

void *DataBlock_AllocateItemOutOfOrder(DataBlock *dataBlock, uint64_t idx) {
DataBlock_Accommodate(dataBlock, idx);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please explain why are we calling this function.

dataBlock->itemCount++;
Block *block = GET_ITEM_BLOCK(dataBlock, idx);
idx = ITEM_POSITION_WITHIN_BLOCK(idx);
DataBlockItemHeader *item_header = (DataBlockItemHeader *)block->data + (idx * block->itemSize);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getting the header is becoming a common operation, please introduce a function which given a block and an item index retrieves the item header.

MARK_HEADER_AS_NOT_DELETED(item_header);
return ITEM_DATA(item_header);
}

void DataBlock_DeleteItemOutOfOrder(DataBlock *dataBlock, uint64_t idx) {
DataBlock_Accommodate(dataBlock, idx);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please explain why are we calling this function.

// Get block.
uint blockIdx = ITEM_INDEX_TO_BLOCK_INDEX(idx);
Block *block = dataBlock->blocks[blockIdx];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use GET_ITEM_BLOCK

uint blockPos = ITEM_POSITION_WITHIN_BLOCK(idx);
uint offset = blockPos * block->itemSize;
DataBlockItemHeader *item_header = (DataBlockItemHeader *)block->data + offset;
// Delete
MARK_HEADER_AS_DELETED(item_header);
dataBlock->deletedIdx = array_append(dataBlock->deletedIdx, idx);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there should be a single function which does both, as marking an item as delete means adding item index to the free_list

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is what the out of order deletion does.
In order to have a function which both inserts and index to the free array and mark its item as deleted, the input should be the index. From this index, you need to retrieve the item from the datablock.

}
17 changes: 17 additions & 0 deletions src/util/datablock/datablock.h
Original file line number Diff line number Diff line change
Expand Up @@ -81,3 +81,20 @@ void DataBlock_DeleteItem(DataBlock *dataBlock, uint64_t idx);

// Free block.
void DataBlock_Free(DataBlock *block);

/*----------- Data Block Out of Order Building -----------*/

/*
* Note!!!
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note!

* This API is to be used only during out of order re-construction of a data block.
* DO NOT USE THE BELOW METHODS COMBINED WITH THE NORMAL DATA BLACK API.
*/

// Return a pointer to allocated item in a given index.
// If needed, allocates new space.
// The values in indices smaller then idx are undefined if not allocated before.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Elements in the range [max_allocated_item.. idx] are neither considered deleted nor allocated,
Retrieving such an element will return garbage.

void *DataBlock_AllocateItemOutOfOrder(DataBlock *dataBlock, uint64_t idx);

// Marks an item within a given dataBlock, in a given index as deleted. Allocates additional space if need.
// Item distructor is not called in this call.
void DataBlock_DeleteItemOutOfOrder(DataBlock *dataBlock, uint64_t idx);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a more appropriate name for this would be DataBlock_MarkAsDeleted
the function will update the free_list and will set the element as deleted.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be better to have these declaration and maybe their implementation in a separated file e.g. OO_DataBlock, taking this thought to the extreme, we could place that file under serialisers folder

46 changes: 46 additions & 0 deletions tests/unit/test_datablock.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -187,3 +187,49 @@ TEST_F(DataBlockTest, RemoveItem) {
DataBlock_Free(dataBlock);
}

TEST_F(DataBlockTest, OutOfOrderBuilding) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this test would be more complete if we were to delete actual items, at the moment the delete_arr include indices which weren't used.

Also this new ability to introduce elements out of order will cause an issue,
suppose we call DataBlock_AllocateItemOutOfOrder only once with ID 5
what will happen if we try to iterate over the datablock? will items 0-4 will be returned?

// This test checks for a fragmented, data block out of order re-construction.
DataBlock *dataBlock = DataBlock_New(1, sizeof(int), NULL);
int insert_arr1[4] = {8, 2, 3, 6};
int delete_arr[2] = {4, 7};
int insert_arr2[4] = {9, 1, 5, 0};
int expected[8] = {0, 1, 2, 3, 5, 6, 8, 9};

// Insert the first array.
for(int i = 0; i < 4; i++) {
int *item = (int *)DataBlock_AllocateItemOutOfOrder(dataBlock, insert_arr1[i]);
*item = insert_arr1[i];
}
ASSERT_EQ(4, dataBlock->itemCount);
ASSERT_EQ(0, array_len(dataBlock->deletedIdx));

// Mark deleted values.
for(int i = 0; i < 2; i++)
DataBlock_DeleteItemOutOfOrder(dataBlock, delete_arr[i]);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there's a line break include curly braces.


ASSERT_EQ(4, dataBlock->itemCount);
ASSERT_EQ(2, array_len(dataBlock->deletedIdx));

// Add another array
for(int i = 0; i < 4; i++) {
int *item = (int *)DataBlock_AllocateItemOutOfOrder(dataBlock, insert_arr2[i]);
*item = insert_arr2[i];
}

ASSERT_EQ(8, dataBlock->itemCount);
ASSERT_EQ(2, array_len(dataBlock->deletedIdx));

// Validate
DataBlockIterator *it = DataBlock_Scan(dataBlock);
for(int i = 0; i < 8; i++) {
int *item = (int *)DataBlockIterator_Next(it);
ASSERT_EQ(*item, expected[i]);
}
ASSERT_FALSE(DataBlockIterator_Next(it));

ASSERT_TRUE(dataBlock->deletedIdx[0] == 4 || dataBlock->deletedIdx[0] == 7);
ASSERT_TRUE(dataBlock->deletedIdx[1] == 4 || dataBlock->deletedIdx[1] == 7);

DataBlock_Free(dataBlock);
}