-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disk array packed headers #3557
Conversation
instead of 256
c7c0c6e
to
6bb464b
Compare
I'm not really sure why this is requiring a larger buffer pool size for the multi copy test to succeed. I'm suspicious that it might be caused by a subtle bug. |
@@ -3,6 +3,7 @@ | |||
#include <string> | |||
|
|||
#include "catalog/catalog_entry/node_table_catalog_entry.h" | |||
#include "common/types/internal_id_t.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Double check if this include is necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not actually sure why that was added; I don't recall modfying that file, but it is where table_id_t
is declared.
@@ -79,7 +79,7 @@ struct BufferPoolConstants { | |||
static constexpr uint64_t DEFAULT_VM_REGION_MAX_SIZE = (uint64_t)1 << 43; // (8TB) | |||
#endif | |||
|
|||
static constexpr uint64_t DEFAULT_BUFFER_POOL_SIZE_FOR_TESTING = 1ull << 26; // (64MB) | |||
static constexpr uint64_t DEFAULT_BUFFER_POOL_SIZE_FOR_TESTING = 1ull << 27; // (128MB) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder which test fails due to bm size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's the multi copy test (I've run into this before; that time it was caused by a bug), which I think writes a gigabyte or so of hash index data. But it doesn't need all that memory at one time, so it should be fine with a small buffer pool.
void prepareCommit(); | ||
|
||
void checkpointInMemory() { | ||
for (size_t i = 0; i < headersForWriteTrx.size(); i++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we avoid size_t
? as I understand, it's not compatible on 32-bit platforms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not incompatible, it's just that we can't expect it to be 64-bit on a 32-bit platform, so it shouldn't be used for things like iterating over the elements in the disk array where we expect to be able to go beyond UINT32_MAX
. But std::vector::size
returns a size_t
anyway (or at least usually does, according to cppreference, it's a size_type
defined in the vector implementation).
6bb464b
to
b9244a2
Compare
b9244a2
to
30e17bc
Compare
Packs the disk array headers into as few pages as possible (each header page points to the next one to support adding new disk arrays to the disk array collection) and packs the hash index headers into two pages.
I've also fixed some padding issues related to #3484, but there still seems to be a small amount of uninitialized data in the main hash index file (either due to something in the slot, or maybe related to the disk array write iterator).
A new class DiskArrayCollection gets passed around with the MetadataDAHInfo structure since the indices no longer refer to individual pages that can be easily read independently. Instead the DiskArrayCollection reads all the headers when initialized and the MetadataDAHInfo provides the index of the header for a particular chunk and an easy way of constructing the disk array.