Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add key management for WAL #214

Merged
merged 8 commits into from
Jun 25, 2024
Merged

Conversation

dAdAbird
Copy link
Member

  • Make the *.map *.dat processing code aware of custom databases and
    table spaces

  • Add XLog GUC and init the keyring based on that. Only FS for now

  • Make the internal/external key infrastructure work with custom
    (not stored in the database) keyrings.

  • Check and create an internal key for XLog during the server start.
    If the key is created (not the first start with the EncryptWAL), then
    upload it into the cache. We can't read the key from files while
    writing the XLog to the disk as it happens in the critical section and
    no palloc is allowed.

  • Create a custom cache for the global catalog external key as we can't
    use PG's hashmap during the (again, no pallocs in critical section).

  • During the server start, when pg_tde module is loading and it needs to
    read *.map, *.dat file, InitFileAccess is yet to be called, hence Vfd
    isn't ready to use. The same gonna happen during recovery. So use raw
    pread/pwrite calls istead.

{
Assert(fheader);

*bytes_read = FileRead(tde_file, fheader, TDE_FILE_HEADER_SIZE, 0, WAIT_EVENT_DATA_FILE_READ);
/* TODO: pgstat_report_wait_start / pgstat_report_wait_end */
*bytes_read = pg_pread(fd, fheader, TDE_FILE_HEADER_SIZE, 0);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use some dyanmic switching between FileRead/pg_pread and other functions (if(filestuffinitialilzed) { doone; } else { dotheother; })?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great idea. I'll look into that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately there is no way to detect it w/o core changes. The backend checks it by asserting static variable and there is no external function without side effects that we can use.

init_keyring(void)
{
EncryptionState->keyring->type = get_keyring_provider_from_typename(KRingProviderType);
switch (EncryptionState->keyring->type)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about implementing a separate keyring mechanism for the global catalog (we have to refactor the other anyway), but I definitely don't like duplicating the configuration part. Can't we just extract load_keyring_provider_options and reuse it here? (and similarly to files, if the postgres JSON API doesn't work during initialization/recovery, we can't use it)

We can keep this as-is with a TODO to fix this as part of the keyring metadata recovery, but then it needs those todos (and issues)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added TODO and created an issue #218

@dAdAbird dAdAbird requested a review from dutow June 17, 2024 15:36
* Make the *.map *.dat processing code aware of custom databases and
  table spaces
* Add XLog GUC and init the keyring based on that. Only FS for now
* Make the internal/external key infrastructure work with custom
  (not stored in the database) keyrings.
* Check and create an internal key for XLog during the server start.
If the key is created (not the first start with the EncryptWAL), then
upload it into the cache. We can't read the key from files while
writing the XLog to the disk as it happens in the critical section and
no palloc is allowed.

* Create a custom cache for the global catalog external key as we can't
use PG's hashmap during the (again, no pallocs in critical section).
During the server start, when pg_tde module is loading and it needs to
read *.map, *.dat file, InitFileAccess is yet to be called, hence Vfd
isn't ready to use. The same gonna happen during recovery. So use raw
pread/pwrite calls istead.
enc_rel_key_data = tde_encrypt_rel_key(mkey, rel_key_data, rlocator);
pg_tde_write_key_map_entry(rlocator, enc_rel_key_data, &mkey->keyInfo);

pg_tde_put_key_into_map(rlocator->relNumber, rel_key_data);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Putting the Xlog key into the keymap means every new backend is born with this entry in its list. This is fine, but it also means we would need to search for the Xlog key from the map every time we need it. Currently, this key will always sit at the head of the list, which means we will get it at the first iteration, but when we start putting more global keys in the same map, that will take its toll during the search for key operation.
What do you think about using a separate cache store (possibly in shared memory) for the global keys?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion.
My take on it:
We need an Xlog key to be stored in the top memory context because of the InternalKey.ctx. This context if being initialized by OpenSSL with the pointer to the encryption context. And should the cache be in the shared memory, new backends would try to use the context of the first initialized backend (hence segfault). Having the case in the top memory context means every backend inherits it with the NULL context and initializes it exclusively. Using shared memory here would mean a pool of contexts, locking etc.
But we can have a global catalog cache in TopMemory context. I just don't want to do it in the PR, the reasons being:

  1. The PR is meant to be a minimum working version and it is already over bloated.
  2. I'd like to address it along with the key rotation as it may add some additional challenges and requirements for the keys caching.

Another thing: I'd change InternalKey cache in general. Now it is a linked list with objects allocated all over the memory and traversing such a list is an absolute performance nightmare (CPU caches etc).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a TODO comment

@dAdAbird dAdAbird merged commit a670e46 into Percona-Lab:smgr Jun 25, 2024
8 checks passed
@dAdAbird dAdAbird deleted the xlog_key_mgmt branch June 25, 2024 11:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants