Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PS-9181: Add caching of dictionary table for component_masking_functions #5302

Open
wants to merge 9 commits into
base: release-8.4.0-1
Choose a base branch
from

Conversation

oleksandr-kachan
Copy link
Contributor

https://perconadev.atlassian.net/browse/PS-9181

  • Added caching of mysql.masking_dictionaries table content.
  • Implemented masking_dictionaries_flush() UDF which flushes data from the masking dictionaries table to the memory ache.
  • Added component_masking.dictionaries_flush_interval_seconds system variable.
  • Added actual flusher thread. It periodically rereads content of dictionary table and updates in-memory cache.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Clang-Tidy found issue(s) with the introduced code (1/3)


#include <mysql/components/services/log_builtins.h>
#include <mysql/psi/mysql_thread.h>
#include <mysqld_error.h>

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ clang-diagnostic-error ⚠️
mysqld_error.h file not found

std::mutex flusher_mutex_;
std::condition_variable flusher_condition_var_;

PSI_thread_key psi_flusher_thread_key_;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ cppcoreguidelines-pro-type-member-init ⚠️
constructor does not initialize these fields: psi_flusher_thread_key_, flusher_thread_attr_

Suggested change
PSI_thread_key psi_flusher_thread_key_;
PSI_thread_key psi_flusher_thread_key_{};


PSI_thread_key psi_flusher_thread_key_;
my_thread_handle flusher_thread_;
my_thread_attr_t flusher_thread_attr_;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ cppcoreguidelines-pro-type-member-init ⚠️
constructor does not initialize these fields: psi_flusher_thread_key_, flusher_thread_attr_

Suggested change
my_thread_attr_t flusher_thread_attr_;
my_thread_attr_t flusher_thread_attr_{};

Comment on lines +48 to +46
bool contains(const std::string &dictionary_name,
const std::string &term) const;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ readability-convert-member-functions-to-static ⚠️
method contains can be made static

Suggested change
bool contains(const std::string &dictionary_name,
const std::string &term) const;
static bool contains(const std::string &dictionary_name,
const std::string &term) ;

}

bool query_cache::contains(const std::string &dictionary_name,
const std::string &term) const {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ readability-convert-member-functions-to-static ⚠️
method contains can be made static

Suggested change
const std::string &term) const {
const std::string &term) {

const std::string &term) const;
// returns a copy of the string to avoid race conditions
// an empty string is returned if the dictionary does not exist
std::string get_random(const std::string &dictionary_name) const;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ readability-convert-member-functions-to-static ⚠️
method get_random can be made static

Suggested change
std::string get_random(const std::string &dictionary_name) const;
static std::string get_random(const std::string &dictionary_name) ;

return acquired_dict_cache.contains(dictionary_name, term);
}

std::string query_cache::get_random(const std::string &dictionary_name) const {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ readability-convert-member-functions-to-static ⚠️
method get_random can be made static

Suggested change
std::string query_cache::get_random(const std::string &dictionary_name) const {
std::string query_cache::get_random(const std::string &dictionary_name) {

}

void query_cache::init_thd() noexcept {
auto *thd = new THD;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ cppcoreguidelines-owning-memory ⚠️
initializing non-owner THD * with a newly created gsl::owner<>

}

void query_cache::init_thd() noexcept {
auto *thd = new THD;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ bugprone-unhandled-exception-at-new ⚠️
missing exception handler for allocation failure at new

std::unique_ptr<THD> flusher_thd_;

void init_thd() noexcept;
void release_thd() noexcept;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ readability-convert-member-functions-to-static ⚠️
method release_thd can be made static

Suggested change
void release_thd() noexcept;
static void release_thd() noexcept;

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Clang-Tidy found issue(s) with the introduced code (2/3)

masking_functions::sql_context sql_ctx{global_command_services::instance()};
auto query{dict_query_builder_->select_all_from_dictionary()};
auto local_dict_cache{std::make_unique<bookshelf>()};
sql_context::row_callback<2> result_inserter{[&terms = *local_dict_cache](

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ cppcoreguidelines-init-variables ⚠️
variable result_inserter is not initialized

Suggested change
sql_context::row_callback<2> result_inserter{[&terms = *local_dict_cache](
sql_context::row_callback<2> result_inserter = 0{[&terms = *local_dict_cache](

@@ -33,9 +35,13 @@
#include <mysqld_error.h>

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ clang-diagnostic-error ⚠️
mysqld_error.h file not found

return 1;
}

std::string check_error_message;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ cppcoreguidelines-init-variables ⚠️
variable check_error_message is not initialized

Suggested change
std::string check_error_message;
std::string check_error_message = 0;

@@ -174,8 +225,11 @@
REQUIRES_SERVICE(mysql_string_substr),
REQUIRES_SERVICE(mysql_string_compare),

REQUIRES_PSI_THREAD_SERVICE,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ cppcoreguidelines-pro-type-const-cast ⚠️
do not use const_cast

REQUIRES_SERVICE(mysql_command_query),
REQUIRES_SERVICE(mysql_command_query_result),
REQUIRES_SERVICE(mysql_command_field_info),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ cppcoreguidelines-pro-type-const-cast ⚠️
do not use const_cast

@@ -187,6 +241,8 @@
REQUIRES_SERVICE(mysql_current_thread_reader),
REQUIRES_SERVICE(mysql_thd_security_context),
REQUIRES_SERVICE(global_grants_check),
REQUIRES_SERVICE(component_sys_variable_register),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ cppcoreguidelines-pro-type-const-cast ⚠️
do not use const_cast

@@ -187,6 +241,8 @@
REQUIRES_SERVICE(mysql_current_thread_reader),
REQUIRES_SERVICE(mysql_thd_security_context),
REQUIRES_SERVICE(global_grants_check),
REQUIRES_SERVICE(component_sys_variable_register),
REQUIRES_SERVICE(component_sys_variable_unregister),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ cppcoreguidelines-pro-type-const-cast ⚠️
do not use const_cast


namespace masking_functions {

class bookshelf {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ cppcoreguidelines-special-member-functions ⚠️
class bookshelf defines a non-default destructor, a copy assignment operator, a move constructor and a move assignment operator but does not define a copy constructor

#include <mysql/components/services/component_sys_var_service.h>
#include <mysql/components/services/log_builtins.h>

#include <mysqld_error.h>

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ clang-diagnostic-error ⚠️
mysqld_error.h file not found

throw std::runtime_error{"Error while executing SQL select query"};
}

unsigned int actual_number_of_fields;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ cppcoreguidelines-init-variables ⚠️
variable actual_number_of_fields is not initialized

Suggested change
unsigned int actual_number_of_fields;
unsigned int actual_number_of_fields = 0;

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Clang-Tidy found issue(s) with the introduced code (3/3)

@@ -1229,6 +1247,7 @@ DECLARE_STRING_UDF_AUTO(mask_uk_nin)
DECLARE_STRING_UDF_AUTO(mask_uuid)
DECLARE_STRING_UDF_AUTO(gen_blocklist)
DECLARE_STRING_UDF_AUTO(gen_dictionary)
DECLARE_STRING_UDF_AUTO(masking_dictionaries_flush)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ cppcoreguidelines-init-variables ⚠️
variable initid is not initialized

@@ -1229,6 +1247,7 @@
DECLARE_STRING_UDF_AUTO(mask_uuid)
DECLARE_STRING_UDF_AUTO(gen_blocklist)
DECLARE_STRING_UDF_AUTO(gen_dictionary)
DECLARE_STRING_UDF_AUTO(masking_dictionaries_flush)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ misc-unused-parameters ⚠️
parameter initid is unused


bool dictionary::contains(const std::string &term) const noexcept {
// TODO: in c++20 change to terms_.contains(term)
return terms_.count(term) > 0U;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ readability-container-contains ⚠️
use contains to check for membership

Suggested change
return terms_.count(term) > 0U;
return terms_.contains(term)U;

https://perconadev.atlassian.net/browse/PS-9148

- Added caching of mysql.masking_dictionaries table content.
- Implemented masking_dictionaries_flush() UDF which flushes data
  from the masking dictionaries table to the memory cache.
https://perconadev.atlassian.net/browse/PS-9148

The masking_functions.masking_database system variable for the
masking_functions component specifies database used for data
masking dictionaries.
oleksandr-kachan and others added 7 commits June 22, 2024 01:42
https://perconadev.atlassian.net/browse/PS-9148

- Added component_masking.dictionaries_flush_interval_seconds system
  variable.
- Added actual flusher thread. It periodically rereads content of
  dictionary table and updates in-memory cache.
https://perconadev.atlassian.net/browse/PS-9148

Introduced 'dictionary' and 'bookshelf' classes for storing terms on
per-dictionary level.
Reworked 'query_cache' to utilize these two new classes.
https://perconadev.atlassian.net/browse/PS-9148

Introduced 'component_sys_variable_service_tuple' class for groupping comonent
system variable registration services (supposed to be used with
'primitive_singleton' class template).

'query_cache' now expects 'query_builder' and 'flusher_interval_seconds' as its
constructor's parameters.

Eliminates custom MySQL types (like 'ulonglong') and its includes (like
'my_inttypes.h') from the publicly facing headers.

'query_cache' is now explicitly initialized / deinitialized in the component's
'init()'' / 'deinit()'' functions via 'primitive_singleton' interface.

'query_cache' helper thread-related methods made private.
https://perconadev.atlassian.net/browse/PS-9148

As std::string_view::data() is not guaranteed to be null-terminated, it is
not safe to use it in old c-functions accepting 'const char *'.
Some constants converted to arrays of char 'const char buffer[]{"value"}'.
https://perconadev.atlassian.net/browse/PS-9148

'command_service_tuple' struct extended with one more member - 'field_info'
service.

Reworked 'query_cache' class: instead of loading terms from the database in
constructor, this operation is now performed in first attempt to access one
of the dictionary methods ('contains()' / 'get_random()' / 'remove()' /
'insert()'). This is done in order to overcome a limitation that does not
allow 'mysql_command_query' service to be used from inside the componment
initialization function.
Fixed problem with 'm_dict_cache' shared pointer updated concurrently from
different threads.
Exceptions thrown from the cache loading function no longer escape the
flusher thread.

De-coupled 'sql_context' and 'bookshelf' classes: 'sql_context' now accepts a
generic insertion callback that can be used to populate any type of containers.

'component_masking_functions.dictionary_operations' MTR test case extended with
additional checks for flushed / unflushed dictionary cache.
https://perconadev.atlassian.net/browse/PS-9148

Both 'dictionary' and 'bookshelf' classes no longer include their own
'std::shared_mutex' to protect data. Instead, we now have a single
'std::shared_mutex' at the 'query_cache' level.

The return value of the 'get_random()' method in both 'dictionary' and
'bookshelf' classes changed from 'optional_string' to 'std::string_view'. Empty
(default constructed) 'std::string_view' is used as an indicator of an
unsuccessful operation.
'get_random()' method in the 'query_cache' class still returns a string by
value to avoid race conditions.

Changed the behaviour of the 'sql_context::execute_dml()' method - it now
throws when SQL errors (like "no table found", etc.) occur.
https://perconadev.atlassian.net/browse/PS-9181

The component_masking_functions fails to start cache flusger thread
from within compoent init function.

Make query_cache loading lazy. This allows to postpone flusher
initialization till the moment when component and server are
complettely initialized.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants