Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PS-8854: Convert binlog_utils_udf to a component with function auto-registration #8

Closed
wants to merge 2 commits into from

Conversation

inikep
Copy link
Owner

@inikep inikep commented Aug 16, 2023

No description provided.

@inikep inikep force-pushed the PS-8865-binlog_utils_udf branch 2 times, most recently from 73da9c6 to e97df7c Compare August 17, 2023 12:26
@inikep inikep closed this Aug 18, 2023
inikep added a commit that referenced this pull request Aug 24, 2023
PS-5741: Incorrect use of memset_s in keyring_vault.

Fixed the usage of memset_s. The arguments should be:
void memset_s(void *dest, size_t dest_max, int c, size_t n)
where the 2nd argument is size of buffer and the 3rd is
argument is character to fill.

---------------------------------------------------------------------------

PS-7769 - Fix use-after-return error in audit_log_exclude_accounts_validate

---

*Problem:*

`st_mysql_value::val_str` might return a pointer to `buf` which after
the function called is deleted. Therefore the value in `save`, after
reuturnin from the function, is invalid.

In this particular case, the error is not manifesting as val_str`
returns memory allocated with `thd_strmake` and it does not use `buf`.

*Solution:*

Allocate memory with `thd_strmake` so the memory in `save` is not local.

---------------------------------------------------------------------------

Fix test main.bug12969156 when WITH_ASAN=ON

*Problem:*

ASAN complains about stack-buffer-overflow on function `mysql_heartbeat`:

```
==90890==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7fe746d06d14 at pc 0x7fe760f5b017 bp 0x7fe746d06cd0 sp 0x7fe746d06478
WRITE of size 24 at 0x7fe746d06d14 thread T16777215

Address 0x7fe746d06d14 is located in stack of thread T26 at offset 340 in frame
    #0 0x7fe746d0a55c in mysql_heartbeat(void*) /home/yura/ws/percona-server/plugin/daemon_example/daemon_example.cc:62

  This frame has 4 object(s):
    [48, 56) 'result' (line 66)
    [80, 112) '_db_stack_frame_' (line 63)
    [144, 200) 'tm_tmp' (line 67)
    [240, 340) 'buffer' (line 65) <== Memory access at offset 340 overflows this variable
HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork
      (longjmp and C++ exceptions *are* supported)
Thread T26 created by T25 here:
    #0 0x7fe760f5f6d5 in __interceptor_pthread_create ../../../../src/libsanitizer/asan/asan_interceptors.cpp:216
    #1 0x557ccbbcb857 in my_thread_create /home/yura/ws/percona-server/mysys/my_thread.c:104
    #2 0x7fe746d0b21a in daemon_example_plugin_init /home/yura/ws/percona-server/plugin/daemon_example/daemon_example.cc:148
    #3 0x557ccb4c69c7 in plugin_initialize /home/yura/ws/percona-server/sql/sql_plugin.cc:1279
    #4 0x557ccb4d19cd in mysql_install_plugin /home/yura/ws/percona-server/sql/sql_plugin.cc:2279
    #5 0x557ccb4d218f in Sql_cmd_install_plugin::execute(THD*) /home/yura/ws/percona-server/sql/sql_plugin.cc:4664
    #6 0x557ccb47695e in mysql_execute_command(THD*, bool) /home/yura/ws/percona-server/sql/sql_parse.cc:5160
    #7 0x557ccb47977c in mysql_parse(THD*, Parser_state*, bool) /home/yura/ws/percona-server/sql/sql_parse.cc:5952
    #8 0x557ccb47b6c2 in dispatch_command(THD*, COM_DATA const*, enum_server_command) /home/yura/ws/percona-server/sql/sql_parse.cc:1544
    #9 0x557ccb47de1d in do_command(THD*) /home/yura/ws/percona-server/sql/sql_parse.cc:1065
    #10 0x557ccb6ac294 in handle_connection /home/yura/ws/percona-server/sql/conn_handler/connection_handler_per_thread.cc:325
    #11 0x557ccbbfabb0 in pfs_spawn_thread /home/yura/ws/percona-server/storage/perfschema/pfs.cc:2198
    #12 0x7fe760ab544f in start_thread nptl/pthread_create.c:473
```

The reason is that `my_thread_cancel` is used to finish the daemon thread. This is not and orderly way of finishing the thread. ASAN does not register the stack variables are not used anymore which generates the error above.

This is a benign error as all the variables are on the stack.

*Solution*:

Finish the thread in orderly way by using a signalling variable.

---------------------------------------------------------------------------

PS-8204: Fix XML escape rules for audit plugin

https://jira.percona.com/browse/PS-8204

There was a wrong length specified for some XML
escape rules. As a result of this terminating null symbol from
replacement rule was copied into resulting string. This lead to
quer text truncation in audit log file.
In addition added empty replacement rules for '\b' and 'f' symbols
which just remove them from resulting string. These symboles are
not supported in XML 1.0.
inikep added a commit that referenced this pull request Oct 16, 2023
PS-5741: Incorrect use of memset_s in keyring_vault.

Fixed the usage of memset_s. The arguments should be:
void memset_s(void *dest, size_t dest_max, int c, size_t n)
where the 2nd argument is size of buffer and the 3rd is
argument is character to fill.

---------------------------------------------------------------------------

PS-7769 - Fix use-after-return error in audit_log_exclude_accounts_validate

---

*Problem:*

`st_mysql_value::val_str` might return a pointer to `buf` which after
the function called is deleted. Therefore the value in `save`, after
reuturnin from the function, is invalid.

In this particular case, the error is not manifesting as val_str`
returns memory allocated with `thd_strmake` and it does not use `buf`.

*Solution:*

Allocate memory with `thd_strmake` so the memory in `save` is not local.

---------------------------------------------------------------------------

Fix test main.bug12969156 when WITH_ASAN=ON

*Problem:*

ASAN complains about stack-buffer-overflow on function `mysql_heartbeat`:

```
==90890==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7fe746d06d14 at pc 0x7fe760f5b017 bp 0x7fe746d06cd0 sp 0x7fe746d06478
WRITE of size 24 at 0x7fe746d06d14 thread T16777215

Address 0x7fe746d06d14 is located in stack of thread T26 at offset 340 in frame
    #0 0x7fe746d0a55c in mysql_heartbeat(void*) /home/yura/ws/percona-server/plugin/daemon_example/daemon_example.cc:62

  This frame has 4 object(s):
    [48, 56) 'result' (line 66)
    [80, 112) '_db_stack_frame_' (line 63)
    [144, 200) 'tm_tmp' (line 67)
    [240, 340) 'buffer' (line 65) <== Memory access at offset 340 overflows this variable
HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork
      (longjmp and C++ exceptions *are* supported)
Thread T26 created by T25 here:
    #0 0x7fe760f5f6d5 in __interceptor_pthread_create ../../../../src/libsanitizer/asan/asan_interceptors.cpp:216
    #1 0x557ccbbcb857 in my_thread_create /home/yura/ws/percona-server/mysys/my_thread.c:104
    #2 0x7fe746d0b21a in daemon_example_plugin_init /home/yura/ws/percona-server/plugin/daemon_example/daemon_example.cc:148
    #3 0x557ccb4c69c7 in plugin_initialize /home/yura/ws/percona-server/sql/sql_plugin.cc:1279
    #4 0x557ccb4d19cd in mysql_install_plugin /home/yura/ws/percona-server/sql/sql_plugin.cc:2279
    #5 0x557ccb4d218f in Sql_cmd_install_plugin::execute(THD*) /home/yura/ws/percona-server/sql/sql_plugin.cc:4664
    #6 0x557ccb47695e in mysql_execute_command(THD*, bool) /home/yura/ws/percona-server/sql/sql_parse.cc:5160
    #7 0x557ccb47977c in mysql_parse(THD*, Parser_state*, bool) /home/yura/ws/percona-server/sql/sql_parse.cc:5952
    #8 0x557ccb47b6c2 in dispatch_command(THD*, COM_DATA const*, enum_server_command) /home/yura/ws/percona-server/sql/sql_parse.cc:1544
    #9 0x557ccb47de1d in do_command(THD*) /home/yura/ws/percona-server/sql/sql_parse.cc:1065
    #10 0x557ccb6ac294 in handle_connection /home/yura/ws/percona-server/sql/conn_handler/connection_handler_per_thread.cc:325
    #11 0x557ccbbfabb0 in pfs_spawn_thread /home/yura/ws/percona-server/storage/perfschema/pfs.cc:2198
    #12 0x7fe760ab544f in start_thread nptl/pthread_create.c:473
```

The reason is that `my_thread_cancel` is used to finish the daemon thread. This is not and orderly way of finishing the thread. ASAN does not register the stack variables are not used anymore which generates the error above.

This is a benign error as all the variables are on the stack.

*Solution*:

Finish the thread in orderly way by using a signalling variable.

---------------------------------------------------------------------------

PS-8204: Fix XML escape rules for audit plugin

https://jira.percona.com/browse/PS-8204

There was a wrong length specified for some XML
escape rules. As a result of this terminating null symbol from
replacement rule was copied into resulting string. This lead to
quer text truncation in audit log file.
In addition added empty replacement rules for '\b' and 'f' symbols
which just remove them from resulting string. These symboles are
not supported in XML 1.0.

---------------------------------------------------------------------------

PS-8854: Add main.percona_udf MTR test

Add a test to check FNV1A_64, FNV_64, and MURMUR_HASH user-defined functions.
inikep added a commit that referenced this pull request Oct 16, 2023
PS-5741: Incorrect use of memset_s in keyring_vault.

Fixed the usage of memset_s. The arguments should be:
void memset_s(void *dest, size_t dest_max, int c, size_t n)
where the 2nd argument is size of buffer and the 3rd is
argument is character to fill.

---------------------------------------------------------------------------

PS-7769 - Fix use-after-return error in audit_log_exclude_accounts_validate

---

*Problem:*

`st_mysql_value::val_str` might return a pointer to `buf` which after
the function called is deleted. Therefore the value in `save`, after
reuturnin from the function, is invalid.

In this particular case, the error is not manifesting as val_str`
returns memory allocated with `thd_strmake` and it does not use `buf`.

*Solution:*

Allocate memory with `thd_strmake` so the memory in `save` is not local.

---------------------------------------------------------------------------

Fix test main.bug12969156 when WITH_ASAN=ON

*Problem:*

ASAN complains about stack-buffer-overflow on function `mysql_heartbeat`:

```
==90890==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7fe746d06d14 at pc 0x7fe760f5b017 bp 0x7fe746d06cd0 sp 0x7fe746d06478
WRITE of size 24 at 0x7fe746d06d14 thread T16777215

Address 0x7fe746d06d14 is located in stack of thread T26 at offset 340 in frame
    #0 0x7fe746d0a55c in mysql_heartbeat(void*) /home/yura/ws/percona-server/plugin/daemon_example/daemon_example.cc:62

  This frame has 4 object(s):
    [48, 56) 'result' (line 66)
    [80, 112) '_db_stack_frame_' (line 63)
    [144, 200) 'tm_tmp' (line 67)
    [240, 340) 'buffer' (line 65) <== Memory access at offset 340 overflows this variable
HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork
      (longjmp and C++ exceptions *are* supported)
Thread T26 created by T25 here:
    #0 0x7fe760f5f6d5 in __interceptor_pthread_create ../../../../src/libsanitizer/asan/asan_interceptors.cpp:216
    #1 0x557ccbbcb857 in my_thread_create /home/yura/ws/percona-server/mysys/my_thread.c:104
    #2 0x7fe746d0b21a in daemon_example_plugin_init /home/yura/ws/percona-server/plugin/daemon_example/daemon_example.cc:148
    #3 0x557ccb4c69c7 in plugin_initialize /home/yura/ws/percona-server/sql/sql_plugin.cc:1279
    #4 0x557ccb4d19cd in mysql_install_plugin /home/yura/ws/percona-server/sql/sql_plugin.cc:2279
    #5 0x557ccb4d218f in Sql_cmd_install_plugin::execute(THD*) /home/yura/ws/percona-server/sql/sql_plugin.cc:4664
    #6 0x557ccb47695e in mysql_execute_command(THD*, bool) /home/yura/ws/percona-server/sql/sql_parse.cc:5160
    #7 0x557ccb47977c in mysql_parse(THD*, Parser_state*, bool) /home/yura/ws/percona-server/sql/sql_parse.cc:5952
    #8 0x557ccb47b6c2 in dispatch_command(THD*, COM_DATA const*, enum_server_command) /home/yura/ws/percona-server/sql/sql_parse.cc:1544
    #9 0x557ccb47de1d in do_command(THD*) /home/yura/ws/percona-server/sql/sql_parse.cc:1065
    #10 0x557ccb6ac294 in handle_connection /home/yura/ws/percona-server/sql/conn_handler/connection_handler_per_thread.cc:325
    #11 0x557ccbbfabb0 in pfs_spawn_thread /home/yura/ws/percona-server/storage/perfschema/pfs.cc:2198
    #12 0x7fe760ab544f in start_thread nptl/pthread_create.c:473
```

The reason is that `my_thread_cancel` is used to finish the daemon thread. This is not and orderly way of finishing the thread. ASAN does not register the stack variables are not used anymore which generates the error above.

This is a benign error as all the variables are on the stack.

*Solution*:

Finish the thread in orderly way by using a signalling variable.

---------------------------------------------------------------------------

PS-8204: Fix XML escape rules for audit plugin

https://jira.percona.com/browse/PS-8204

There was a wrong length specified for some XML
escape rules. As a result of this terminating null symbol from
replacement rule was copied into resulting string. This lead to
quer text truncation in audit log file.
In addition added empty replacement rules for '\b' and 'f' symbols
which just remove them from resulting string. These symboles are
not supported in XML 1.0.

---------------------------------------------------------------------------

PS-8854: Add main.percona_udf MTR test

Add a test to check FNV1A_64, FNV_64, and MURMUR_HASH user-defined functions.
inikep added a commit that referenced this pull request Oct 25, 2023
PS-5741: Incorrect use of memset_s in keyring_vault.

Fixed the usage of memset_s. The arguments should be:
void memset_s(void *dest, size_t dest_max, int c, size_t n)
where the 2nd argument is size of buffer and the 3rd is
argument is character to fill.

---------------------------------------------------------------------------

PS-7769 - Fix use-after-return error in audit_log_exclude_accounts_validate

---

*Problem:*

`st_mysql_value::val_str` might return a pointer to `buf` which after
the function called is deleted. Therefore the value in `save`, after
reuturnin from the function, is invalid.

In this particular case, the error is not manifesting as val_str`
returns memory allocated with `thd_strmake` and it does not use `buf`.

*Solution:*

Allocate memory with `thd_strmake` so the memory in `save` is not local.

---------------------------------------------------------------------------

Fix test main.bug12969156 when WITH_ASAN=ON

*Problem:*

ASAN complains about stack-buffer-overflow on function `mysql_heartbeat`:

```
==90890==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7fe746d06d14 at pc 0x7fe760f5b017 bp 0x7fe746d06cd0 sp 0x7fe746d06478
WRITE of size 24 at 0x7fe746d06d14 thread T16777215

Address 0x7fe746d06d14 is located in stack of thread T26 at offset 340 in frame
    #0 0x7fe746d0a55c in mysql_heartbeat(void*) /home/yura/ws/percona-server/plugin/daemon_example/daemon_example.cc:62

  This frame has 4 object(s):
    [48, 56) 'result' (line 66)
    [80, 112) '_db_stack_frame_' (line 63)
    [144, 200) 'tm_tmp' (line 67)
    [240, 340) 'buffer' (line 65) <== Memory access at offset 340 overflows this variable
HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork
      (longjmp and C++ exceptions *are* supported)
Thread T26 created by T25 here:
    #0 0x7fe760f5f6d5 in __interceptor_pthread_create ../../../../src/libsanitizer/asan/asan_interceptors.cpp:216
    #1 0x557ccbbcb857 in my_thread_create /home/yura/ws/percona-server/mysys/my_thread.c:104
    #2 0x7fe746d0b21a in daemon_example_plugin_init /home/yura/ws/percona-server/plugin/daemon_example/daemon_example.cc:148
    #3 0x557ccb4c69c7 in plugin_initialize /home/yura/ws/percona-server/sql/sql_plugin.cc:1279
    #4 0x557ccb4d19cd in mysql_install_plugin /home/yura/ws/percona-server/sql/sql_plugin.cc:2279
    #5 0x557ccb4d218f in Sql_cmd_install_plugin::execute(THD*) /home/yura/ws/percona-server/sql/sql_plugin.cc:4664
    #6 0x557ccb47695e in mysql_execute_command(THD*, bool) /home/yura/ws/percona-server/sql/sql_parse.cc:5160
    #7 0x557ccb47977c in mysql_parse(THD*, Parser_state*, bool) /home/yura/ws/percona-server/sql/sql_parse.cc:5952
    #8 0x557ccb47b6c2 in dispatch_command(THD*, COM_DATA const*, enum_server_command) /home/yura/ws/percona-server/sql/sql_parse.cc:1544
    #9 0x557ccb47de1d in do_command(THD*) /home/yura/ws/percona-server/sql/sql_parse.cc:1065
    #10 0x557ccb6ac294 in handle_connection /home/yura/ws/percona-server/sql/conn_handler/connection_handler_per_thread.cc:325
    #11 0x557ccbbfabb0 in pfs_spawn_thread /home/yura/ws/percona-server/storage/perfschema/pfs.cc:2198
    #12 0x7fe760ab544f in start_thread nptl/pthread_create.c:473
```

The reason is that `my_thread_cancel` is used to finish the daemon thread. This is not and orderly way of finishing the thread. ASAN does not register the stack variables are not used anymore which generates the error above.

This is a benign error as all the variables are on the stack.

*Solution*:

Finish the thread in orderly way by using a signalling variable.

---------------------------------------------------------------------------

PS-8204: Fix XML escape rules for audit plugin

https://jira.percona.com/browse/PS-8204

There was a wrong length specified for some XML
escape rules. As a result of this terminating null symbol from
replacement rule was copied into resulting string. This lead to
quer text truncation in audit log file.
In addition added empty replacement rules for '\b' and 'f' symbols
which just remove them from resulting string. These symboles are
not supported in XML 1.0.

---------------------------------------------------------------------------

PS-8854: Add main.percona_udf MTR test

Add a test to check FNV1A_64, FNV_64, and MURMUR_HASH user-defined functions.
inikep added a commit that referenced this pull request Oct 26, 2023
PS-5741: Incorrect use of memset_s in keyring_vault.

Fixed the usage of memset_s. The arguments should be:
void memset_s(void *dest, size_t dest_max, int c, size_t n)
where the 2nd argument is size of buffer and the 3rd is
argument is character to fill.

---------------------------------------------------------------------------

PS-7769 - Fix use-after-return error in audit_log_exclude_accounts_validate

---

*Problem:*

`st_mysql_value::val_str` might return a pointer to `buf` which after
the function called is deleted. Therefore the value in `save`, after
reuturnin from the function, is invalid.

In this particular case, the error is not manifesting as val_str`
returns memory allocated with `thd_strmake` and it does not use `buf`.

*Solution:*

Allocate memory with `thd_strmake` so the memory in `save` is not local.

---------------------------------------------------------------------------

Fix test main.bug12969156 when WITH_ASAN=ON

*Problem:*

ASAN complains about stack-buffer-overflow on function `mysql_heartbeat`:

```
==90890==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7fe746d06d14 at pc 0x7fe760f5b017 bp 0x7fe746d06cd0 sp 0x7fe746d06478
WRITE of size 24 at 0x7fe746d06d14 thread T16777215

Address 0x7fe746d06d14 is located in stack of thread T26 at offset 340 in frame
    #0 0x7fe746d0a55c in mysql_heartbeat(void*) /home/yura/ws/percona-server/plugin/daemon_example/daemon_example.cc:62

  This frame has 4 object(s):
    [48, 56) 'result' (line 66)
    [80, 112) '_db_stack_frame_' (line 63)
    [144, 200) 'tm_tmp' (line 67)
    [240, 340) 'buffer' (line 65) <== Memory access at offset 340 overflows this variable
HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork
      (longjmp and C++ exceptions *are* supported)
Thread T26 created by T25 here:
    #0 0x7fe760f5f6d5 in __interceptor_pthread_create ../../../../src/libsanitizer/asan/asan_interceptors.cpp:216
    #1 0x557ccbbcb857 in my_thread_create /home/yura/ws/percona-server/mysys/my_thread.c:104
    #2 0x7fe746d0b21a in daemon_example_plugin_init /home/yura/ws/percona-server/plugin/daemon_example/daemon_example.cc:148
    #3 0x557ccb4c69c7 in plugin_initialize /home/yura/ws/percona-server/sql/sql_plugin.cc:1279
    #4 0x557ccb4d19cd in mysql_install_plugin /home/yura/ws/percona-server/sql/sql_plugin.cc:2279
    #5 0x557ccb4d218f in Sql_cmd_install_plugin::execute(THD*) /home/yura/ws/percona-server/sql/sql_plugin.cc:4664
    #6 0x557ccb47695e in mysql_execute_command(THD*, bool) /home/yura/ws/percona-server/sql/sql_parse.cc:5160
    #7 0x557ccb47977c in mysql_parse(THD*, Parser_state*, bool) /home/yura/ws/percona-server/sql/sql_parse.cc:5952
    #8 0x557ccb47b6c2 in dispatch_command(THD*, COM_DATA const*, enum_server_command) /home/yura/ws/percona-server/sql/sql_parse.cc:1544
    #9 0x557ccb47de1d in do_command(THD*) /home/yura/ws/percona-server/sql/sql_parse.cc:1065
    #10 0x557ccb6ac294 in handle_connection /home/yura/ws/percona-server/sql/conn_handler/connection_handler_per_thread.cc:325
    #11 0x557ccbbfabb0 in pfs_spawn_thread /home/yura/ws/percona-server/storage/perfschema/pfs.cc:2198
    #12 0x7fe760ab544f in start_thread nptl/pthread_create.c:473
```

The reason is that `my_thread_cancel` is used to finish the daemon thread. This is not and orderly way of finishing the thread. ASAN does not register the stack variables are not used anymore which generates the error above.

This is a benign error as all the variables are on the stack.

*Solution*:

Finish the thread in orderly way by using a signalling variable.

---------------------------------------------------------------------------

PS-8204: Fix XML escape rules for audit plugin

https://jira.percona.com/browse/PS-8204

There was a wrong length specified for some XML
escape rules. As a result of this terminating null symbol from
replacement rule was copied into resulting string. This lead to
quer text truncation in audit log file.
In addition added empty replacement rules for '\b' and 'f' symbols
which just remove them from resulting string. These symboles are
not supported in XML 1.0.

---------------------------------------------------------------------------

PS-8854: Add main.percona_udf MTR test

Add a test to check FNV1A_64, FNV_64, and MURMUR_HASH user-defined functions.
inikep added a commit that referenced this pull request Oct 26, 2023
PS-5741: Incorrect use of memset_s in keyring_vault.

Fixed the usage of memset_s. The arguments should be:
void memset_s(void *dest, size_t dest_max, int c, size_t n)
where the 2nd argument is size of buffer and the 3rd is
argument is character to fill.

---------------------------------------------------------------------------

PS-7769 - Fix use-after-return error in audit_log_exclude_accounts_validate

---

*Problem:*

`st_mysql_value::val_str` might return a pointer to `buf` which after
the function called is deleted. Therefore the value in `save`, after
reuturnin from the function, is invalid.

In this particular case, the error is not manifesting as val_str`
returns memory allocated with `thd_strmake` and it does not use `buf`.

*Solution:*

Allocate memory with `thd_strmake` so the memory in `save` is not local.

---------------------------------------------------------------------------

Fix test main.bug12969156 when WITH_ASAN=ON

*Problem:*

ASAN complains about stack-buffer-overflow on function `mysql_heartbeat`:

```
==90890==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7fe746d06d14 at pc 0x7fe760f5b017 bp 0x7fe746d06cd0 sp 0x7fe746d06478
WRITE of size 24 at 0x7fe746d06d14 thread T16777215

Address 0x7fe746d06d14 is located in stack of thread T26 at offset 340 in frame
    #0 0x7fe746d0a55c in mysql_heartbeat(void*) /home/yura/ws/percona-server/plugin/daemon_example/daemon_example.cc:62

  This frame has 4 object(s):
    [48, 56) 'result' (line 66)
    [80, 112) '_db_stack_frame_' (line 63)
    [144, 200) 'tm_tmp' (line 67)
    [240, 340) 'buffer' (line 65) <== Memory access at offset 340 overflows this variable
HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork
      (longjmp and C++ exceptions *are* supported)
Thread T26 created by T25 here:
    #0 0x7fe760f5f6d5 in __interceptor_pthread_create ../../../../src/libsanitizer/asan/asan_interceptors.cpp:216
    #1 0x557ccbbcb857 in my_thread_create /home/yura/ws/percona-server/mysys/my_thread.c:104
    #2 0x7fe746d0b21a in daemon_example_plugin_init /home/yura/ws/percona-server/plugin/daemon_example/daemon_example.cc:148
    #3 0x557ccb4c69c7 in plugin_initialize /home/yura/ws/percona-server/sql/sql_plugin.cc:1279
    #4 0x557ccb4d19cd in mysql_install_plugin /home/yura/ws/percona-server/sql/sql_plugin.cc:2279
    #5 0x557ccb4d218f in Sql_cmd_install_plugin::execute(THD*) /home/yura/ws/percona-server/sql/sql_plugin.cc:4664
    #6 0x557ccb47695e in mysql_execute_command(THD*, bool) /home/yura/ws/percona-server/sql/sql_parse.cc:5160
    #7 0x557ccb47977c in mysql_parse(THD*, Parser_state*, bool) /home/yura/ws/percona-server/sql/sql_parse.cc:5952
    #8 0x557ccb47b6c2 in dispatch_command(THD*, COM_DATA const*, enum_server_command) /home/yura/ws/percona-server/sql/sql_parse.cc:1544
    #9 0x557ccb47de1d in do_command(THD*) /home/yura/ws/percona-server/sql/sql_parse.cc:1065
    #10 0x557ccb6ac294 in handle_connection /home/yura/ws/percona-server/sql/conn_handler/connection_handler_per_thread.cc:325
    #11 0x557ccbbfabb0 in pfs_spawn_thread /home/yura/ws/percona-server/storage/perfschema/pfs.cc:2198
    #12 0x7fe760ab544f in start_thread nptl/pthread_create.c:473
```

The reason is that `my_thread_cancel` is used to finish the daemon thread. This is not and orderly way of finishing the thread. ASAN does not register the stack variables are not used anymore which generates the error above.

This is a benign error as all the variables are on the stack.

*Solution*:

Finish the thread in orderly way by using a signalling variable.

---------------------------------------------------------------------------

PS-8204: Fix XML escape rules for audit plugin

https://jira.percona.com/browse/PS-8204

There was a wrong length specified for some XML
escape rules. As a result of this terminating null symbol from
replacement rule was copied into resulting string. This lead to
quer text truncation in audit log file.
In addition added empty replacement rules for '\b' and 'f' symbols
which just remove them from resulting string. These symboles are
not supported in XML 1.0.

---------------------------------------------------------------------------

PS-8854: Add main.percona_udf MTR test

Add a test to check FNV1A_64, FNV_64, and MURMUR_HASH user-defined functions.
inikep added a commit that referenced this pull request Jan 17, 2024
PS-5741: Incorrect use of memset_s in keyring_vault.

Fixed the usage of memset_s. The arguments should be:
void memset_s(void *dest, size_t dest_max, int c, size_t n)
where the 2nd argument is size of buffer and the 3rd is
argument is character to fill.

---------------------------------------------------------------------------

PS-7769 - Fix use-after-return error in audit_log_exclude_accounts_validate

---

*Problem:*

`st_mysql_value::val_str` might return a pointer to `buf` which after
the function called is deleted. Therefore the value in `save`, after
reuturnin from the function, is invalid.

In this particular case, the error is not manifesting as val_str`
returns memory allocated with `thd_strmake` and it does not use `buf`.

*Solution:*

Allocate memory with `thd_strmake` so the memory in `save` is not local.

---------------------------------------------------------------------------

Fix test main.bug12969156 when WITH_ASAN=ON

*Problem:*

ASAN complains about stack-buffer-overflow on function `mysql_heartbeat`:

```
==90890==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7fe746d06d14 at pc 0x7fe760f5b017 bp 0x7fe746d06cd0 sp 0x7fe746d06478
WRITE of size 24 at 0x7fe746d06d14 thread T16777215

Address 0x7fe746d06d14 is located in stack of thread T26 at offset 340 in frame
    #0 0x7fe746d0a55c in mysql_heartbeat(void*) /home/yura/ws/percona-server/plugin/daemon_example/daemon_example.cc:62

  This frame has 4 object(s):
    [48, 56) 'result' (line 66)
    [80, 112) '_db_stack_frame_' (line 63)
    [144, 200) 'tm_tmp' (line 67)
    [240, 340) 'buffer' (line 65) <== Memory access at offset 340 overflows this variable
HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork
      (longjmp and C++ exceptions *are* supported)
Thread T26 created by T25 here:
    #0 0x7fe760f5f6d5 in __interceptor_pthread_create ../../../../src/libsanitizer/asan/asan_interceptors.cpp:216
    #1 0x557ccbbcb857 in my_thread_create /home/yura/ws/percona-server/mysys/my_thread.c:104
    #2 0x7fe746d0b21a in daemon_example_plugin_init /home/yura/ws/percona-server/plugin/daemon_example/daemon_example.cc:148
    #3 0x557ccb4c69c7 in plugin_initialize /home/yura/ws/percona-server/sql/sql_plugin.cc:1279
    #4 0x557ccb4d19cd in mysql_install_plugin /home/yura/ws/percona-server/sql/sql_plugin.cc:2279
    #5 0x557ccb4d218f in Sql_cmd_install_plugin::execute(THD*) /home/yura/ws/percona-server/sql/sql_plugin.cc:4664
    #6 0x557ccb47695e in mysql_execute_command(THD*, bool) /home/yura/ws/percona-server/sql/sql_parse.cc:5160
    #7 0x557ccb47977c in mysql_parse(THD*, Parser_state*, bool) /home/yura/ws/percona-server/sql/sql_parse.cc:5952
    #8 0x557ccb47b6c2 in dispatch_command(THD*, COM_DATA const*, enum_server_command) /home/yura/ws/percona-server/sql/sql_parse.cc:1544
    #9 0x557ccb47de1d in do_command(THD*) /home/yura/ws/percona-server/sql/sql_parse.cc:1065
    #10 0x557ccb6ac294 in handle_connection /home/yura/ws/percona-server/sql/conn_handler/connection_handler_per_thread.cc:325
    #11 0x557ccbbfabb0 in pfs_spawn_thread /home/yura/ws/percona-server/storage/perfschema/pfs.cc:2198
    #12 0x7fe760ab544f in start_thread nptl/pthread_create.c:473
```

The reason is that `my_thread_cancel` is used to finish the daemon thread. This is not and orderly way of finishing the thread. ASAN does not register the stack variables are not used anymore which generates the error above.

This is a benign error as all the variables are on the stack.

*Solution*:

Finish the thread in orderly way by using a signalling variable.

---------------------------------------------------------------------------

PS-8204: Fix XML escape rules for audit plugin

https://jira.percona.com/browse/PS-8204

There was a wrong length specified for some XML
escape rules. As a result of this terminating null symbol from
replacement rule was copied into resulting string. This lead to
quer text truncation in audit log file.
In addition added empty replacement rules for '\b' and 'f' symbols
which just remove them from resulting string. These symboles are
not supported in XML 1.0.

---------------------------------------------------------------------------

PS-8854: Add main.percona_udf MTR test

Add a test to check FNV1A_64, FNV_64, and MURMUR_HASH user-defined functions.
inikep added a commit that referenced this pull request Jan 17, 2024
PS-5741: Incorrect use of memset_s in keyring_vault.

Fixed the usage of memset_s. The arguments should be:
void memset_s(void *dest, size_t dest_max, int c, size_t n)
where the 2nd argument is size of buffer and the 3rd is
argument is character to fill.

---------------------------------------------------------------------------

PS-7769 - Fix use-after-return error in audit_log_exclude_accounts_validate

---

*Problem:*

`st_mysql_value::val_str` might return a pointer to `buf` which after
the function called is deleted. Therefore the value in `save`, after
reuturnin from the function, is invalid.

In this particular case, the error is not manifesting as val_str`
returns memory allocated with `thd_strmake` and it does not use `buf`.

*Solution:*

Allocate memory with `thd_strmake` so the memory in `save` is not local.

---------------------------------------------------------------------------

Fix test main.bug12969156 when WITH_ASAN=ON

*Problem:*

ASAN complains about stack-buffer-overflow on function `mysql_heartbeat`:

```
==90890==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7fe746d06d14 at pc 0x7fe760f5b017 bp 0x7fe746d06cd0 sp 0x7fe746d06478
WRITE of size 24 at 0x7fe746d06d14 thread T16777215

Address 0x7fe746d06d14 is located in stack of thread T26 at offset 340 in frame
    #0 0x7fe746d0a55c in mysql_heartbeat(void*) /home/yura/ws/percona-server/plugin/daemon_example/daemon_example.cc:62

  This frame has 4 object(s):
    [48, 56) 'result' (line 66)
    [80, 112) '_db_stack_frame_' (line 63)
    [144, 200) 'tm_tmp' (line 67)
    [240, 340) 'buffer' (line 65) <== Memory access at offset 340 overflows this variable
HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork
      (longjmp and C++ exceptions *are* supported)
Thread T26 created by T25 here:
    #0 0x7fe760f5f6d5 in __interceptor_pthread_create ../../../../src/libsanitizer/asan/asan_interceptors.cpp:216
    #1 0x557ccbbcb857 in my_thread_create /home/yura/ws/percona-server/mysys/my_thread.c:104
    #2 0x7fe746d0b21a in daemon_example_plugin_init /home/yura/ws/percona-server/plugin/daemon_example/daemon_example.cc:148
    #3 0x557ccb4c69c7 in plugin_initialize /home/yura/ws/percona-server/sql/sql_plugin.cc:1279
    #4 0x557ccb4d19cd in mysql_install_plugin /home/yura/ws/percona-server/sql/sql_plugin.cc:2279
    #5 0x557ccb4d218f in Sql_cmd_install_plugin::execute(THD*) /home/yura/ws/percona-server/sql/sql_plugin.cc:4664
    #6 0x557ccb47695e in mysql_execute_command(THD*, bool) /home/yura/ws/percona-server/sql/sql_parse.cc:5160
    #7 0x557ccb47977c in mysql_parse(THD*, Parser_state*, bool) /home/yura/ws/percona-server/sql/sql_parse.cc:5952
    #8 0x557ccb47b6c2 in dispatch_command(THD*, COM_DATA const*, enum_server_command) /home/yura/ws/percona-server/sql/sql_parse.cc:1544
    #9 0x557ccb47de1d in do_command(THD*) /home/yura/ws/percona-server/sql/sql_parse.cc:1065
    #10 0x557ccb6ac294 in handle_connection /home/yura/ws/percona-server/sql/conn_handler/connection_handler_per_thread.cc:325
    #11 0x557ccbbfabb0 in pfs_spawn_thread /home/yura/ws/percona-server/storage/perfschema/pfs.cc:2198
    #12 0x7fe760ab544f in start_thread nptl/pthread_create.c:473
```

The reason is that `my_thread_cancel` is used to finish the daemon thread. This is not and orderly way of finishing the thread. ASAN does not register the stack variables are not used anymore which generates the error above.

This is a benign error as all the variables are on the stack.

*Solution*:

Finish the thread in orderly way by using a signalling variable.

---------------------------------------------------------------------------

PS-8204: Fix XML escape rules for audit plugin

https://jira.percona.com/browse/PS-8204

There was a wrong length specified for some XML
escape rules. As a result of this terminating null symbol from
replacement rule was copied into resulting string. This lead to
quer text truncation in audit log file.
In addition added empty replacement rules for '\b' and 'f' symbols
which just remove them from resulting string. These symboles are
not supported in XML 1.0.

---------------------------------------------------------------------------

PS-8854: Add main.percona_udf MTR test

Add a test to check FNV1A_64, FNV_64, and MURMUR_HASH user-defined functions.
inikep pushed a commit that referenced this pull request Feb 8, 2024
In the TCP Transporter, request a TLS key rotation after each
2^32 bytes are sent.

Note that there is no visibility into whether this has occured.

Change-Id: I70e1fff8b20305d565efc5a44e7ecb827da22dca
inikep pushed a commit that referenced this pull request Feb 8, 2024
Support the standard --ndb-tls-search-path in ndb_mgm.

Also support the --ndb-mgm-tls option, but in ndb_mgm, this enum
option can take a third value, "deferred", that is not supported
by other programs.

Add a --test-tls option, which makes a single, verbose attempt to
connect using TLS, then exits with status 0 on success or 1 on error.

Add a new interactive command, "START TLS". This can perform the
TLS negotiation that was deferred with --ndb-mgm-tls=deferred.

Change-Id: I711245b723183ffe97f87dd0d7d15d2349730c77
inikep added a commit that referenced this pull request Feb 8, 2024
PS-5741: Incorrect use of memset_s in keyring_vault.

Fixed the usage of memset_s. The arguments should be:
void memset_s(void *dest, size_t dest_max, int c, size_t n)
where the 2nd argument is size of buffer and the 3rd is
argument is character to fill.

---------------------------------------------------------------------------

PS-7769 - Fix use-after-return error in audit_log_exclude_accounts_validate

---

*Problem:*

`st_mysql_value::val_str` might return a pointer to `buf` which after
the function called is deleted. Therefore the value in `save`, after
reuturnin from the function, is invalid.

In this particular case, the error is not manifesting as val_str`
returns memory allocated with `thd_strmake` and it does not use `buf`.

*Solution:*

Allocate memory with `thd_strmake` so the memory in `save` is not local.

---------------------------------------------------------------------------

Fix test main.bug12969156 when WITH_ASAN=ON

*Problem:*

ASAN complains about stack-buffer-overflow on function `mysql_heartbeat`:

```
==90890==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7fe746d06d14 at pc 0x7fe760f5b017 bp 0x7fe746d06cd0 sp 0x7fe746d06478
WRITE of size 24 at 0x7fe746d06d14 thread T16777215

Address 0x7fe746d06d14 is located in stack of thread T26 at offset 340 in frame
    #0 0x7fe746d0a55c in mysql_heartbeat(void*) /home/yura/ws/percona-server/plugin/daemon_example/daemon_example.cc:62

  This frame has 4 object(s):
    [48, 56) 'result' (line 66)
    [80, 112) '_db_stack_frame_' (line 63)
    [144, 200) 'tm_tmp' (line 67)
    [240, 340) 'buffer' (line 65) <== Memory access at offset 340 overflows this variable
HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork
      (longjmp and C++ exceptions *are* supported)
Thread T26 created by T25 here:
    #0 0x7fe760f5f6d5 in __interceptor_pthread_create ../../../../src/libsanitizer/asan/asan_interceptors.cpp:216
    #1 0x557ccbbcb857 in my_thread_create /home/yura/ws/percona-server/mysys/my_thread.c:104
    #2 0x7fe746d0b21a in daemon_example_plugin_init /home/yura/ws/percona-server/plugin/daemon_example/daemon_example.cc:148
    #3 0x557ccb4c69c7 in plugin_initialize /home/yura/ws/percona-server/sql/sql_plugin.cc:1279
    #4 0x557ccb4d19cd in mysql_install_plugin /home/yura/ws/percona-server/sql/sql_plugin.cc:2279
    #5 0x557ccb4d218f in Sql_cmd_install_plugin::execute(THD*) /home/yura/ws/percona-server/sql/sql_plugin.cc:4664
    #6 0x557ccb47695e in mysql_execute_command(THD*, bool) /home/yura/ws/percona-server/sql/sql_parse.cc:5160
    #7 0x557ccb47977c in mysql_parse(THD*, Parser_state*, bool) /home/yura/ws/percona-server/sql/sql_parse.cc:5952
    #8 0x557ccb47b6c2 in dispatch_command(THD*, COM_DATA const*, enum_server_command) /home/yura/ws/percona-server/sql/sql_parse.cc:1544
    #9 0x557ccb47de1d in do_command(THD*) /home/yura/ws/percona-server/sql/sql_parse.cc:1065
    #10 0x557ccb6ac294 in handle_connection /home/yura/ws/percona-server/sql/conn_handler/connection_handler_per_thread.cc:325
    #11 0x557ccbbfabb0 in pfs_spawn_thread /home/yura/ws/percona-server/storage/perfschema/pfs.cc:2198
    #12 0x7fe760ab544f in start_thread nptl/pthread_create.c:473
```

The reason is that `my_thread_cancel` is used to finish the daemon thread. This is not and orderly way of finishing the thread. ASAN does not register the stack variables are not used anymore which generates the error above.

This is a benign error as all the variables are on the stack.

*Solution*:

Finish the thread in orderly way by using a signalling variable.

---------------------------------------------------------------------------

PS-8204: Fix XML escape rules for audit plugin

https://jira.percona.com/browse/PS-8204

There was a wrong length specified for some XML
escape rules. As a result of this terminating null symbol from
replacement rule was copied into resulting string. This lead to
quer text truncation in audit log file.
In addition added empty replacement rules for '\b' and 'f' symbols
which just remove them from resulting string. These symboles are
not supported in XML 1.0.

---------------------------------------------------------------------------

PS-8854: Add main.percona_udf MTR test

Add a test to check FNV1A_64, FNV_64, and MURMUR_HASH user-defined functions.
inikep pushed a commit that referenced this pull request Mar 15, 2024
…ocal DDL

         executed

https://perconadev.atlassian.net/browse/PS-9018

Problem
-------
In high concurrency scenarios, MySQL replica can enter into a deadlock due to a
race condition between the replica applier thread and the client thread
performing a binlog group commit.

Analysis
--------
It needs at least 3 threads for this deadlock to happen

1. One client thread
2. Two replica applier threads

How this deadlock happens?
--------------------------
0. Binlog is enabled on replica, but log_replica_updates is disabled.

1. Initially, both "Commit Order" and "Binlog Flush" queues are empty.

2. Replica applier thread 1 enters the group commit pipeline to register in the
   "Commit Order" queue since `log-replica-updates` is disabled on the replica
   node.

3. Since both "Commit Order" and "Binlog Flush" queues are empty, the applier
   thread 1

   3.1. Becomes leader (In Commit_stage_manager::enroll_for()).

   3.2. Registers in the commit order queue.

   3.3. Acquires the lock MYSQL_BIN_LOG::LOCK_log.

   3.4. Commit Order queue is emptied, but the lock MYSQL_BIN_LOG::LOCK_log is
        not yet released.

   NOTE: SE commit for applier thread is already done by the time it reaches
         here.

4. Replica applier thread 2 enters the group commit pipeline to register in the
   "Commit Order" queue since `log-replica-updates` is disabled on the replica
   node.

5. Since the "Commit Order" queue is empty (emptied by applier thread 1 in 3.4), the
   applier thread 2

   5.1. Becomes leader (In Commit_stage_manager::enroll_for())

   5.2. Registers in the commit order queue.

   5.3. Tries to acquire the lock MYSQL_BIN_LOG::LOCK_log. Since it is held by applier
        thread 1 it will wait until the lock is released.

6. Client thread enters the group commit pipeline to register in the
   "Binlog Flush" queue.

7. Since "Commit Order" queue is not empty (there is applier thread 2 in the
   queue), it enters the conditional wait `m_stage_cond_leader` with an
   intention to become the leader for both the "Binlog Flush" and
   "Commit Order" queues.

8. Applier thread 1 releases the lock MYSQL_BIN_LOG::LOCK_log and proceeds to update
   the GTID by calling gtid_state->update_commit_group() from
   Commit_order_manager::flush_engine_and_signal_threads().

9. Applier thread 2 acquires the lock MYSQL_BIN_LOG::LOCK_log.

   9.1. It checks if there is any thread waiting in the "Binlog Flush" queue
        to become the leader. Here it finds the client thread waiting to be
        the leader.

   9.2. It releases the lock MYSQL_BIN_LOG::LOCK_log and signals on the
        cond_var `m_stage_cond_leader` and enters a conditional wait until the
        thread's `tx_commit_pending` is set to false by the client thread
       (will be done in the
       Commit_stage_manager::process_final_stage_for_ordered_commit_group()
       called by client thread from fetch_and_process_flush_stage_queue()).

10. The client thread wakes up from the cond_var `m_stage_cond_leader`.  The
    thread has now become a leader and it is its responsibility to update GTID
    of applier thread 2.

    10.1. It acquires the lock MYSQL_BIN_LOG::LOCK_log.

    10.2. Returns from `enroll_for()` and proceeds to process the
          "Commit Order" and "Binlog Flush" queues.

    10.3. Fetches the "Commit Order" and "Binlog Flush" queues.

    10.4. Performs the storage engine flush by calling ha_flush_logs() from
          fetch_and_process_flush_stage_queue().

    10.5. Proceeds to update the GTID of threads in "Commit Order" queue by
          calling gtid_state->update_commit_group() from
          Commit_stage_manager::process_final_stage_for_ordered_commit_group().

11. At this point, we will have

    - Client thread performing GTID update on behalf if applier thread 2 (from step 10.5), and
    - Applier thread 1 performing GTID update for itself (from step 8).

    Due to the lack of proper synchronization between the above two threads,
    there exists a time window where both threads can call
    gtid_state->update_commit_group() concurrently.

    In subsequent steps, both threads simultaneously try to modify the contents
    of the array `commit_group_sidnos` which is used to track the lock status of
    sidnos. This concurrent access to `update_commit_group()` can cause a
    lock-leak resulting in one thread acquiring the sidno lock and not
    releasing at all.

-----------------------------------------------------------------------------------------------------------
Client thread                                           Applier Thread 1
-----------------------------------------------------------------------------------------------------------
update_commit_group() => global_sid_lock->rdlock();     update_commit_group() => global_sid_lock->rdlock();

calls update_gtids_impl_lock_sidnos()                   calls update_gtids_impl_lock_sidnos()

set commit_group_sidno[2] = true                        set commit_group_sidno[2] = true

                                                        lock_sidno(2) -> successful

lock_sidno(2) -> waits

                                                        update_gtids_impl_own_gtid() -> Add the thd->owned_gtid in `executed_gtids()`

                                                        if (commit_group_sidnos[2]) {
                                                          unlock_sidno(2);
                                                          commit_group_sidnos[2] = false;
                                                        }

                                                        Applier thread continues..

lock_sidno(2) -> successful

update_gtids_impl_own_gtid() -> Add the thd->owned_gtid in `executed_gtids()`

if (commit_group_sidnos[2]) { <=== this check fails and lock is not released.
  unlock_sidno(2);
  commit_group_sidnos[2] = false;
}

Client thread continues without releasing the lock
-----------------------------------------------------------------------------------------------------------

12. As the above lock-leak can also happen the other way i.e, the applier
    thread fails to unlock, there can be different consequences hereafter.

13. If the client thread continues without releasing the lock, then at a later
    stage, it can enter into a deadlock with the applier thread performing a
    GTID update with stack trace.

    Client_thread
    -------------
    #1  __GI___lll_lock_wait
    #2  ___pthread_mutex_lock
    #3  native_mutex_lock                                       <= waits for commit lock while holding sidno lock
    #4  Commit_stage_manager::enroll_for
    #5  MYSQL_BIN_LOG::change_stage
    #6  MYSQL_BIN_LOG::ordered_commit
    #7  MYSQL_BIN_LOG::commit
    #8  ha_commit_trans
    #9  trans_commit_implicit
    #10 mysql_create_like_table
    #11 Sql_cmd_create_table::execute
    #12 mysql_execute_command
    percona#13 dispatch_sql_command

    Applier thread
    --------------
    #1  ___pthread_mutex_lock
    #2  native_mutex_lock
    #3  safe_mutex_lock
    #4  Gtid_state::update_gtids_impl_lock_sidnos               <= waits for sidno lock
    #5  Gtid_state::update_commit_group
    #6  Commit_order_manager::flush_engine_and_signal_threads   <= acquires commit lock here
    #7  Commit_order_manager::finish
    #8  Commit_order_manager::wait_and_finish
    #9  ha_commit_low
    #10 trx_coordinator::commit_in_engines
    #11 MYSQL_BIN_LOG::commit
    #12 ha_commit_trans
    percona#13 trans_commit
    percona#14 Xid_log_event::do_commit
    percona#15 Xid_apply_log_event::do_apply_event_worker
    percona#16 Slave_worker::slave_worker_exec_event
    percona#17 slave_worker_exec_job_group
    percona#18 handle_slave_worker

14. If the applier thread continues without releasing the lock, then at a later
    stage, it can perform recursive locking while setting the GTID for the next
    transaction (in set_gtid_next()).

    In debug builds the above case hits the assertion
    `safe_mutex_assert_not_owner()` meaning the lock is already acquired by the
    replica applier thread when it tries to re-acquire the lock.

Solution
--------
In the above problematic example, when seen from each thread
individually, we can conclude that there is no problem in the order of lock
acquisition, thus there is no need to change the lock order.

However, the root cause for this problem is that multiple threads can
concurrently access to the array `Gtid_state::commit_group_sidnos`.

In its initial implementation, it was expected that threads should
hold the `MYSQL_BIN_LOG::LOCK_commit` before modifying its contents. But it
was not considered when upstream implemented WL#7846 (MTS:
slave-preserve-commit-order when log-slave-updates/binlog is disabled).

With this patch, we now ensure that `MYSQL_BIN_LOG::LOCK_commit` is acquired
when the client thread (binlog flush leader) when it tries to perform GTID
update on behalf of threads waiting in "Commit Order" queue, thus providing a
guarantee that `Gtid_state::commit_group_sidnos` array is never accessed
without the protection of `MYSQL_BIN_LOG::LOCK_commit`.
inikep pushed a commit that referenced this pull request Apr 9, 2024
Summary:
1. Disable more tests that were missed last time due to missing features
2. handler_basic.test : move FLUSH STATS *after* CREATE TABLE as CREATE TABLE also calls handler::write_row for innodb, not surprisingly.
3. Account for explain select format differences.
4. rocksdb.test: disable portion of test that use partition, sort the result for better stability
5. consistent_snapshot_mixed_engines: 8.0 now properly supports cross-engine snapshots (it calls the callback for each engine to create snasphot) so rebaseline for the behavioral differences - now innodb and rocksdb can share the same snapshot.
6. rpl_rocksdb_snapshot: rebaseline for binlog position change and don't use master_auto_position with master_log_pos.
7. rpl_crash_safe_wal_corrupt/rpl_gtid_crash_safe_wal_corrupt : Enable idempotent_recovery. Note they still need a missing patch to work completely.
8. rpl_gtid_rocksdb_sys_header: need to set rpl_allow_error=1 as the test is expecting binlog corruption
9. delete singledelete_idempotent_table as we don't have rbr_idempotent_table anymore.

Reviewed By: lloyd

Differential Revision: D17676018
inikep pushed a commit that referenced this pull request Jun 14, 2024
This reverts commit 0a661dc0d6dd8b498b0ed9f59207b9814d380fbe.

Some mtr tests have leaks originating from
    #7 0x7f579a2a28cc  (/lib/x86_64-linux-gnu/libprotobuf-lite.so.23+0x4d8cc) (BuildId: 1af51cdba58393c136bc541bbfc26ee7568c8c78)
    #8 0x7f579a2a44cc in google::protobuf::internal::InitSCCImpl(google::protobuf::internal::SCCInfoBase*) (/lib/x86_64-linux-gnu/libprotobuf-lite.so.23+0x4f4cc) (BuildId: 1af51cdba58393c136bc541bbfc26ee7568c8c78)
    #9 0x56352824ad0d in google::protobuf::internal::InitSCC(google::protobuf::internal::SCCInfoBase*) /usr/include/google/protobuf/generated_message_util.h:240:5

Change-Id: I52ea02cee282a9521a2b153b4d883314ed15eb5c
inikep pushed a commit that referenced this pull request Jun 14, 2024
Problem:
Starting ´ndb_mgmd --bind-address´ may potentially cause abnormal
program termination in MgmtSrvr destructor when ndb_mgmd restart itself.

  Core was generated by `ndb_mgmd --defa'.
  Program terminated with signal SIGABRT,   Aborted.
  #0  0x00007f8ce4066b8f in raise () from /lib64/libc.so.6
  #1  0x00007f8ce4039ea5 in abort () from /lib64/libc.so.6
  #2  0x00007f8ce40a7d97 in __libc_message () from /lib64/libc.so.6
  #3  0x00007f8ce40af08c in malloc_printerr () from /lib64/libc.so.6
  #4  0x00007f8ce40b132d in _int_free () from /lib64/libc.so.6
  #5  0x00000000006e9ffe in MgmtSrvr::~MgmtSrvr (this=0x28de4b0) at
mysql/8.0/storage/ndb/src/mgmsrv/MgmtSrvr.cpp:
890
  #6  0x00000000006ea09e in MgmtSrvr::~MgmtSrvr (this=0x2) at mysql/8.0/
storage/ndb/src/mgmsrv/MgmtSrvr.cpp:849
  #7  0x0000000000700d94 in mgmd_run () at
mysql/8.0/storage/ndb/src/mgmsrv/main.cpp:260
  #8  0x0000000000700775 in mgmd_main (argc=<optimized out>,
argv=0x28041d0) at mysql/8.0/storage/ndb/src/
mgmsrv/main.cpp:479

Analysis:
While starting up, the ndb_mgmd will allocate memory for bind_address in
order to potentially rewrite the parameter. When ndb_mgmd restart itself
the memory will be released and dangling pointer causing double free.

Fix:
Drop support for bind_address=[::], it is not documented anywhere, is
not useful and doesn't work.
This means the need to rewrite bind_address is gone and bind_address
argument need neither alloc or free.

Change-Id: I7797109b9d8391394587188d64d4b1f398887e94
inikep pushed a commit that referenced this pull request Jun 14, 2024
…ocal DDL

         executed

https://perconadev.atlassian.net/browse/PS-9018

Merge remote-tracking branch 'venki/PS-9018-8.0-gca' into HEAD

Problem
-------
In high concurrency scenarios, MySQL replica can enter into a deadlock due to a
race condition between the replica applier thread and the client thread
performing a binlog group commit.

Analysis
--------
It needs at least 3 threads for this deadlock to happen

1. One client thread
2. Two replica applier threads

How this deadlock happens?
--------------------------
0. Binlog is enabled on replica, but log_replica_updates is disabled.

1. Initially, both "Commit Order" and "Binlog Flush" queues are empty.

2. Replica applier thread 1 enters the group commit pipeline to register in the
   "Commit Order" queue since `log-replica-updates` is disabled on the replica
   node.

3. Since both "Commit Order" and "Binlog Flush" queues are empty, the applier
   thread 1

   3.1. Becomes leader (In Commit_stage_manager::enroll_for()).

   3.2. Registers in the commit order queue.

   3.3. Acquires the lock MYSQL_BIN_LOG::LOCK_log.

   3.4. Commit Order queue is emptied, but the lock MYSQL_BIN_LOG::LOCK_log is
        not yet released.

   NOTE: SE commit for applier thread is already done by the time it reaches
         here.

4. Replica applier thread 2 enters the group commit pipeline to register in the
   "Commit Order" queue since `log-replica-updates` is disabled on the replica
   node.

5. Since the "Commit Order" queue is empty (emptied by applier thread 1 in 3.4), the
   applier thread 2

   5.1. Becomes leader (In Commit_stage_manager::enroll_for())

   5.2. Registers in the commit order queue.

   5.3. Tries to acquire the lock MYSQL_BIN_LOG::LOCK_log. Since it is held by applier
        thread 1 it will wait until the lock is released.

6. Client thread enters the group commit pipeline to register in the
   "Binlog Flush" queue.

7. Since "Commit Order" queue is not empty (there is applier thread 2 in the
   queue), it enters the conditional wait `m_stage_cond_leader` with an
   intention to become the leader for both the "Binlog Flush" and
   "Commit Order" queues.

8. Applier thread 1 releases the lock MYSQL_BIN_LOG::LOCK_log and proceeds to update
   the GTID by calling gtid_state->update_commit_group() from
   Commit_order_manager::flush_engine_and_signal_threads().

9. Applier thread 2 acquires the lock MYSQL_BIN_LOG::LOCK_log.

   9.1. It checks if there is any thread waiting in the "Binlog Flush" queue
        to become the leader. Here it finds the client thread waiting to be
        the leader.

   9.2. It releases the lock MYSQL_BIN_LOG::LOCK_log and signals on the
        cond_var `m_stage_cond_leader` and enters a conditional wait until the
        thread's `tx_commit_pending` is set to false by the client thread
       (will be done in the
       Commit_stage_manager::process_final_stage_for_ordered_commit_group()
       called by client thread from fetch_and_process_flush_stage_queue()).

10. The client thread wakes up from the cond_var `m_stage_cond_leader`.  The
    thread has now become a leader and it is its responsibility to update GTID
    of applier thread 2.

    10.1. It acquires the lock MYSQL_BIN_LOG::LOCK_log.

    10.2. Returns from `enroll_for()` and proceeds to process the
          "Commit Order" and "Binlog Flush" queues.

    10.3. Fetches the "Commit Order" and "Binlog Flush" queues.

    10.4. Performs the storage engine flush by calling ha_flush_logs() from
          fetch_and_process_flush_stage_queue().

    10.5. Proceeds to update the GTID of threads in "Commit Order" queue by
          calling gtid_state->update_commit_group() from
          Commit_stage_manager::process_final_stage_for_ordered_commit_group().

11. At this point, we will have

    - Client thread performing GTID update on behalf if applier thread 2 (from step 10.5), and
    - Applier thread 1 performing GTID update for itself (from step 8).

    Due to the lack of proper synchronization between the above two threads,
    there exists a time window where both threads can call
    gtid_state->update_commit_group() concurrently.

    In subsequent steps, both threads simultaneously try to modify the contents
    of the array `commit_group_sidnos` which is used to track the lock status of
    sidnos. This concurrent access to `update_commit_group()` can cause a
    lock-leak resulting in one thread acquiring the sidno lock and not
    releasing at all.

-----------------------------------------------------------------------------------------------------------
Client thread                                           Applier Thread 1
-----------------------------------------------------------------------------------------------------------
update_commit_group() => global_sid_lock->rdlock();     update_commit_group() => global_sid_lock->rdlock();

calls update_gtids_impl_lock_sidnos()                   calls update_gtids_impl_lock_sidnos()

set commit_group_sidno[2] = true                        set commit_group_sidno[2] = true

                                                        lock_sidno(2) -> successful

lock_sidno(2) -> waits

                                                        update_gtids_impl_own_gtid() -> Add the thd->owned_gtid in `executed_gtids()`

                                                        if (commit_group_sidnos[2]) {
                                                          unlock_sidno(2);
                                                          commit_group_sidnos[2] = false;
                                                        }

                                                        Applier thread continues..

lock_sidno(2) -> successful

update_gtids_impl_own_gtid() -> Add the thd->owned_gtid in `executed_gtids()`

if (commit_group_sidnos[2]) { <=== this check fails and lock is not released.
  unlock_sidno(2);
  commit_group_sidnos[2] = false;
}

Client thread continues without releasing the lock
-----------------------------------------------------------------------------------------------------------

12. As the above lock-leak can also happen the other way i.e, the applier
    thread fails to unlock, there can be different consequences hereafter.

13. If the client thread continues without releasing the lock, then at a later
    stage, it can enter into a deadlock with the applier thread performing a
    GTID update with stack trace.

    Client_thread
    -------------
    #1  __GI___lll_lock_wait
    #2  ___pthread_mutex_lock
    #3  native_mutex_lock                                       <= waits for commit lock while holding sidno lock
    #4  Commit_stage_manager::enroll_for
    #5  MYSQL_BIN_LOG::change_stage
    #6  MYSQL_BIN_LOG::ordered_commit
    #7  MYSQL_BIN_LOG::commit
    #8  ha_commit_trans
    #9  trans_commit_implicit
    #10 mysql_create_like_table
    #11 Sql_cmd_create_table::execute
    #12 mysql_execute_command
    percona#13 dispatch_sql_command

    Applier thread
    --------------
    #1  ___pthread_mutex_lock
    #2  native_mutex_lock
    #3  safe_mutex_lock
    #4  Gtid_state::update_gtids_impl_lock_sidnos               <= waits for sidno lock
    #5  Gtid_state::update_commit_group
    #6  Commit_order_manager::flush_engine_and_signal_threads   <= acquires commit lock here
    #7  Commit_order_manager::finish
    #8  Commit_order_manager::wait_and_finish
    #9  ha_commit_low
    #10 trx_coordinator::commit_in_engines
    #11 MYSQL_BIN_LOG::commit
    #12 ha_commit_trans
    percona#13 trans_commit
    percona#14 Xid_log_event::do_commit
    percona#15 Xid_apply_log_event::do_apply_event_worker
    percona#16 Slave_worker::slave_worker_exec_event
    percona#17 slave_worker_exec_job_group
    percona#18 handle_slave_worker

14. If the applier thread continues without releasing the lock, then at a later
    stage, it can perform recursive locking while setting the GTID for the next
    transaction (in set_gtid_next()).

    In debug builds the above case hits the assertion
    `safe_mutex_assert_not_owner()` meaning the lock is already acquired by the
    replica applier thread when it tries to re-acquire the lock.

Solution
--------
In the above problematic example, when seen from each thread
individually, we can conclude that there is no problem in the order of lock
acquisition, thus there is no need to change the lock order.

However, the root cause for this problem is that multiple threads can
concurrently access to the array `Gtid_state::commit_group_sidnos`.

In its initial implementation, it was expected that threads should
hold the `MYSQL_BIN_LOG::LOCK_commit` before modifying its contents. But it
was not considered when upstream implemented WL#7846 (MTS:
slave-preserve-commit-order when log-slave-updates/binlog is disabled).

With this patch, we now ensure that `MYSQL_BIN_LOG::LOCK_commit` is acquired
when the client thread (binlog flush leader) when it tries to perform GTID
update on behalf of threads waiting in "Commit Order" queue, thus providing a
guarantee that `Gtid_state::commit_group_sidnos` array is never accessed
without the protection of `MYSQL_BIN_LOG::LOCK_commit`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant