Skip to content
This repository has been archived by the owner on Feb 10, 2023. It is now read-only.

[feature] add TokuDB row operations #1

Closed
BohuTANG opened this issue Oct 16, 2016 · 0 comments
Closed

[feature] add TokuDB row operations #1

BohuTANG opened this issue Oct 16, 2016 · 0 comments

Comments

@BohuTANG
Copy link
Collaborator

BohuTANG commented Oct 16, 2016

[Summary]
add insert/read/update/delete operations per second to measure the performance.

[Usage]
mysql>SHOW ENGINE TOKUDB STATUS;

....
Type: TokuDB
Name: row: operation status
Status: Number of rows inserted 3905, updated 3350, deleted 1016, read 3461
657.09 inserts/s, 322.88 updates/s, 876.90 deletes/s, 1230.48 reads/s

[Test Plan]
mtr with ASAN

[Reviewed By]
N/A

@BohuTANG BohuTANG changed the title FEATURE: add TokuDB row operations [feature] add TokuDB row operations Oct 16, 2016
BohuTANG added a commit that referenced this issue Oct 16, 2016
[Summary]
add insert/read/update/delete operations per second to measure the performance.

[Usage]
mysql>SHOW ENGINE TOKUDB STATUS;

....
Type: TokuDB
Name: row: operation status
Status: Number of rows inserted 3905, updated 3350, deleted 1016, read 3461
657.09 inserts/s, 322.88 updates/s, 876.90 deletes/s, 1230.48 reads/s
[Test Plan]
./mtr --suite=tokudb suite/tokudb/t/row_operations.test [OK]
ASAN [OK]

[Reviewed By]
N/A
BohuTANG pushed a commit that referenced this issue Dec 31, 2016
Use ha_rocksdb::open() should check table->found_next_number_field,
not table->next_number_field (like InnoDB does)
BohuTANG pushed a commit that referenced this issue Dec 31, 2016
Two testcases are empty because RocksDB-SE doesn't support the feature:
-autoincrement.test
-autoinc_secondary.test
BohuTANG pushed a commit that referenced this issue Dec 31, 2016
Summary:
This adds two features in RocksDB.
1. Supporting START TRANSACTION WITH CONSISTENT SNAPSHOT
2. Getting current binlog position in addition to #1.
With these features, mysqldump can take consistent logical backup.

The second feature is done by START TRANSACTION WITH
CONSISTENT ROCKSDB SNAPSHOT. This is Facebook's extension, and
it works like existing START TRANSACTION WITH CONSISTENT INNODB SNAPSHOT.

This diff changed some existing codebase/behaviors.

- Original Facebook-MySQL always started InnoDB transaction
regardless of engine clause. For example, START TRANSACTION WITH
CONSISTENT MYISAM SNAPSHOT was accepted but it actually started
InnoDB transaction, not MyISAM. This patch does not allow
setting engine that does not support consistent snapshot.

mysql> start transaction with consistent myisam snapshot;
ERROR 1105 (HY000): Consistent Snapshot is not supported for this engine

Currently only InnoDB and RocksDB support consistent snapshot.
To check engines, I modified sql/sql_yacc.yy, trans_begin()
and ha_start_consistent_snapshot() to pass handlerton.

- Changed constant name from
  MYSQL_START_TRANS_OPT_WITH_CONS_INNODB_SNAPSHOT to
MYSQL_START_TRANS_OPT_WITH_CONS_ENGINE_SNAPSHOT, because it's no longer
InnoDB dependent.

- When not setting engine, START TRANSACTION WITH CONSISTENT SNAPSHOT
takes both InnoDB and RocksDB snapshots, and both InnoDB and RocksDB
participate in transaction. When executing COMMIT, both InnoDB and
RocksDB modifications are committed. Remember that XA is not supported yet,
so mixing engines is not recommended anyway.

- When setting engine, START TRANSACTION WITH CONSISTENT.. takes
snapshot for the specified engine only. But it starts both
InnoDB and RocksDB transactions.

Test Plan: mtr --suite=rocksdb,rocksdb_rpl, --repeat=3

Reviewers: hermanlee4, jonahcohen, jtolmer, tian.xia, maykov, spetrunia

Reviewed By: spetrunia

Subscribers: steaphan

Differential Revision: https://reviews.facebook.net/D32355
BohuTANG pushed a commit that referenced this issue Dec 31, 2016
Summary:
This adds two features in RocksDB.
1. Supporting START TRANSACTION WITH CONSISTENT SNAPSHOT
2. Getting current binlog position in addition to #1.
With these features, mysqldump can take consistent logical backup.

The second feature is done by START TRANSACTION WITH
CONSISTENT ROCKSDB SNAPSHOT. This is Facebook's extension, and
it works like existing START TRANSACTION WITH CONSISTENT INNODB SNAPSHOT.

This diff changed some existing codebase/behaviors.

- Original Facebook-MySQL always started InnoDB transaction
regardless of engine clause. For example, START TRANSACTION WITH
CONSISTENT MYISAM SNAPSHOT was accepted but it actually started
InnoDB transaction, not MyISAM. This patch does not allow
setting engine that does not support consistent snapshot.

mysql> start transaction with consistent myisam snapshot;
ERROR 1105 (HY000): Consistent Snapshot is not supported for this engine

Currently only InnoDB and RocksDB support consistent snapshot.
To check engines, I modified sql/sql_yacc.yy, trans_begin()
and ha_start_consistent_snapshot() to pass handlerton.

- Changed constant name from
  MYSQL_START_TRANS_OPT_WITH_CONS_INNODB_SNAPSHOT to
MYSQL_START_TRANS_OPT_WITH_CONS_ENGINE_SNAPSHOT, because it's no longer
InnoDB dependent.

- When not setting engine, START TRANSACTION WITH CONSISTENT SNAPSHOT
takes both InnoDB and RocksDB snapshots, and both InnoDB and RocksDB
participate in transaction. When executing COMMIT, both InnoDB and
RocksDB modifications are committed. Remember that XA is not supported yet,
so mixing engines is not recommended anyway.

- When setting engine, START TRANSACTION WITH CONSISTENT.. takes
snapshot for the specified engine only. But it starts both
InnoDB and RocksDB transactions.

Test Plan: mtr --suite=rocksdb,rocksdb_rpl, --repeat=3

Reviewers: hermanlee4, jonahcohen, jtolmer, tian.xia, maykov, spetrunia

Reviewed By: spetrunia

Subscribers: steaphan

Differential Revision: https://reviews.facebook.net/D32355
BohuTANG pushed a commit that referenced this issue Dec 31, 2016
Summary:
This adds two features in RocksDB.
1. Supporting START TRANSACTION WITH CONSISTENT SNAPSHOT
2. Getting current binlog position in addition to #1.
With these features, mysqldump can take consistent logical backup.

The second feature is done by START TRANSACTION WITH
CONSISTENT ROCKSDB SNAPSHOT. This is Facebook's extension, and
it works like existing START TRANSACTION WITH CONSISTENT INNODB SNAPSHOT.

This diff changed some existing codebase/behaviors.

- Original Facebook-MySQL always started InnoDB transaction
regardless of engine clause. For example, START TRANSACTION WITH
CONSISTENT MYISAM SNAPSHOT was accepted but it actually started
InnoDB transaction, not MyISAM. This patch does not allow
setting engine that does not support consistent snapshot.

mysql> start transaction with consistent myisam snapshot;
ERROR 1105 (HY000): Consistent Snapshot is not supported for this engine

Currently only InnoDB and RocksDB support consistent snapshot.
To check engines, I modified sql/sql_yacc.yy, trans_begin()
and ha_start_consistent_snapshot() to pass handlerton.

- Changed constant name from
  MYSQL_START_TRANS_OPT_WITH_CONS_INNODB_SNAPSHOT to
MYSQL_START_TRANS_OPT_WITH_CONS_ENGINE_SNAPSHOT, because it's no longer
InnoDB dependent.

- When not setting engine, START TRANSACTION WITH CONSISTENT SNAPSHOT
takes both InnoDB and RocksDB snapshots, and both InnoDB and RocksDB
participate in transaction. When executing COMMIT, both InnoDB and
RocksDB modifications are committed. Remember that XA is not supported yet,
so mixing engines is not recommended anyway.

- When setting engine, START TRANSACTION WITH CONSISTENT.. takes
snapshot for the specified engine only. But it starts both
InnoDB and RocksDB transactions.

Test Plan: mtr --suite=rocksdb,rocksdb_rpl, --repeat=3

Reviewers: hermanlee4, jonahcohen, jtolmer, tian.xia, maykov, spetrunia

Reviewed By: spetrunia

Subscribers: steaphan

Differential Revision: https://reviews.facebook.net/D32355
BohuTANG pushed a commit that referenced this issue Dec 31, 2016
Summary:
Don't update secondary indexes that do not have columns that
are included in table->write_map at query start.

Test Plan: mtr

Reviewers: maykov, yoshinorim, hermanlee4, jonahcohen

Reviewed By: jonahcohen

Differential Revision: https://reviews.facebook.net/D33471
BohuTANG pushed a commit that referenced this issue Dec 31, 2016
Summary:
Inside index_next_same() call, we should
1. first check whether the record matches the index
   lookup prefix,
2. then check pushed index condition.

If we try to check #2 without checking #1 first, we may walk
off the index lookup prefix and scan till the end of the index.

Test Plan: Run mtr

Reviewers: hermanlee4, maykov, jtolmer, yoshinorim

Reviewed By: yoshinorim

Differential Revision: https://reviews.facebook.net/D38769
BohuTANG pushed a commit that referenced this issue Dec 31, 2016
Summary:
Inside index_next_same() call, we should
1. first check whether the record matches the index
   lookup prefix,
2. then check pushed index condition.

If we try to check #2 without checking #1 first, we may walk
off the index lookup prefix and scan till the end of the index.

Test Plan: Run mtr

Reviewers: hermanlee4, maykov, jtolmer, yoshinorim

Reviewed By: yoshinorim

Differential Revision: https://reviews.facebook.net/D38769
BohuTANG pushed a commit that referenced this issue Dec 31, 2016
Summary:
MyRocks had two bugs when calculating index scan cost.
1. block_size was not considered. This made covering index scan cost
(both full index scan and range scan) much higher
2. ha_rocksdb::records_in_range() may have estimated more rows
than the estimated number of rows in the table. This was wrong,
and MySQL optimizer decided to use full index scan even though
range scan was more efficient.

This diff fixes #1 by setting stats.block_size at ha_rocksdb::open(),
and fixes #2 by reducing the number of estimated rows if it was
larger than stats.records.

Test Plan:
mtr, updating some affected test cases, and new test case
rocksdb_range2

Reviewers: hermanlee4, jkedgar, spetrunia

Reviewed By: spetrunia

Subscribers: MarkCallaghan, webscalesql-eng

Differential Revision: https://reviews.facebook.net/D55869
BohuTANG pushed a commit that referenced this issue Dec 31, 2016
Summary:
MyRocks had two bugs when calculating index scan cost.
1. block_size was not considered. This made covering index scan cost
(both full index scan and range scan) much higher
2. ha_rocksdb::records_in_range() may have estimated more rows
than the estimated number of rows in the table. This was wrong,
and MySQL optimizer decided to use full index scan even though
range scan was more efficient.

This diff fixes #1 by setting stats.block_size at ha_rocksdb::open(),
and fixes #2 by reducing the number of estimated rows if it was
larger than stats.records.

Test Plan:
mtr, updating some affected test cases, and new test case
rocksdb_range2

Reviewers: hermanlee4, jkedgar, spetrunia

Reviewed By: spetrunia

Subscribers: MarkCallaghan, webscalesql-eng

Differential Revision: https://reviews.facebook.net/D55869
BohuTANG added a commit that referenced this issue Dec 24, 2017
Summary:
In the xa transation 'XA END' phase(thd_sql_command is SQLCOM_END), TokuDB slave will create both transaction for trx->sp_level and trx->stmt, this will cause the toku_xids_can_create_child abort since the trx->sp_level->xids is 0x00.

How to reproduce:
With tokudb_debug=32, do the queries on master:
create table t1(a int)engine=tokudb;

xa start 'x1';
insert into t1 values(1);
xa end 'x1';
xa prepare 'x1';
xa commit 'x1';

xa start 'x2';
insert into t1 values(2);
xa end 'x2';
xa prepare 'x2';
xa commit 'x2';

Slave debug info:
xa start 'x1';
insert into t1 values(1);
xa end 'x1';
xa prepare 'x1';
xa commit 'x1';
2123 0x7ff2d44c5830 /u01/tokudb/storage/tokudb/ha_tokudb.cc:6533 ha_tokudb::external_lock trx (nil) (nil) (nil) (nil) 0 0
2123 /u01/tokudb/storage/tokudb/tokudb_txn.h:127 txn_begin begin txn (nil) 0x7ff2d44a3000 67108864 r=0
2123 0x7ff2d44c5830 /u01/tokudb/storage/tokudb/ha_tokudb.cc:6426 ha_tokudb::create_txn created master 0x7ff2d44a3000
2123 /u01/tokudb/storage/tokudb/tokudb_txn.h:127 txn_begin begin txn 0x7ff2d44a3000 0x7ff2d44a3100 1 r=0
2123 0x7ff2d44c5830 /u01/tokudb/storage/tokudb/ha_tokudb.cc:6468 ha_tokudb::create_txn created stmt 0x7ff2d44a3000 sp_level 0x7ff2d44a3100
2123 0x7ff2d44c5830 /u01/tokudb/storage/tokudb/ha_tokudb.cc:4120 ha_tokudb::write_row txn 0x7ff2d44a3100
2123 /u01/tokudb/storage/tokudb/hatoku_hton.cc:942 tokudb_commit commit trx 0 txn 0x7ff2d44a3100 syncflag 512

xa start 'x2';
insert into t1 values(2);
xa end 'x2';
xa prepare 'x2';
xa commit 'x2';
2123 0x7ff2d44c5830 /u01/tokudb/storage/tokudb/ha_tokudb.cc:6533 ha_tokudb::external_lock trx 0x7ff2d44a3000 (nil) 0x7ff2d44a3000 (nil) 0 0
2123 /u01/tokudb/storage/tokudb/tokudb_txn.h:127 txn_begin begin txn 0x7ff2d44a3000 0x7ff2d44a3000 1 r=0
2123 0x7ff2d44c5830 /u01/tokudb/storage/tokudb/ha_tokudb.cc:6468 ha_tokudb::create_txn created stmt 0x7ff2d44a3000 sp_level 0x7ff2d44a3000
2123 0x7ff2d44c5830 /u01/tokudb/storage/tokudb/ha_tokudb.cc:4120 ha_tokudb::write_row txn 0x7ff2d44a3000
2017-12-24T08:36:45.347405Z 11 [ERROR] TokuDB: toku_db_put: Transaction cannot do work when child exists

2017-12-24T08:36:45.347444Z 11 [Warning] Slave: Got error 22 from storage engine Error_code: 1030
2017-12-24T08:36:45.347448Z 11 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with 'SLAVE START'. We stopped at log 'mysql-bin.000001' position 1007
2123 /u01/tokudb/storage/tokudb/hatoku_hton.cc:972 tokudb_rollback rollback 0 txn 0x7ff2d44a3000
Segmentation fault (core dumped)

This crash caused by the parent->xid is 0x00.
The core statck info:
(gdb) bt
#0  __pthread_kill (threadid=<optimized out>, signo=signo@entry=11) at ../sysdeps/unix/sysv/linux/pthread_kill.c:62
#1  0x0000000000f6b647 in my_write_core (sig=sig@entry=11) at /u01/tokudb/mysys/stacktrace.c:249
#2  0x000000000086b945 in handle_fatal_signal (sig=11) at /u01/tokudb/sql/signal_handler.cc:223
#3  <signal handler called>
#4  toku_xids_can_create_child (xids=0x0) at /u01/tokudb/storage/tokudb/PerconaFT/ft/txn/xids.cc:93
#5  0x000000000080531f in toku_txn_begin_with_xid (parent=0x7f0bf501c280, txnp=0x7f0bf50a3490, logger=0x7f0c415e66c0, xid=..., snapshot_type=TXN_SNAPSHOT_CHILD, container_db_txn=0x7f0bf50a3400, for_recovery=false, read_only=false) at /u01/tokudb/storage/tokudb/PerconaFT/ft/txn/txn.cc:137
#6  0x00000000007aa6a2 in toku_txn_begin (env=0x7f0c819fde00, stxn=0x7f0bf50a3300, txn=0x7f0bf500dca8, flags=<optimized out>) at /u01/tokudb/storage/tokudb/PerconaFT/src/ydb_txn.cc:579
#7  0x0000000000f99323 in txn_begin (thd=0x7f0bf504bfc0, flags=1, txn=0x7f0bf500dca8, parent=0x7f0bf50a3300, env=<optimized out>) at /u01/tokudb/storage/tokudb/tokudb_txn.h:116
#8  ha_tokudb::create_txn (this=0x7f0bf50c8830, thd=0x7f0bf504bfc0, trx=0x7f0bf500dca0) at /u01/tokudb/storage/tokudb/ha_tokudb.cc:6458
#9  0x0000000000fa48f9 in ha_tokudb::external_lock (this=0x7f0bf50c8830, thd=0x7f0bf504bfc0, lock_type=1) at /u01/tokudb/storage/tokudb/ha_tokudb.cc:6544
#10 0x00000000008d46eb in handler::ha_external_lock (this=0x7f0bf50c8830, thd=thd@entry=0x7f0bf504bfc0, lock_type=lock_type@entry=1) at /u01/tokudb/sql/handler.cc:8352
#11 0x0000000000e4f3b4 in lock_external (count=1, tables=0x7f0bf5050688, thd=0x7f0bf504bfc0) at /u01/tokudb/sql/lock.cc:389
#12 mysql_lock_tables (thd=thd@entry=0x7f0bf504bfc0, tables=<optimized out>, count=<optimized out>, flags=0) at /u01/tokudb/sql/lock.cc:325
#13 0x0000000000cd0b6d in lock_tables (thd=thd@entry=0x7f0bf504bfc0, tables=0x7f0bf4d11020, count=<optimized out>, flags=flags@entry=0) at /u01/tokudb/sql/sql_base.cc:6705
#14 0x0000000000cd61f2 in open_and_lock_tables (thd=0x7f0bf504bfc0, tables=0x7f0bf4d11020, flags=flags@entry=0, prelocking_strategy=prelocking_strategy@entry=0x7f0c89629680) at /u01/tokudb/sql/sql_base.cc:6523
percona#15 0x0000000000ee09eb in open_and_lock_tables (flags=0, tables=<optimized out>, thd=<optimized out>) at /u01/tokudb/sql/sql_base.h:484
percona#16 Rows_log_event::do_apply_event (this=0x7f0bf50ab4a0, rli=0x7f0c87762800) at /u01/tokudb/sql/log_event.cc:10911
percona#17 0x0000000000ed71c0 in Log_event::apply_event (this=this@entry=0x7f0bf50ab4a0, rli=rli@entry=0x7f0c87762800) at /u01/tokudb/sql/log_event.cc:3329
percona#18 0x0000000000f1d233 in apply_event_and_update_pos (ptr_ev=ptr_ev@entry=0x7f0c89629940, thd=thd@entry=0x7f0bf504bfc0, rli=rli@entry=0x7f0c87762800) at /u01/tokudb/sql/rpl_slave.cc:4761
percona#19 0x0000000000f280a8 in exec_relay_log_event (rli=0x7f0c87762800, thd=0x7f0bf504bfc0) at /u01/tokudb/sql/rpl_slave.cc:5276
percona#20 handle_slave_sql (arg=<optimized out>) at /u01/tokudb/sql/rpl_slave.cc:7491
percona#21 0x00000000013c6184 in pfs_spawn_thread (arg=0x7f0bf5bea820) at /u01/tokudb/storage/perfschema/pfs.cc:2185
percona#22 0x00007f0c885126ba in start_thread (arg=0x7f0c8962a700) at pthread_create.c:333
percona#23 0x00007f0c87d293dd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
(gdb) f 10
#10 0x00000000008d46eb in handler::ha_external_lock (this=0x7f0bf50c8830, thd=thd@entry=0x7f0bf504bfc0, lock_type=lock_type@entry=1) at /u01/tokudb/sql/handler.cc:8352
8352    /u01/tokudb/sql/handler.cc: No such file or directory.
(gdb) p thd->lex->sql_command
 = SQLCOM_END

With the fixed patch, the debug info is:
xa start 'x1';
insert into t1 values(1);
xa end 'x1';
xa prepare 'x1';
xa commit 'x1';
24111 0x7f4aba6c4830 /u01/tokudb/storage/tokudb/ha_tokudb.cc:6534 ha_tokudb::external_lock trx (nil) (nil) (nil) (nil) 0 0
24111 /u01/tokudb/storage/tokudb/tokudb_txn.h:127 txn_begin begin txn (nil) 0x7f4aba689000 67108864 r=0
24111 0x7f4aba6c4830 /u01/tokudb/storage/tokudb/ha_tokudb.cc:6469 ha_tokudb::create_txn created stmt (nil) sp_level 0x7f4aba689000
24111 0x7f4aba6c4830 /u01/tokudb/storage/tokudb/ha_tokudb.cc:4120 ha_tokudb::write_row txn 0x7f4aba689000
24111 /u01/tokudb/storage/tokudb/hatoku_hton.cc:942 tokudb_commit commit trx 0 txn 0x7f4aba689000 syncflag 512

xa start 'x2';
insert into t1 values(2);
xa end 'x2';
xa prepare 'x2';
xa commit 'x2';
24111 0x7f4aba6c4830 /u01/tokudb/storage/tokudb/ha_tokudb.cc:6534 ha_tokudb::external_lock trx (nil) (nil) (nil) (nil) 0 0
24111 /u01/tokudb/storage/tokudb/tokudb_txn.h:127 txn_begin begin txn (nil) 0x7f4aba689000 67108864 r=0
24111 0x7f4aba6c4830 /u01/tokudb/storage/tokudb/ha_tokudb.cc:6469 ha_tokudb::create_txn created stmt (nil) sp_level 0x7f4aba689000
24111 0x7f4aba6c4830 /u01/tokudb/storage/tokudb/ha_tokudb.cc:4120 ha_tokudb::write_row txn 0x7f4aba689000
24111 /u01/tokudb/storage/tokudb/hatoku_hton.cc:942 tokudb_commit commit trx 0 txn 0x7f4aba689000 syncflag 512

Test:
mtr --suite=tokudb xa

Reviewed by: Rik
prohaska7 added a commit that referenced this issue Jan 5, 2018
…ecuting global constructors (before main gets called). the assert catches global mutex initialization which seems to be problematic. since tokudb has a global mutex (open_tables_mutex) and tokudb is statically linked into mysqld (plugin type mandatory), the assert fires.

we could:
1. remove the assert, or
2. rewrite tokudb to remove the open_tables_mutex, or
3. compile tokudb into a shared library.

this commit compiles tokudb into a shared library for debug builds.

GNU gdb (Ubuntu 8.0.1-0ubuntu1) 8.0.1
Copyright (C) 2017 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from bin/mysqld...done.
[New LWP 24606]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `bin/mysqld --initialize'.
Program terminated with signal SIGABRT, Aborted.
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1  0x00007f4498d8df5d in __GI_abort () at abort.c:90
#2  0x00007f4498d83f17 in __assert_fail_base (fmt=<optimized out>, assertion=assertion@entry=0x5581f567ed23 "safe_mutex_inited",
    file=file@entry=0x5581f567ecf0 "/home/rfp/projects/xelabs-server/mysys/thr_mutex.c", line=line@entry=39,
    function=function@entry=0x5581f567f040 <__PRETTY_FUNCTION__.6977> "safe_mutex_init") at assert.c:92
#3  0x00007f4498d83fc2 in __GI___assert_fail (assertion=0x5581f567ed23 "safe_mutex_inited", file=0x5581f567ecf0 "/home/rfp/projects/xelabs-server/mysys/thr_mutex.c", line=39,
    function=0x5581f567f040 <__PRETTY_FUNCTION__.6977> "safe_mutex_init") at assert.c:101
#4  0x00005581f4ab1a40 in safe_mutex_init (mp=0x5581f632b8e0 <TOKUDB_SHARE::_open_tables_mutex>, attr=0x5581f6336550 <my_fast_mutexattr>,
    file=0x5581f57ac188 "/home/rfp/projects/xelabs-server/storage/tokudb/tokudb_thread.h", line=207) at /home/rfp/projects/xelabs-server/mysys/thr_mutex.c:39
#5  0x00005581f500aa9f in my_mutex_init (mp=0x5581f632b8e0 <TOKUDB_SHARE::_open_tables_mutex>, attr=0x5581f6336550 <my_fast_mutexattr>,
    file=0x5581f57ac188 "/home/rfp/projects/xelabs-server/storage/tokudb/tokudb_thread.h", line=207) at /home/rfp/projects/xelabs-server/include/thr_mutex.h:167
#6  0x00005581f500ac25 in inline_mysql_mutex_init (key=4294967295, that=0x5581f632b8e0 <TOKUDB_SHARE::_open_tables_mutex>, attr=0x5581f6336550 <my_fast_mutexattr>,
    src_file=0x5581f57ac188 "/home/rfp/projects/xelabs-server/storage/tokudb/tokudb_thread.h", src_line=207)
    at /home/rfp/projects/xelabs-server/include/mysql/psi/mysql_thread.h:668
#7  0x00005581f5043ee5 in tokudb::thread::mutex_t::mutex_t (this=0x5581f632b8e0 <TOKUDB_SHARE::_open_tables_mutex>, key=4294967295)
    at /home/rfp/projects/xelabs-server/storage/tokudb/tokudb_thread.h:207
#8  0x00005581f5043e68 in tokudb::thread::mutex_t::mutex_t (this=0x5581f632b8e0 <TOKUDB_SHARE::_open_tables_mutex>)
    at /home/rfp/projects/xelabs-server/storage/tokudb/tokudb_thread.h:45
#9  0x00005581f50433a4 in __static_initialization_and_destruction_0 (__initialize_p=1, __priority=65535) at /home/rfp/projects/xelabs-server/storage/tokudb/ha_tokudb.cc:38
#10 0x00005581f5043936 in _GLOBAL__sub_I__ZN6tokudb8metadata4readEP9__toku_dbP13__toku_db_txnyPvmPm () at /home/rfp/projects/xelabs-server/storage/tokudb/ha_tokudb.cc:9061
#11 0x00005581f529aefd in __libc_csu_init ()
#12 0x00007f4498d76150 in __libc_start_main (main=0x5581f4025aea <main(int, char**)>, argc=2, argv=0x7ffe9dcd7118, init=0x5581f529aeb0 <__libc_csu_init>,
    fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffe9dcd7108) at ../csu/libc-start.c:264
#13 0x00005581f4025a0a in _start ()
(gdb) q
prohaska7 added a commit that referenced this issue Jan 8, 2018
…ecuting global constructors (before main gets called). the assert catches global mutex initialization which seems to be problematic. since tokudb has a global mutex (open_tables_mutex) and tokudb is statically linked into mysqld (plugin type mandatory), the assert fires.

we could:
1. remove the assert, or
2. rewrite tokudb to remove the open_tables_mutex, or
3. compile tokudb into a shared library.

this commit compiles tokudb into a shared library for debug builds.

GNU gdb (Ubuntu 8.0.1-0ubuntu1) 8.0.1
Copyright (C) 2017 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from bin/mysqld...done.
[New LWP 24606]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `bin/mysqld --initialize'.
Program terminated with signal SIGABRT, Aborted.
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1  0x00007f4498d8df5d in __GI_abort () at abort.c:90
#2  0x00007f4498d83f17 in __assert_fail_base (fmt=<optimized out>, assertion=assertion@entry=0x5581f567ed23 "safe_mutex_inited",
    file=file@entry=0x5581f567ecf0 "/home/rfp/projects/xelabs-server/mysys/thr_mutex.c", line=line@entry=39,
    function=function@entry=0x5581f567f040 <__PRETTY_FUNCTION__.6977> "safe_mutex_init") at assert.c:92
#3  0x00007f4498d83fc2 in __GI___assert_fail (assertion=0x5581f567ed23 "safe_mutex_inited", file=0x5581f567ecf0 "/home/rfp/projects/xelabs-server/mysys/thr_mutex.c", line=39,
    function=0x5581f567f040 <__PRETTY_FUNCTION__.6977> "safe_mutex_init") at assert.c:101
#4  0x00005581f4ab1a40 in safe_mutex_init (mp=0x5581f632b8e0 <TOKUDB_SHARE::_open_tables_mutex>, attr=0x5581f6336550 <my_fast_mutexattr>,
    file=0x5581f57ac188 "/home/rfp/projects/xelabs-server/storage/tokudb/tokudb_thread.h", line=207) at /home/rfp/projects/xelabs-server/mysys/thr_mutex.c:39
#5  0x00005581f500aa9f in my_mutex_init (mp=0x5581f632b8e0 <TOKUDB_SHARE::_open_tables_mutex>, attr=0x5581f6336550 <my_fast_mutexattr>,
    file=0x5581f57ac188 "/home/rfp/projects/xelabs-server/storage/tokudb/tokudb_thread.h", line=207) at /home/rfp/projects/xelabs-server/include/thr_mutex.h:167
#6  0x00005581f500ac25 in inline_mysql_mutex_init (key=4294967295, that=0x5581f632b8e0 <TOKUDB_SHARE::_open_tables_mutex>, attr=0x5581f6336550 <my_fast_mutexattr>,
    src_file=0x5581f57ac188 "/home/rfp/projects/xelabs-server/storage/tokudb/tokudb_thread.h", src_line=207)
    at /home/rfp/projects/xelabs-server/include/mysql/psi/mysql_thread.h:668
#7  0x00005581f5043ee5 in tokudb::thread::mutex_t::mutex_t (this=0x5581f632b8e0 <TOKUDB_SHARE::_open_tables_mutex>, key=4294967295)
    at /home/rfp/projects/xelabs-server/storage/tokudb/tokudb_thread.h:207
#8  0x00005581f5043e68 in tokudb::thread::mutex_t::mutex_t (this=0x5581f632b8e0 <TOKUDB_SHARE::_open_tables_mutex>)
    at /home/rfp/projects/xelabs-server/storage/tokudb/tokudb_thread.h:45
#9  0x00005581f50433a4 in __static_initialization_and_destruction_0 (__initialize_p=1, __priority=65535) at /home/rfp/projects/xelabs-server/storage/tokudb/ha_tokudb.cc:38
#10 0x00005581f5043936 in _GLOBAL__sub_I__ZN6tokudb8metadata4readEP9__toku_dbP13__toku_db_txnyPvmPm () at /home/rfp/projects/xelabs-server/storage/tokudb/ha_tokudb.cc:9061
#11 0x00005581f529aefd in __libc_csu_init ()
#12 0x00007f4498d76150 in __libc_start_main (main=0x5581f4025aea <main(int, char**)>, argc=2, argv=0x7ffe9dcd7118, init=0x5581f529aeb0 <__libc_csu_init>,
    fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffe9dcd7108) at ../csu/libc-start.c:264
#13 0x00005581f4025a0a in _start ()
(gdb) q
BohuTANG added a commit that referenced this issue Jan 18, 2018
TokuDB will be crashed during shutdown due to PFS key double-free.

(gdb) bt
#0  __pthread_kill (threadid=<optimized out>, signo=signo@entry=11) at ../sysdeps/unix/sysv/linux/pthread_kill.c:62
#1  0x0000000000f6b3e7 in my_write_core (sig=sig@entry=11) at /u01/tokudb/mysys/stacktrace.c:249
#2  0x000000000086b6e5 in handle_fatal_signal (sig=11) at /u01/tokudb/sql/signal_handler.cc:223
#3  <signal handler called>
#4  destroy_mutex (pfs=0x7f8fb71c9900) at /u01/tokudb/storage/perfschema/pfs_instr.cc:327
#5  0x00000000013c574a in pfs_destroy_mutex_v1 (mutex=<optimized out>) at /u01/tokudb/storage/perfschema/pfs.cc:1833
#6  0x0000000000fc350a in inline_mysql_mutex_destroy (that=0x1f84ea0 <tokudb_map_mutex>) at /u01/tokudb/include/mysql/psi/mysql_thread.h:681
#7  tokudb::thread::mutex_t::~mutex_t (this=0x1f84ea0 <tokudb_map_mutex>, __in_chrg=<optimized out>) at /u01/tokudb/storage/tokudb/tokudb_thread.h:214
#8  0x00007f8fbb38cff8 in _run_exit_handlers (status=status@entry=0, listp=0x7f8fbb7175f8 <_exit_funcs>, run_list_atexit=run_list_atexit@entry=true) at exit.c:82
#9  0x00007f8fbb38d045 in __GI_exit (status=status@entry=0) at exit.c:104
#10 0x000000000085b7b5 in mysqld_exit (exit_code=exit_code@entry=0) at /u01/tokudb/sql/mysqld.cc:1205
#11 0x0000000000865fa6 in mysqld_main (argc=46, argv=0x7f8fbaeb3088) at /u01/tokudb/sql/mysqld.cc:5430
#12 0x00007f8fbb373830 in __libc_start_main (main=0x78d700 <main(int, char**)>, argc=10, argv=0x7ffd36ab1a28, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffd36ab1a18) at ../csu/libc-start.c:291
#13 0x00000000007a7b79 in _start ()

And the AddressSanitizer errors:
==27219==ERROR: AddressSanitizer: heap-use-after-free on address 0x7f009b12d118 at pc 0x00000265d80b bp 0x7ffd3bb7ffb0 sp 0x7ffd3bb7ffa0
READ of size 8 at 0x7f009b12d118 thread T0
#0 0x265d80a in destroy_mutex(PFS_mutex*) /u01/tokudb/storage/perfschema/pfs_instr.cc:323
#1 0x1cfaef3 in inline_mysql_mutex_destroy /u01/tokudb/include/mysql/psi/mysql_thread.h:681
#2 0x1cfaef3 in tokudb::thread::mutex_t::~mutex_t() /u01/tokudb/storage/tokudb/tokudb_thread.h:214
#3 0x7f009d512ff7 (/lib/x86_64-linux-gnu/libc.so.6+0x39ff7)
#4 0x7f009d513044 in exit (/lib/x86_64-linux-gnu/libc.so.6+0x3a044)
#5 0x9a8239 in mysqld_exit /u01/tokudb/sql/mysqld.cc:1205
#6 0x9b210f in unireg_abort /u01/tokudb/sql/mysqld.cc:1175
#7 0x9b629e in init_server_components /u01/tokudb/sql/mysqld.cc:4509
#8 0x9b8aca in mysqld_main(int, char**) /u01/tokudb/sql/mysqld.cc:5001
#9 0x7f009d4f982f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)
#10 0x7e8308 in _start (/home/ubuntu/mysql_20161216/bin/mysqld+0x7e8308)
dbkernel pushed a commit to lulabs/tokudb that referenced this issue Nov 7, 2018
Group Replication does implement conflict detection on
multi-primary to avoid write errors on parallel operations.
The conflict detection is also engaged in single-primary mode on the
particular case of primary change and the new primary still has a
backlog to apply. Until the backlog is flushed, conflict detection
is enabled to prevent write errors between the backlog and incoming
transactions.

The conflict detection data, which we name certification info, is
also used to detected dependencies between accepted transactions,
dependencies which will rule the transactions schedule on the
parallel applier.

In order to avoid that the certification info grows forever,
periodically all members exchange their GTID_EXECUTED set, which
full intersection will provide the set of transactions that are
applied on all members. Future transactions cannot conflict with
this set since all members are operating on top of it, so we can
safely remove all write-sets from the certification info that do
belong to those transactions.
More details at WL#6833: Group Replication: Read-set free
Certification Module (DBSM Snapshot Isolation).

Though a corner case was found on which the garbage collection was
purging more data than it should.
The scenario is:
 1) Group with 2 members;
 2) Member1 executes:
      CREATE TABLE t1(a INT, b INT, PRIMARY KEY(a));
      INSERT INTO t1 VALUE(1, 1);
    Both members have a GTID_EXECUTED= UUID:1-4
    Both members certification info has:
       Hash of item in Writeset     snapshot version (Gtid_set)
       xelabs#1                           UUID1:1-4
 3) member1 executes TA
      UPDATE t1 SET b=10 WHERE a=1;
    and blocks immediately before send the transaction to the group.
    This transaction has snapshot_version: UUID:1-4
 4) member2 executes TB
      UPDATE t1 SET b=10 WHERE a=1;
    This transaction has snapshot_version: UUID:1-4
    It goes through the complete patch and it is committed.
    This transaction has GTID: UUID:1000002
    Both members have a GTID_EXECUTED= UUID:1-4:1000002
    Both members certification info has:
       Hash of item in Writeset     snapshot version (Gtid_set)
       xelabs#1                           UUID1:1-4:1000002
 5) member2 becomes extremely slow in processing transactions, we
    simulate that by holding the transaction queue to the GR
    pipeline.
    Transaction delivery is still working, but the transaction will
    be block before certification.
 6) member1 is able to send its TA transaction, lets recall that
    this transaction has snapshot_version: UUID:1-4.
    On conflict detection on member1, it will conflict with xelabs#1,
    since this snapshot_version does not contain the snapshot_version
    of xelabs#1, that is TA was executed on a previous version than TB.
    On member2 the transaction will be delivered and will be put on
    hold before conflict detection.
 7) meanwhile the certification info garbage collection kicks in.
    Both members have a GTID_EXECUTED= UUID:1-4:1000002
    Its intersection is UUID:1-4:1000002
    Both members certification info has:
       Hash of item in Writeset     snapshot version (Gtid_set)
       xelabs#1                           UUID1:1-4:1000002
    The condition to purge write-sets is:
       snapshot_version.is_subset(intersection)
    We have
       "UUID:1-4:1000002".is_subset("UUID:1-4:1000002)
    which is true, so we remove xelabs#1.
    Both members certification info has:
       Hash of item in Writeset     snapshot version (Gtid_set)
       <empty>
 8) member2 gets back to normal, we release transaction TA, lets
    recall that this transaction has snapshot_version: UUID:1-4.
    On conflict detection, since the certification info is empty,
    the transaction will be allowed to proceed, which is incorrect,
    it must rollback (like on member1) since it conflicts with TB.

The problem it is on certification garbage collection, more
precisely on the condition used to purge data, we cannot leave the
certification info empty otherwise this situation can happen.
The condition must be changed to
       snapshot_version.is_subset_not_equals(intersection)
which will always leave a placeholder to detect delayed conflicting
transaction.

So a trace of the solution is (starting on step 7):
 7) meanwhile the certification info garbage collection kicks in.
    Both members have a GTID_EXECUTED= UUID:1-4:1000002
    Its intersection is UUID:1-4:1000002
    Both members certification info has:
       Hash of item in Writeset     snapshot version (Gtid_set)
       xelabs#1                           UUID1:1-4:1000002
    The condition to purge write-sets is:
       snapshot_version.is_subset_not_equals(intersection)
    We have
       "UUID:1-4:1000002".is_subset_not_equals("UUID:1-4:1000002)
    which is false, so we do not remove xelabs#1.
    Both members certification info has:
       Hash of item in Writeset     snapshot version (Gtid_set)
       xelabs#1                           UUID1:1-4:1000002
 8) member2 gets back to normal, we release transaction TA, lets
    recall that this transaction has snapshot_version: UUID:1-4.
    On conflict detection on member2, it will conflict with xelabs#1,
    since this snapshot_version does not contain the snapshot_version
    of xelabs#1, that is TA was executed on a previous version than TB.

This is the same scenario that we see on this bug, though here the
pipeline is being blocked by the distributed recovery procedure,
that is, while the joining member is applying the missing data
through the recovery channel, the incoming data is being queued.
Meanwhile the certification info garbage collection kicks in and
purges more data that it should, the result it is that conflicts are
not being detected.
BohuTANG pushed a commit that referenced this issue Apr 3, 2019
Upstream commit ID : fb-mysql-5.6.35/8cb1dc836b68f1f13e8b2655b2b8cb2d57f400b3
PS-5217 : Merge fb-prod201803

Summary:
Original report: https://jira.mariadb.org/browse/MDEV-15816

To reproduce this bug just following below steps,

client 1:
USE test;
CREATE TABLE t1 (i INT) ENGINE=MyISAM;
HANDLER t1 OPEN h;
CREATE TABLE t2 (i INT) ENGINE=RocksDB;
LOCK TABLES t2 WRITE;

client 2:
FLUSH TABLES WITH READ LOCK;

client 1:
INSERT INTO t2 VALUES (1);

So client 1 acquired the lock and set m_lock_rows = RDB_LOCK_WRITE.
Then client 2 calls store_lock(TL_IGNORE) and m_lock_rows was wrongly
set to RDB_LOCK_NONE, as below

```
 #0  myrocks::ha_rocksdb::store_lock (this=0x7fffbc03c7c8, thd=0x7fffc0000ba0, to=0x7fffc0011220, lock_type=TL_IGNORE)
 #1  get_lock_data (thd=0x7fffc0000ba0, table_ptr=0x7fffe84b7d20, count=1, flags=2)
 #2  mysql_lock_abort_for_thread (thd=0x7fffc0000ba0, table=0x7fffbc03bbc0)
 #3  THD::notify_shared_lock (this=0x7fffc0000ba0, ctx_in_use=0x7fffbc000bd8, needs_thr_lock_abort=true)
 #4  MDL_lock::notify_conflicting_locks (this=0x555557a82380, ctx=0x7fffc0000cc8)
 #5  MDL_context::acquire_lock (this=0x7fffc0000cc8, mdl_request=0x7fffe84b8350, lock_wait_timeout=2)
 #6  Global_read_lock::lock_global_read_lock (this=0x7fffc0003fe0, thd=0x7fffc0000ba0)
```

Finally, client 1 "INSERT INTO..." hits the Assertion 'm_lock_rows == RDB_LOCK_WRITE'
failed in myrocks::ha_rocksdb::write_row()

Fix this bug by not setting m_locks_rows if lock_type == TL_IGNORE.

Closes facebook/mysql-5.6#838
Pull Request resolved: facebook/mysql-5.6#871

Differential Revision: D9417382

Pulled By: lth

fbshipit-source-id: c36c164e06c
BohuTANG pushed a commit that referenced this issue Jun 17, 2019
…E TO A SERVER

Problem
========================================================================
Running the GCS tests with ASAN seldomly reports a user-after-free of
the server reference that the acceptor_learner_task uses.

Here is an excerpt of ASAN's output:

==43936==ERROR: AddressSanitizer: heap-use-after-free on address 0x63100021c840 at pc 0x000000530ff8 bp 0x7fc0427e8530 sp 0x7fc0427e8520
WRITE of size 8 at 0x63100021c840 thread T3
    #0 0x530ff7 in server_detected /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_transport.c:962
    #1 0x533814 in buffered_read_bytes /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_transport.c:1249
    #2 0x5481af in buffered_read_msg /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_transport.c:1399
    #3 0x51e171 in acceptor_learner_task /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_base.c:4690
    #4 0x562357 in task_loop /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/task.c:1140
    #5 0x5003b2 in xcom_taskmain2 /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_base.c:1324
    #6 0x6a278a in Gcs_xcom_proxy_impl::xcom_init(unsigned short, node_address*) /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/gcs_xcom_proxy.cc:164
    #7 0x59b3c1 in xcom_taskmain_startup /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/gcs_xcom_control_interface.cc:107
    #8 0x7fc04a2e4dd4 in start_thread (/lib64/libpthread.so.0+0x7dd4)
    #9 0x7fc047ff2bfc in __clone (/lib64/libc.so.6+0xfebfc)

0x63100021c840 is located 64 bytes inside of 65688-byte region [0x63100021c800,0x63100022c898)
freed by thread T3 here:
    #0 0x7fc04a5d7508 in __interceptor_free (/lib64/libasan.so.4+0xde508)
    #1 0x52cf86 in freesrv /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_transport.c:836
    #2 0x52ea78 in srv_unref /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_transport.c:868
    #3 0x524c30 in reply_handler_task /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_base.c:4914
    #4 0x562357 in task_loop /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/task.c:1140
    #5 0x5003b2 in xcom_taskmain2 /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_base.c:1324
    #6 0x6a278a in Gcs_xcom_proxy_impl::xcom_init(unsigned short, node_address*) /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/gcs_xcom_proxy.cc:164
    #7 0x59b3c1 in xcom_taskmain_startup /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/gcs_xcom_control_interface.cc:107
    #8 0x7fc04a2e4dd4 in start_thread (/lib64/libpthread.so.0+0x7dd4)

previously allocated by thread T3 here:
    #0 0x7fc04a5d7a88 in __interceptor_calloc (/lib64/libasan.so.4+0xdea88)
    #1 0x543604 in mksrv /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_transport.c:721
    #2 0x543b4c in addsrv /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_transport.c:755
    #3 0x54af61 in update_servers /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_transport.c:1747
    #4 0x501082 in site_install_action /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_base.c:1572
    #5 0x55447c in import_config /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/site_def.c:486
    #6 0x506dfc in handle_x_snapshot /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_base.c:5257
    #7 0x50c444 in xcom_fsm /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_base.c:5325
    #8 0x516c36 in dispatch_op /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_base.c:4510
    #9 0x521997 in acceptor_learner_task /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_base.c:4772
    #10 0x562357 in task_loop /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/task.c:1140
    #11 0x5003b2 in xcom_taskmain2 /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_base.c:1324
    #12 0x6a278a in Gcs_xcom_proxy_impl::xcom_init(unsigned short, node_address*) /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/gcs_xcom_proxy.cc:164
    #13 0x59b3c1 in xcom_taskmain_startup /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/gcs_xcom_control_interface.cc:107
    #14 0x7fc04a2e4dd4 in start_thread (/lib64/libpthread.so.0+0x7dd4)

Analysis
========================================================================
The server structure is reference counted by the associated sender_task
and reply_handler_task.
When they finish, they unreference the server, which leads to its memory
being freed.

However, the acceptor_learner_task keeps a "naked" reference to the
server structure.
Under the right ordering of operations, i.e. the sender_task and
reply_handler_task terminating after the acceptor_learner_task acquires,
but before it uses, the reference to the server structure, leads to the
acceptor_learner_task accessing the server structure after it has been
freed.

Solution
========================================================================
Let the acceptor_learner_task also reference count the server structure
so it is not freed while still in use.

Reviewed-by: André Negrão <andre.negrao@oracle.com>
Reviewed-by: Venkatesh Venugopal <venkatesh.venugopal@oracle.com>
RB: 21209
BohuTANG pushed a commit that referenced this issue May 7, 2020
… keyring

Issue: when the server is started without a keyring, the redo log encryption
checks were never performed, redo log encryption wasn't initialized, but the
setting was left on, misleading the user. Later operations which triggered
this check could possibly fail.

Issue #2: redo log encryption didn't work properly when it was turned on using
the command line: in some cases, the redo log remained unencrypted (until
another operation triggered the redo log encryption setup methods to be called
again)

Both issues were caused by the refactoring of the periodic, once every
second encryption checks in upstream, in PS-5189.

This commit:

* changes the code so redo log encryption routines are called at least twice
during every startup with the encryption settings on. This fixes issue #2.
* changes the code so redo log encryption routines are called at least once
during every startup, even without a keyring, or a read-only server: this fixes
issues #1
* adds a test ensuring that in an invalid configuration (without a keyring) the
the server remains running, but the redo log encryption setting correctly displays the
OFF status
* adds a test ensuring that when the redo log is correctly configured (using the
dynamic variable or a command line parameter) the log is encrypted, and no unencrypted
data is present in it.
* backports additional redo log related bugfixes/refactorings from 8.0
(92198f4). The original commit contains fixes
for both the redo and the undo log, the backport only includes the redo log related
parts refactored to match our changes.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant