Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make -j8 failed #2

Closed
guimingyue opened this issue Oct 20, 2021 · 1 comment
Closed

make -j8 failed #2

guimingyue opened this issue Oct 20, 2021 · 1 comment

Comments

@guimingyue
Copy link

OS: Ubuntu 20.04 LTS (GNU/Linux 5.4.0-42-generic x86_64)

[ 61%] Building CXX object plugin/x/client/CMakeFiles/mysqlxclient_lite.dir/xprotocol_impl.cc.o
In file included from /home/ubuntu/develop/galaxyengine/sql/sql_cmd.h:31:0,
                 from /home/ubuntu/develop/galaxyengine/sql/sql_plugin.h:34,
                 from /home/ubuntu/develop/galaxyengine/include/mysql/plugin.h:35,
                 from /home/ubuntu/develop/galaxyengine/sql/ha_sequence.cc:29:
/home/ubuntu/develop/galaxyengine/sql/ha_sequence.cc: In member function ‘bool ha_sequence::setup_base_engine()’:
/home/ubuntu/develop/galaxyengine/sql/ha_sequence.cc:623:15: error: ‘FALSE’ was not declared in this scope
   DBUG_RETURN(FALSE);
               ^
/home/ubuntu/develop/galaxyengine/include/my_dbug.h:156:13: note: in definition of macro ‘DBUG_RETURN’
     return (a1);                              \
             ^~
/home/ubuntu/develop/galaxyengine/sql/ha_sequence.cc:623:15: note: suggested alternative: ‘FILE’
   DBUG_RETURN(FALSE);
               ^
/home/ubuntu/develop/galaxyengine/include/my_dbug.h:156:13: note: in definition of macro ‘DBUG_RETURN’
     return (a1);                              \
             ^~
/home/ubuntu/develop/galaxyengine/sql/ha_sequence.cc:626:15: error: ‘TRUE’ was not declared in this scope
   DBUG_RETURN(TRUE);
               ^
/home/ubuntu/develop/galaxyengine/include/my_dbug.h:156:13: note: in definition of macro ‘DBUG_RETURN’
     return (a1);                              \
             ^~
/home/ubuntu/develop/galaxyengine/sql/ha_sequence.cc: In member function ‘bool ha_sequence::setup_base_handler(MEM_ROOT*)’:
/home/ubuntu/develop/galaxyengine/sql/ha_sequence.cc:662:17: error: ‘TRUE’ was not declared in this scope
     DBUG_RETURN(TRUE);
                 ^
/home/ubuntu/develop/galaxyengine/include/my_dbug.h:156:13: note: in definition of macro ‘DBUG_RETURN’
     return (a1);                              \
             ^~
/home/ubuntu/develop/galaxyengine/sql/ha_sequence.cc:664:15: error: ‘FALSE’ was not declared in this scope
   DBUG_RETURN(FALSE);
               ^
/home/ubuntu/develop/galaxyengine/include/my_dbug.h:156:13: note: in definition of macro ‘DBUG_RETURN’
     return (a1);                              \
             ^~
/home/ubuntu/develop/galaxyengine/sql/ha_sequence.cc:664:15: note: suggested alternative: ‘FILE’
   DBUG_RETURN(FALSE);
               ^
/home/ubuntu/develop/galaxyengine/include/my_dbug.h:156:13: note: in definition of macro ‘DBUG_RETURN’
     return (a1);                              \
             ^~
/home/ubuntu/develop/galaxyengine/sql/ha_sequence.cc: In member function ‘bool ha_sequence::get_from_handler_file(const char*, MEM_ROOT*)’:
/home/ubuntu/develop/galaxyengine/sql/ha_sequence.cc:678:27: error: ‘FALSE’ was not declared in this scope
   if (m_file) DBUG_RETURN(FALSE);
                           ^
/home/ubuntu/develop/galaxyengine/include/my_dbug.h:156:13: note: in definition of macro ‘DBUG_RETURN’
     return (a1);                              \
             ^~
/home/ubuntu/develop/galaxyengine/sql/ha_sequence.cc:678:27: note: suggested alternative: ‘FILE’
   if (m_file) DBUG_RETURN(FALSE);
                           ^
/home/ubuntu/develop/galaxyengine/include/my_dbug.h:156:13: note: in definition of macro ‘DBUG_RETURN’
     return (a1);                              \
             ^~
/home/ubuntu/develop/galaxyengine/sql/ha_sequence.cc:682:15: error: ‘FALSE’ was not declared in this scope
   DBUG_RETURN(FALSE);
               ^
/home/ubuntu/develop/galaxyengine/include/my_dbug.h:156:13: note: in definition of macro ‘DBUG_RETURN’
     return (a1);                              \
             ^~
/home/ubuntu/develop/galaxyengine/sql/ha_sequence.cc:682:15: note: suggested alternative: ‘FILE’
   DBUG_RETURN(FALSE);
               ^
/home/ubuntu/develop/galaxyengine/include/my_dbug.h:156:13: note: in definition of macro ‘DBUG_RETURN’
     return (a1);                              \
             ^~
/home/ubuntu/develop/galaxyengine/sql/ha_sequence.cc:685:15: error: ‘TRUE’ was not declared in this scope
   DBUG_RETURN(TRUE);
               ^
/home/ubuntu/develop/galaxyengine/include/my_dbug.h:156:13: note: in definition of macro ‘DBUG_RETURN’
     return (a1);                              \
             ^~
/home/ubuntu/develop/galaxyengine/sql/ha_sequence.cc: In member function ‘bool ha_sequence::new_handler_from_sequence_info(MEM_ROOT*)’:
/home/ubuntu/develop/galaxyengine/sql/ha_sequence.cc:704:17: error: ‘TRUE’ was not declared in this scope
     DBUG_RETURN(TRUE);
                 ^
/home/ubuntu/develop/galaxyengine/include/my_dbug.h:156:13: note: in definition of macro ‘DBUG_RETURN’
     return (a1);                              \
             ^~
/home/ubuntu/develop/galaxyengine/sql/ha_sequence.cc:706:15: error: ‘FALSE’ was not declared in this scope
   DBUG_RETURN(FALSE);
               ^
/home/ubuntu/develop/galaxyengine/include/my_dbug.h:156:13: note: in definition of macro ‘DBUG_RETURN’
     return (a1);                              \
             ^~
/home/ubuntu/develop/galaxyengine/sql/ha_sequence.cc:706:15: note: suggested alternative: ‘FILE’
   DBUG_RETURN(FALSE);
               ^
/home/ubuntu/develop/galaxyengine/include/my_dbug.h:156:13: note: in definition of macro ‘DBUG_RETURN’
     return (a1);                              \
             ^~
/home/ubuntu/develop/galaxyengine/sql/ha_sequence.cc: In member function ‘bool ha_sequence::initialize_sequence(MEM_ROOT*)’:
/home/ubuntu/develop/galaxyengine/sql/ha_sequence.cc:722:19: error: ‘TRUE’ was not declared in this scope
       DBUG_RETURN(TRUE);
                   ^
/home/ubuntu/develop/galaxyengine/include/my_dbug.h:156:13: note: in definition of macro ‘DBUG_RETURN’
     return (a1);                              \
             ^~
/home/ubuntu/develop/galaxyengine/sql/ha_sequence.cc:725:17: error: ‘TRUE’ was not declared in this scope
     DBUG_RETURN(TRUE);
                 ^
/home/ubuntu/develop/galaxyengine/include/my_dbug.h:156:13: note: in definition of macro ‘DBUG_RETURN’
     return (a1);                              \
             ^~
/home/ubuntu/develop/galaxyengine/sql/ha_sequence.cc:730:17: error: ‘TRUE’ was not declared in this scope
     DBUG_RETURN(TRUE);
                 ^
/home/ubuntu/develop/galaxyengine/include/my_dbug.h:172:7: note: in definition of macro ‘DBUG_EXECUTE_IF’
       a1                                 \
       ^~
/home/ubuntu/develop/galaxyengine/sql/ha_sequence.cc:730:5: note: in expansion of macro ‘DBUG_RETURN’
     DBUG_RETURN(TRUE);
     ^
/home/ubuntu/develop/galaxyengine/sql/ha_sequence.cc:733:15: error: ‘FALSE’ was not declared in this scope
   DBUG_RETURN(FALSE);
               ^
/home/ubuntu/develop/galaxyengine/include/my_dbug.h:156:13: note: in definition of macro ‘DBUG_RETURN’
     return (a1);                              \
             ^~
/home/ubuntu/develop/galaxyengine/sql/ha_sequence.cc:733:15: note: suggested alternative: ‘FILE’
   DBUG_RETURN(FALSE);
               ^
/home/ubuntu/develop/galaxyengine/include/my_dbug.h:156:13: note: in definition of macro ‘DBUG_RETURN’
     return (a1);                              \
             ^~
/home/ubuntu/develop/galaxyengine/sql/ha_sequence.cc: In member function ‘virtual int ha_sequence::create(const char*, TABLE*, HA_CREATE_INFO*, dd::Table*)’:
/home/ubuntu/develop/galaxyengine/sql/ha_sequence.cc:1154:68: error: ‘TRUE’ was not declared in this scope
   if (get_from_handler_file(name, ha_thd()->mem_root)) DBUG_RETURN(TRUE);
                                                                    ^
/home/ubuntu/develop/galaxyengine/include/my_dbug.h:156:13: note: in definition of macro ‘DBUG_RETURN’
     return (a1);                              \
             ^~
/home/ubuntu/develop/galaxyengine/sql/ha_sequence.cc: In member function ‘virtual int ha_sequence::delete_table(const char*, const dd::Table*)’:
/home/ubuntu/develop/galaxyengine/sql/ha_sequence.cc:1193:68: error: ‘TRUE’ was not declared in this scope
   if (get_from_handler_file(name, ha_thd()->mem_root)) DBUG_RETURN(TRUE);
                                                                    ^
/home/ubuntu/develop/galaxyengine/include/my_dbug.h:156:13: note: in definition of macro ‘DBUG_RETURN’
     return (a1);                              \
             ^~
/home/ubuntu/develop/galaxyengine/sql/ha_sequence.cc: In member function ‘virtual int ha_sequence::rename_table(const char*, const char*, const dd::Table*, dd::Table*)’:
/home/ubuntu/develop/galaxyengine/sql/ha_sequence.cc:1307:68: error: ‘TRUE’ was not declared in this scope
   if (get_from_handler_file(from, ha_thd()->mem_root)) DBUG_RETURN(TRUE);
                                                                    ^
/home/ubuntu/develop/galaxyengine/include/my_dbug.h:156:13: note: in definition of macro ‘DBUG_RETURN’
     return (a1);                              \
             ^~
[ 61%] Building CXX object plugin/x/client/CMakeFiles/mysqlxclient_lite.dir/xrow_impl.cc.o
make[2]: *** [sql/CMakeFiles/sequence.dir/build.make:63: sql/CMakeFiles/sequence.dir/ha_sequence.cc.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:10502: sql/CMakeFiles/sequence.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....
[ 61%] Building CXX object plugin/x/client/CMakeFiles/mysqlxclient_lite.dir/xquery_result_impl.cc.o
[ 61%] Building CXX object plugin/x/client/CMakeFiles/mysqlxclient.dir/xcompression_impl.cc.o
[ 61%] Building CXX object plugin/x/client/CMakeFiles/mysqlxclient_lite.dir/xsession_impl.cc.o
[ 61%] Building CXX object plugin/x/client/CMakeFiles/mysqlxclient.dir/xrow.cc.o
[ 61%] Building CXX object plugin/x/client/CMakeFiles/mysqlxclient.dir/validator/capability_compression_validator.cc.o
[ 61%] Building CXX object plugin/x/client/CMakeFiles/mysqlxclient.dir/galaxy_protocol_string.cc.o
[ 61%] Building CXX object plugin/x/client/CMakeFiles/mysqlxclient.dir/__/__/__/sql-common/net_ns.cc.o
[ 61%] Building CXX object plugin/x/client/CMakeFiles/mysqlxclient_lite.dir/xconnection_impl.cc.o
[ 61%] Building CXX object plugin/x/client/CMakeFiles/mysqlxclient_lite.dir/xcompression_negotiator.cc.o
[ 62%] Building CXX object plugin/x/client/CMakeFiles/mysqlxclient_lite.dir/xcompression_impl.cc.o
[ 62%] Building CXX object plugin/x/client/CMakeFiles/mysqlxclient_lite.dir/xrow.cc.o
[ 62%] Building CXX object plugin/x/client/CMakeFiles/mysqlxclient_lite.dir/validator/capability_compression_validator.cc.o
[ 62%] Building CXX object plugin/x/client/CMakeFiles/mysqlxclient_lite.dir/galaxy_protocol_string.cc.o
[ 62%] Building CXX object plugin/x/client/CMakeFiles/mysqlxclient_lite.dir/__/__/__/sql-common/net_ns.cc.o
[ 62%] Linking CXX static library libmysqlxclient.a
[ 62%] Built target mysqlxclient
[ 62%] Linking CXX static library libmysqlxclient_lite.a
[ 62%] Built target mysqlxclient_lite
make: *** [Makefile:163: all] Error 2
@AliMH3410
Copy link

Could you please tell me how did you solve this?

xiewajueji pushed a commit that referenced this issue May 5, 2024
A heap-buffer-overflow in libmyqlxclient when
- auth-method is MYSQL41
- the "server" sends a nonce that is shortert than 20 bytes.

==2466857==ERROR: AddressSanitizer: heap-buffer-overflow on address
#0 0x4a7b76 in memcpy (routertest_component_routing_splicer+0x4a7b76)
#1 0x7fd3a1d89052 in SHA1_Update (/libcrypto.so.1.1+0x1c2052)
#2 0x63409c in compute_mysql41_hash_multi(unsigned char*, char const*,
   unsigned int, char const*, unsigned int)
   ...

RB: 25305
Reviewed-by: Lukasz Kotula <lukasz.kotula@oracle.com>
xiewajueji pushed a commit that referenced this issue May 5, 2024
TABLESPACE STATE DOES NOT CHANGE THE SPACE TO EMPTY

After the commit for Bug#31991688, it was found that an idle system may
not ever get around to truncating an undo tablespace when it is SET INACTIVE.
Actually, it takes about 128 seconds before the undo tablespace is finally
truncated.

There are three main tasks for the function trx_purge().
1) Process the undo logs and apply changes to the data files.
   (May be multiple threads)
2) Clean up the history list by freeing old undo logs and rollback
   segments.
3) Truncate undo tablespaces that have grown too big or are SET INACTIVE
   explicitly.

Bug#31991688 made sure that steps 2 & 3 are not done too often.
Concentrating this effort keeps the purge lag from growing too large.
By default, trx_purge() does step#1 128 times before attempting steps
#2 & #3 which are called 'truncate' steps.  This is set by the setting
innodb_purge_rseg_truncate_frequency.

On an idle system, trx_purge() is called once per second if it has nothing
to do in step 1.  After 128 seconds, it will finally do steps 2 (truncating
the undo logs and rollback segments which reduces the history list to zero)
and step 3 (truncating any undo tablespaces that need it).

The function that the purge coordinator thread uses to make these repeated
calls to trx_purge() is called srv_do_purge(). When trx_purge() returns
having done nothing, srv_do_purge() returns to srv_purge_coordinator_thread()
which will put the purge thread to sleep.  It is woke up again once per
second by the master thread in srv_master_do_idle_tasks() if not sooner
by any of several of other threads and activities.

This is how an idle system can wait 128 seconds before the truncate steps
are done and an undo tablespace that was SET INACTIVE can finally become
'empty'.

The solution in this patch is to modify srv_do_purge() so that if trx_purge()
did nothing and there is an undo space that was explicitly set to inactive,
it will immediately call trx_purge again with do_truncate=true so that steps
#2 and #3 will be done.

This does not affect the effort by Bug#31991688 to keep the purge lag from
growing too big on sysbench UPDATE NO_KEY. With this change, the purge lag
has to be zero and there must be a pending explicit undo space truncate
before this extra call to trx_purge is done.

Approved by Sunny in RB#25311
xiewajueji pushed a commit that referenced this issue May 5, 2024
…TH VS 2019 [#2] [noclose]

storage\ndb\src\kernel\blocks\backup\Backup.cpp(2807,37): warning C4805: '==': unsafe mix of type 'Uint32' and type 'bool' in operation

Change-Id: I0582c4e40bcfc69cdf3288ed84ad3ac62c9e4b80
xiewajueji pushed a commit that referenced this issue May 5, 2024
Use --cluster-config-suffix in mtr.

Change-Id: I667984cfe01c597510c81f80802532df490bf5e6
xiewajueji pushed a commit that referenced this issue May 5, 2024
…ING TABLESPACES

The occurrence of this message is a minor issue fixed by change #1 below.
But during testing, I found that if mysqld is restarted while remote and
local tablespaces are discarded, especially if the tablespaces to be imported
are already in place at startup, then many things can go wrong.  There were
various asserts that occurred depending on timing. During all the testing
and debugging, the following changes were made.

1. Prevent the stats thread from complaining about a missing tablespace.
   See dict_stats_update().
2. Prevent a discarded tablespace from being opened at startup, even if the
   table to be imported is already in place. See Validate_files::check().
3. dd_tablespace_get_state_enum() was refactored to separate the normal
   way to do it in v8.0, which is to use "state" key in
   dd::tablespaces::se_private_date, from the non-standard way which is
   to check undo::spaces or look for the old key value pair of
   "discarded=true". This allowed the new call to this routine by the
   change in fix #2 above.
4. Change thd_tablespace_op() in sql/sql_thd_api.cc such that instead of
   returning 1 if the DDL requires an implicit tablespace, it returns the
   DDL operation flag.  This can still be interpreted as a boolean, but it
   can also be used to determine if the op is an IMPORT or a DISCARD.
5. With that change, the annoying message that a space is discarded can be
   avoided during an import when it needs to be discarded.
6. Several test cases were corrected now that the useless "is discarded"
   warning is no longer being written.
7. Two places where dd_tablespace_set_state() was called to set the state
   to either "discard" or "normal" were consolidated to a new version of
   dd_tablespace_set_state(thd, dd_space_id, space_name, dd_state).
8. This new version of dd_tablespace_set_state() was used in
   dd_commit_inplace_alter_table() to make sure that in all three places
   the dd is changed to identify a discarded tablesapace, it is identified
   in dd:Tablespace::se_private_data as well as dd:Table::se_private_data
   or dd::Partition::se_private_data.  The reason it is necessary to
   record this in dd::Tablespace is that during startup, boot_tablespaces()
   and Validate::files::check() are only traversing dd::Tablespace.
   And that is where fix #2 is done!
9. One of the asserts that occurred was during IMPORT TABLESPACE after a
   restart that found a discarded 5.7 tablespace in the v8.0 discarded
   location. This assert occurred in Fil_shard::get_file_size() just after
   ER_IB_MSG_272.  The 5.7 file did not have the SDI flag, but the v8.0
   space that was discarded did have that flag.  So the flags did not match.
   That crash was fixed by setting the fil_space_t::flags to what it is in
   the tablespace header page.  A descriptive comment was added.
10. There was a section in fil_ibd_open() that checked
   `if (space != nullptr) {` and if true, it would close and free stuff
   then immediately crash.  I think I remember many years ago adding that
   assert because I did not think it actually occurred. Well it did occur
   during my testing before I added fix #2 above.  This made fil_ibd_open()
   assume that the file was NOT already open.
   So fil_ibd_open() is now changed to allow for that possibility by adding
   `if (space != nullptr) {return DB_SUCCESS}` further down.
   Since fil_ibd_open() can be called with a `validate` boolean, the routine
   now attempts to do all the validation whether or not the tablespace is
   already open.

The following are non-functional changes;
- Many code documentation lines were added or improved.
- dict_sys_t::s_space_id renamed to dict_sys_t::s_dict_space_id in order
  to clarify better which space_id it referred to.
- For the same reason, change s_dd_space_id to s_dd_dict_space_id.
- Replaced `table->flags2 & DICT_TF2_DISCARDED`
  with `dict_table_is_discarded(table)` in dict0load.cc
- A redundant call to ibuf_delete_for_discarded_space(space_id) was deleted
  from fil_discard_tablespace() because it is also called higher up in
  the call stack in row_import_for_mysql().
- Deleted the declaration to `row_import_update_discarded_flag()` since
  the definition no longer exists.  It was deleted when we switched from
  `discarded=true` to 'state=discarded' in dd::Tablespace::se_private_data
  early in v8.0 developement.

Approved by Mateusz in RB#26077
xiewajueji pushed a commit that referenced this issue May 5, 2024
Memory leaks detected when running testMgm with ASAN build.

bld_asan$> mtr test_mgm

Direct leak of 8 byte(s) in 1 object(s) allocated from:
    #0 0x3004ed in malloc
(trunk/bld_asan/runtime_output_directory/testMgm+0x3004ed)
    #1 0x7f794d6b0b46 in ndb_mgm_create_logevent_handle
trunk/bld_asan/../storage/ndb/src/mgmapi/ndb_logevent.cpp:85:24
    #2 0x335b4b in runTestMgmApiReadErrorRestart(NDBT_Context*,
NDBT_Step*)
trunk/bld_asan/../storage/ndb/test/ndbapi/testMgm.cpp:652:32

Add support for using unique_ptr for all functions in mgmapi that return
pointer to something that need to be released.
Move existing functionality for ndb_mgm_configuration to same new file.
Use ndb_mgm namespace for new functions and remove implementation
details from both the new and old functionality
Use new functionality to properly release allocated memory.

Change-Id: Id455234077c4ed6756e93bf7f40a1e93179af1a0
xiewajueji pushed a commit that referenced this issue May 5, 2024
Remove the unused "ndb_table_statistics_row" struct

Change-Id: I62982d005d50a0ece7d92b3861ecfa8462a05661
xiewajueji pushed a commit that referenced this issue May 5, 2024
Patch #2: Support multi-valued indexes for prepared statements.

Parameters to prepared statements are not denoted as constant but
constant during statement execution, however only constant values are
considered for use with multi-valued indexes.

Replace const_item() with const_for_execution() to enable use of
such parameters with multi-valued indexes.

This is a contribution by Yubao Liu.

Change-Id: I8cf843a95d2657e5fcc67a04df65815f9ad3154a
xiewajueji pushed a commit that referenced this issue May 5, 2024
This error happens for queries such as:

SELECT ( SELECT 1 FROM t1 ) AS a,
  ( SELECT a FROM ( SELECT x FROM t1 ORDER BY a ) AS d1 );

Query_block::prepare() for query block #4 (corresponding to the 4th
SELECT in the query above) calls setup_order() which again calls
find_order_in_list(). That function replaces an Item_ident for 'a' in
Query_block.order_list with an Item_ref pointing to query block #2.
Then Query_block::merge_derived() merges query block #4 into query
block #3. The Item_ref mentioned above is then moved to the order_list
of query block #3.

In the next step, find_order_in_list() is called for query block #3.
At this point, 'a' in the select list has been resolved to another
Item_ref, also pointing to query block #2. find_order_in_list()
detects that the Item_ref in the order_list is equivalent to the
Item_ref in the select list, and therefore decides to replace the
former with the latter. Then find_order_in_list() calls
Item::clean_up_after_removal() recursively (via Item::walk()) for the
order_list Item_ref (since that is no longer needed).

When calling clean_up_after_removal(), no
Cleanup_after_removal_context object is passed. This is the actual
error, as there should be a context pointing to query block #3 that
ensures that clean_up_after_removal() only purge Item_subselect.unit
if both of the following conditions hold:

1) The Item_subselect should not be in any of the Item trees in the
   select list of query block #3.

2) Item_subselect.unit should be a descendant of query block #3.

These conditions ensure that we only purge Item_subselect.unit if we
are sure that it is not needed elsewhere. But without the right
context, query block #2 gets purged even if it is used in the select
lists of query blocks #1 and #3.

The fix is to pass a context (for query block #3) to clean_up_after_removal().
Both of the above conditions then become false, and Item_subselect.unit is
not purged. As an additional shortcut, find_order_in_list() will not call
clean_up_after_removal() if real_item() of the order item and the select
list item are identical.

In addition, this commit changes clean_up_after_removal() so that it
requires the context to be non-null, to prevent similar errors. It
also simplifies Item_sum::clean_up_after_removal() by removing window
functions unconditionally (and adds a corresponding test case).

Change-Id: I449be15d369dba97b23900d1a9742e9f6bad4355
xiewajueji pushed a commit that referenced this issue May 5, 2024
#2]

If the schema distribution client detects timeout, but before freeing
the schema object if the coordinator receives the schema event, then
coordinator instead of returning the function, will process the stale
schema event.

The coordinator does not know if the schema distribution time out is
detected by the client. It starts processing the schema event whenever
the schema object is valid. So, introduce a new variable to indicate
the state of the schema object and change the state when client detect
the schema distribution timeout or when the schema event is received by
the coordinator. So that both coordinator and client can be in sync.

Change-Id: Ic0149aa9a1ae787c7799a675f2cd085f0ac0c4bb
xiewajueji pushed a commit that referenced this issue May 5, 2024
…ILER WARNINGS

Remove some stringop-truncation warning using cstrbuf.

Change-Id: I3ab43f6dd8c8b0b784d919211b041ac3ad4fad40
xiewajueji pushed a commit that referenced this issue May 5, 2024
Patch #1 caused several problems in mysql-trunk related to ndbinfo
initialization and upgrade, including the failure of the test
ndb_76_inplace_upgrade and the failure of all NDB MTR tests in Pushbuild
on Windows. This patch fixes these issues, including fixes for
bug#33726826 and bug#33730799.

In ndbinfo, revert the removal of ndb$blocks and ndb$index_stats and the
change of blocks and index_stats from views to tables.

Improve the ndbinfo schema upgrade & initialization logic to better handle
such a change in the future. This logic now runs in two passes: first it
drops the known tables and views from current and previous versions, then
it creates the tables and views for the current version.

Add a new class method NdbDictionary::printColumnTypeDescription(). This
is needed for the ndbinfo.columns table in patch #2 but was missing from
patch #1. Add boilerplate index lookup initialization code that was
also missing.

Fix ndbinfo prefix determination on Windows.

Change-Id: I422856bcad4baf5ae9b14c1e3a1f2871bd6c5f59
xiewajueji pushed a commit that referenced this issue May 5, 2024
When creating a NdbEventOperationImpl it need reference to a
NdbDictionary::Event. Creating a NdbDictionary::Event involves a
roundtrip to NDB in order to "open" the Event and return the Event
instance. This may fail and is not suitable for doing in a constructor.

Fix by moving the opening of NdbDictionary::Event out of
NdbEventOperationImpl constructor.

Change-Id: I5752f8b636ddd31672ac95f59b8f272a41cddfa9
xiewajueji pushed a commit that referenced this issue May 5, 2024
* PROBLEM

The test "ndb.ndb_bug17624736" was constantly failing in
[daily|weekly]-8.0-cluster branches in PB2, whether on `ndb-ps` or
`ndb-default-big` profile test runs. The high-level reason for the
failure was the installation of a duplicate entry in the Data
Dictionary in respect to the `engine`-`se_private_id` pair, even when
the previous table definition should have been dropped.

* LOW-LEVEL EXPLANATION

When data nodes fail and need to reorganize, the MySQL servers
connected start to synchronize the schema definition in their own Data
Dictionary. The `se_private_id` for NDB tables installed in the DD is
the same as the NDB table ID, hereafter refered to as just ID, and
thus a pair `engine`-`se_private_id` is installed in the
`tables.engine`. It is common tables to be updated with different IDs,
such as when an ALTER table or a DROP/CREATE occurs. The previous
table definition, gotten by table full qualified name ("schema.table"
format), is usually sufficient to be dropped and hence the new table
to be installed with the new ID, since it is assumed that no other
table definition is installed with that ID. However, on the
synchronization phase, if the data node failure caused a previous
table definition *of a different table than the one to be installed*
to still exist with the ID to be installed, then that old definition
won't be dropped and thus a duplicate entry warning will be logged on
the THD.

Example:
t1 - id=13,version=1
t2 - id=15,version=1
<failures and synchronization>
t1 = id=9,version=2
t2 = id=13,version=2 (previous def=15, but ndbcluster-13 still exists)

One of the reasons for the error is that on
`Ndb_dd_client::install_table` the name is used to fetch the previous
definition while on `Ndb_dd_client::store_table` the ID is used
instead. Also, `Ndb_dd_client::install_table` should be able to drop
the required table definitions on the DD in order to install the new
one, as dictated by the data nodes. It was just dropping the one found
by the name of the table to be installed.

* SOLUTION

The solution was to add procedures to check if the ID to be installed
is different than the previous, then it must be checked if an old
table definition already exists with that ID. If it does, drop it
also.

Additionally, some renaming (`object_id` to `spi`, refering to
`se_private_id`) and a new struct were employed to make it
simpler to keep the pair (ID-VERSION) together and respectively
install these on the new table's definition SE fields.

Change-Id: Ie671a5fc58646e02c21ef1299309303f33173e95
xiewajueji pushed a commit that referenced this issue May 5, 2024
-- Patch #1: Persist secondary load information --

Problem:
We need a way of knowing which tables were loaded to HeatWave after
MySQL restarts due to a crash or a planned shutdown.

Solution:
Add a new "secondary_load" flag to the `options` column of mysql.tables.
This flag is toggled after a successful secondary load or unload. The
information about this flag is also reflected in
INFORMATION_SCHEMA.TABLES.CREATE_OPTIONS.

-- Patch #2 --

The second patch in this worklog triggers the table reload from InnoDB
after MySQL restart.

The recovery framework recognizes that the system restarted by checking
whether tables are present in the Global State. If there are no tables
present, the framework will access the Data Dictionary and find which
tables were loaded before the restart.

This patch introduces the "Data Dictionary Worker" - a MySQL service
recovery worker whose task is to query the INFORMATION_SCHEMA.TABLES
table from a separate thread and find all tables whose secondary_load
flag is set to 1.

All tables that were found in the Data Dictionary will be appended to
the list of tables that have to be reloaded by the framework from
InnoDB.

If an error occurs during restart recovery we will not mark the recovery
as failed. This is done because the types of failures that can occur
when the tables are reloaded after a restart are less critical compared
to previously existing recovery situations. Additionally, this code will
soon have to be adapted for the next worklog in this area so we are
proceeding with the simplest solution that makes sense.

A Global Context variable m_globalStateEmpty is added which indicates
whether the Global State should be recovered from an external source.

-- Patch #3 --

This patch adds the "rapid_reload_on_restart" system variable. This
variable is used to control whether tables should be reloaded after a
restart of mysqld or the HeatWave plugin. This variable is persistable
(i.e., SET PERSIST RAPID_RELOAD_ON_RESTART = TRUE/FALSE).

The default value of this variable is set to false.

The variable can be modified in OFF, IDLE, and SUSPENDED states.

-- Patch #4 --

This patch refactors the recovery code by removing all recovery-related
code from ha_rpd.cc and moving it to separate files:

  - ha_rpd_session_factory.h/cc:
  These files contain the MySQLAdminSessionFactory class, which is used
to create admin sessions in separate threads that can be used to issue
SQL queries.

  - ha_rpd_recovery.h/cc:
  These files contain the MySQLServiceRecoveryWorker,
MySQLServiceRecoveryJob and ObjectStoreRecoveryJob classes which were
previously defined in ha_rpd.cc. This file also contains a function that
creates the RecoveryWorkerFactory object. This object is passed to the
constructor of the Recovery Framework and is used to communicate with
the other section of the code located in rpdrecoveryfwk.h/cc.

This patch also renames rpdrecvryfwk to rpdrecoveryfwk for better
readability.

The include relationship between the files is shown on the following
diagram:

        rpdrecoveryfwk.h◄──────────────rpdrecoveryfwk.cc
            ▲    ▲
            │    │
            │    │
            │    └──────────────────────────┐
            │                               │
        ha_rpd_recovery.h◄─────────────ha_rpd_recovery.cc──┐
            ▲                               │           │
            │                               │           │
            │                               │           │
            │                               ▼           │
        ha_rpd.cc───────────────────────►ha_rpd.h       │
                                            ▲           │
                                            │           │
            ┌───────────────────────────────┘           │
            │                                           ▼
    ha_rpd_session_factory.cc──────►ha_rpd_session_factory.h

Other changes:
  - In agreement with Control Plane, the external Global State is now
  invalidated during recovery framework startup if:
    1) Recovery framework recognizes that it should load the Global
    State from an external source AND,
    2) rapid_reload_on_restart is set to OFF.

  - Addressed review comments for Patch #3, rapid_reload_on_restart is
  now also settable while plugin is ON.

  - Provide a single entry point for processing external Global State
  before starting the recovery framework loop.

  - Change when the Data Dictionary is read. Now we will no longer wait
  for the HeatWave nodes to connect before querying the Data Dictionary.
  We will query it when the recovery framework starts, before accepting
  any actions in the recovery loop.

  - Change the reload flow by inserting fake global state entries for
  tables that need to be reloaded instead of manually adding them to a
  list of tables scheduled for reload. This method will be used for the
  next phase where we will recover from Object Storage so both recovery
  methods will now follow the same flow.

  - Update secondary_load_dd_flag added in Patch #1.

  - Increase timeout in wait_for_server_bootup to 300s to account for
  long MySQL version upgrades.

  - Add reload_on_restart and reload_on_restart_dbg tests to the rapid
  suite.

  - Add PLUGIN_VAR_PERSIST_AS_READ_ONLY flag to "rapid_net_orma_port"
  and "rapid_reload_on_restart" definitions, enabling their
  initialization from persisted values along with "rapid_bootstrap" when
  it is persisted as ON.

  - Fix numerous clang-tidy warnings in recovery code.

  - Prevent suspended_basic and secondary_load_dd_flag tests to run on
  ASAN builds due to an existing issue when reinstalling the RAPID
  plugin.

-- Bug#33752387 --

Problem:
A shutdown of MySQL causes a crash in queries fired by DD worker.

Solution:
Prevent MySQL from killing DD worker's queries by instantiating a
DD_kill_immunizer before the queries are fired.

-- Patch #5 --

Problem:
A table can be loaded before the DD Worker queries the Data Dictionary.
This means that table will be wrongly processed as part of the external
global state.

Solution:
If the table is present in the current in-memory global state we will
not consider it as part of the external global state and we will not
process it by the recovery framework.

-- Bug#34197659 --

Problem:
If a table reload after restart causes OOM the cluster will go into
RECOVERYFAILED state.

Solution:
Recognize when the tables are being reloaded after restart and do not
move the cluster into RECOVERYFAILED. In that case only the current
reload will fail and the reload for other tables will be attempted.

Change-Id: Ic0c2a763bc338ea1ae6a7121ff3d55b456271bf0
xiewajueji pushed a commit that referenced this issue May 5, 2024
Add various json fields in the new JSON format. Have json field
"access_type" with value "index" for many scans that use some or the
other forms of index. Plans with "access_type=index" have additional
fields such as index_access_type, covering, lookup_condition,
index_name, etc. The value of index_access_type will further tell us
what specfic type of index scan it is; like Index range scan, Index
lookup scan, etc.

Join plan nodes have access_type=join. Such plans will, again, have
additional json fields that tell us whether it's a hash join, merge
join, and whether it is an antijoin, semijoin, etc.

If a plan node is a root of a subquery subtree, it additionally
has the field 'subquery' with value "true". Such plan nodes will also
have fields like "location=projection", "dependent=true" corresponding
to the TREE format synopsis :
Select #2 (subquery in projection; dependent)

If a json field is absent, its value should be interpreted as either
0, empty, or false, depending on its type.

A side effect of this commit is that for AccessPath::REF, the phrase
"iterate backwards" is changed to "reverse".

New test file added to test format=JSON with hypergraph optimizer.

Change-Id: I816af3ec546c893d4fc0c77298ef17d49cff7427
xiewajueji pushed a commit that referenced this issue May 5, 2024
Enh#34350907 - [Nvidia] Allow DDLs when tables are loaded to HeatWave
Bug#34433145 - WL#15129: mysqld crash Assertion `column_count == static_cast<int64_t>(cp_table-
Bug#34446287 - WL#15129: mysqld crash at rapid::data::RapidNetChunkCtx::consolidateEncodingsDic
Bug#34520634 - MYSQLD CRASH : Sql_cmd_secondary_load_unload::mysql_secondary_load_or_unload
Bug#34520630 - Failed Condition: "table_id != InvalidTableId"

Currently, DDL statements such as ALTER TABLE*, RENAME TABLE, and
TRUNCATE TABLE are not allowed if a table has a secondary engine
defined. The statements fail with the following error: "DDLs on a table
with a secondary engine defined are not allowed."

This worklog lifts this restriction for tables whose secondary engine is
RAPID.

A secondary engine hook is called in the beginning (pre-hook) and in the
end (post-hook) of a DDL statement execution. If the DDL statement
succeeds, the post-hook will direct the recovery framework to reload the
table in order to reflect that change in HeatWave.

Currently all DDL statements that were previously disallowed will
trigger a reload. This can be improved in the future by checking whether
the DDL operation has an impact on HeatWave or not. However detecting
all edge-cases in this behavior is not straightforward so this
improvement has been left as a future improvement.

Additionally, if a DDL modifies the table schema in a way that makes it
incompatible with HeatWave (e.g., dropping a primary key column) the
reload will fail silently. There is no easy way to recognize whether the
table schema will become incompatible with HeatWave in a pre-hook.

List of changes:
  1) [MySQL] Add new HTON_SECONDARY_ENGINE_SUPPORTS_DDL flag to indicate
whether a secondary engine supports DDLs.
  2) [MySQL] Add RAII hooks for RENAME TABLE and TRUNCATE TABLE, modeled
on the ALTER TABLE hook.
  3) Define HeatWave hooks for ALTER TABLE, RENAME TABLE, and TRUNCATE
TABLE statements.
  4) If a table reload is necessary, trigger it by marking the table as
stale (WL#14914).
  4) Move all change propagation & DDL hooks to ha_rpd_hooks.cc.
  5) Adjust existing tests to support table reload upon DDL execution.
  6) Extract code related to RapidOpSyncCtx in ha_rpd_sync_ctx.cc, and
the PluginState enum to ha_rpd_fsm.h.

* Note: ALTER TABLE statements related to secondary engine setting and
loading were allowed before:
    - ALTER TABLE <TABLE> SECONDARY_UNLOAD,
    - ALTER TABLE SECONDARY_ENGINE = NULL.

-- Bug#34433145 --
-- Bug#34446287 --

--Problem #1--
Crashes in Change Propagation when the CP thread tries to apply DMLs of
tables with new schema to the not-yet-reloaded table in HeatWave.

--Solution #1--
Remove table from Change Propagation before marking it as stale and
revert the original change from rpd_binlog_parser.cc where we were
checking if the table was stale before continuing with binlog parsing.
The original change is no longer necessary since the table is removed
from CP before being marked as stale.

--Problem #2--
In case of a failed reload, tables are not removed from Global State.

--Solution #2--
Keep track of whether the table was reloaded because it was marked as
STALE. In that case we do not want the Recovery Framework to retry the
reload and therefore we can remove the table from the Global State.

-- Bug#34520634 --

Problem:
Allowing the change of primary engine for tables with a defined
secondary engine hits an assertion in mysql_secondary_load_or_unload().

Example:
    CREATE TABLE t1 (col1 INT PRIMARY KEY) SECONDARY_ENGINE = RAPID;
    ALTER TABLE t1 ENGINE = BLACKHOLE;
    ALTER TABLE t1 SECONDARY_LOAD; <- assertion hit here

Solution:
Disallow changing the primary engine for tables with a defined secondary
engine.

-- Bug#34520630 --

Problem:
A debug assert is being hit in rapid_gs_is_table_reloading_from_stale
because the table was dropped in the meantime.

Solution:
Instead of asserting, just return false if table is not present in the
Global State.

This patch also changes rapid_gs_is_table_reloading_from_stale to a more
specific check (inlined the logic in load_table()). This check now also
covers the case when a table was dropped/unloaded before the Recovery
Framework marked it as INRECOVERY. In that case, if the reload fails we
should not have an entry for that table in the Global State.

The patch also adjusts dict_types MTR test, where we no longer expect
for tables to be in UNAVAIL state after a failed reload. Additionally,
recovery2_ddls.test is adjusted to not try to offload queries running on
Performance Schema.

Change-Id: I6ee390b1f418120925f5359d5e9365f0a6a415ee
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants