Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ASAN, syntax and misc bugfixes #1332

Merged
merged 14 commits into from
Nov 2, 2023
Merged

Conversation

midenok
Copy link
Collaborator

@midenok midenok commented Oct 27, 2023

See fix descriptions in commit messages.

@davidducos
Copy link
Member

Hi @midenok,
Wow!! NICE WORK! thanks you for working on those issue, I will be reviewing the code.
Btw, take into account that the checks didn't pass. Let me know if you need help fixing those.

@midenok midenok force-pushed the corp_latest branch 2 times, most recently from a6b1774 to da493ed Compare October 27, 2023 22:41
@midenok
Copy link
Collaborator Author

midenok commented Oct 27, 2023

Hi @midenok, Wow!! NICE WORK! thanks you for working on those issue, I will be reviewing the code. Btw, take into account that the checks didn't pass. Let me know if you need help fixing those.

Ok. Now I hope everything is Ok.

@midenok
Copy link
Collaborator Author

midenok commented Oct 30, 2023

@davidducos I guess tests passed but some of them was interrupted. Right?

CMakeLists.txt Outdated Show resolved Hide resolved
@@ -548,7 +601,10 @@ void initialize_sql_statement(GString *statement){
if (is_mysql_like()) {
if (set_names_statement)
g_string_printf(statement,"%s;\n",set_names_statement);
g_string_append(statement, "/*!40014 SET UNIQUE_CHECKS=0*/;\n");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not append this to the files, on myloader we have the --defaults-file and the --defaults-extra-file to configure the [myloader_session_variables] to set this options. Honestly, I'm not sure why the FOREIGN_KEY_CHECKS is still here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should this be set conditionally? What's the point in UNIQUE_CHECKS=1?

Copy link
Collaborator Author

@midenok midenok Oct 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How it can not fail with FOREIGN_KEY_CHECKS=1? Do you track correct table creation order? This not only requires right order of DDL, but DML as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not arguing against the usage of FOREIGN_KEY_CHECKS=0 nor UNIQUE_CHECKS=0, what I'm saying is that the backup should not keep those SET, that is why we implemented the [myloader_session_variables] which has those SET by default in /etc/mydumper.cnf.
In this why, you can easy add any SET to the connections, not only FOREIGN_KEY_CHECKS or UNIQUE_CHECKS.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The question stays: why do you want them set conditionally if it does not make sense?

Copy link
Collaborator Author

@midenok midenok Oct 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I got it about UNIQUE_CHECKS.

FOREIGN_KEY_CHECKS=1 doesn't work at all, so it is in the code, OK.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about SQL_NOTES and NO_AUTO_VALUE_ON_ZERO. I guess NO_AUTO_VALUE_ON_ZERO should be there unconditionally, otherwise it produces different result on restored database.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SQL_NOTES can be added to the default configuration file.
We are setting NO_AUTO_VALUE_ON_ZERO to sql_mode in the default configuration file.

take into account that nowadays it is really important the configuration file, which is loaded by default from /etc/mydumper.cnf when you install it. However, when you compile MyDumper, it is not copy to the default location, that is why it is not loaded. That is why, I recommend you to compile and install if you are going to perform tests.

Copy link
Collaborator Author

@midenok midenok Oct 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The software should work correctly without any configuration files. Different result between the origin and the copy should be considered as bug. Default normally means something that doesn't require any explicit configuration settings.

Copy link
Collaborator Author

@midenok midenok Oct 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated this change to commit 25d0dce

@@ -937,7 +937,9 @@ GString *get_insertable_fields(MYSQL *conn, char *database, char *table) {
g_string_append(field_list, ",");
}

gchar *tb = g_strdup_printf("`%s`", row[0]);
char *field_name= backtick_protect(row[0]);
gchar *tb = g_strdup_printf("`%s`", field_name);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double backtick protection???

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you see test case in the link of commit message? It fixes bug with backticks in the name (backtick is the part of the name).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

create table ```\"'a` (i int);

--source include/mydumper_dump.inc
--source include/myloader_restore.inc
show tables;
drop table ```\"'a`;

Copy link
Collaborator Author

@midenok midenok Oct 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated this commit to b180d5c for one more fix: Newline in identifier names breaks myloader restore

CREATE TABLE `tab
one` (a int);

@@ -1076,7 +1078,10 @@ void get_primary_key_separated_by_comma(struct db_table * dbt) {
}else{
g_string_append(field_list, ",");
}
g_string_append(field_list, (char*)list->data);
char *field_name= backtick_protect((char*) list->data);
gchar *tb = g_strdup_printf("`%s`", field_name);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double backtick protection???

@@ -1152,7 +1157,7 @@ gboolean new_db_table( struct db_table **d, MYSQL *conn, struct configuration *c
dbt = g_new(struct db_table, 1);
dbt->status = UNDEFINED;
dbt->database = database;
dbt->table = g_strdup(table);
dbt->table = backtick_protect(table);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this will cause problems, I think that we use dbt->table enclosed by backtick on several queries. Did you review this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It passed quite long test with mischievous test cases adapted from mysqldump. See above answer and bug description.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you share the include files so I can add this test to the project?

Copy link
Collaborator Author

@midenok midenok Oct 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are run under MTR.

I'm not sure how would you run it because it checks the output against the .result file, but at least you can check runnability.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you should add -M to mydumper_args on mydumper_dump.inc, to check that the consistency

@davidducos
Copy link
Member

Hi @midenok,
about #1332 (comment)
No, tests didn't pass. We perform tests over multiple combination of options to detect possible issue and as you can see, 3 of them hang at #541, and 1 at #545. When this happens, I usually start a SSH test and check the test that failed.
From what I can see, this might be related to stream + gzip, but I might be wrong.

@midenok midenok force-pushed the corp_latest branch 5 times, most recently from 98d9fae to 7ef8c00 Compare October 30, 2023 19:36
davidducos
davidducos previously approved these changes Oct 30, 2023
@midenok midenok force-pushed the corp_latest branch 2 times, most recently from 2aea39f to 6866b0d Compare October 31, 2023 10:55
@midenok midenok changed the title Bugfixes ASAN, syntax and misc bugfixes Oct 31, 2023
@midenok
Copy link
Collaborator Author

midenok commented Oct 31, 2023

Hi @midenok, about #1332 (comment) No, tests didn't pass. We perform tests over multiple combination of options to detect possible issue and as you can see, 3 of them hang at #541, and 1 at #545. When this happens, I usually start a SSH test and check the test that failed. From what I can see, this might be related to stream + gzip, but I might be wrong.

I guess it was 266dd7f. So I removed it and now I see tests are a passing. Let's merge without it. For 266dd7f I will create another PR (and I expect there at least one more bugfix).

@midenok
Copy link
Collaborator Author

midenok commented Oct 31, 2023

@davidducos Btw, I see you create empty merge commits from pull requests. There was rebase option when you merge PR.

Typo fix: set_names_str is not the full statement string,
set_names_statement is the statement string according to
initialize_set_names().

Bug description:

https://jira.mariadb.org/browse/CONTRIB-5
jemalloc is known to eat more memory. OTOH there is testimony that it
may not achieve any significant performance at all:

https://stackoverflow.com/a/33993215/3824341

Bug description:

https://jira.mariadb.org/browse/CONTRIB-11
Omit format was "db.table" per line. Now it also accepts just "db" to
skip the whole db (including its schema).

Excluding some databases from dump is important feature because the
list of included databases may be not known and/or much bigger.

Bug description:

https://jira.mariadb.org/browse/CONTRIB-9
The logic in write_view_definition_into_file() of making duplicate
CREATE TABLE statements for views commented in the code as "we create
tables as workaround for view dependencies" pursues the goal of fixing
Bug#10927.

But this was done with undefined storage engine. Default storage
engine prohibits such amount of columns. We add ENGINE=MEMORY which
allows such amount of columns.

Original fix for Bug#10927 adds TEMPORARY keyword but we cannot add it
as it will fail for multiple connections.

Bug description:

https://jira.mariadb.org/browse/CONTRIB-14
Related to CONTRIB-14 and multi-threaded creation of views. When one
thread already dropped table v1 but did not yet created view v1
another thread creates view v2 which selects from v1 and fails.

This cannot be fixed with LOCK TABLES as lock is closed when table is
dropped. This cannot be fixed with CREATE OR REPLACE as this doesn't
work on different type (table cannot be replaced by view). As a quick
fix we make --max-threads-for-post-actions 1 by default, the number of
threads used for views creation.

Bug description:

https://jira.mariadb.org/browse/CONTRIB-17
UNHEX() works only with --hex-blob option.

Bug description:

https://jira.mariadb.org/browse/CONTRIB-20
When strlen() == 0 -1 addresses before buffer boundary. while()
condition was always false because buffer[strlen(buffer)] is always 0,
otherwise how strlen() would work?

We incremented chunk size from 256 to 4096 to speedup large loads.

Bug description:

https://jira.mariadb.org/browse/CONTRIB-21
Related to 96e1d8d change of

 /*  if (use_savepoints &&
       mysql_query(td->thrconn, "ROLLBACK TO SAVEPOINT mydumper")) {
     g_critical("Rollback to savepoint failed: %s", mysql_error(td->thrconn));
   }*/
-  tj->td=NULL;
   free_table_job(tj);
+  tj->td=NULL;
   g_free(job);

which is wrong as free_table_job() frees tj in the same commit.
When some SSL option like --key or --cert is wrong or missing SSL does
very vague messaging:

  Error connection to database: TLS/SSL error: No such file or directory
  Error connection to database: TLS/SSL error: The requested data were not available.
  Error connection to database: TLS/SSL error: Error while reading file.
  Error connection to database: TLS/SSL error: Base64 decoding error.

We can at least make this more understandable if the required option
is missing or if file not exists. The patch checks these criteria for
--ssl-mode=required (or preferred) for options --key and --cert and
for checking server identity mode it checks options --ca or --capath.

Bug description:

https://jira.mariadb.org/browse/CONTRIB-7
Protect backtick in name by repeating. This is different from
escaping, non-escaped and escaped names must include this.

g_string_replace() is included in 2.68, we include source code for
older glib.

Bug description:

https://jira.mariadb.org/browse/CONTRIB-15

Newline in identifier names breaks myloader restore

metadata INI format is broken with newlines in indentifier names. We
protect newlines with non-allowed Unicode symbol "\u10000".

Cleanup in get_database_table_part_name_from_filename(): dead code
removed.

"\u10000" is available from C99 so we added -std=gnu99 option.

https://jira.mariadb.org/browse/CONTRIB-23
@midenok midenok force-pushed the corp_latest branch 2 times, most recently from a8d5f4e to 87e4864 Compare October 31, 2023 19:09
@midenok
Copy link
Collaborator Author

midenok commented Oct 31, 2023

I added one more fix 87e4864. Everything else if any I will add in new PR.

@davidducos davidducos added this to the Release 0.15.3-1 milestone Nov 1, 2023
@davidducos davidducos added the Fix label Nov 1, 2023
src/myloader.c Outdated
if (m > 4 || s > 1 || (s == 1 && r >= 1))
g_hash_table_insert(set_session_hash,g_strdup("SQL_MODE"), g_strdup("'NO_AUTO_VALUE_ON_ZERO'"));
if (m > 4 || s > 1 || (s == 1 && r >= 11))
g_hash_table_insert(set_session_hash,g_strdup("SQL_NOTES"), g_strdup("0"));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be added to mydumper.cnf

src/myloader.c Outdated
Comment on lines 119 to 127
if (m >= 4) {
if (m > 4 || s > 0 || r >= 14) {
g_hash_table_insert(set_session_hash,g_strdup("UNIQUE_CHECKS"), g_strdup("0"));
g_hash_table_insert(set_session_hash,g_strdup("FOREIGN_KEY_CHECKS"), g_strdup("0"));
}
if (m > 4 || s > 1 || (s == 1 && r >= 1))
g_hash_table_insert(set_session_hash,g_strdup("SQL_MODE"), g_strdup("'NO_AUTO_VALUE_ON_ZERO'"));
if (m > 4 || s > 1 || (s == 1 && r >= 11))
g_hash_table_insert(set_session_hash,g_strdup("SQL_NOTES"), g_strdup("0"));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that is why we have mydumper.cnf, to avoid this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Configuration file should not be mandatory for correct work of software. This is user unfriendly! Otherwise you should fail it to run if it doesn't find configuration file. What exactly did you want to avoid? There is only 4 variables and the number doesn't grow for long time.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved this commit to PR #1339 so we could discuss it separately.

../src/extra/mydumper/src/mydumper_working_thread.c:276:7: error: variable 'databases' is used uninitialized whenever '||' condition is true [-Werror,-Wsometimes-uninitialized]
  if (mysql_query(td->thrconn, "SHOW DATABASES") ||
      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../src/extra/mydumper/src/mydumper_working_thread.c:281:33: note: uninitialized use occurs here
  while ((row = mysql_fetch_row(databases))) {
                                ^~~~~~~~~
../src/extra/mydumper/src/mydumper_working_thread.c:276:7: note: remove the '||' if its condition is always false
  if (mysql_query(td->thrconn, "SHOW DATABASES") ||
      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../src/extra/mydumper/src/mydumper_working_thread.c:274:23: note: initialize the variable 'databases' to silence this warning
  MYSQL_RES *databases;
                      ^
                       = NULL
1 error generated.
On restoring sequences with stored values myloader fails with
following error:

  ** (myloader:2371085): WARNING **: 19:57:57.649: Thread 13: Error restoring -1: /*!40101 SET NAMES binary*/;
   Commands out of sync; you can't run this command now

That is because it did run:

  SELECT SETVAL(`s1`, 1101, 0);

myloader cannot run SELECT, otherwise it has to receive
resultset. Without getting resultset next command fails with "Commands
out of sync".

The patch replaces SELECT with DO which discards result output.

Bug description:

https://jira.mariadb.org/browse/CONTRIB-24
@davidducos davidducos merged commit bb2d23f into mydumper:master Nov 2, 2023
37 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants