Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modifications to support use of the DBD::MariaDB driver instead of DBD::mysql #1160

Merged

Conversation

taniwallach
Copy link
Member

Modifications to allow use of the DBD::MariaDB driver instead of DBD::mysql.

The DBD::MariaDB driver handles UTF-8 by default, and does not need mysql_enable_utf8mb4 or mysql_enable_utf8 and would fail if they were given.

These changes are far beyond the initial one mentioned in #1150 (comment) which was needed to allow using the new driver.

See:


Also included modifications to prevent mysqldump of mysql 8+ for reporting errors about column-statistics being missing from MariaDB databases by disabling column_statistic when the installed mysqldump command would support it.

See:

…mysql. It handles UTF-8 by default, and does not need mysql_enable_utf8mb4 or mysql_enable_utf8. Also mods to prevent mysqldump of mysql 8+ for reporting errors about column-statistics being missing from MariaDB databases by disabling column_statistic when the installed mysqldump command would support it.
@taniwallach taniwallach added NeedsTesting Tentatively fixed bug or implemented feature Enhancement enhances the software labels Nov 23, 2020
@taniwallach taniwallach added this to the WW 2.16 milestone Nov 23, 2020
@taniwallach
Copy link
Member Author

I have tested:

The same tests need to also be run on a server which is still using the DBD::mysql driver.

@taniwallach
Copy link
Member Author

I can report that in over 3 months of using DBD::MariaDB in production on a smaller server - I am no longer seeing any more aborted connections due to (Got an error reading communication packets).

I recently reset my main production server, but before I reset it I had checked for this issue and there was not any noticeable issue.

I do still see some "rare" aborted connections, but only those for (Got timeout reading communication packets) which occur once the wait_timeout "session" timeout is reached. On the server which has not been reset recently, there are 35 such warnings over a 5 week period, where MariaDB has wait_timeout = 86400 seconds (= 24 hours).

Over these 5 weeks there is only one other warning, which is triggered at Docker startup time:

[Warning] Aborted connection 8 to db: 'unconnected' user: 'unauthenticated' host: '172.30.0.3' (This connection closed normally without authentication)

Thus, at least in my setting: MariaDB 10.4 in a Docker container on the same host as WeBWorK in Docker - the dropped connection issue is no longer occurring, while it was happening with the "older" DBD::mysql driver. That driver is not being very actively maintained, and there are several reports of issues with it. (That motivated the creating of the forked driver.)

The number of seconds the server waits for activity on a connection before closing it.
from https://mariadb.com/docs/reference/mdb/system-variables/wait_timeout/

Copy link
Sponsor Member

@drgrice1 drgrice1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good and passed all my tests, but I you should add comments above the $database_dsn variable in site.conf explaining how to use this, and the describe format that variable should be in.

@dlglin
Copy link
Member

dlglin commented Feb 26, 2021

If we are looking at migrating to DBD::MariaDB then we also need to include this as a required perl module. It is not currently packaged in RedHat 7, so I had to install it from CPAN on my test server. This should be documented somewhere, and probably included in check_modules.pl.

@drgrice1
Copy link
Sponsor Member

I am not sure that we should include DBD::MariaDB in check_modules.pl. The way this pull request is set up the system administrator is given the choice to use DBD:MariaDB or DBD::mysql. So what we need is a mechanism to have check_modules.pl check that one of DBD::MariaDB or DBD::mysql is installed.

@pstaabp
Copy link
Sponsor Member

pstaabp commented Feb 28, 2021

I haven't been success running these. On anything that hits the database, I get as an example for dump-OPL-tables.pl:

mysqldump: Got error: 2002: "Can't connect to MySQL server on 'db' (36)" when trying to connect
OPL database dump created: XXX/libraries/webwork-open-problem-library//TABLE-DUMP/OPL-tables.sql

I'm running the following version of mariadb:

mysql  Ver 10.19 Distrib 10.5.9-MariaDB, for osx10.16 (x86_64)

The only change I made was in site.conf to set

$database_dsn = "DBI:MariaDB:database=webwork;host=db;port=3306";

and then did

$ENABLE_UTF8MB4 =0; 

I'm sure I have some other setting wrong, but can't figure that out right now. Fortunately, changing it back worked fine.

I think overall, if go with this, we need to do a bunch of documentation to allow people to try out. This is probably fine to include as long as it doesn't break old things.

@drgrice1
Copy link
Sponsor Member

@pstaabp: Most likely you need to change the line $database_dsn = "DBI:MariaDB:database=webwork;host=db;port=3306"; to $database_dsn = "DBI:MariaDB:database=webwork";

That is unless you have your database setup to be accessed via a different host. In the docker setup the host is db.

@pstaabp
Copy link
Sponsor Member

pstaabp commented Feb 28, 2021

@drgrice1 That did it and I got everything running above, however, I also tested OPL-update and got the error:

DBI connect('database=webwork','webworkWrite',...) failed: Unknown attribute mysql_enable_utf8 at /Users/pstaab/code/ww-docker/webwork2/bin/OPL-update line 100.

Looks like the database handler might need to be altered for MariaDB in this file.

@drgrice1
Copy link
Sponsor Member

@pstaabp: You are correct. That is an oversight.

@taniwallach: See @pstaabp's comment. You missed a case with the mysql_enable_utf8mb4 settings.

@taniwallach taniwallach self-assigned this Mar 2, 2021
@taniwallach taniwallach added the Do Not Merge Yet PR to allow others to inspect -- not ready for prime time label Mar 2, 2021
@taniwallach
Copy link
Member Author

@pstaabp - Thanks for noticing the omission. I think the second commit fixes it.

I tested only so far as making sure that an bin/update-OPL run would start working with the change using both drivers.

I ran under Docker once with each of the options:

  • WEBWORK_DB_DSN: DBI:mysql:webwork:db:3306
  • WEBWORK_DB_DSN: DBI:MariaDB:database=webwork;host=db;port=3306
    and in both cases the code started to report processing PG files.

(I used bin/restore-OPL-tables.pl to quickly restore the tables.)

@pstaabp and @drgrice1 - If one of you would please do a quick test of OPL-update and remove the Do Not Merge label...

@taniwallach
Copy link
Member Author

you should add comments above the $database_dsn variable in site.conf explaining how to use this, and the describe format that variable should be in.

Being added.

@drgrice1
Copy link
Sponsor Member

... you should add comments above the $database_dsn variable in site.conf explaining how to use this, and the describe format that variable should be in.

Don't forget this.

@drgrice1
Copy link
Sponsor Member

Cross posting!

… 2 drivers. Change a default switch setting for pdflatex.
@taniwallach
Copy link
Member Author

Cross posting!

My fault, I should have recalled the second issue before pushing the changes to my branch.

Note also the pdflatex default setting change.

Copy link
Sponsor Member

@drgrice1 drgrice1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the documentation you added is decent. You might make a comment about case. Your listed formats have lower case dbi, and the examples are upper case. Perhaps point out where case matters (I believe this is only the database name webwork). Or just make them all consistent to prevent confusion on this matter.

Edit: Case also matters for MariaDB.
Edit2: Actually it seems the only place where case doesn't matter is for dbi. So probably just make the case consistent.

I think this pull request looks good other than that.

@taniwallach
Copy link
Member Author

Maybe see if a symbolic link from /tmp/mysqld.sock to /var/run/mysqld/mysqld.sock will work.

This doesn't work. I don't think file system links work for sockets.

Sorry, I was not sure, but it was worth trying.

Setting the DSN on the machine in a custom manner might be the simplest solution for now.

I also don't know what you mean by this. This seems unrelated to the issue.

Try something like:

$database_dsn="DBI:$database_driver:database=$database_name"
     . 'mariadb_socket=/var/run/mysqld/mysqld.sock';

as a manual setting after the regular block in site.conf.

I know you wanted to avoid this, and for Ubuntu 20.04 it apparently is not needed, but it should hopefully work.

@drgrice1
Copy link
Sponsor Member

Oh, you meant the database DSN. I was thinking the general network DSN. Sorry.

Yeah, that works for general access, but fails when creating a course archive (and probably restoring one). It messes up the parsing of the database_dsn string that is done in the webwork code. The apache2 error.log shows messages about an unknown host mariadb_socket=/var/run/mysqld/mysqld.sock.

@drgrice1
Copy link
Sponsor Member

Wait, this is a problem with this pull request. Course archiving is broken now in general. Even with the mysql driver. I am getting errors like:
Warning: Failed to dump table course_id_setting' with command '2>&1 /usr/bin/mysqldump --defaults-file=/tmp/NE7XaluFC1 webwork course_id_setting > /opt/webwork/courses/course_id/DATA/mysqldump/setting.sql' (exit=2 signal=0 core=0): mysqldump: Got error: 2005: "Unknown MySQL server host 'mysql' (-3)" when trying to connect

@drgrice1
Copy link
Sponsor Member

This happens on Ubuntu 20.04 also.

@drgrice1
Copy link
Sponsor Member

Anytime you are using the mysql DBD driver course archiving is broken. The MariaDB driver works with this on Ubuntu 20.04.

@taniwallach
Copy link
Member Author

Oh, you meant the database DSN. I was thinking the general network DSN. Sorry.

Yeah, that works for general access, but fails when creating a course archive (and probably restoring one). It messes up the parsing of the database_dsn string that is done in the webwork code. The apache2 error.log shows messages about an unknown host mariadb_socket=/var/run/mysqld/mysqld.sock.

Wait, this is a problem with this pull request. Course archiving is broken now in general. Even with the mysql driver. I am getting errors like:
Warning: Failed to dump table course_id_setting' with command '2>&1 /usr/bin/mysqldump --defaults-file=/tmp/NE7XaluFC1 webwork course_id_setting > /opt/webwork/courses/course_id/DATA/mysqldump/setting.sql' (exit=2 signal=0 core=0): mysqldump: Got error: 2005: "Unknown MySQL server host 'mysql' (-3)" when trying to connect

I have archived courses with older versions of the code from this PR, but using a "remote" database where host and port are both set.

The changes in the PR were supposed to replace most of the cases where the database DSN is being parsed, but there are 2 places where that remained necessary.

  • lib/WeBWorK/DB/Schema/NewSQL/Std.pm
  • lib/WeBWorK/Utils/CourseManagement/sql_single.pm
    where new code manually does it for the new driver. Now that we have additional DSN formats (for socket use on localhost) - we have a problem which was not anticipated. The older code used DBD::mysql->_OdbcParse() which also was a hack and did not address local socket access to the database.

I'm checking now with the most recent code.

@drgrice1
Copy link
Sponsor Member

Note that this just generally fails. I was testing without the mariadb_socket option added in, but with the mysql driver enabled.

@taniwallach
Copy link
Member Author

taniwallach commented Mar 24, 2021

Note that this just generally fails. I was testing without the mariadb_socket option added in, but with the mysql driver enabled.

I can archive courses (in Docker WW) with either setting of WEBWORK_DB_DRIVER:

  • WEBWORK_DB_DRIVER=mysql
  • WEBWORK_DB_DRIVER=MariaDB

when using

WEBWORK_DB_HOST=db
WEBWORK_DB_USER=webworkWrite
WEBWORK_DB_PORT=3306
WEBWORK_DB_PASSWORD=the real secret password
WEBWORK_DB_NAME=webwork

I think the problem is socket specific, and we will need to fix it by adjusting the special code in

  • lib/WeBWorK/DB/Schema/NewSQL/Std.pm
  • lib/WeBWorK/Utils/CourseManagement/sql_single.pm
    to handle alternate formats of the DSN (without host= and port=).

Did archiving work via socket with the older code. It also was designed to handle only host and port.

@taniwallach
Copy link
Member Author

Note the old code in those files had:

	# this is an internal function which we probably shouldn't be using here
	# but it's quick and gets us what we want (FIXME what about sockets, etc?)

so sockets were not covered before either.

@drgrice1
Copy link
Sponsor Member

I am not using docker. Sockets aren't the problem. I say again, I am not using the socket option.

@drgrice1
Copy link
Sponsor Member

This all worked with the old code without the socket option, and still should.

@drgrice1
Copy link
Sponsor Member

This is definitely this pull request that caused this. If I revert back to before this pull request, course archiving works as it should.

@taniwallach
Copy link
Member Author

This is definitely this pull request that caused this. If I revert back to before this pull request, course archiving works as it should.

If you revert - getting this code back in will be a mess.

Please test on your setup with a network based database DSN setting and look at what changed in the code in the 2 files I just mentioned.

I looked at /usr/lib/x86_64-linux-gnu/perl5/5.28/DBD/mysql.pm on my machine to see what _OdbcParse() does...

@drgrice1
Copy link
Sponsor Member

That is not what I am saying. I am saying that if I locally revert to before this pull request, then archiving works. I am also saying that this is a big problem that needs to be fixed, or it is possible that this will need to be reverted.

@taniwallach
Copy link
Member Author

I agree it needs to be fixed, and dropped other work to work on this now.

We need to understand how https://github.com/openwebwork/webwork2/blob/WeBWorK-2.16/lib/WeBWorK/DB/Schema/NewSQL/Std.pm is broken now...

At present, the only hint I find of what is wrong is in

   mysqldump: Got error: 2005: "Unknown MySQL server host 'mysql' (-3)" when trying to connect

from the error message you posted.

I think it would help for you to see what is being placed in the temporary config file created by the archiving process in both settings, as well as the command line being executed.

Commenting out the

    $my_cnf->unlink_on_destroy(1);

in sub _get_db_info might help grab the file, and adding something to output $dump_cmd to a file will help save that.

@drgrice1
Copy link
Sponsor Member

I am not sure what has changed, but before this pull request _OdbcParse did not set $dsn{host} or $dsn{port} with the dsn dbi:mysql:webwork, and now (hacking in that same dsn) it sets $dsn{host}=mysql, and $dsn{port}=webwork.

@drgrice1
Copy link
Sponsor Member

If you change line 285 of Std.pm to

DBD::mysql->_OdbcParse($dsn, \%dsn, ['dbi', 'driver', 'database', 'host', 'port']);

archiving works again with the default settings in site.conf as you have them with the exception of $database_driver="mysql".

@drgrice1
Copy link
Sponsor Member

Another things that works is to just use the code you have above for dbi:mariadb also for dbi:mysql.

@drgrice1
Copy link
Sponsor Member

By the way, this also works (either way) if the mariadb_socket is added into the dsn.

@taniwallach
Copy link
Member Author

Another things that works is to just use the code you have above for dbi:mariadb also for dbi:mysql.

Then maybe that is the best approach. We need to make a call on what seem the best solution

I did not want to change what was being done for the old driver, as it has case base handling for different ways of writing the database DSN.

Until the middle of this PR's review process DSN was set directly and then parsed in many places. Some (archive/unarchive courses) used the "old" driver's _OdbcParse() routine which allows host= or hostname= and dbname= or db=. (The code is below.) In order to avoid changing existing behavior unnecessarily and with little interest in maintaining more parsing code - I figures leaving the old approach for the old driver was safest. It turns out that this is a mistake for localhost databases for some cases.

I'll happily experiment later or tomorrow with a revised file via Docker and using both driver options. However, I'm not set up to test a localhost DB.


If you change line 285 of Std.pm to
...

That line should not be active for the "new" driver, but only when $database_driver="mysql".

I can't really figure out how to see what is happening on your machine. I think a detailed "trace" of the relevant variables and the values they get along the process may help.

Could you post the code of DBD::mysql's internal _OdbcParse function on your box.

I have /usr/lib/x86_64-linux-gnu/perl5/5.30/DBD/mysql.pm in the Docker container, and it has:

sub _OdbcParse($$$) {
    my($class, $dsn, $hash, $args) = @_;
    my($var, $val);
    if (!defined($dsn)) {
	return;
    }
    while (length($dsn)) {
	if ($dsn =~ /([^:;]*\[.*]|[^:;]*)[:;](.*)/) {
	    $val = $1;
	    $dsn = $2;
	    $val =~ s/\[|]//g; # Remove [] if present, the rest of the code prefers plain IPv6 addresses
	} else {
	    $val = $dsn;
	    $dsn = '';
	}
	if ($val =~ /([^=]*)=(.*)/) {
	    $var = $1;
	    $val = $2;
	    if ($var eq 'hostname'  ||  $var eq 'host') {
		$hash->{'host'} = $val;
	    } elsif ($var eq 'db'  ||  $var eq 'dbname') {
		$hash->{'database'} = $val;
	    } else {
		$hash->{$var} = $val;
	    }
	} else {
	    foreach $var (@$args) {
		if (!defined($hash->{$var})) {
		    $hash->{$var} = $val;
		    last;
		}
	    }
	}
    }
}

@taniwallach
Copy link
Member Author

taniwallach commented Mar 24, 2021

I think I see the issue - the new code uses database= in the DSN, but the _OdbcParse parser allows db= or dbname= but not database=.

Since we are not building the database DSN by "our standard" in conf/site.conf.dist and expect sites to adopt that standard in their site.conf I think the best solution is to drop the cases, and just use the local "parser" of the DSN in the 2 relevant files.

Install instructions and release notes should make it very clear that modifying the DNS too much could break archiving.

Maybe we should rewrite the "parser" to allow multiple orders and skipped parameters, now that we are using the format with explicit

  • database=
  • host=
  • port=

so no longer have a need to implicitly assume a fixed order.

@taniwallach
Copy link
Member Author

A better fix, but possible very complicated to do would be to get the "raw" settings from site.conf somewhere the database code can access directly, and drop the special DSN parsing in these 2 files.

@drgrice1
Copy link
Sponsor Member

I found it. The difference is what is on line 274 in Std.pm before this pull request, and what you changed that to on line 281. It was a substitution that stripped of the "dbi:mysql" before the pull request. Now it is a match that does not strip off that part. In sql_single.pm you left it as a substitution. So if you change that back, you get the old behavior back and it works.

@drgrice1
Copy link
Sponsor Member

I think that it would be best to do as you suggest, and just use the new code for both drivers and both files.

@taniwallach
Copy link
Member Author

I think that it would be best to do as you suggest, and just use the new code for both drivers and both files.

I will try to make a PR for this later today or tomorrow, unless you have time to work on it.

drgrice1 added a commit that referenced this pull request Mar 26, 2021
Merge after #1160 - Docker - update to using Ubuntu 20.04 + minor changes related to recent PRs
@taniwallach
Copy link
Member Author

I found it. The difference is what is on line 274 in Std.pm before this pull request, and what you changed that to on line 281. It was a substitution that stripped of the "dbi:mysql" before the pull request. Now it is a match that does not strip off that part. In sql_single.pm you left it as a substitution. So if you change that back, you get the old behavior back and it works.

That seems to be the root bug... But as I posted in #1284 (comment) I did not manage to trigger the issue...

@taniwallach taniwallach deleted the dbd-mariadb-driver-support branch June 1, 2021 20:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants