Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mariadb_use_result is broken #173

Closed
rovo89 opened this issue Nov 25, 2022 · 13 comments · Fixed by #181
Closed

mariadb_use_result is broken #173

rovo89 opened this issue Nov 25, 2022 · 13 comments · Fixed by #181

Comments

@rovo89
Copy link

rovo89 commented Nov 25, 2022

The following example displays Active: no and has no results:

my $dbh = DBI->connect(...);
my $sth = $dbh->prepare('SELECT id FROM sometable');
$sth->{'mariadb_use_result'} = 1;
$sth->execute();
print 'Active: ', $sth->{'Active'} ? 'yes' : 'no', "\n";
while (my $id = $sth->fetchrow_array()) {
  print $id, "\n";
}
$dbh->disconnect()

It works fine without mariadb_use_result. It works fine with DBD::mysql, both with and without mysql_use_result.

It might be related to 4f47a06. The execute() call returns 0E0 rows, but with mariadb_use_results this doesn't mean that there are no results.

Trace:

    DBI::db=HASH(0x13cffc0) trace level set to 0x0/9 (DBI @ 0x0/0) in DBI 1.642-ithread (pid 7788)
    -> prepare for DBD::MariaDB::db (DBI::db=HASH(0x13cff30)~0x13cffc0 'SELECT id FROM sometable') thr#9fb260
    >> func        DISPATCH (DBI::db=HASH(0x13cffc0) rc1/2 @2 g2 ima6 pid#7788) at /perlpath/DBD/MariaDB.pm line 180 via  at test.pl line 26
1   -> _async_check for DBD::MariaDB::db (DBI::db=HASH(0x13cffc0)~INNER) thr#9fb260
1   <- _async_check= ( 1 ) [1 items] at /perlpath/DBD/MariaDB.pm line 180 via  at test.pl line 26
    New 'DBI::st' (for DBD::MariaDB::st, parent=DBI::db=HASH(0x13cffc0), id=undef)
    dbih_setup_handle(DBI::st=HASH(0x13cfa08)=>DBI::st=HASH(0xb00bc0), DBD::MariaDB::st, 11fb088, Null!)
    dbih_make_com(DBI::db=HASH(0x13cffc0), feb8f0, DBD::MariaDB::st, 464, 0) thr#9fb260
    dbih_setup_attrib(DBI::st=HASH(0xb00bc0), Err, DBI::db=HASH(0x13cffc0)) SCALAR(0x11faf98) (already defined)
    dbih_setup_attrib(DBI::st=HASH(0xb00bc0), State, DBI::db=HASH(0x13cffc0)) SCALAR(0x11faff8) (already defined)
    dbih_setup_attrib(DBI::st=HASH(0xb00bc0), Errstr, DBI::db=HASH(0x13cffc0)) SCALAR(0x11fafc8) (already defined)
    dbih_setup_attrib(DBI::st=HASH(0xb00bc0), TraceLevel, DBI::db=HASH(0x13cffc0)) 9 (already defined)
    dbih_setup_attrib(DBI::st=HASH(0xb00bc0), FetchHashKeyName, DBI::db=HASH(0x13cffc0)) 'NAME_uc' (already defined)
    dbih_setup_attrib(DBI::st=HASH(0xb00bc0), HandleSetErr, DBI::db=HASH(0x13cffc0)) undef (not defined)
    dbih_setup_attrib(DBI::st=HASH(0xb00bc0), HandleError, DBI::db=HASH(0x13cffc0)) CODE(0x137f710) (already defined)
    dbih_setup_attrib(DBI::st=HASH(0xb00bc0), ReadOnly, DBI::db=HASH(0x13cffc0)) 1 (already defined)
    dbih_setup_attrib(DBI::st=HASH(0xb00bc0), Profile, DBI::db=HASH(0x13cffc0)) undef (not defined)
        -> mariadb_st_prepare_sv MYSQL_VERSION_ID 100505, SQL statement: SELECT id FROM sometable
        >- mariadb_st_free_result_sets
        <- mariadb_st_free_result_sets RC -1
        <- mariadb_st_free_result_sets
>count_params statement SELECT id FROM sometable
        <- mariadb_st_prepare_sv
    <- prepare= ( DBI::st=HASH(0x13cfa08) ) [1 items] at test.pl line 26
    -> STORE for DBD::MariaDB::st (DBI::st=HASH(0xb00bc0)~INNER 'mariadb_use_result' 1 (magic-r:p)) thr#9fb260
                -> mariadb_st_STORE_attrib for 11fb0a0, key mariadb_use_result
                <- mariadb_st_STORE_attrib for 11fb0a0, result 1
    <- STORE= ( 1 ) [1 items] at test.pl line 28
    -> execute for DBD::MariaDB::st (DBI::st=HASH(0x13cfa08)~0xb00bc0) thr#9fb260
 -> mariadb_st_execute_iv for 11fb0a0
        >- mariadb_st_free_result_sets
        <- mariadb_st_free_result_sets RC -1
        <- mariadb_st_free_result_sets
mariadb_st_internal_execute MYSQL_VERSION_ID 100505
>parse_params statement SELECT id FROM sometable
 <- mariadb_st_execute_iv returning imp_sth->row_num 0
    <- execute= ( '0E0' ) [1 items] at test.pl line 29
    -> FETCH for DBD::MariaDB::st (DBI::st=HASH(0xb00bc0)~INNER 'Active') thr#9fb260
    -> mariadb_st_FETCH_attrib for 11fb0a0, key Active
    .. FETCH DBI::st=HASH(0xb00bc0) 'Active' = ''
    <- FETCH= ( '' ) [1 items] at test.pl line 30
Active: no
    -> fetchrow_array for DBD::MariaDB::st (DBI::st=HASH(0x13cfa08)~0xb00bc0) thr#9fb260
        -> mariadb_st_fetch
    <- fetchrow_array= ( undef ) [1 items] at test.pl line 31
    -> disconnect for DBD::MariaDB::db (DBI::db=HASH(0x13cff30)~0x13cffc0) thr#9fb260
        mariadb_db_close_mysql: imp_dbh=feb8f0 pmysql=16ba410
    >> FETCH       DISPATCH (DBI::db=HASH(0x13cffc0) rc2/3 @2 g2 ima404 pid#7788) at test.pl line 34
1   <> FETCH= ( [ DBI::st=HASH(0x13cfa08) ] ) [1 items] ('ChildHandles' from cache) at test.pl line 34
    <- disconnect= ( 1 ) [1 items] at test.pl line 34
    -> DESTROY for DBD::MariaDB::st (DBI::st=HASH(0xb00bc0)~INNER) thr#9fb260

--> mariadb_st_finish

<-- mariadb_st_finish
    <- DESTROY= ( undef ) [1 items] at test.pl line 34 via  at test.pl line 34
    -> DESTROY for DBD::MariaDB::db (DBI::db=HASH(0x13cffc0)~INNER) thr#9fb260
    <- DESTROY= ( undef ) [1 items]
@pali
Copy link
Member

pali commented Nov 25, 2022

Interesting. I tried your example and it not only prints Active: no, but also fetchrow_array does return anything and moreover at the end of program perl prints additional error:

DBD::MariaDB::st DESTROY failed: Commands out of sync; you can't run this command now

As issue is related to the use_result feature, I looked into the mysql_use_result() and mysql_num_rows() documentation where is written:

https://dev.mysql.com/doc/c-api/5.7/en/mysql-use-result.html

You may not use ... , mysql_num_rows(), ... with a result returned from mysql_use_result(), nor may you issue other queries until mysql_use_result() has finished. (However, after you have fetched all the rows, mysql_num_rows() accurately returns the number of rows fetched.)

https://dev.mysql.com/doc/c-api/5.7/en/mysql-num-rows.html

If you use mysql_use_result(), mysql_num_rows() does not return the correct value until all the rows in the result set have been retrieved.

Which is interesting because DBD::MariaDB is calling mysql_num_rows() after the mysql_use_result() call and which is against what documentation says. I checked what mysql_num_rows() returns and it is zero value. Hence DBD::MariaDB things that SELECT statement does not return any row, it turn off $sth->{Active} attribute and it stops reading returned data. Commands out of sync error is then the result of stale data which DBD::MariaDB did not read.

I think that the main issue here is total absence of use_result tests. Except very simple t/rt91715.t there is no test.

Could you try following patch if it fixes issue for you?

diff --git a/dbdimp.c b/dbdimp.c
index 0f44089221f2..5853c6776dc8 100644
--- a/dbdimp.c
+++ b/dbdimp.c
@@ -4355,7 +4355,11 @@ static my_ulonglong mariadb_st_internal_execute(
           if (mysql_errno(*svsock))
             rows = -1;
           else if (*result)
+          {
             rows = mysql_num_rows(*result);
+            if (use_mysql_use_result && rows == 0)
+              rows = -2;
+          }
           else {
             rows = mysql_affected_rows(*svsock);
           }

@rovo89
Copy link
Author

rovo89 commented Nov 28, 2022

Thanks for the quick reply! Your fix works, however I find I have trouble understanding why it works and $sth->rows() always returns -2, so I looked for an alternative fix. Various things are in #174, maybe you could have a look at it. I might add a bit more tomorrow.

@pali
Copy link
Member

pali commented Nov 28, 2022

During the weekend I have looked at this issue and I implemented fix and added missing tests. I pushed code into my mariadb_use_result branch: master...pali:DBD-MariaDB:mariadb_use_result

Looks like that you figured out also other issues which I have in my branch :-)

I was planning to open a pull request with my fixes after CI testing patch is merged.

@rovo89
Copy link
Author

rovo89 commented Nov 28, 2022

Indeed, quite some similarities. ☺️ And you also fixed the bug with multiple async statements that I noticed, but didn't mention yet 😀

I hope I can find some time tomorrow to review your commits, maybe you also want to see if you can borrow a few things from my commits. I see we solved most things the same way, however from my "outsider" perspective the explicit checks for use_mysql_use_result are easier to understand than the return code -2. The docs for mysql_use_result() also mention that mysql_num_rows() will give a valid result after all rows have been fetched.

@pali
Copy link
Member

pali commented Nov 28, 2022

explicit checks for use_mysql_use_result are easier to understand than the return code -2

This does not fix the issue that $sth->rows() and $DBI::rows should return value as documented in DBI documentation (either correct value or -1).

Look at the test which I added.

And you cannot check imp_sth->use_mysql_use_result because application may change between DBI calls.

The docs for mysql_use_result() also mention that mysql_num_rows() will give a valid result after all rows have been fetched.

Yes, but this is basically unusable. You do not know number of rows, so you cannot know if you fetched all rows. What you can do is to move the next row and if it fails (you get NULL) then you know that you fetched all of them and you can call mysql_num_rows(). But at this stage $sth handle should be switched to non-active state and when you are in non-active state then application should not be interested in $sth->rows() anymore. So basically mysql API allows you to receive number of rows when nobody is interested in it, so there is no reason to call that function.

@rovo89
Copy link
Author

rovo89 commented Nov 29, 2022

This does not fix the issue that $sth->rows() and $DBI::rows should return value as documented in DBI documentation (either correct value or -1).

Well, the DBI docs also says:

Generally, you can only rely on a row count after a non-SELECT execute (for some specific operations like UPDATE and DELETE), or after fetching all the rows of a SELECT statement.

For SELECT statements, it is generally not possible to know how many rows will be returned except by fetching them all. Some drivers will return the number of rows the application has fetched so far, but others may return -1 until all rows have been fetched.

Which is exactly what I have implemented. I can imagine that the following would be useful for some scripts, e.g. when writing the results to a CSV:

my $sth = $dbh->prepare('SELECT id FROM sometable', {mariadb_use_result => 1});
$sth->execute();
while (my $id = $sth->fetchrow_array()) {
  print $id, "\n";
}
printf "Wrote %d rows\n", $sth->rows();

If you prefer -1 while fetching is in progress, how about setting the correct row count when it's safe to do so - after fetching all rows?

diff --git a/dbdimp.c b/dbdimp.c
index 91c9ad4..827f6f7 100644
--- a/dbdimp.c
+++ b/dbdimp.c
@@ -5263,6 +5263,8 @@ process:
         mariadb_dr_do_error(sth, mysql_errno(imp_dbh->pmysql),
                  mysql_error(imp_dbh->pmysql),
                  mysql_sqlstate(imp_dbh->pmysql));
+      if (imp_sth->row_num == (my_ulonglong)-2)
+        imp_sth->row_num = mysql_num_rows(imp_sth->result);
       if (!mysql_more_results(imp_dbh->pmysql))
         DBIc_ACTIVE_off(imp_sth);
       return Nullav;

Yes, but this is basically unusable.

Oh, I wasn't suggesting to call mysql_num_rows() to find out how many rows will be returned or when to stop looking. I just wanted to point out that mysql_num_rows() does also "return the number of rows the application has fetched so far", so I found it helpful to make $sth->rows() behave in the same way.

@rovo89
Copy link
Author

rovo89 commented Nov 29, 2022

Look at the test which I added.

Looks like you're covering all the combinations, great!

And you cannot check imp_sth->use_mysql_use_result because application may change between DBI calls.

Oh ok, and that is (should be) allowed while the statement is still active? I checked the source code and indeed it seems that would be possible to switch it on/off while fetching results, affecting the next result received. Which would only make sense if the application wants to handle results of multi-statement queries differently, but maybe there's an edge-case for that. Fine.

By the way, I hope you don't find my comments disrespectful - I'm new to this codebase and some circumstances aren't obvious, so I might be pressing for something that has good reasons not to be done. ;)

@pali
Copy link
Member

pali commented Nov 29, 2022

If you prefer -1 while fetching is in progress, how about setting the correct row count when it's safe to do so - after fetching all rows?

I have no objections for this feature.

Oh ok, and that is (should be) allowed while the statement is still active? I checked the source code and indeed it seems that would be possible to switch it on/off while fetching results, affecting the next result received. Which would only make sense if the application wants to handle results of multi-statement queries differently, but maybe there's an edge-case for that. Fine.

It is not only for (rather) rare multi-statement queries. It is possible to call $sth->execute() on the existing statement handle more times. Together with placeholders it is used very often. For example:

my $sth = $dbh->prepare("SELECT id FROM t WHERE name = ? AND req = ?");
$sth->execute("name1", "req1");
...
$sth->execute("name1", "req2");
...

This idiom is documented and also marked as typical call sequence in DBI documentation: https://metacpan.org/pod/DBI#Outline-Usage

By the way, I hope you don't find my comments disrespectful

Of course not :-)

@rovo89
Copy link
Author

rovo89 commented Nov 29, 2022

I have no objections for this feature.

Good. It's just those two lines, so I assume you'll put that into a commit yourself? Or shall I open a PR to your branch?
Oh, and I guess this should be covered by the tests then, to ensure that it's no longer -1.

It is possible to call $sth->execute() on the existing statement handle more times.

Sure, I do that a lot. Yeah, you're right, it's possible that someone changes the attribute while they're still fetching rows for the first $sth->execute() call. If really required, the fact that mysql_store_result() or mysql_use_result() was called could be stored separately, but I understand you don't see a need to distinguish for now.

@pali
Copy link
Member

pali commented Nov 30, 2022

See also https://dev.mysql.com/doc/c-api/5.7/en/c-api-threaded-clients.html

To use mysql_use_result(), you must ensure that no other thread uses the same connection until the result set is closed. However, it really is best for threaded clients that share the same connection to use mysql_store_result().

This applies not only for threads but also for multi-process application if it shares file descriptor with connection to server.

I think that we do not have any tests for threads. And moreover there are two kinds of thread usage in perl. First is classic standalone perl application with threads and second is application written in any other language (e.g. C) which spawns two threads and initialize two separate perl interpreters (one in every thread). Well written Perl XS module should work correctly with both scenarios. Second scenario is used for example by Apache and mod_perl. I tried to handle possible issues in DBD::MariaDB but because there is no automated test, there could be bugs (and such test is hard to write as it needs to be via man perlembed).

So mysql_use_result() just makes everything harder in multi thread or multi process scenarios.

@rovo89
Copy link
Author

rovo89 commented Nov 30, 2022

I'd definitely avoid using database/statement handles anywhere else than in the process that created it 😉

But as discussed in the other PR, DESTROY can get in your way without your knowledge. Even if you just connected to the database before forking, the exiting child process will close the connection. And if there were pending results for a multi-statement query, they will be finished (and therefore taken away from the parent process), also when you use mysql_store_result().

So yes, there are some more things to consider with mysql_use_result(), but also other approaches have their traps.

@rovo89
Copy link
Author

rovo89 commented Nov 30, 2022

Or let's just say: The application developer should carefully consider whether they want to use mariadb_use_result, but if they do, the library should handle it correctly, provided they follow the rules you mentioned.

@pali
Copy link
Member

pali commented Feb 10, 2023

I was planning to open a pull request with my fixes after CI testing patch is merged.

Sorry that it took longer time, but now both AppVeyor CI (Windows) and Github Actions CI (Linux) are implemented and enabled for all new pull requests.

So I opened a pull request #181 for mariadb_use_result attribute from my branch. Please look at it.

You can implement other features mentioned in this discussion on top of it.

@pali pali closed this as completed in 1fed8a0 Mar 6, 2023
netbsd-srcmastr pushed a commit to NetBSD/pkgsrc that referenced this issue Apr 2, 2024
Upstream changes:
1.23 2023-09-10
 - Add a missing break
   (perl5-dbi/DBD-MariaDB#163)
 - Signal error if mariadb_db_async_result() fails
   (perl5-dbi/DBD-MariaDB#162)
 - Update links to project website, issues and years
 - Fix compilation with some MariaDB client library 10.2 and 10.3 versions
 - Fix mariadb_use_result attribute
   (perl5-dbi/DBD-MariaDB#173)
 - Fix statements with multiple result sets in asynchronous mode
 - Fix mariadb_sockfd attribute for Windows
 - Croaks when changing AutoCommit attribute fails
   (perl5-dbi/dbi#104)
 - Various documentation and tests fixes
 - Fix support for MariaDB Connector/C prior to 3.1.3 version
 - Fix usage of Win32::GetShortPathName() in Makefile.PL
 - Build release tarball in TAR format (instead of PAX)
 - Allow to query and change mariadb_multi_statements attribute
 - Add connect option mariadb_auth_plugin for specifying auth plugin
 - Fix support for MySQL 8.0+ client library
   (perl5-dbi/DBD-MariaDB#191)
   (perl5-dbi/DBD-mysql#329)
 - Add Github Actions CI and Cirrus CI (FreeBSD) for automated testing

1.22 2022-04-22
 - Disable usage of libmysqld.a from MySQL 8.x series
 - Install README.pod into DBD/MariaDB/ subdirectory
   (perl5-dbi/DBD-MariaDB#146)
 - Do not export driver private C functions
 - Fix typo in error message
 - Fix compatibility with new MariaDB client and server versions
   (perl5-dbi/DBD-MariaDB#164)
   (perl5-dbi/DBD-MariaDB#167)
   (perl5-dbi/DBD-mysql#333)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants