Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix GH-11950: [mysqlnd] Fixed not to set CR_MALFORMED_PACKET to error if CR_SERVER_GONE_ERROR is already set #11951

Closed
wants to merge 1 commit into from

Conversation

SakiTakamachi
Copy link
Member

This is a PR to solve the issue #11950

@SakiTakamachi SakiTakamachi changed the title Fix GH-11950: Fixed not to set CR_MALFORMED_PACKET to error if CR_SERVER_GONE_ERROR is already set Fix GH-11950: [mysqlnd] Fixed not to set CR_MALFORMED_PACKET to error if CR_SERVER_GONE_ERROR is already set Aug 12, 2023
@Girgias
Copy link
Member

Girgias commented Aug 12, 2023

PHP 8.0 is out of support, please rebase onto 8.1

@kamil-tekiela
Copy link
Member

0d922aa

@SakiTakamachi
Copy link
Member Author

SakiTakamachi commented Aug 12, 2023

@kamil-tekiela

Thank you.

I know that PR.
However, even with code from the latest PHP-8.1 branch, this issue still seems to occur.

@SakiTakamachi SakiTakamachi changed the base branch from PHP-8.0 to PHP-8.1 August 12, 2023 21:58
@SakiTakamachi
Copy link
Member Author

SakiTakamachi commented Aug 12, 2023

@Girgias

PHP 8.0 is out of support, please rebase onto 8.1

Thanks, fixed it.

@kamil-tekiela
Copy link
Member

@kamil-tekiela

Thank you.

I know that PR. However, even with code from the latest PHP-8.1 branch, this issue still seems to occur.

Are you sure? When I execute your test script on both master and 8.1, I get this:

Uncaught mysqli_sql_exception: MySQL server has gone away

@SakiTakamachi
Copy link
Member Author

Hmm, this happens no matter how many times I try.

Fatal error: Uncaught mysqli_sql_exception: Malformed packet in /var/www/html/test/test.php:11

I'll check again to see if I've misconfigured something.

@SakiTakamachi
Copy link
Member Author

Although the result has not changed yet, I will describe my operating environment.

I are still investigating.

https://gist.github.com/SakiTakamachi/866c15ef2e8e333ae526a286b1428fbc

@SakiTakamachi
Copy link
Member Author

SakiTakamachi commented Aug 13, 2023

@kamil-tekiela

First of all, it looks like bytes_sent = 0; is not running in my environment....

(edit) I made the breakpoints a bit more descriptive and re-ran.

<?php

$mysqli = new mysqli("mysql", "root", "", "test"); // my env 
var_dump($mysqli->select_db("test2"));
// master branch

(gdb) break mysqlnd_read_header
Breakpoint 1 at 0x50cee7: file /var/www/html/php-src/ext/mysqlnd/mysqlnd_wireprotocol.c, line 244.
(gdb) break mysqlnd_protocol_frame_codec.c:202
Breakpoint 2 at 0x52397f: file /var/www/html/php-src/ext/mysqlnd/mysqlnd_protocol_frame_codec.c, line 202.
(gdb) break mysqlnd_protocol_frame_codec.c:203
Breakpoint 3 at 0x52398d: file /var/www/html/php-src/ext/mysqlnd/mysqlnd_protocol_frame_codec.c, line 203.
(gdb) break mysqlnd_protocol_frame_codec.c:205
Breakpoint 4 at 0x523a2d: file /var/www/html/php-src/ext/mysqlnd/mysqlnd_protocol_frame_codec.c, line 205.
(gdb) break mysqli_api.c:1620
Breakpoint 5 at 0x36c180: file /var/www/html/php-src/ext/mysqli/mysqli_api.c, line 1621.
(gdb) run ../test/test.php
Starting program: /var/www/html/php-src/sapi/cli/php ../test/test.php

Breakpoint 1, mysqlnd_read_header (pfc=0x555555989897 <php_tcp_sockop_connect+1546>, vio=0x7fffffffb700, header=0x7ffff7a05050, conn_stats=0x555555ce2340, error_info=0x33b00000000) at /var/www/html/php-src/ext/mysqlnd/mysqlnd_wireprotocol.c:244
244	{
(gdb) c
Continuing.

Breakpoint 2, mysqlnd_mysqlnd_pfc_send_pub (pfc=0x7ffff7a83140, vio=0x7ffff7a6c300, buffer=0x7fffffff9d40 "j", count=106, conn_stats=0x7ffff7a040c0, error_info=0x7ffff7a7a120) at /var/www/html/php-src/ext/mysqlnd/mysqlnd_protocol_frame_codec.c:202
202		if (bytes_sent <= 0) {
(gdb) c
Continuing.

Breakpoint 1, mysqlnd_read_header (pfc=0x8, vio=0x7ffff7a01230, header=0x0, conn_stats=0x15, error_info=0x1) at /var/www/html/php-src/ext/mysqlnd/mysqlnd_wireprotocol.c:244
244	{
(gdb) c
Continuing.

Breakpoint 2, mysqlnd_mysqlnd_pfc_send_pub (pfc=0x7ffff7a83140, vio=0x7ffff7a6c300, buffer=0x7ffff7a7f000 "", count=0, conn_stats=0x7ffff7a040c0, error_info=0x7ffff7a7a120) at /var/www/html/php-src/ext/mysqlnd/mysqlnd_protocol_frame_codec.c:202
202		if (bytes_sent <= 0) {
(gdb) c
Continuing.

Breakpoint 1, mysqlnd_read_header (pfc=0x7ffff7a77500, vio=0x7ffff7a7f000, header=0x4, conn_stats=0x0, error_info=0x55555597638b <_php_stream_write+197>) at /var/www/html/php-src/ext/mysqlnd/mysqlnd_wireprotocol.c:244
244	{
(gdb) c
Continuing.

Breakpoint 5, zif_mysqli_select_db (execute_data=0x555555a3b31c <zend_verify_internal_arg_types+220>, return_value=0x7fffffffc130) at /var/www/html/php-src/ext/mysqli/mysqli_api.c:1621
1621	{

/* -------------- mysql restart -------------- */

(gdb) c
Continuing.

Breakpoint 2, mysqlnd_mysqlnd_pfc_send_pub (pfc=0x7ffff7a83140, vio=0x7ffff7a6c300, buffer=0x7ffff7a7f000 "\006", count=6, conn_stats=0x7ffff7a040c0, error_info=0x7ffff7a7a120) at /var/www/html/php-src/ext/mysqlnd/mysqlnd_protocol_frame_codec.c:202
202		if (bytes_sent <= 0) {
(gdb) p bytes_sent
$1 = 10
(gdb) c
Continuing.

Breakpoint 1, mysqlnd_read_header (pfc=0x0, vio=0x0, header=0x0, conn_stats=0x0, error_info=0x0) at /var/www/html/php-src/ext/mysqlnd/mysqlnd_wireprotocol.c:244
244	{
(gdb) c
Continuing.

Warning: mysqli::select_db(): Error while reading INIT_DB's response packet. PID=109 in /var/www/html/test/test.php on line 4

Fatal error: Uncaught mysqli_sql_exception: Malformed packet in /var/www/html/test/test.php:4
Stack trace:
#0 /var/www/html/test/test.php(4): mysqli->select_db('test2')
#1 {main}
  thrown in /var/www/html/test/test.php on line 4
[Inferior 1 (process 109) exited with code 0377]

@SakiTakamachi
Copy link
Member Author

SakiTakamachi commented Aug 18, 2023

@kamil-tekiela
I investigated.

0d922aa
This fix works for errors on pfc->data->m.send() in mysqlnd_wireprotocol.c.

This issue is for pfc->data->m.receive().

I think the problem is that mysqlnd_read_packet_header_and_body correctly sets the error, but then the error is overwritten.

@kamil-tekiela
Copy link
Member

I still cannot reproduce this issue in my environment. I don't know what I am doing wrong. I would appreciate some step by step instructions.

If you really get the error "Error while reading INIT_DB's response packet." then it means something is seriously broken. This warning is defensive programming and we should never get there. So I don't think it's the case of overwritten error, but something more serious.

I would appreciate if you could share as much details as possible. I hope we can resolve it together.

@SakiTakamachi
Copy link
Member Author

SakiTakamachi commented Aug 19, 2023

@kamil-tekiela k

I see.

I'll keep investigating to see if I can find any clues.

I'm going to close this PR and write a continuation to the issue, but is that okay?

@SakiTakamachi
Copy link
Member Author

SakiTakamachi commented Aug 19, 2023

The error Error while reading INIT_DB's response packet. was set here.

php_error_docref(NULL, E_WARNING, "Error while reading %s's response packet. PID=%d", mysqlnd_command_to_text[command], getpid());

This if runs with condition !silent && error_info->error_no == CR_MALFORMED_PACKET.

This message is easily printed because send_command_handle_OK() was called in the previous switch and the error is overwritten there with CR_MALFORMED_PACKET.

ret = payload_decoder_factory->m.send_command_handle_OK(payload_decoder_factory, error_info, upsert_status, ignore_upsert_status, last_message);

SET_CLIENT_ERROR(error_info, CR_MALFORMED_PACKET, UNKNOWN_SQLSTATE, "Malformed packet");

@SakiTakamachi
Copy link
Member Author

To reproduce this error, restart the docker mysql container itself.
Or stop the container.

I'm trying to see if it can be reproduced with php and mysql on the host OS without using docker.

@SakiTakamachi
Copy link
Member Author

SakiTakamachi commented Aug 19, 2023

(edit) Oops, I forgot to build the local php. I'll rerun.

I have successfully reproduced it on the host OS.

repro code

// test.php

<?php

$driver = new mysqli_driver();
$driver->report_mode = MYSQLI_REPORT_ERROR|MYSQLI_REPORT_STRICT;

$mysqli = new mysqli("localhost", "root", "", "test"); // my env 

fgets(STDIN);
// Restarted mysql, then press enter

var_dump($mysqli->select_db("test2"));

my env

MacBook Pro 16-inch, 2019 (intel)

// php -v

PHP 8.2.9 (cli) (built: Aug 16 2023 21:20:30) (NTS)
Copyright (c) The PHP Group
Zend Engine v4.2.9, Copyright (c) Zend Technologies
    with Zend OPcache v8.2.9, Copyright (c), by Zend Technologies
// mysql> select version();

+-----------+
| version() |
+-----------+
| 8.1.0     |
+-----------+

steps

  1. Open two terminals, one for php operations and one for restarting mysql.
  2. Run php test.php on the terminal for php. Without pressing anything, go to step3.
  3. Run sudo mysql.server restart on the terminal for mysql. Wait for mysql to finish restarting.
  4. Press the enter key on the terminal for php.

@kamil-tekiela
Copy link
Member

PHP 8.2.9 isn't the latest release in PHP 8.2 branch. My fix is included in 8.2.10

@SakiTakamachi
Copy link
Member Author

I forgot to build a fresh PHP on my host OS.

It could be reproduced when connecting from PHP in the Docker container to mysql on the host.
I tried building PHP8.3-dev on the host, but it certainly won't be reproducible in this case.

I'll look into it some more.

@SakiTakamachi
Copy link
Member Author

If using php of host OS, this ret will be FAIL.

ret = send_command(conn->payload_decoder_factory, COM_INIT_DB, (const zend_uchar*) db.s, db.l, FALSE,

And in gdb I get the following message: Thread 2 received signal SIGPIPE, Broken pipe

On the other hand, if using php in a docker container, the above ret will be PASS.
This causes the following line to be executed, which in turn FAIL and raises an error.

ret = send_command_handle_response(conn->payload_decoder_factory, PROT_OK_PACKET, FALSE, COM_INIT_DB, TRUE,

@kamil-tekiela
Copy link
Member

If using php of host OS, this ret will be FAIL.

ret = send_command(conn->payload_decoder_factory, COM_INIT_DB, (const zend_uchar*) db.s, db.l, FALSE,

And in gdb I get the following message: Thread 2 received signal SIGPIPE, Broken pipe
On the other hand, if using php in a docker container, the above ret will be PASS. This causes the following line to be executed, which in turn FAIL and raises an error.

ret = send_command_handle_response(conn->payload_decoder_factory, PROT_OK_PACKET, FALSE, COM_INIT_DB, TRUE,

That's exactly what my commit (0d922aa) fixed. I guess something must be wrong with your docker setup if compiling PHP 8.1 branch still produces this error. The function returned PASS because the value of sent in php_mysqlnd_cmd_write overflowed. I fixed it, so now the function returns FAIL correctly.

Please double check your docker setup. Check what the value of sent is and what the value of bytes_sent is. Let me know what you find out. If there's still something missing, we need to find a fix.

Good job on debugging though. This is a really good skill.

@SakiTakamachi
Copy link
Member Author

@kamil-tekiela

The docker write buffer results were exactly the same as when I had a normal connection.

// docker
php_sockop_write (stream=0x5555559079a8 <zend_string_release_ex+225>, buf=0x7fffffffbb80 "\240\273\377\377\377\177", count=140737348335360) at /var/www/html/php-src/main/streams/xp_socket.c:58
58	{
(gdb) 
59		php_netstream_data_t *sock = (php_netstream_data_t*)stream->abstract;
(gdb) 
63		if (!sock || sock->socket == -1) {
(gdb) 
67		if (sock->timeout.tv_sec == -1)
(gdb) 
70			ptimeout = &sock->timeout;
(gdb) 
73		didwrite = send(sock->socket, buf, XP_SOCK_BUF_SIZE(count), (sock->is_blocked && ptimeout) ? MSG_DONTWAIT : 0);
(gdb) 
__libc_send (fd=4, buf=0x7ffff7a7f000, len=10, flags=64) at ../sysdeps/unix/sysv/linux/send.c:24
24	../sysdeps/unix/sysv/linux/send.c: No such file or directory.
(gdb) 
28	in ../sysdeps/unix/sysv/linux/send.c
(gdb) 
php_sockop_write (stream=0x7ffff7a77600, buf=0x7ffff7a7f000 "\006", count=10) at /var/www/html/php-src/main/streams/xp_socket.c:75
75		if (didwrite <= 0) {
(gdb) p sock->socket
$8 = 4
(gdb) p count
$9 = 10
(gdb) p sock->is_blocked && ptimeout
$10 = 1
(gdb) p buf
$11 = 0x7ffff7a7f000 "\006"
(gdb) p didwrite
$12 = 10
// Mac
php_sockop_write (stream=0x102077900, buf=0x10207f000 "\006", count=10) at /Users/saki/develop/php-src/main/streams/xp_socket.c:59
59		php_netstream_data_t *sock = (php_netstream_data_t*)stream->abstract;
(gdb) 
63		if (!sock || sock->socket == -1) {
(gdb) 
67		if (sock->timeout.tv_sec == -1)
(gdb) 
70			ptimeout = &sock->timeout;
(gdb) 
67		if (sock->timeout.tv_sec == -1)
(gdb) 
73		didwrite = send(sock->socket, buf, XP_SOCK_BUF_SIZE(count), (sock->is_blocked && ptimeout) ? MSG_DONTWAIT : 0);
(gdb) p sock->socket
$1 = 4
(gdb) p count
$2 = 10
(gdb) p buf
$3 = 0x10207f000 "\006"
(gdb) p sock->is_blocked && ptimeout
$4 = 1
(gdb) s

Thread 2 received signal SIGPIPE, Broken pipe.
75		if (didwrite <= 0) {
(gdb) p didwrite
$5 = -1

After digging this far, I finally got an idea, but isn't this a problem with the specifications of socket communication?
read() can detect the disconnection immediately, but send() may not know that the connection has been disconnected until it is sent twice.

@SakiTakamachi
Copy link
Member Author

I think this prediction is probably correct.
When I performed a similar test with a combination of connecting to Docker's mysql from the host OS, the Malformed packet error occurred.

Perhaps the result was different because it was a combination of php and mysql of the host OS. If the services are in the same environment, the disconnection may be noticed the first time send() is executed.

@SakiTakamachi
Copy link
Member Author

@kamil-tekiela

Is there any more information you need?

@kamil-tekiela
Copy link
Member

I am busy with some other stuff at the moment. I will review it soon. I need to spend some more time to check what you are saying.

@SakiTakamachi
Copy link
Member Author

Okay, sorry for rushing you.

@adsr
Copy link
Contributor

adsr commented Oct 12, 2023

IMO the error emitted here:

SET_CLIENT_ERROR(error_info, CR_MALFORMED_PACKET, UNKNOWN_SQLSTATE, "Malformed packet");

is potentially misleading.

There's a difference between trying to read the socket and getting EOF and reading in an actually malformed packet. In both cases PACKET_READ returns FAIL. If there's metadata available at this line to determine if we hit a socket EOF (i.e., the underlying recv(2) returned 0), then IMO we should emit CR_SERVER_GONE_ERROR instead, which would fix your issue. There's already handling below that handles well-formed error packets. Ideally we would also distinguish packets that are well-formed but violate protocol (such as seeing a resultset packet in response to a COM_INIT_DB).

Looking at mysqlnd it looks there are lots of similar assumptions floating around. FAIL by itself is not enough to determine what error to emit.

@SakiTakamachi
Copy link
Member Author

@adsr

Yeah, PACKET_READ(payload_decoder_factory->conn, &ok_response) must be FAIL for the line you showed to be executed.

if (FAIL == (ret = PACKET_READ(payload_decoder_factory->conn, &ok_response))) {

And if this results in FAIL, CR_SERVER_GONE_ERROR may already be set on the line shown below.

SET_CLIENT_ERROR(error_info, CR_SERVER_GONE_ERROR, UNKNOWN_SQLSTATE, mysqlnd_server_gone);

Therefore, I changed it to check if an error has already been set when PACKET_READ(payload_decoder_factory->conn, &ok_response) == FAIL.

@adsr
Copy link
Contributor

adsr commented Oct 12, 2023

CR_SERVER_GONE_ERROR may already be set

Understood. I would bet this could happen in other places too. Rather than fixing your very specific case, could it be solved more generally in SET_CLIENT_ERROR itself? I.e., only set error_no if it's 0, and ensure mysqlnd_error_info_init is invoked at any userland entrypoint? I don't know if that's safe but at least conceptually it makes sense.

My comments re: CR_MALFORMED_PACKET still apply, but I suppose that's a separate issue.

@SakiTakamachi
Copy link
Member Author

I see, that makes sense.

As far as I could see, this was the only place where this could happen, so I was proposing a fix like this.

However, in a sense, my modification is like a monkey patch, and it may be wise to fundamentally change the mechanism to solve the problem.

@SakiTakamachi SakiTakamachi changed the base branch from PHP-8.1 to PHP-8.2 February 10, 2024 01:26
Copy link
Member

@nielsdos nielsdos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was able to reproduce the issue, and I see how it is caused by an error that was overwritten. This does fix the issue for me.
LGTM.

@nielsdos nielsdos linked an issue Feb 10, 2024 that may be closed by this pull request
@kamil-tekiela
Copy link
Member

@nielsdos Ok, can you please merge the PR then? As long as it fixes the issue and doesn't cause any others then let's merge it.

@nielsdos nielsdos closed this in 199e48b Feb 10, 2024
@SakiTakamachi
Copy link
Member Author

Thanks for checking and merging!

@SakiTakamachi SakiTakamachi deleted the fix/gh-11950 branch March 15, 2024 14:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

mysqlnd: error message is inappropriate in some cases
5 participants