Mysql2::Error: This connection is still waiting for a result, try again once you have the result #772

mmozuras · 2016-07-12T12:23:35Z

We've been experiencing this issue in production to varying degrees for the last two years:

Mysql2::Error: This connection is still waiting for a result, try again once you have the result

I've looked at previous issues (like #566, #532, #99), but didn't find anything that helps. We've also tried multiple things ourselves, but did not manage to solve it.

We're running:

mysql2 (0.4.4)
rails (4.2.5.2)
unicorn (4.9.0)

Ruby 2.2

Percona XtraDB Cluster 5.6 
(based on Percona Server 5.6, and subsequently it's based on MySQL 5.6)

Stacktraces don't show anything interesting, here's one example:

app/domain/user_msgs/for_thread.rb:35:in `user_msg_thread'
app/domain/user_msgs/for_thread.rb:27:in `among_receivers?'
app/domain/user_msgs/for_thread.rb:14:in `fetch'
app/domain/user_msgs/for_thread.rb:3:in `fetch'
app/controllers/api/v2/msg_threads_controller.rb:145:in `user_msg_thread'
app/controllers/api/v2/msg_threads_controller.rb:35:in `show'
config/initializers/1.patches/action_controller.rb:7:in `process_action'
app/controllers/api/v2/base_controller.rb:112:in `api'

We didn't notice any particular pattern to when it happens either:

Any help would be appreciated 😄

The text was updated successfully, but these errors were encountered:

dmitry-g · 2016-08-29T08:31:48Z

After some investigation it appeared that the issue was caused by incorrect signal handling.

A signal (like SIGINT or SIGQUIT) interrupted a Mysql2::Client which was in progress.
mysql_client_wrapper->active_thread was not reset to Qnil and it was still pointing to current Unicorn thread.
Custom signal handling didn't allow unicorn worker to shut down/restart and the worker continued to handle requests.
Each request which came to the worker failed, because mysql_client_wrapper->active_thread was pointing to rb_thread_current (Mysql2::Error: This connection is still waiting for a result, try again once you have the result).

richardlarocque · 2016-10-05T18:27:09Z

I ran into another, closely related variant of this issue. It can be triggered when someone uses Thread#kill to interrupt a long-running query. Here's a small repro:

#!/usr/bin/env ruby

require 'mysql2'

def long_query(connection)
  begin
    connection.query('SELECT SLEEP(1000);')
  rescue StandardError => e
    puts "Caught #{e}"
  ensure
    puts "Ensured"
  end
  puts 'done'
end

conn = Mysql2::Client.new(:host => "localhost",
                          :username => "root",
                          :password => 'p4ssw0rd')

t = Thread.new { long_query(conn) }
sleep 1
puts "Killing #{t}, #{t.status}"
t.kill
sleep 1
puts "Killed #{t}, #{t.status}"

conn.query('SELECT 1;')

On my machine, the output is:

Killing #<Thread:0x000000022f7ef0>, sleep
Ensured
Killed #<Thread:0x000000022f7ef0>, false
/var/lib/gems/2.3.0/gems/mysql2-0.4.4/lib/mysql2/client.rb:107:in `_query': This connection is in use by: #<Thread:0x000000022f7ef0@./main.rb:20 dead> (Mysql2::Error)
        from /var/lib/gems/2.3.0/gems/mysql2-0.4.4/lib/mysql2/client.rb:107:in `block in query'
        from /var/lib/gems/2.3.0/gems/mysql2-0.4.4/lib/mysql2/client.rb:106:in `handle_interrupt'
        from /var/lib/gems/2.3.0/gems/mysql2-0.4.4/lib/mysql2/client.rb:106:in `query'
        from ./main.rb:27:in `<main>'

ie. the killed thread is still marked as actively using the connection, even after it's been killed.

This variant is a little bit worse than the ones caused by signals or remotely-triggered exceptions, because there is no reasonable way to mask the Thread#kill signal. The handle_interrupt trick from https://github.com/brianmario/mysql2/pull/731/files won't work here.

I imagine this has been discussed before, but I haven't found any reference to it. Would it be reasonable to try to solve this with an exception-handler in the C extension code? The exception handlers should get run in the Thread#kill case. Would it be possible to register a handler to reset the connection state when an exception is thrown while waiting for a response to some query?

sodabrew · 2016-10-05T23:27:44Z

@richardlarocque I cannot think of any reasonable workaround for a Thread#kill. At best, the next time you try to use the connection it can say, "The handle is marked in-use by a dead thread, so you need to reconnect."

curtp · 2016-11-18T17:42:25Z

For those who are using Passenger to serve their app and use a script to kill passenger processes, check your script. We're running a script via cron to kill processes which are consuming too much ram. Thanks to the research by @dmitry-g, I looked into our script and was able to improve it to remove the use of kill. Passenger 5.0 comes with the passenger-config command which can be used to gracefully shut down passenger processes. We switched to that command recently and since we've started using it, this exception went away. Here is the script we're using to kill the processes for those who are interested (thanks to whoever came up with this script, it has been very helpful).

for i in /usr/local/rvm/bin/rvm-shell -c "passenger-memory-stats" |grep RubyApp|awk '{if ($4 > 299 && $7 == "RubyApp:") print $1}' ; do passenger-config detach-process $i; done

sodabrew · 2017-01-03T04:27:40Z

It looks like #811 would provide a fix for this problem.

mmozuras · 2017-05-02T18:45:41Z

Thanks for the help, everyone! We no longer encounter this issue. 👍

eredi93 · 2017-05-16T11:37:31Z

this happened again, we are running the latest gem. here is the log

» 15 May 2017 17:10:35.192 i-07a3c182330826af5 xxxxxxx-rails:ActiveRecord::StatementInvalid (Mysql2::Error: This connection is still waiting for a result, try again once you have the result: SELECT `xxxxxx`.* FROM `xxxxxxxx` WHERE `xxxxxx`.`type` IN ('xxxxxxx', 'xxxxxxxx') AND `xxxxxxx`.`id` = 111111111 ORDER BY `xxxxxxxx`.`id` ASC LIMIT 1 /*application:IntercomApp,line:/app/controllers/intercom_base_controller.rb:28:in `append_info_to_payload'*/):
» 15 May 2017 17:10:35.192  i-07a3c182330826af5 xxxxxxx: config/initializers/stacktrace.rb:10:in `block in instrument'
» 15 May 2017 17:10:35.192 i-07a3c182330826af5 xxxxxxx: config/initializers/stacktrace.rb:8:in `instrument'
» 15 May 2017 17:10:35.192  i-07a3c182330826af5 xxxxxxx: config/initializers/mysql2_exception_bucket.rb:4:in `execute'
» 15 May 2017 17:10:35.193 i-07a3c182330826af5 xxxxxxx: config/initializers/mysql2_readonly_reconnect.rb:4:in `execute'
» 15 May 2017 17:10:35.193  i-07a3c182330826af5 xxxxxxx: app/controllers/xxxxxxx_base_controller.rb:28:in `append_info_to_payload'
» 15 May 2017 17:10:35.192  i-07a3c182330826af5 xxxxxxx: config/initializers/stacktrace.rb:10:in `block in instrument' Context
» 15 May 2017 17:10:35.192  i-07a3c182330826af5 xxxxxxx: config/initializers/stacktrace.rb:8:in `instrument' Context
» 15 May 2017 17:10:35.192  i-07a3c182330826af5 xxxxxxx: config/initializers/mysql2_exception_bucket.rb:4:in `execute' Context
» 15 May 2017 17:10:35.193 i-07a3c182330826af5 xxxxxxx: config/initializers/mysql2_readonly_reconnect.rb:4:in `execute' Context
» 15 May 2017 17:10:35.193  i-07a3c182330826af5 xxxxxxx: app/controllers/xxxxxxx_base_controller.rb:28:in `append_info_to_payload' Context
» 15 May 2017 17:10:35.193 i i-07a3c182330826af5 xxxxxxx: config/initializers/stacktrace.rb:10:in `block in instrument' Context
» 15 May 2017 17:10:35.193  i-07a3c182330826af5 xxxxxxx: config/initializers/stacktrace.rb:8:in `instrument' Context
» 15 May 2017 17:10:35.193  i-07a3c182330826af5 xxxxxxx: app/lib/xxxxxxx/rack/version_header.rb:10:in `call' Context
» 15 May 2017 17:10:35.193  i-07a3c182330826af5 xxxxxxx: app/lib/xxxxxxx/rack/version_metrics.rb:9:in `call' Context
» 15 May 2017 17:10:35.193  i-07a3c182330826af5 xxxxxxx: app/lib/xxxxxxx/rack/bad_request_for_broken_json.rb:11:in `call' Context
» 15 May 2017 17:10:35.193 i-07a3c182330826af5 xxxxxxx: app/lib/xxxxxxxx/rack/character_encoding_coercion.rb:20:in `call' Context

can you help me debug it and see why this is still happening? @sodabrew @darnaut

sodabrew · 2017-05-17T00:43:30Z

What are your initializers? mysql2_exception_bucket.rb and mysql_readonly_reconnect.rb?

austinhalpin · 2017-05-17T10:24:26Z

mysql2_exception_bucket.rb is basically just a catch all for mysql2 exceptions:
We're recording metrics whenever we see certain mysql exceptions for alarming purposes.

Example:

module ActiveRecord
  module Mysql2ExceptionBucket
    def execute(*)
      super
    rescue Mysql2::Error, ActiveRecord::StatementInvalid => e
      if e.message =~ /This connection is still waiting for a result/
        record_metric
      end
    raise
  end

mysql_readonly_reconnect.rb is something we wrote to force the connection to reconnect!, we found that after we moved to AWS Aurora failovers were no longer handled gracefully & needed a redeploy to bounce connections. To avoid that, we wrote this initializer:

module ActiveRecord
  module Mysql2ReadonlyReconnect
    def execute(*)
      super
    rescue Mysql2::Error, ActiveRecord::StatementInvalid => e
      if e.message =~ /read-only/
        Rails.logger.info("MySQL running in read-only mode: reconnecting before raising error")
        reconnect!
      end

      raise
    end
  end
end

ActiveRecord::ConnectionAdapters::Mysql2Adapter.prepend(ActiveRecord::Mysql2ReadonlyReconnect)

sodabrew · 2017-05-17T12:01:50Z

Interesting re: Aurora failovers! Do you have metrics for Aurora failover events? I can imagine a relationship between these two scenarios:

Rails app sends query
Aurora fails over, connection errs silently
Rails app tries to send next query on the connection that erred silently: 'still waiting for a result'
Rails app tries to send next query on a connection that was in the pool: 'in read-only mode'

austinhalpin · 2017-05-17T14:22:01Z

The event that @eredi93 reported was isolated to a single host & wasn't triggered during any database failover, which is why we're at a bit of loss trying to figure out what the cause could be here. :(

stefansundin · 2018-05-21T20:32:35Z

@austinhalpin Thanks for sharing your initializer, it helped me! I'm also curious if you've encountered any other gotchas in the last year since your post. :)

I have been using Aurora a bit recently, and encountered the same issue with failovers. At first I thought that connections were not reset when the failover happened. After some more testing it does appear that all connections are reset, but that Rails often manages to reconnect before the DNS update has propagated fully. The Aurora cluster endpoint is a CNAME record with a 5 second TTL.

So my conclusion is that the mysql process on the old master restarts too quickly and is able to receive connections again (but now as a slave) before the DNS has propagated fully. Clients that manage to connect are oblivious to what happened and tries to write to the slave.

I changed your initializer slightly, using pool.disconnect! instead of reconnect! in order to force the whole connection pool to disconnect at once (this should hopefully reduce the number of errors that end users may receive). I also switched the regex to search for a string.

require 'active_record'

module ActiveRecord
  module Mysql2ReadonlyReconnect
    def execute(*)
      super
    rescue Mysql2::Error, ActiveRecord::StatementInvalid => e
      if e.message.include?('The MySQL server is running with the --read-only option')
        pool.disconnect!
      end
      raise
    end
  end
end

ActiveRecord::ConnectionAdapters::Mysql2Adapter.prepend(ActiveRecord::Mysql2ReadonlyReconnect)

Finally, this document has been helpful to adopt best practices for Aurora: https://d1.awsstatic.com/whitepapers/RDS/amazon-aurora-mysql-database-administrator-handbook.pdf

The document mentions "smart drivers" for Aurora. Basically the table INFORMATION_SCHEMA.REPLICA_HOST_STATUS let's you discover other replicas directly in SQL. You can also query SELECT @@innodb_read_only; to determine if you're actually on the master if that is what you expect.

There are some other Aurora-specific variables available:

> show variables like 'aurora%';
+------------------------------------+------------------+
| Variable_name                      | Value            |
+------------------------------------+------------------+
| aurora_lab_mode                    | OFF              |
| aurora_load_from_s3_role           |                  |
| aurora_max_alter_table_log_entries | 1000             |
| aurora_max_connections_limit       | 16000            |
| aurora_select_into_s3_role         |                  |
| aurora_server_id                   | staging-aurora-1 |
| aurora_use_key_prefetch            | ON               |
| aurora_version                     | 2.02             |
+------------------------------------+------------------+

sodabrew modified the milestones: 0.4.5, 0.4.6 Nov 18, 2016

mmozuras closed this as completed May 2, 2017

radamanthus mentioned this issue Jun 14, 2017

Consider using passenger-config detach instead of kill -9 engineyard/ey-cookbooks-stable-v5#213

Closed

florrain mentioned this issue Jun 24, 2018

Rails 4.0 compatibility #984

Closed

bala1203006 mentioned this issue Jul 5, 2022

Mysql2::Error: This connection is still waiting for a result, try again once you have the result: ROLLBACK #1274

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mysql2::Error: This connection is still waiting for a result, try again once you have the result #772

Mysql2::Error: This connection is still waiting for a result, try again once you have the result #772

mmozuras commented Jul 12, 2016 •

edited

Loading

dmitry-g commented Aug 29, 2016

richardlarocque commented Oct 5, 2016

sodabrew commented Oct 5, 2016

curtp commented Nov 18, 2016 •

edited

Loading

sodabrew commented Jan 3, 2017

mmozuras commented May 2, 2017

eredi93 commented May 16, 2017 •

edited

Loading

sodabrew commented May 17, 2017

austinhalpin commented May 17, 2017

sodabrew commented May 17, 2017

austinhalpin commented May 17, 2017

stefansundin commented May 21, 2018

Mysql2::Error: This connection is still waiting for a result, try again once you have the result #772

Mysql2::Error: This connection is still waiting for a result, try again once you have the result #772

Comments

mmozuras commented Jul 12, 2016 • edited Loading

dmitry-g commented Aug 29, 2016

richardlarocque commented Oct 5, 2016

sodabrew commented Oct 5, 2016

curtp commented Nov 18, 2016 • edited Loading

sodabrew commented Jan 3, 2017

mmozuras commented May 2, 2017

eredi93 commented May 16, 2017 • edited Loading

sodabrew commented May 17, 2017

austinhalpin commented May 17, 2017

sodabrew commented May 17, 2017

austinhalpin commented May 17, 2017

stefansundin commented May 21, 2018

mmozuras commented Jul 12, 2016 •

edited

Loading

curtp commented Nov 18, 2016 •

edited

Loading

eredi93 commented May 16, 2017 •

edited

Loading