Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error handling for broken connections #129

Open
Linuus opened this issue Apr 11, 2024 · 0 comments
Open

Error handling for broken connections #129

Linuus opened this issue Apr 11, 2024 · 0 comments

Comments

@Linuus
Copy link

Linuus commented Apr 11, 2024

Hi!

We're trying out your gem to send VoIP notifications, using Sidekiq. We are having some issues though with broken connections.

At first we were raising an error in the connection.on(:error) {} callback, like this:

      Apnotic::ConnectionPool.new(connection_config, size: 5) do |connection|
        connection.on(:error) do |exception|
          raise(PushNotification::Error, "Production APNs connection error: #{exception}")
        end
      end

That was a really bad idea since it crashed all of Sidekiq making it restart. We fixed this and now we're just reporting to our error service instead.

      Apnotic::ConnectionPool.new(connection_config, size: 5) do |connection|
        connection.on(:error) do |exception|
          Sentry.capture_exception(exception)
        end
      end

Now, occasionally we get this error reported:

Errno::ECONNRESET: Connection reset by peer
  from openssl (3.2.0) lib/openssl/buffering.rb:211:in `sysread_nonblock'
  from openssl (3.2.0) lib/openssl/buffering.rb:211:in `read_nonblock'
  from net-http2 (0.18.5) lib/net-http2/client.rb:145:in `block in socket_loop'
  from net-http2 (0.18.5) lib/net-http2/client.rb:142:in `loop'
  from net-http2 (0.18.5) lib/net-http2/client.rb:142:in `socket_loop'
  from net-http2 (0.18.5) lib/net-http2/client.rb:114:in `block (2 levels) in ensure_open'

It's reported in the callback and then 60s later we get a timeout here:

    connection_pool(ios_voip_push_token).with do |connection|
      response = connection.push(apnotic_notification(notification, ios_voip_push_token))
      raise(TimeoutError) if response.nil?
      [...]
    end

I guess we can pass a shorter timeout to the push method to lower this timeout, since it seems fairly high.

Anyway, when this happened it started happening a lot. Almost all our pushes got this connection reset error. Our push jobs are not retried, but I don't think this would help either since the connections seems to not be "healed".

Could there be an issue where connections are stuck in a broken state? Or are we supposed to handle these errors differently?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant