Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nats-io timeout #130

Closed
marcelohpf opened this issue Mar 18, 2020 · 2 comments
Closed

Nats-io timeout #130

marcelohpf opened this issue Mar 18, 2020 · 2 comments

Comments

@marcelohpf
Copy link

Summary

One game are having several issues caused by network instabilities and Nats-IO.

When we have a disruption in the communication between Nats and connector/game/metagame. It can't recover from that error. Even restarting the pods of pitaya doesn't solve the "nats timeout" error. We had to restart nats and pitaya together to solve the issue.

Logs

Here are a few logs from connector

{
      "server": "connector",
      "level": "error",
      "version": "0.1.0",
      "msg": "Failed to process remote: nats: timeout",
      "source": "game",
      "time": "2020-03-16T03:48:37Z"
    }

In our meta-game:

{ 
      "source": "game",
      "time": "2020-03-14T17:58:21Z",
      "method": "playerHandler.Authenticate",
      "version": "0.1.0",
      "msg": "nats: timeout",
      "methodName": "bindSession",
      "server": "metagame",
      "level": "error"
 }
{
      "server": "metagame",
      "level": "error",
      "version": "0.1.0",
      "msg": "error while trying to push session to front: nats: timeout",
      "source": "game",
      "time": "2020-03-14T17:58:23Z"
    }

How to reproduce

Currently we couldn't reproduce the error. But it happened a few times:

When we did a reload on the network plugin. That interrupted some connections with nats and started the errors

When we got a slow consumer in Nats and some games restarted.

Related issues

I found this issue nats-io/stan.js#101

Versions

Nats: 1.1.0
Etcd: v3.3.10

@LucianoPC
Copy link

I had this problem when I was creating a ruby pitaya framework, in my case was happening the following situation, when the metagame closes without remove your key on etcd the pitaya frontend didn't recognise that the metagame not exists anymore, so when the pitaya frontend receives a request they try send to an inexistent metagame server using the Nats then happens this "Nats timeout"

@felipejfc
Copy link
Contributor

This is expected behavior, if the server is listed in etcd then we expect it to be answering to requests. That's why you should always insert the key for the server with a lease containing a ttl, so that it auto-heals after some few seconds.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants