Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Presence timeout on heavy load #2238

Closed
alirajabi opened this issue Apr 5, 2017 · 1 comment
Closed

Presence timeout on heavy load #2238

alirajabi opened this issue Apr 5, 2017 · 1 comment

Comments

@alirajabi
Copy link

alirajabi commented Apr 5, 2017

Environment

  • Elixir version (elixir -v): 1.4.2
  • Phoenix version (mix deps): 1.2.1
  • Operating system: Ubuntu server

Expected behavior

Presence works fine

Actual behavior

I have three nodes with these hardware
2 * Intel(R) Xeon(R) CPU E5-2695 v3 @ 2.30GHz
64 GB Ram
256 SSD

When each of the servers is under ~300-400 channel request per second and ~35000 user connection (System load 15/56), I get timeout error from list and track methods.

Presence.list error:

** (stop) exited in: GenServer.call(App.Presence, {:list, "user:e6728e0c-3170-4863-b8b0-8e7e0b00e813"}, 5000)
    ** (EXIT) time out
    (elixir) lib/gen_server.ex:737: GenServer.call/3
    lib/phoenix/tracker.ex:199: Phoenix.Tracker.list/2
    (phoenix) lib/phoenix/presence.ex:236: Phoenix.Presence.list/2
    (app) web/channels/presence.ex:8: App.Presence.user_online?/1
    (app) web/channels/user_channel.ex:318: anonymous fn/1 in App.UserChannel.push_notification/1
    (elixir) lib/task/supervised.ex:85: Task.Supervised.do_apply/2
    (stdlib) proc_lib.erl:247: :proc_lib.init_p_do_apply/3
Function: #Function<2.63574933/0 in App.UserChannel./1>
    Args: []

Presence.track error:

** (stop) exited in: GenServer.call(App.Presence, {:track, #PID<0.28571.6>, "user:9cab5aab-7737-49
bc-a4a2-faba1380b95a", "online", %{}}, 5000)
    ** (EXIT) time out
    (elixir) lib/gen_server.ex:737: GenServer.call/3
    (app) web/channels/user_channel.ex:67: anonymous fn/1 in App.UserChannel.handle_info/2
    (elixir) lib/task/supervised.ex:85: Task.Supervised.do_apply/2
    (stdlib) proc_lib.erl:247: :proc_lib.init_p_do_apply/3
Function: #Function<1.63574933/0 in App.UserChannel.handle_info/2>
    Args: []

This is the track code section:

  with {:ok, _} <- Presence.track(socket, "online", %{}) do
    ### do somethings
  end
  {:noreply, socket}

This is the list code section:

  def user_online?(user_id) do
    "user:" <> user_id
    |> Presence.list
    |> Map.has_key?("online")
  end

I'm sure this is not hardware resources issue. So what's problem ?

@chrismccord
Copy link
Member

Your user_online? function is quite expensive to find a single user, but the work is being done in the client. That said, a Presence.list requires a server call so it is contributing to your timeouts. Please open the issue on the phoenix_pubsub side and we can take a look. In the long term, we can pool the trackers, but in the short term, you could shard the trackers yourself by topic, which would help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants