Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ETS ** Too many db tables ** error #5

Closed
long-tran opened this issue Nov 7, 2016 · 9 comments
Closed

ETS ** Too many db tables ** error #5

long-tran opened this issue Nov 7, 2016 · 9 comments

Comments

@long-tran
Copy link

Hi man, I've recently run into this problem on my production environment:

CRASH REPORT==== 7-Nov-2016::09:00:05 ===
  crasher:
    initial call: ranch_conns_sup:init/7
    pid: <0.8347.1>
    registered_name: []
    exception exit: {system_limit,
                        [{ets,new,[pdu_storage_by_sequence_number,[set]],[]},
                         {'Elixir.SMPPEX.PduStorage',init,1,
                             [{file,"lib/smppex/pdu_storage.ex"},{line,43}]},
                         {gen_server,init_it,6,
                             [{file,"gen_server.erl"},{line,328}]},
                         {proc_lib,init_p_do_apply,3,
                             [{file,"proc_lib.erl"},{line,247}]}]}
      in function  ranch_conns_sup:terminate/3 (src/ranch_conns_sup.erl, line 224)
    ancestors: [<0.8346.1>,<0.8345.1>]
    messages: []
    links: []
    dictionary: [{<0.8348.1>,true}]
    trap_exit: true
    status: running
    heap_size: 610
    stack_size: 27
    reductions: 261
  neighbours: 
.....
[error] * Too many db tables

It seems like something to do with the pdu_storage, is there any potential misconfiguration in the the SMPPEX code?

Thanks,
Long

@savonarola
Copy link
Contributor

savonarola commented Nov 7, 2016

Hello!

Thanks for the feedback.

There are two main reasons that may cause the problem:

  • there is something that creates many ets'es in your code, so that the creation of the next mc session fails when system limits are exhausted;
  • all of the ets'es are consumed by SMPPEX itself, in this case there should be many (unstopped for some reason) mc sessions.

So there are several questions I would like to ask to get the situation more clear:

  • What is the number of simultaneous client connections that your server has when the crash occurs? Have you specified custom max_connections transport option when starting MC?
  • What are the names of ets'es that pollute the ets space when the crash occurs? (This info can be obtained by running :ets.i()).

@savonarola
Copy link
Contributor

Closing due to no reply.

@archseer
Copy link
Contributor

archseer commented Apr 5, 2017

@savonarola Hi, we just ran into the same issue. My max_connections is at 600 (while the ETS table limit should be around 1400 by default) and I had a health checker try and open (and close) a socket every 10 seconds.

12:47:58.117 [info]  mc_conn #PID<0.1832.0>, socket closed, stopping

12:48:08.117 [info]  mc_conn #PID<0.1838.0>, socket closed, stopping

12:48:08.117 [info]  mc_conn #PID<0.1841.0>, socket closed, stopping

12:48:18.117 [info]  mc_conn #PID<0.1844.0>, socket closed, stopping

12:48:18.117 [info]  mc_conn #PID<0.1847.0>, socket closed, stopping

12:48:28.117 [info]  mc_conn #PID<0.1850.0>, socket closed, stopping

12:48:28.117 [info]  mc_conn #PID<0.1853.0>, socket closed, stopping

12:48:38.117 [info]  mc_conn #PID<0.1856.0>, socket closed, stopping

...

After a few hours, of this though, any time the health checker opens a socket, we encounter this issue:

16:54:48.121 [error] Ranch listener #Reference<0.0.2.571> connection process start failure; SMPPEX.Session:start_link/4 returned: {:error, {{:badmatch, {:error, {:system_limit, [{:ets, :new, [:pdu_storage_by_sequence_number, [:set]], []}, {SMPPEX.PduStorage, :init, 1, [file: 'lib/smppex/pdu_storage.ex', line: 43]}, {:gen_server, :init_it, 6, [file: 'gen_server.erl', line: 328]}, {:proc_lib, :init_p_do_apply, 3, [file: 'proc_lib.erl', line: 247]}]}}}, [{SMPPEX.MC, :init, 1, [file: 'lib/smppex/mc.ex', line: 386]}, {:gen_server, :init_it, 6, [file: 'gen_server.erl', line: 328]}, {:proc_lib, :init_p_do_apply, 3, [file: 'proc_lib.erl', line: 247]}]}}

So it seems that the ETS table is not getting cleaned up properly when a Ranch socket is closed.

@archseer
Copy link
Contributor

archseer commented Apr 5, 2017

Do note that we have no active connections to the instance, except for the health-check opening and closing the socket (so this is not a case of it being over-saturated with traffic).

@savonarola savonarola reopened this Apr 6, 2017
@savonarola
Copy link
Contributor

Hello!

Trying to reproduce the issue.

@archseer
Copy link
Contributor

archseer commented Apr 6, 2017 via email

@archseer
Copy link
Contributor

archseer commented Apr 6, 2017 via email

@savonarola
Copy link
Contributor

savonarola commented Apr 6, 2017

Hello!

I have reproduced the issue; the reason was that peer closing socket is not considered to be an abnormal case, so MC session stopped with :normal leaving child PduStorage alive and keeping its ets.

I have added the necessary cleanup.

@archseer
Copy link
Contributor

archseer commented Apr 6, 2017

@savonarola as always, thank you for the swift fix! 🍻

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants