New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
lua, CRASH ERROR terminated with reason: timeout #341
Comments
If I return false in my_auth_on_publish, the connection will disconnect each time, but the my_auth_on_subscribe is ok function my_auth_on_publish(pub)
return false
end Maybe this is the ERROR's key point. 2017-04-02 13:36:28.448 [warning] <0.937.0>@vmq_ranch:teardown:127 session stopped abnormally due to 'publish_not_authorized_3_1_1' |
Your comment above seems correct. Returning Regarding your first issue I don't know to be honest as the log you're showing only describes a timeout that occured probably when reaching out to your web server. Can you add some Lua |
1) Close connection question: Is same as this issue? I use mqtt.js, and if I use this config, everything is ok
2) The timeout question: body = http.body(response.ref) if I remove this line, I will get the ERROR after 50 requests (I test this issue 20+ times, the 50 is fixed), so I don't think this is normal I don't want check the body of http response, because I check the http status only, why I must add this line? The http server and vernemq in same host, and I just return 200 with empty body every time, so I don't think this is my http server's problem. |
Is there anybody review the 2nd question(timeout question)? Thanks |
We haven't looked at it yet, thanks for your patience. |
Hello, " If I slow down the speed rate of connections (e.g. 5 clients per second), vernemq is able to hold on and I can go to regime with all 4000 clients connected and subscribed. Thanks |
Hi @francescoPadovani81 thanks for reporting... a couple questions:
|
Hi @ioolkos ,
Thanks in advance for your help. Francesco |
@francescoPadovani81 The default timeout is set to 5000ms, which means for whatever reasons it took too long to process the Lua script. In this case the Lua script is very simple, it just accesses the database and returns its result. This means, either it took more than ~5000ms to access the configured database or there are too many requests that pile up in the This of course isn't a solution for your problem, but might help you understand the problem you're experiencing. |
Thanks so much @dergraf for your quick reply. Francesco |
This timeout is currently fixed, we'd have to make it configurable. However, no timeout isn't an option either, as it would just allow to accumulate even more messages in the mailbox of vmq_diversity_script, and this would only simulate the broker behaviour under connection spikes conditions, and connect latencies won't tell you much at this point. This could be ok for benchmarking but not for production. Hope this helps. |
I've just retried the load test by keeping monitored cpu and redis-server: unfortunately it seems there's no redis resources problem (redis cpu usage is always under 2%). And also the CPU of the entire system is good (never more than 50%). Also ram and I/O operations are good (both redis and vernemq are great from this point of view...) |
I have experimented the same problem in my vernemq+redis installation.
This problem do not show when using ACL, so guess there is something related with redis. Any suggestion would be apreciated |
Hm, |
yep. that's a good question. The single obvious explanation is that the access to the redis took too long. |
is it worth to use mongodb in replacement to redis? Or maybe to work on the redis plugin in order to make it faster? |
@diego-gabriele it depends. But, if you have a chance to run the same benchmark using Mongo we'd see if we run in the exact same bottleneck. |
We'd have to loadtest this somehow anyway. Access to Redis taking too long... this just sounds impossible :) ... like a programmer not loving pizza or something. |
I totally agree with ioolkos. I'm not so sure the problem is at redis side: when my load test crashed, redis was'nt stressed: it was consuming under 2% of cpu. And the "redis-cli --intrinsic-latency 100" command shown a Max latency of about 60 microseconds: a more than acceptable value I think. Maybe the problem is a limit on a different layer. I don't know... |
#381 Should fix the crashes. |
I connected VerneMQ to MongoDB instead of Redis and the results in term of subscribed users is EXACTLY the same. This means the bottleneck is somewhere inside verneMQ code. |
We've believe we've found the bottleneck and are working on a fix. |
Any idea when this fix will be released? |
I hope soon(a week, maybe two), but can't promise anything. |
any update? |
As soon as it's merged we'll let you know. |
We merged a new feature that enables to load balance Lua hooks. In the current master those database Lua wrappers have this feature enabled. We'll soon release 1.1.0 that contains this improvement. |
Of course i will |
The fixes has been released as part of the VerneMQ 1.1.0 release so I'm closing this - we'd of course appreciate any feedback you might have. |
When we can get the 1.1.0? :-)
Best Regards
Lewis Deng
…On Fri, Jun 23, 2017 at 1:48 PM, Lars Hesel Christensen < ***@***.***> wrote:
The fixes has been released as part of the VerneMQ 1.1.0 release so I'm
closing this - we'd of course appreciate any feedback you might have.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#341 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AEW2TLH39ZfGdm6GtgeyWEaYJHPO3F09ks5sG1GzgaJpZM4Mwy6t>
.
|
It's available on the download page: http://vernemq.com/downloads/ |
wow, GOOOOOOOD |
Environment
I use lua plugin, and pub/sub each 10ms, always get the CRASH ERROR
The vernemq and lua request http service in same ethernet
At http service I return 200 always
2017-04-02 08:06:32.948 [error] <0.561.0> Ranch listener {{172,17,0,17},1883} terminated with reason: {timeout,{gen_server,call,[<0.381.0>,{call_function,auth_on_register,[{addr,<<"192.168.5.104">>},{port,61806},{mountpoint,<<>>},{client_id,<<"mqttjs_94a1dfdf">>},{username,<<"aaaaa">>},{password,<<"bbbbb">>},{clean_session,true}]}]}}
2017-04-02 08:06:38.951 [error] <0.563.0> CRASH REPORT Process <0.563.0> with 0 neighbours exited with reason: {timeout,{gen_server,call,[<0.381.0>,{call_function,auth_on_register,[{addr,<<"192.168.5.104">>},{port,61835},{mountpoint,<<>>},{client_id,<<"mqttjs_94a1dfdf">>},{username,<<"aaaaa">>},{password,<<"bbbbb">>},{clean_session,true}]}]}} in gen_server:call/2 line 204
The text was updated successfully, but these errors were encountered: