New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Frequent EMSGSIZE with UDPSender #124

Closed
hekike opened this Issue Jun 6, 2017 · 9 comments

Comments

Projects
None yet
4 participants
@hekike

hekike commented Jun 6, 2017

I run frequently into EMSGSIZE error with UDPSender sender.
I tried to configure a huge maxPacketSize without any luck.

Do you experience similar issue?

@vprithvi

This comment has been minimized.

Member

vprithvi commented Jun 6, 2017

Could you provide more details?

  • Where do you see this error?
  • What is the MTU set for the network interface that you are using?
@yurishkuro

This comment has been minimized.

Member

yurishkuro commented Jun 6, 2017

That sounds like incorrect calculation of the cumulative message size, which is a somewhat tricky problem with thrift and compact encoding.

The max packet size should not be fiddled with because it's what the agent expects, it will not accept different sizes.

@zhongfox

This comment has been minimized.

zhongfox commented Aug 22, 2017

I reproduce this problem using jaeger-client@3.5.3, form Mac OS, MTU: 1500, node version: 8.3
if add a callback here: https://github.com/uber/jaeger-client-node/blob/master/src/reporters/udp_sender.js#L139

like:

this._client.send(thriftBuffer, 0, thriftBuffer.length, this._port, this._host, function(e, sent) {
  if (e) {
    console.log("thriftBuffer.length: ", thriftBuffer.length)
    console.log("sent: ", sent)
    console.log(e)
  }
});
thriftBuffer.length:  37724
sent:  37724
{ Error: send EMSGSIZE localhost:6832
    at Object._errnoException (util.js:1022:11)
    at _exceptionWithHostPort (util.js:1045:20)
    at SendWrap.afterSend [as oncomplete] (dgram.js:474:11)
  code: 'EMSGSIZE',
  errno: 'EMSGSIZE',
  syscall: 'send',
  address: 'localhost',
  port: 6832 }

this._client.on('error', err => {
  console.log(`error sending span: ${err}`)
})

this code can not get this error, so the error should from:
https://github.com/nodejs/node/blob/master/lib/dgram.js#L480

not sure why called afterSend, but agent does not get the data.

btw, Im using https://github.com/RisingStack/jaeger-node now, author is @hekike , so Im not sure the issue is because of RisingStack/jaeger-node or uber/jaeger-client-node.


and, https://github.com/uber/jaeger-client-node/blob/master/src/reporters/udp_sender.js#L29
why const UDP_PACKET_MAX_LENGTH = 65000
I think it's reasonable to be 1472: max MTU 1500 - ip header 20 -ued header 8
pls correct me, thx.

@yurishkuro

This comment has been minimized.

Member

yurishkuro commented Aug 22, 2017

@zhongfox are you running the agent on the same host? UDP packets on loopback interface can be up to 64Kb without fragmentation, which is why we have UDP_PACKET_MAX_LENGTH = 65000. If you're running the agent on another host (or perhaps through some network mapping that could limit MTU), then you're better off not using UDP. I don't think our Node client supports HTTP sender the way Go and Java clients do.

@zhongfox

This comment has been minimized.

zhongfox commented Aug 23, 2017

@yurishkuro yes, agent and node project are in the same mac pro.

% ifconfig
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> mtu 16384
	options=3<RXCSUM,TXCSUM>
	inet6 ::1 prefixlen 128
	inet 127.0.0.1 netmask 0xff000000
	inet6 fe80::1%lo0 prefixlen 64 scopeid 0x1
	nd6 options=1<PERFORMNUD>
gif0: flags=8010<POINTOPOINT,MULTICAST> mtu 1280
stf0: flags=0<> mtu 1280
en0: flags=8823<UP,BROADCAST,SMART,SIMPLEX,MULTICAST> mtu 1500

looks like my LOOPBACK mtu is 16384, should it be 65000?

but even I change to var UDP_PACKET_MAX_LENGTH = 16000;

I can still get this error:

thriftBuffer.length:  9508
sent:  9508
{ Error: send EMSGSIZE 127.0.0.1:6832
    at Object._errnoException (util.js:1022:11)
    at _exceptionWithHostPort (util.js:1045:20)
    at SendWrap.afterSend [as oncomplete] (dgram.js:474:11)
  code: 'EMSGSIZE',
  errno: 'EMSGSIZE',
  syscall: 'send',
  address: '127.0.0.1',
  port: 6832 }

from some sources, EMSGSIZE indicate the data which application send is bigger than udp socket buffer, which can be changed by SO_SENDBUF, but seem node.js dgram doesn't provide such api.

@yurishkuro

This comment has been minimized.

Member

yurishkuro commented Aug 23, 2017

My Mac also has mtu 16384 for loopback. But also this:

$ sysctl -a|grep gram
net.local.dgram.recvspace: 4096
net.local.dgram.maxdgram: 2048
net.inet.udp.maxdgram: 9216
net.inet.raw.maxdgram: 8192

maybe you try increasing these, or limiting jaeger client to these sizes.

In contrast, on a Linux host:

$ sudo sysctl -a|grep net.core | grep mem_max
[skip]
net.core.rmem_max = 212992
net.core.wmem_max = 212992
@zhongfox

This comment has been minimized.

zhongfox commented Aug 23, 2017

turn out mac net.inet.udp.maxdgram is 9216

https://stackoverflow.com/questions/9123098/set-max-packet-size-for-gcdasyncudpsocket
savoirfairelinux/opendht#135

so this code aways fail in my mac

var dgram = require('dgram');
var message = new Buffer(9217); // 9216 or smaller is ok
var client = dgram.createSocket("udp4");
client.send(message, 0, message.length, 6832, "localhost", function(err) {
console.log(message.length)
  console.log(err)
  client.close();
});

so I did this:

% sysctl net.inet.udp.maxdgram
net.inet.udp.maxdgram: 9216
% sudo sysctl net.inet.udp.maxdgram=65536
net.inet.udp.maxdgram: 9216 -> 65536
% sudo sysctl net.inet.udp.maxdgram
net.inet.udp.maxdgram: 65536

the problem solved.
Im wonder why you guys didn't encounter this problem.
so maybe you need give some note for mac user in readme.

@zhongfox

This comment has been minimized.

zhongfox commented Aug 24, 2017

@yurishkuro
I moved to centos to do performance test, and found another situation can lead to EMSGSIZE.
it's more common for any OS, and lead to span miss.
please see: #150

@yurishkuro

This comment has been minimized.

Member

yurishkuro commented Nov 22, 2017

I believe this has been solved. Please reopen if not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment