-
Notifications
You must be signed in to change notification settings - Fork 372
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
No RTP/SRTP Sent #12
Comments
Could you please post the full rtpengine log output for a test call you've made, thanks. |
Is there an rtpengine specific log as separate from the kamailio log ? |
Yes, the rtpengine daemon logs separately. It logs to syslog using the "daemon" facility. |
Please find the log file output for a test call. I tried to get LOG_DEBUG to output (tried defining env variable $DBG, changing the make file to always include -D__DEBUG=1 , removing the __DEBUG section from iLog & running rtpengine with a --log-level=7) but I couldn't get any LOG_DEBUG output. If you search backwards from the bottom for OUT_DIALOG_FROM_LYNCIT then that's the first call coming in from the SRTP side. Thanks very much for your help it's much appreciated.
|
Sorry but this doesn't contain any useful information. The log you posted doesn't contain any calls. |
There are two calls in the log file as follows: First Call - Search for "OUT_DIALOG_FROM_LYNCIT" - some call details: Second Call - Search for "OUT_DIALOG_TO_LYNCIT - some call details: I assume at this point rtpengine has knowledge of both calls (the SRTP from 192.168.2.106 and the RTP from 10.1.0.119) as it has modified the sdp offer in both SIP INVITE messages. What were you looking for in the log ? What log entries were you expecting to see that weren't there ? |
Yes, those would be the lines. Search the page yourself, they don't appear in what you posted. |
I think you have to hit the Edit Comment pencil icon then do a select all and then paste to see the entire contents of the post. |
Ah yes, you're right. Silly github, truncating comments like that. Reproducing the relevant log lines here, reformatted for clarity:
|
So we have... First call, 2cb67bfe-5a85-4010-8e20-572bd9cd24f6 [SRTP] 192.168.2.106:33042 <> 30002 / 30000 <> 10.1.0.121:19110 [RTP] IOW, packets received on port 30000 are sent out on port 30002 to side A, and packets received on port 30002 are sent out on port 30000 to side B. Second call, YjFhY2Q5OTUzMGVmM2RkMDRkNDI1NzNkYzg0YTM3NzU [RTP] 10.1.0.121:19130 <> 30006 / 30004 <> 192.168.2.106:33488 [SRTP] All this looks fine. However, the final packet stats don't look so good. Both calls are terminated due to timeout:
And all packet stats are zero. This means that rtpengine has never seen any packets coming in on any of these ports. My first guess would be an iptables packet filter. Otherwise perhaps an IP/routing configuration issue? Or possibly an overlap in UDP port ranges with something else that has the same ports open (unlikely). |
Hi Richard,
Thanks for your help. Ed From: Richard Fuchs [mailto:notifications@github.com] So we have... First call, 2cb67bfe-5a85-4010-8e20-572bd9cd24f6 [SRTP] 192.168.2.106:33042 <> 30002 / 30000 <> 10.1.0.121:19110 [RTP] IOW, packets received on port 30000 are sent out on port 30002 to side A, and packets received on port 30002 are sent out on port 30000 to side B. Second call, YjFhY2Q5OTUzMGVmM2RkMDRkNDI1NzNkYzg0YTM3NzU [RTP] 10.1.0.121:19130 <> 30006 / 30004 <> 192.168.2.106:33488 [SRTP] All this looks fine. However, the final packet stats don't look so good. Both calls are terminated due to timeout: Jul 24 15:19:09 localhost rtpengine[4945]: Closing call branch due to timeout And all packet stats are zero. This means that rtpengine has never seen any packets coming in on any of these ports. My first guess would be an iptables packet filter. Otherwise perhaps an IP/routing configuration issue? Or possibly an overlap in UDP port ranges with something else that has the same ports open (unlikely). — |
Hi Richard, Offer inbound to RTPEngine: Offer outbound from RTPEngine (no lifetime or MKI): It's then saying later on in the logs that SRTP and SRTCP output required but there's no crypto suite agreed. Do you think this could be the cause ? If so where in the code can I make a change which will write these fields out into the message ? The __generate_crypto function already has a lifetime for SRTP and SRTCP (I made it use &crypto_suites[0] to make sure of this) but it doesn't come out in the message. Thanks for your help Richard, it's much appreciated. The full log: Aug 20 15:39:13 localhost rtpengine[16796]: Generating new DTLS certificate |
The message "SRTP output wanted, but no crypto suite was negotiated" in this call is actually expected. It occurs when early media is received from the RTP endpoint before the SDP answer has been sent to the SRTP endpoint. SRTP can only be sent after the SDP answer has been processed, so any RTP received before that must be discarded with this message. I'm more concerned about the "authentication failed" messages, those shouldn't be there. There might be a problem with how the MKI is handled, I haven't done any testing with the MKI due to the lack of a client using the MKI. Perhaps you could create a complete tcpdump including the RTP and SRTP packets for one such call, so I can see why the authantication fails for these packets. The fact that rtpengine doesn't use an MKI itself shouldn't be a problem. |
Hi Richard, In Wireshark go to Edit->Preferences->Protocols->SSL then Edit->New then: IP Address: 10.1.0.127 (this is the Kamailio IP) Also if you put this filter into Wireshark it will help (192.168.2.106 is Lync Client IP and 10.1.0.121 is our platform media card IP) udp && ((ip.src==192.168.2.106) || (ip.dst==192.168.2.106) || (ip.src==10.1.0.121) || (ip.dst==10.1.0.121)) Let me know if you need more traces etc.. Thanks for your help - it's much appreciated. http://www.thephoneangel.com/LyncIT_RTPEngineTrace.pcap |
Thanks, could you also please post the rtpengine log from this particular call |
Hi Richard, Thanks |
Ok, so in the log file, I see two calls, da6dcca5-a21b-4ed6-aaab-3e79f7d875aa and MjkxMWU2MTdmOTNiZDM1NGU0OGQxYjEzMGQ4ZjhkYTA. However in the pcap, I only see one call (the first one), but it looks like the problematic one is the second one. Is there a second port in the pcap which also carries encrypted SIP? |
Call da6dcca5-a21b-4ed6-aaab-3e79f7d875aa (encrypted INVITE inbound into RTPEngine) has RTP between: 10.1.0.127:30142 (RTP Engine) and 192.168.2.106:33042 (MS Lync) In wireshark if you filter 'udp.srcport == 33042' or filter 'udp.srcport == 30142' then you see all SRTP packets - so that's OK. Call MjkxMWU2MTdmOTNiZDM1NGU0OGQxYjEzMGQ4ZjhkYTA. (encrypted INVITE outbound from RTPEngine) has media between: 10.1.0.127:30144 (RTP Engine) and 192.168.2.106:33044 (MS Lync) In Wireshark if you filter 'udp.srcport == 30144' or filter 'udp.srcport == 33044' then you see all UDP packets where you'd expect SRTP packets. I'm assuming that the MjkxMWU2MTdmOTNiZDM1NGU0OGQxYjEzMGQ4ZjhkYTA. call is not showing SRTP because the initial SDP negotiation failed somehow but the SIP messages look OK to me, the only difference I can see is that the RTP Engine doesn't offer |2^31|1:1 in the crypto attribute in an outbound INVITE and doesn't answer |2^31|1:1 in the crypto attribute in a 183 to an inbound INVITE but the issue might not be down to this difference. I can send you the complete SIP message history and also my Kamailio config file if that would be of use - it could be something to do with the way I'm calling the rtpproxy_manage function but the SIP exchange does look OK to me. Thanks for your help. |
I think I see the problem. In the second call (MjkxMW...), the A side first sends an offer (RTP from port 19080, translated to SRTP port 30144), then the B side sends an answer (SRTP from port 33488, translated to 30146). All is fine so far. Then the B side sends another answer, with a different port (33044) and different SRTP parameters (different key). The port change is no problem (rtpengine translates it to 30148), but the key change doesn't work. This results in failed authentication, discarded packets and no audio. I should have a fix for that shortly. The other log message ("Error parsing RTP header: invalid header version") is actually exactly what it says. In your most recent dump for example, packet 26305 is supposed to contain RTP but seems to contain some other garbage. Not sure what those are, perhaps they'll go away after the key change problem is fixed. |
Actually I have to take that back, key change does seem to be working, it was my test script that was broken. Back to looking... |
On further investigation, it looks like the key change isn't actually a key change, but rather two endpoints (two registrations?) answering the call at the same time. The first answer was advertising encrypted RTP on port 33488 (RTCP on 33489). The second answer (different to-tag, so a different registration/endpoint?) came less than a second later and was advertising encrypted RTP on port 33044 (RTCP on 33045) with a different encryption key. The second answer overrides the first one. I've looked at an RTCP packet from your pcap, packet 2850. It comes from port 33489, so it was probably the first endpoint sending it, and indeed the authentication tag matches if you use the encryption key from the first answer. However, rtpengine expects to be talking to the second endpoint and so it wants to use the encryption key from the second answer. Authentication fails, error is logged, packet is dropped. At the same time, RTP packets come in from port 33044, which is the first endpoint (packet 2953) and are encrypted using the key from the second answer. Those should get processed correctly, but the A-side (10.1.0.121, local port 30146 or 30148) is missing from the pcap, so I can't tell what's going on there. I understand that Lync supports conferencing services, is that what's happening here? |
OK that all makes sense, if I compare the packets sent from 33488/9 (KEY1) and 33044/5 (KEY2) and received on 30144/5 they do correlate with the error messages in the log: KEY1: KEY2: The 'Discarded invalid SRTCP packet' error occurs 113 times and the 'Error parsing RTP header' error occurs 20 times (for the total number of CLASSIC_STUN packets) so that all makes sense. Looking at the final packet stats it looks like the engine still links the 30146/7 pair of translated ports to 10.1.0.121:19080/1 and it doesn't link 30148/9 translated ports to 10.1.0.121:19080/1. So maybe the engine has overwritten the key to use the key from the second answer but it hasn't updated the translated ports from the second answer to be mapped to 10.1.0.121:19080/1. Does the '[::]:0' entry indicate this in the log extract below ? Aug 26 10:35:05 localhost rtpengine[16797]: [MjkxMWU2MTdmOTNiZDM1NGU0OGQxYjEzMGQ4ZjhkYTA.] Final packet stats: Aug 26 10:35:05 localhost rtpengine[16797]: [MjkxMWU2MTdmOTNiZDM1NGU0OGQxYjEzMGQ4ZjhkYTA.] --- Tag '50ad6ca53', created 4:42 ago, in dialogue with 'd27bb023' Aug 26 10:35:05 localhost rtpengine[16797]: [MjkxMWU2MTdmOTNiZDM1NGU0OGQxYjEzMGQ4ZjhkYTA.] ------ Media #1, port 30148 <> [::]:0 , 10645 p, 1935959 b, 0 e Aug 26 10:35:05 localhost rtpengine[16797]: [MjkxMWU2MTdmOTNiZDM1NGU0OGQxYjEzMGQ4ZjhkYTA.] ------ Media #1, port 30149 <> [::]:0 (RTCP), 28 p, 1848 b, 0 e Aug 26 10:35:05 localhost rtpengine[16797]: [MjkxMWU2MTdmOTNiZDM1NGU0OGQxYjEzMGQ4ZjhkYTA.] --- Tag '156b2a6f9c', created 4:42 ago, in dialogue with 'd27bb023' Aug 26 10:35:05 localhost rtpengine[16797]: [MjkxMWU2MTdmOTNiZDM1NGU0OGQxYjEzMGQ4ZjhkYTA.] ------ Media #1, port 30146 <> 10.1.0.121:19080, 0 p, 0 b, 0 e Aug 26 10:35:05 localhost rtpengine[16797]: [MjkxMWU2MTdmOTNiZDM1NGU0OGQxYjEzMGQ4ZjhkYTA.] ------ Media #1, port 30147 <> 10.1.0.121:19081 (RTCP), 0 p, 0 b, 0 e If I do a Wireshark of 'udp.dstport == 19080' there are no packets being sent to this port on 10.1.0.121 but there are SRTP (from 33044) and SRTCP (from 33045) packets coming into 30144/5 successfully. Also as you say I think the packets from 33044/5 (KEY2) must be being decrypted by the RTPEngine otherwise we'd have error messages relating to this as well and we don't. The client we are using isn't Lync itself it's a Lync testing tool called LyncIT which we have to use to become Lync certified. It's a black box client which we have no control over. It is a single LyncIT end point which is sending both 183 answers containing different To tags, I'm sure why they're different (maybe it's mimicking forking behind the single end point) or if they have to be the same in the RFC but certainly all of the SIP messages after the second 183 contain the To tag from that second 183 so the RTPEngine dropping the first key looks correct to me. I'm not sure why LyncIT would send packets encrypted with the first key after having sent a second key, it might not matter as far as the test goes if the KEY2 packets reach the other end. Thanks for your help, if you need a pcap with 10.1.0.121 included please let me know. |
Yes, these kinds of 3-was associations aren't well supported yet. I'll try to sort it out, but I'm not sure if that will fix your problem, as I'm still not totally sure what the expected behaviour is. |
So with the following patch applied: replaying the signalling gives you the following associations:
which looks much more healthier. I still can't say if that really fixes your setup though, but it should be a step in the right direction. |
Hi Richard, FAILED TO OPEN KERNEL TABLE 0, KERNEL FORWARDING DISABLED The start up sequence I'm using is the following: service mysqld start I realise this is a pilot error issue but any help would be appreciated. Thanks |
/proc/mediaproxy/ should appear after you run the modprobe command. You can check the output of lsmod to see if it's loaded. Note that modprobe searches for the module in /lib/modules/, so if you've just compiled it somewhere and didn't install it into the system, modprobe won't find it. Use insmod to load the module for a different place manually. |
Just a FYI - my previous build still works with the same start up sequence. |
Nothing has changed in the kernel interface for a long time, so it shouldn't make a difference. |
When you say it 'searches for the module' what file is it actually looking for ? All I've done is copy the new source onto the machine and run the Makefile in the kernel-module, iptables-extension and daemon folder. Do I need to copy something into /lib/modules ? Sorry I'm not familiar with Linux - thanks for your help. |
The compiled module file is called Alternatively you can copy it (or make a symlink) into /lib/modules/, find the subdirectory matching your kernel's version (as printed by The compiled iptables module also needs to go into |
OK thanks for that - I've now got past this issue (not quite sure how) and can see the sdp being modified but I'm getting no audio on the call inbound to RTPEngine (this used to work) and also still nothing on the outbound call. I looked at the final packet stats and it looks like media port 30010 and 30004 are both mapping to 192.1682.106:33044. I think that 30004 should map to 192.168.2.106:33488 given that 30005 maps to 192.168.2.106:33489, does that sound right or should two RTPEngine ports map to the same LyncIT port ? Thanks - I'm probably driving you nuts I'm sorry about that, there's just a lot of pressure to get it working. Aug 29 12:02:52 localhost rtpengine[27489]: [M2Y5M2NjMDI1MTYwMzkwZDQxYjgwZGQ0OGRhMzYwZWM.] Final packet stats: |
Well, normally they wouldn't, but what you have here (as before) is a call with apparently 3 participants. Since rtpengine expects (and only supports) 1-to-1 associations between endpoints, one of these participants is necessarily left hanging. It looks like you have the same situation as before, that the port reserved for one endpoint receives traffic from a different endpoint (overwriting the endpoint address received from signalling with the source address of the received packets), hence the duplicate addresses, while the other port doesn't receive anything. As mentioned before, I really don't know what the expected result here is. You can try using the "asymmetric" flag, which disables learning of endpoint addresses, but other than that I don't have any other suggestions. |
OK thanks for the explanation, I didn't realize it did the learning step as I was looking at your output (which I've copied below) from an earlier post but presumably this output is purely based on the signalling. I'll give the asymmetric flag a go. rtpengine[2555]: [asdgahahadfghadf] Closing call due to timeout Talking to two of my colleagues about the expected behaviour, they think that the port latching should be initiated by the signalling (which it is being above) and then the expected behaviour would be for the rtpengine to latch onto the source port of the first packet it receives (which is what the engine is doing) and while the engine receives packets from this source port it should disregard any packets from any other source port. When it considers that it's no longer receiving packets from the first source port (based on some sort of internal engine timeout) it should then re-learn and latch onto the source port of the next packet and remain latched to that source port whilst it was receiving packets from that source port. It should be able to decrypt all the packets in both (or any number) streams. I think that this would require the engine to hold a map of source port mapped to key (so it basically remembers all the keys it's ever been sent in the multiple 183s) so it could then decrypt both streams. Extracting the source port and looking up the key from the port -> key map should be quick (and certainly better than any logic which involved just trying the keys until one worked) so wouldn't have much overhead. Do you think this is feasible to implement ? Thanks for your help, it's much appreciated. |
I changed rtpproxy_manage("froc+Sp"); to rtpproxy_manage("faoc+Sp"); to switch on the asymmetric flag but I got the same result (details below) in that it didn't keep the source port from the sip message: Sep 1 11:22:15 localhost rtpengine[2500]: [MGNlYTQ5YTNlZDRkNDE5OWM4ZTQ3YzQ0NjY1M2M0OTE.] --- Tag '8156aa191', created 1:46 ago, in dialogue with 'b96ac65f' Have I set the flags incorrectly ? Thanks |
The "a" flag should be enough to disable learning of addresses, have you used it in all instances of rtpproxy_*() calls? You can verify it by looking at the offer/answer log lines of the rtpengine log, you should see the "asymmetric" flag there. As for your suggestion, it is my understanding that this would go against what the SRTP RFC describes. RFC 3711 says that the cryptographic context (which holds the master key and session key) is determined by the destination address and port of a received SRTP packet, not the source address/port. IOW, if your testing tool sends packets to a particular port and encrypts them using a key advertised by a different endpoint (and a different port) then that's probably a bug in the testing tool. The RFC also says that a trial & error approach in finding the correct decryption key must not be used. But leaving the SRTP issues aside, I'm not even sure which packets are supposed to be forwarded to where. Assuming that these endpoints were talking to each other directly, with no RTP proxy in between (and no SRTP involved, RTP only). Who ends up talking to who and on which ports? The endpoint making the original offer gets two answers from two different endpoints. Which one does it talk to? Which one does it send to and which one does it receive from? |
Thanks for you help Richard, I think we need to go back to the testing tool people and ask them what they expect us to do. Your comments have really helped us to clarify the issue so thanks again. |
Hi Richard, Looking at the log I think this might be because the SDP is included in the 200 OK to the INVITEs (even though it's already been negotiated) and the RTPEngine is adding "a=crypto:4" to the SDP in the 200 OK to the INVITE for call id 4d6e1349-adf8-4c7c-9ab0-d9ea647de41c - search the log for "ERJ20 - 200 OK reply to INVITE from LYNCIT" for the details of this particular message. I assume this should be "a=crypto:1" and this is the only difference I can see between the SDP in the 183 and the SDP in 200 OK to the INVITE. Sep 23 09:54:22 localhost /usr/local/sbin/kamailio[2519]: INFO: <script>: OUT_DIALOG_FROM_LYNCIT |
Yes you're right, this seems to be a bug. I'll investigate. |
However, I'm unable to reproduce this. If I replay the exact same signalling messages as I see in your log, the "answer" consistently comes back with a "crypto:1 ..." tag as it should, no matter how many times I repeat the answer. Does it always come back as "crypto:4" for you, or is the number changing or random? |
Thanks Richard, I got the latest source and it gives me "crypto:1..." so thanks for your help. I believe that the RTPEngine is now working for both call directions. The LyncIT testing tool is still failing the test but I think this is due to an issue with the tool itself so I'm closing this issue. |
Document `record-call` and `ICE` flag interaction
Hi All,
I'm trying to integrate Rtpengine into our MS Lync testing estate (so there's a MS Lync SIP client on the SRTP side and our internal media card on the RTP side) and have got to the point where there are RTP/RTCP (from the media card) and SRTP/SRTCP packets (from the MS Lync SIP client) going into the proxy (both IPV4) but no packets being sent by the proxy. Some info:
Our kamailio script entries:
Loadmodule “rtpproxy-ng.so”
Modparam(“rtpproxy-ng”, “rtpproxy_sock”, “udp:127.0.0.1:2223”)
request_route
For request from RTP to SRTP with sdp body
Rtpproxy_manage(“froc+Sp”)
onreply_route
rtpproxy_answer(“-sp”)
For request from SRTP to RTP with sdp body
Rtpproxy_manage(“froc-sp”)
Onreply_route
Rtpproxy_answer(“froc+Sp”)
Commands we run at startup:
service mysqld start
service kamailio start - - for rtpproxy
modprobe xt_MEDIAPROXY
iptables –I INPUT –p udp –d 10.1.0.127 –j MEDIAPROXY - -id 0
echo ‘del 0’ > /proc/mediaproxy/control
/usr/bin/rtpengine --table=0 - - listen-ng=127.0.0.1:2223 - - tos=184 --pidfile=/var/run/rtpengine.pid - - no-fallback
Service mysqld start
Service kamailio start
Any comments, suggestions etc.. as to why there are no packets being sent by the proxy would be welcome. I suspect it might be something obvious relating to config but if not then some pointers to places in the code I could add extra debugging would be helpful.
Thanks
Ed James
The text was updated successfully, but these errors were encountered: