Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it possible to make calls with 2-Way Audio? #107

Closed
MajorMaxdom opened this issue Feb 13, 2023 · 27 comments
Closed

Is it possible to make calls with 2-Way Audio? #107

MajorMaxdom opened this issue Feb 13, 2023 · 27 comments
Labels
question Issue is asking a question

Comments

@MajorMaxdom
Copy link

Hi, i tried calling the number, which i configured for the sip account in my Fritz!Box.
The call gets answered without a problem.

Now i cant seem to find out how to receive audio and transmit audio.
The code-snippet for recoding voice in #80 does not seem to work for me.

When trying to play audio using the code from the documentation, i only get noise on the phone.
This also doesnt seem to work for me.

Is there a working script that allows me to transfer audio in both ways simultaneously?

Thanks in advance!

@jrozhon
Copy link

jrozhon commented Feb 17, 2023

Hi,
it should be pretty straightforward. For the UAS side, there is an example in the documentation that looks similar to this:

import time
import wave

from pyVoIP.VoIP import CallState, InvalidStateError, VoIPPhone

# ACCOUNT DATA:
SIP_IP = "x.x.x.x"
SIP_Port = 5060
SIP_Username = "iptel402"
SIP_Password = "pass"
myIP = "x.x.x.x"


def answer(call):
    try:
        f = wave.open("./lala.wav", "rb")
        frames = f.getnframes()
        data = f.readframes(frames)
        f.close()

        call.answer()
        call.write_audio(
            data
        )  # This writes the audio data to the transmit buffer, this must be bytes.

        stop = time.time() + (
            frames / 8000
        )  # frames/8000 is the length of the audio in seconds. 8000 is the hertz of PCMU.

        while time.time() <= stop and call.state == CallState.ANSWERED:
            time.sleep(0.1)
        call.hangup()
    except InvalidStateError:
        pass
    except:
        call.hangup()


if __name__ == "__main__":
    phone = VoIPPhone(
        SIP_IP, SIP_Port, SIP_Username, SIP_Password, bind_ip=myIP, call_callback=answer
    )
    phone.start()
    input("Press enter to disable the phone")
    phone.stop()

Now, you need to make the similar thing as you do in the answer callback with your UAC. Callback actually operates on the VoIPCall instance, so you just need to get this for UAC as well.

Luckily, the call() method actually returns the call object so you can do something like this:

def send_audio(call):
    try:
        f = wave.open("./lala.wav", "rb")
        frames = f.getnframes()
        data = f.readframes(frames)
        f.close()

        call.write_audio(
            data
        )  # This writes the audio data to the transmit buffer, this must be bytes.

        stop = time.time() + (
            frames / 8000
        )  # frames/8000 is the length of the audio in seconds. 8000 is the hertz of PCMU.

        while time.time() <= stop and call.state == CallState.ANSWERED:
            time.sleep(0.1)
        call.hangup()
    except InvalidStateError:
        pass
    except:
        call.hangup()


if __name__ == "__main__":
    phone = VoIPPhone(
        SIP_IP, SIP_Port, SIP_Username, SIP_Password, bind_ip=myIP, call_callback=answer
    )
    phone.start()
    call = phone.call("123456789")
    send_audio(call)
    input("Press enter to disable the phone")
    phone.stop()

Haven't tested the code as I have a bit lack of time right now, but it should work, perhaps with some minor tweaks.

Regards, J

@ferugit
Copy link

ferugit commented Feb 22, 2023

@jrozhon that example only sends audio

@ferugit
Copy link

ferugit commented Feb 22, 2023

@MajorMaxdom

I am trying something like this:

while (call.state != CallState.ENDED) and (chunk_index <= n_samples-chunk_size):
      call.write_audio(
          audio_array[chunk_index:chunk_index+chunk_size]
      )
      incoming_audio = call.read_audio(length=1024, blocking=False)
      print(incoming_audio)
      if incoming_audio != b"\x80" * len(incoming_audio):
          w.writeframes(incoming_audio)
      
      chunk_index += chunk_size
      time.sleep(0.1)

But this is not working for me at the moment :/
I am receiving only b"\x80"

@jrozhon
Copy link

jrozhon commented Feb 22, 2023

Hi, @ferugit. My example was meant only to send audio from both ends - UAC and UAS. You also want to do something with the incoming audio?

@ferugit
Copy link

ferugit commented Feb 22, 2023

Hi, @ferugit. My example was meant only to send audio from both ends - UAC and UAS. You also want to do something with the incoming audio?

Hello, yes I think @MajorMaxdom is asking how to send and receive audio simultaneously

@jrozhon
Copy link

jrozhon commented Feb 22, 2023

Oh, ok, my bad then. I thought the goal was to

...that allows me to transfer audio in both ways simultaneously

What is it you want to achieve with the incoming audio? Just store it in a file?

@MajorMaxdom
Copy link
Author

Hi everyone,

sorry for not providing a more precise description.

I would like to write a script that both sends and receives audio simultaneously, like @ferugit said.
Like a normal phone call, with direct audio output through some audio device.

The Background for my idea:
I bought an old german telephone (FeTap 611) and wanted to connect the handset from the telephone to a Raspberry Pi and use a script to make a call. The microphone should be used for audio input. The speaker should be used for audio output.
I dont want to use the old boards from the phone. Just the handset.

@ferugit
Copy link

ferugit commented Feb 22, 2023

@MajorMaxdom

I already solved this, in my case I needed to use a single socket for RTP communication. In the RTP.py file y modified the following lines:

    def start(self) -> None:
        self.sin = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
        #self.sout = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
        self.sin.bind((self.inIP, self.inPort))
        self.sin.setblocking(False)
        self.sout = self.sin

Just configuring this, I was able to simultaneously send and receive audio. Hope it helps you!

@jrozhon
Copy link

jrozhon commented Feb 22, 2023

I am not sure this is a good idea as sending might potentially block the receiving. Or at least this is my understanding. I have not tried to store the audio yet, so I might be off here, but I know I can process DTMFs which is carried by RTP as well.

I would consider using RTPPacketManager class for this task and its read method, or maybe even better RTPClient read method, which effectively calls pmin.read() to get the data from the buffer. This way you would be on the safe side.

@plugnburn
Copy link

plugnburn commented Feb 23, 2023

@MajorMaxdom

I already solved this, in my case I needed to use a single socket for RTP communication. In the RTP.py file y modified the following lines:

    def start(self) -> None:
        self.sin = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
        #self.sout = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
        self.sin.bind((self.inIP, self.inPort))
        self.sin.setblocking(False)
        self.sout = self.sin

Just configuring this, I was able to simultaneously send and receive audio. Hope it helps you!

Confirming, helped me in my case as well, both read_audio and write_audio work correctly now. Please accept this change in upstream.

@jrozhon
Copy link

jrozhon commented Feb 23, 2023

I have taken a look at the problem and tried to send and receive the audio myself. To be honest, it works for me without any issue. I am using one wav to send audio to asterisk and then store the incoming audio (which is just an echo of the first) to second wav and they are the same.

So, sorry if I am missing something obvious and you can of course use what you want, but in case you want to stick with the way the library works, this code works for me.

J

import time
import wave
from io import BytesIO
from threading import Thread

from pyVoIP.VoIP import CallState, InvalidStateError, VoIPPhone

# ACCOUNT DATA:
SIP_IP = "x.x.x.x"
SIP_Port = 5060
SIP_Username = "xxx"
SIP_Password = "pass"
myIP = "x.x.x.x"


def answer(call):
    try:
        f = wave.open("./lala.wav", "rb")
        frames = f.getnframes()
        data = f.readframes(frames)
        f.close()

        call.answer()
        call.write_audio(
            data
        )  # This writes the audio data to the transmit buffer, this must be bytes.

        stop = time.time() + (
            frames / 8000
        )  # frames/8000 is the length of the audio in seconds. 8000 is the hertz of PCMU.

        while time.time() <= stop and call.state == CallState.ANSWERED:
            time.sleep(0.1)
        call.hangup()
    except InvalidStateError:
        pass
    except:
        call.hangup()


def send_audio(call):
    b = BytesIO()
    try:
        f = wave.open("./lala.wav", "rb")
        frames = f.getnframes()
        data = f.readframes(frames)
        f.close()

        # call.answer()
        call.write_audio(
            data
        )  # This writes the audio data to the transmit buffer, this must be bytes.
        print("y")

        stop = time.time() + (
            frames / 8000
        )  # frames/8000 is the length of the audio in seconds. 8000 is the hertz of PCMU.

        print(stop, time.time(), call.state)
        while time.time() <= stop and call.state == CallState.ANSWERED:
            print("x")
            b.write(call.read_audio())  # store the incoming data to buffer
            time.sleep(0.01) # need to wait way less than the 20 ms to get all packets
        call.hangup()
    except InvalidStateError:
        pass
    except:
        call.hangup()
    finally:
        # strore the audio
        with wave.open("test.wav", "wb") as f:
            f.setnchannels(1)  # mono
            f.setsampwidth(1) # 8 bit
            f.setframerate(8000) 
            f.writeframes(b.getvalue())


if __name__ == "__main__":
    phone = VoIPPhone(
        SIP_IP, SIP_Port, SIP_Username, SIP_Password, bind_ip=myIP, call_callback=answer
    )
    phone.start()
    call = phone.call("999")
    while call.state != CallState.ANSWERED:
        time.sleep(0.1)
    send_audio(call)

    input("Press enter to disable the phone")
    phone.stop()

@plugnburn
Copy link

plugnburn commented Feb 24, 2023

I have taken a look at the problem and tried to send and receive the audio myself. To be honest, it works for me without any issue. I am using one wav to send audio to asterisk and then store the incoming audio (which is just an echo of the first) to second wav and they are the same.

So, sorry if I am missing something obvious and you can of course use what you want, but in case you want to stick with the way the library works, this code works for me.

Try implementing a simple echo test (with read_audio and write_audio in the same function) and you'll see what the issue is. From behind a NAT.

@jrozhon
Copy link

jrozhon commented Feb 24, 2023

Oh, I guess I FINALLY got what you are trying to achieve. Not sending the whole audio at once but rather small chunks as they come from whatever source, correct?

@plugnburn
Copy link

Oh, I guess I FINALLY got what you are trying to achieve. Not sending the whole audio at once but rather small chunks as they come from whatever source, correct?

Yes but it doesn't matter in this case. What matters is creating two sockets to the same IP and port and using them at the same time. I'm not sure whose side the issue is on but this clearly can become a problem (and it was at least for 3 people here).
Is there any reason not to reuse the same socket for the actual UDP transmission?

@jrozhon
Copy link

jrozhon commented Feb 24, 2023

Sure there is, blocking. I am definitely no expert here, but mixing this together is just asking for trouble. It will probably work most of the time, but every now and then you will see an unexpected behavior. But take this with a grain of salt as all this socket programming is something I never really dived into.

Another thing to keep in mind is that the behavior of TCP or even TLS which is on the roadmap will be completely different.

And what is bugging me the most, it is obvious that sockets are not an issue here as I can send and receive audio without any problem and in the discussion linked by the OP they used the same approach as well and it worked. What I think is happening here is just the sync issue as you generate audio in different pace than you are sending it, which can lead to choppiness and other not wanted artifacts in the speech/sound.

Could anyone of you guys just post your code and describe and issue in greater detail? I am really curious about this and obviously not getting fully the stuff you guys see as obvious (sorry about that, btw :-)).

@plugnburn
Copy link

plugnburn commented Feb 24, 2023

Could anyone of you guys just post your code and describe and issue in greater detail? I am really curious about this and obviously not getting fully the stuff you guys see as obvious (sorry about that, btw :-)).

Could you try a simple echo test? Here's a snippet from one of my modules (simplified and stripped from unnecessary details):

call_obj.write_audio(tts_to_buf('Entering the echo test', config['tts'])) 
audiobuf = None
while audiobuf != emptybuf:
    audiobuf = call_obj.read_audio(audio_buf_len, False) # nonblocking audio buffer read (to flush previous data)
while call_obj.state == CallState.ANSWERED: # main event loop
    audiobuf = call_obj.read_audio(audio_buf_len, True) # blocking audio buffer read
    call_obj.write_audio(audiobuf) # echo the audio

In the unpatched version, audiobuf is always empty (all 0x80 bytes) if I switch to the nonblocking read and just not returned with the blocking read.
In the patched version, everything works correctly.

@jrozhon
Copy link

jrozhon commented Feb 24, 2023

This is my analog:

def answer(call):
    try:

        call.answer()

        while call.state == CallState.ANSWERED:
            call.write_audio(call.read_audio())
            time.sleep(0.01)

        call.hangup()
    except InvalidStateError:
        pass
    except:
        call.hangup()

And this is the waveform from the network:
image

Sound is ok.

For some reason, I was unable to implement it with BytesIO buffer as the truncation did not work, but call is ok as is.

One "issue" is, that the source port is different than the port advertised in SDP, but that happens frequently in VoIP and PBX/Proxy should have a means to handle that.

Regards, J

@plugnburn
Copy link

plugnburn commented Feb 25, 2023

This is my analog:

In my case, read_audio just doesn't return any data unless I make it use the same socket as write_audio does (more precisely, making sout the same as sin). Same happens to @ferugit. Why is it so hard to understand?

@jrozhon
Copy link

jrozhon commented Feb 25, 2023

Oh, no, I understand that it does not work for you and the other guys. Just dont understand that it works for me. Perhaps there is an issue elsewhere in your setup? Anyway, feel free to ignore me and stick to your solution if it is good enough for you. I am just curious what is going on in this case. That is it.

Regards, J

@plugnburn
Copy link

plugnburn commented Feb 25, 2023

Oh, no, I understand that it does not work for you and the other guys. Just dont understand that it works for me. Perhaps there is an issue elsewhere in your setup?

I have no issues in my setup, just a regular Linux PC behind a NAT. Using a single socket is totally fine. See this answer, for example: https://stackoverflow.com/questions/15794271/maintaining-a-bidirectional-udp-connection

Why would the SIP provider be able to send the datagram to another socket if the client, that is behind a NAT, didn't send a single packet in its direction through that socket, only through the first one? See that answer:

Also, if the client is placed behind a NAT, it is required for the hole punching to work correctly. Even though you bind to the same IP and port on the client, you are not guaranteed to get the same mapping in the NAT. Thus, the server might not be able to reach the client.

This is exactly what's happening here.

Anyway, feel free to ignore me and stick to your solution if it is good enough for you.

I'd like this change to be in the upstream to not have to redistribute the patched .whl of 1.6.4 along with the upcoming IVR framework I'm writing. In the current state, pyVoIP is unusable for anyone who just connects to a public SIP provider from any home or corporate network.

@jrozhon
Copy link

jrozhon commented Feb 25, 2023

Look, I have a feeling that you think that I argue with you that your approach is not good. This is really not the case, I am just saying we should be cautious here as things can go wrong easily and that I am able to receive audio without any issue and my echo app works perfectly fine.

As for the current state, it is perfectly consistent with RFC, as SDP only specifies port number for receiving, see https://www.rfc-editor.org/rfc/rfc4566#section-5.14. Btw. look at SIP implementation by Cisco just for fun.

Yes, most of the clients go this way, sending and receiving audio through the same socket. No, issue here. But in this implementation, there is a ton of blocking here in this library and that can potentially break things to pieces.

No argument with you, just using caution for as long as the author doesn't say "hey, this is a good idea".

@plugnburn
Copy link

OK, let's wait for the author then. I'm just trying to point out that 1) such libraries must be tested with real VoIP providers first-hand in the most realistic scenario (you're in the home WLAN, the SIP server is out there far away on the Internet), 2) in the current state of the library, when the client is behind NAT (and this is the case, like, for the absolute majority of world's SIP devices), it will not get any audio data from the server on this separate socket it creates for reading. Why? Because this data will be sent to the socket for writing, the one that punched the UDP hole in the NAT router(s).

@jrozhon
Copy link

jrozhon commented Feb 25, 2023

See, we just moved from not working at all to not working under some conditions. Believe me, it is very common in telco industry to use different ports for sending and for receiving - this is why I pointed to Cisco, which is the most obvious example and yea, they are pretty common in providers' environment. It is just not that simple. But in general, I would also welcome if the library used same socket for both - sending and receiving.

If you are interested in NAT and how to handle it, just take a look at STUN, TURN and ICE. Another thing that is commonly used is ALG in your router. Having a symmetric communication unfortunately does not guarantee that you will get through NAT. It just makes it a bit more likely.

One side note, this implementation is far from complete SIP client, if you are planning to use it in this extent, I would suggest using a more advanced implementation, such as linphone, or pjsip/pjsua. They have all this yummy stuff covered, but are far more complex and written in C, which is why I abandoned any further attempts on implementing a client using them. But I just need a simple call handled by an API.

Regards, J

@plugnburn
Copy link

If I wanted to use PJSIP, I wouldn't be here. This pyVoIP library is wonderful as it offers just the right abstraction level over the protocol details, it allows to keep the IVR kernel under 250 SLOC of Python code (including in-band DTMF detection, interaction with external TTS engines, action API and so on). It's a perfect framework to create a framework. Just this minor detail with two UDP sockets that makes one wonder how it was tested in real conditions (connecting to a remote SIP server from a home network or a Docker container or even both at the same time).
And yes, as long as we connect to a public VoIP provider to receive PSTN-to-SIP calls from its server, we shouldn't need to use STUN/TURN at all.

@jrozhon
Copy link

jrozhon commented Feb 25, 2023

Ok, it was just a suggestion. Only trying to help.

We definitely agree that this is a wonderful library.

Regards, J

@plugnburn
Copy link

plugnburn commented Feb 26, 2023

Until this is solved, the package of pyVoIP 1.6.4 with the patched RTP.py is available here.

@PraveenChordia
Copy link

@MajorMaxdom

I already solved this, in my case I needed to use a single socket for RTP communication. In the RTP.py file y modified the following lines:

    def start(self) -> None:
        self.sin = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
        #self.sout = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
        self.sin.bind((self.inIP, self.inPort))
        self.sin.setblocking(False)
        self.sout = self.sin

Just configuring this, I was able to simultaneously send and receive audio. Hope it helps you!

This worked for me thanks

@tayler6000 tayler6000 added the question Issue is asking a question label May 9, 2023
Repository owner locked and limited conversation to collaborators May 9, 2023
@tayler6000 tayler6000 converted this issue into discussion #135 May 9, 2023

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
question Issue is asking a question
Projects
None yet
Development

No branches or pull requests

6 participants