Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Outbound connection write timeout after capturing DTMF digits #20

Open
IvanRibakov opened this issue Jul 14, 2022 · 5 comments
Open

Outbound connection write timeout after capturing DTMF digits #20

IvanRibakov opened this issue Jul 14, 2022 · 5 comments

Comments

@IvanRibakov
Copy link

Hi, thanks for sharing this neat library with the community.

I've used it to create a small PoC that generates IVR menu dynamically instead of using fixed XML files, but when I got to the part where I needed to capture callers menu response I ran into an issue with the underlying TCP connection:

time="2022-07-14T07:14:32Z" level=info msg="[1] Answer the call"
time="2022-07-14T07:14:32Z" level=info msg="[2] Start DTMF listener"
time="2022-07-14T07:14:32Z" level=info msg="[3] Play IVR menu"
time="2022-07-14T07:14:32Z" level=info msg="[5] Capture DTMF events"
time="2022-07-14T07:14:48Z" level=info msg="Received digit: 2"
time="2022-07-14T07:14:48Z" level=error msg="Error transferring the call!"
time="2022-07-14T07:14:48Z" level=error msg="write tcp 10.22.1.201:8084->10.22.1.200:52490: i/o timeout"

Here is how DTMF digit is captured:

func (ch *CallHandler) captureDTMF(ctx context.Context, conn *eslgo.Conn) (byte, error) {
	digit, err := conn.WaitForDTMF(ctx, ch.uuid)
	if err != nil {
		ch.logger.Error("Error waiting for DTMF signal")
		return 0, err
	}
	ch.logger.Info("Received digit: " + string(digit))

	return digit, nil
}

Timeout happens on the next ESL command I try to executed after receiving the digit. Not sure if it matters, but the next command is this:

func (ch *CallHandler) transferCall(ctx context.Context, conn *eslgo.Conn, digit byte) error {
	_, err := conn.SendCommand(ctx, &call.Execute{UUID: ch.uuid, AppName: "deflect", AppArgs: "sip:eureka@" + ch.outboundDestIp, Sync: true, SyncPri: true})
	if err != nil {
		ch.logger.Error("Error transferring the call!")
		ch.logger.Error(err)
		return err
	}
	return nil
}

The deflect commmand works as expected if I comment out the "capture DTMF digits" logic, which makes me think that capturing the DTMF digit somehow affects the connection.

Any ideas what might be going wrong here or if I'm missing something important?

@IvanRibakov
Copy link
Author

[Update]

I've tried commenting out these lines of code:

eslgo/connection.go

Lines 123 to 125 in 6999297

if deadline, ok := ctx.Deadline(); ok {
_ = c.conn.SetWriteDeadline(deadline)
}

and that removed the the TCP write i/o timeout error.

Obviously I'm not claiming this to be a fix, so I still wonder if there is something I have done wrong that makes the code hit the deadline.

@IvanRibakov
Copy link
Author

With some further debugging I realised that I'm reusing the same context provided to the ListenAndServe connection handler which comes with 5 second timeout. All other commands are sent within 5seconds, but since DTMF digits are likely sent after the IVR menu has played all the sounds, the next command will fail.

I do wonder though what is the semantical meaning of the context passed to handleConnection? If it's meant to be a parent context for the entire connection, then why put a 5 second deadline on it?

This is where a more detailed example could really help newcommers.

I wonder if this is also what caused the other issue #13

@winsock
Copy link
Contributor

winsock commented Jul 14, 2022

Hi @IvanRibakov

Thanks for the report! Are you able to post your entire handleConnection function? The context passed into there is indeed the running context which by default is just context.Background() but cancelable.

The 5 second timeout is for sending the connect command to FreeSWITCH once the socket is opened. If FreeSWITCH doesn't respond we'll close the connection and cancel the running context.

c.logger.Warn("Error connecting to %s error %s", c.conn.RemoteAddr().String(), err.Error())

@IvanRibakov
Copy link
Author

IvanRibakov commented Jul 14, 2022

Are you able to post your entire handleConnection function?

Sure, here it is:

handleConnection source
func (s *Server) Start() error {
	var serverOpts = eslgo.DefaultOutboundOptions
	serverOpts.Logger = log.NewEslgoLogAdapter(s.logger)

	// Start listening, this is a blocking function
	err := serverOpts.ListenAndServe(":8084", s.handleConnection)
	return err
}

func (s *Server) handleConnection(ctx context.Context, conn *eslgo.Conn, response *eslgo.RawResponse) {
	s.logger.Info("New connection!")
	s.logger.Tracef("%#v", response)

	err := conn.EnableEvents(ctx)
	if err != nil {
		s.logger.Error("Error enabling events for the outbound connection!")
		s.logger.Error(err)
	}

	var uuid = response.ChannelUUID()
	s.logger.Debugf("Channel UUID: %s", uuid)

	ch := NewCallHandler(uuid, s.inboundDestPattern, s.outboundDestIp, s.logger)
	err = ch.Handle(ctx, conn, response)
	if err != nil {
		s.logger.Error(err)
	}

	// TODO do we really need this sleep?
	time.Sleep(10 * time.Second)

	conn.ExitAndClose()
	s.logger.Info("The end")
}

The control is then passed to the CallHandler, but it does more of the same - passes the context from the h.Handle(ctx, conn, response) to the SendCommands internally.

CallHandler source
package esl

import (
	"context"
	"fmt"
	"regexp"
	"time"

	"github.com/percipia/eslgo"
	"github.com/percipia/eslgo/command/call"
	"github.com/sirupsen/logrus"
)

type CallHandler struct {
	logger             *logrus.Logger
	inboundDestPattern *regexp.Regexp
	outboundDestIp     string
	uuid               string
}

func NewCallHandler(uuid string, inboundDestPattern *regexp.Regexp, outboundDestIp string, logger *logrus.Logger) CallHandler {
	return CallHandler{
		uuid:               uuid,
		inboundDestPattern: inboundDestPattern,
		outboundDestIp:     outboundDestIp,
		logger:             logger}
}

func (ch *CallHandler) Handle(ctx context.Context, conn *eslgo.Conn, response *eslgo.RawResponse) error {
	var username = response.GetHeader("Caller-Destination-Number")
	ch.logger.Debugf("Caller-Destination-Number: %s", username)

	ch.logger.Info("[1] Answer the call")
	err := conn.AnswerCall(ctx, ch.uuid)
	if err != nil {
		ch.logger.Error("Error answering the call")
		return err
	}

	if ch.inboundDestPattern.Match([]byte(username)) {
		ch.imrDemo(ctx, conn, response)
	} else {
		ch.logger.Info("[2] Transfer the call")
		ch.transferCall(ctx, conn, "")
	}

	return nil
}

func (ch *CallHandler) imrDemo(ctx context.Context, conn *eslgo.Conn, response *eslgo.RawResponse) error {

	ch.logger.Info("[2] Start DTMF listener")
	_, err := conn.SendCommand(ctx, &call.Execute{UUID: ch.uuid, AppName: "start_dtmf"})
	if err != nil {
		ch.logger.Error("Error starting the DTMF listener")
		return err
	}

	ch.logger.Info("[3] Play IVR menu")
	ch.playIvrMenu(ctx, conn)

	ch.logger.Info("[4] Capture DTMF events")
	digit, err := ch.captureDTMF(ctx, conn)
	if err != nil {
		return err
	}

	ch.logger.Info("[5] Transfer the call")
	ch.transferCall(ctx, conn, digit)

	ch.logger.Info("[6] End of IVR demo")
	return nil
}

func (ch *CallHandler) transferCall(ctx context.Context, conn *eslgo.Conn, digit string) error {

	username := "other"
	if digit == "1" {
		username = "sales"
	} else if digit == "2" {
		username = "support"
	}

	target := fmt.Sprintf("sip:%s@%s", username, ch.outboundDestIp)
	ch.logger.Info("Transferring to " + target)

	// Since more than 5 seconds have likely passed since creation of the connection, let's create
	// a separate context with it's own deadline to avoid hitting connection context's 5sec deadline
	deadline := time.Now().Add(1500 * time.Millisecond)
	transferCtx, cancelCtx := context.WithDeadline(ctx, deadline)
	defer cancelCtx()
	_, err := conn.SendCommand(transferCtx, &call.Execute{UUID: ch.uuid, AppName: "deflect", AppArgs: target, Sync: true})
	if err != nil {
		ch.logger.Error("Error transferring the call!")
		ch.logger.Error(err)
		return err
	}
	return nil
}

func (ch *CallHandler) captureDTMF(ctx context.Context, conn *eslgo.Conn) (string, error) {
	digit, err := conn.WaitForDTMF(ctx, ch.uuid)
	if err != nil {
		ch.logger.Error("Error waiting for DTMF signal")
		return "", err
	}
	res := string(digit)
	ch.logger.Info("Received digit: " + res)

	return res, nil
}

func (ch *CallHandler) playIvrMenu(ctx context.Context, conn *eslgo.Conn) error {
	_, err := conn.Playback(ctx, ch.uuid, "ivr/8000/ivr-thank_you_for_calling.wav", 1, true)
	if err != nil {
		ch.logger.Error("Error playing back the audio!")
		return err
	}
	_, err = conn.SendCommand(ctx, &call.Execute{UUID: ch.uuid, AppName: "sleep", AppArgs: "500", Sync: true})
	if err != nil {
		ch.logger.Error("Error sleeping!")
		return err
	}
	_, err = conn.Playback(ctx, ch.uuid, "ivr/8000/ivr-please.wav", 1, true)
	if err != nil {
		ch.logger.Error("Error playing back the audio!")
		return err
	}
	_, err = conn.Playback(ctx, ch.uuid, "base256/8000/select.wav", 1, true)
	if err != nil {
		ch.logger.Error("Error playing back the audio!")
		return err
	}
	_, err = conn.SendCommand(ctx, &call.Execute{UUID: ch.uuid, AppName: "say", AppArgs: "en number pronounced 1", Sync: true})
	if err != nil {
		ch.logger.Error("Error saying number!")
		return err
	}
	_, err = conn.Playback(ctx, ch.uuid, "ivr/8000/ivr-option.wav", 1, true)
	if err != nil {
		ch.logger.Error("Error playing back the audio!")
		return err
	}
	_, err = conn.SendCommand(ctx, &call.Execute{UUID: ch.uuid, AppName: "sleep", AppArgs: "500", Sync: true})
	if err != nil {
		ch.logger.Error("Error sleeping!")
		return err
	}
	_, err = conn.Playback(ctx, ch.uuid, "ivr/8000/ivr-for.wav", 1, true)
	if err != nil {
		ch.logger.Error("Error playing back the audio!")
		return err
	}
	_, err = conn.Playback(ctx, ch.uuid, "ivr/8000/ivr-sales.wav", 1, true)
	if err != nil {
		ch.logger.Error("Error playing back the audio!")
		return err
	}
	_, err = conn.Playback(ctx, ch.uuid, "voicemail/8000/vm-press.wav", 1, true)
	if err != nil {
		ch.logger.Error("Error playing back the audio!")
		return err
	}
	_, err = conn.SendCommand(ctx, &call.Execute{UUID: ch.uuid, AppName: "say", AppArgs: "en number pronounced 1", Sync: true})
	if err != nil {
		ch.logger.Error("Error saying number!")
		return err
	}
	_, err = conn.SendCommand(ctx, &call.Execute{UUID: ch.uuid, AppName: "sleep", AppArgs: "500", Sync: true})
	if err != nil {
		ch.logger.Error("Error sleeping!")
		return err
	}
	_, err = conn.Playback(ctx, ch.uuid, "ivr/8000/ivr-for.wav", 1, true)
	if err != nil {
		ch.logger.Error("Error playing back the audio!")
		return err
	}
	_, err = conn.Playback(ctx, ch.uuid, "ivr/8000/ivr-technical_support.wav", 1, true)
	if err != nil {
		ch.logger.Error("Error playing back the audio!")
		return err
	}
	_, err = conn.Playback(ctx, ch.uuid, "voicemail/8000/vm-press.wav", 1, true)
	if err != nil {
		ch.logger.Error("Error playing back the audio!")
		return err
	}
	_, err = conn.SendCommand(ctx, &call.Execute{UUID: ch.uuid, AppName: "say", AppArgs: "en number pronounced 2", Sync: true})
	if err != nil {
		ch.logger.Error("Error saying number!")
		return err
	}
	_, err = conn.SendCommand(ctx, &call.Execute{UUID: ch.uuid, AppName: "sleep", AppArgs: "500", Sync: true})
	if err != nil {
		ch.logger.Error("Error sleeping!")
		return err
	}
	_, err = conn.Playback(ctx, ch.uuid, "ivr/8000/ivr-for_other_options.wav", 1, true)
	if err != nil {
		ch.logger.Error("Error playing back the audio!")
		return err
	}
	_, err = conn.Playback(ctx, ch.uuid, "ivr/8000/ivr-any_other_digit.wav@@2000", 1, true)
	if err != nil {
		ch.logger.Error("Error playing back the audio!")
		return err
	}

	return nil
}

The 5 second timeout is for sending the connect command to FreeSWITCH once the socket is opened. If FreeSWITCH doesn't respond we'll close the connection and cancel the running context.

I understand the intention there, but I don't know how else to explain the behaviour I've been seeing.

@winsock
Copy link
Contributor

winsock commented Jul 14, 2022

Thanks for sending over this additional information, I'll see if I can spot the issue going on. It may take me a bit to actually troubleshoot since I'm not with Percipia anymore but I do keep an eye on this library and help them out from time to time.

If you want to help test faster, you can try this branch that disables closing on failure/no-response of the connect command. If that fixes the issue then the problem is somewhere with how we are sending the initial connect command or how FreeSWITCH is processing/responding to it. In that case verbose logs from FreeSWITCH would potentially help with determining the cause. I'm thinking this may be actually related to #14 but with outbound connections.
https://github.com/winsock/eslgo/tree/test-disable-connect-failure

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants