Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RSDK-7403 - add remote rtp_passthrough support #3957

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

nicksanford
Copy link
Member

@nicksanford nicksanford commented May 15, 2024

Changes:

  1. rtp_passthrough will now be the default way video streams will be sent between remote parts. Even if the camera in the remote doesn't support passthrough, the main part will be able to do passthrough on the RTP packets from the remote, even if the remote is generating those packets by calling GetImage & encoding H264
  2. Change SubscribeRTP and Unsubscribe in the camera client to use the component's SDP encoded short name if talking to a main viam-sever (so for rdk:component:camera/remote:cam-1 the SDP encoded short name is remote+cam-1)
  3. Change SubscribeRTP and Unsubscribe in the camera client to use the component's SDP encoded short name with one level of remote poped off if talking to a remote viam-sever from a main viam-server (so for rdk:component:camera/remote:cam-1 name used is cam-1 asremote doesn't know that it's name is remote from the perspective of the main part)
  4. Add Tracker interface support to ReconfigurableClientConn so that peer connections from viam-server main & remote parts can stream WebRTC tracks to / from each other
  5. Change the Tracker interface to take strings rather than resource.Names as keys for the track names as that is the lowest common denominator for how both module.Add|RemoveStream and stream.Add|RemoveStream endpoints work
  6. Move SDP track name formatting functions into resource package
  7. Update robot/web/stream/state/state.go to support looking up cameras in remote parts by decoding the SDP encoded short name back into a short name
  8. fix tests now that the name of the camera is important

Follow up ticket:
https://viam.atlassian.net/browse/RSDK-7668

Manual Test Plan:

  • this change viam-server -> module compiled with this change of RDK
  • this change viam-server -> module compiled with v0.27.1 of RDK
  • v0.27.1 viam-server -> module compiled with this change of RDK
  • main -> remote-1 -> remote-2 > rtsp module -> cam-1 (remote-2 terminates & comes back after a minute): video stream heals (assuming main.remote-1:remote-2:cam-1's client isn't closed)
  • main -> remote-1 -> remote-2 > rtsp module -> cam-1 (remote-2 terminates, main unsubscribes from video, remote-2 & comes back after a minute): video stream is not playing & starts playing when main requests it (assuming main.remote-1:remote-2:cam-1's client isn't closed)
  • main -> remote-1 -> remote-2 > rtsp module -> cam-1 (remote-2 terminates, main unsubscribes from video & resubscribes, remote-2 & comes back after a minute): starts playing automatically requests it (assuming main.remote-1:remote-2:cam-1's client isn't closed)
  • main -> remote-1 -> remote-2 > rtsp module -> cam-1 (remote-2 reconfigures to not have cam-1 ): main.remote-1:remote-2:cam-1's client is closed and the video stops playing and the card is removed from the RC card.
  • main -> remote-1 -> remote-2 > rtsp module -> cam-1 (remote-2 looses connection with remote-1, then reconfigures to not have cam-1 , then reconnects after a minute to remote-1): main.remote-1:remote-2:cam-1's client is closed the video stops playing, and the card is removed from the RC card.

Note:
I tested this manually with the following topology:

sequenceDiagram
web browser->>main part: stream
main part->>remote part 1: stream
remote part 1->>remote part 2: stream
remote part 2->>remote part 3: stream
remote part 3->> module: stream
module->>rtsp server: stream

At every hop, the frequency of video pauses would increase. I believe this is due to the fact that there is variable latency on the office wifi network, where I see ping times swing between 5 ms and 200 ms per hop when there are a lot of people on the network. When the congestion is bad, the topology above does have noticeable pauses / starts. Future work could mitigate this by adding RTP packet buffering to help prevent pauses / starts at the cost of showing a video feed that is a few seconds stale.

Details:
This diagram shows the inputs to AddStream and the name of the track used in AddTrack & the name format convention used in the case of:

  1. A client talking to a main viam-server
  2. A main part talking to a remote viam-server
  3. A viam-server talking to a module

The fact that modules use the fully qualified component name as the input to AddStream and RemoveStream is a mistake I made in an earlier implementation. This change keeps that status quo to not break backward compatibility with existing modules.

sequenceDiagram
Note right of client: (remote1:remote2:cam-1).SubscribeRTP
client->>viam-server main: AddStream(remote1+remote2+cam-1)
Note right of viam-server main: (remote1:remote2:cam-1).SubscribeRTP
viam-server main->>viam-server remote1: AddStream(remote2+cam-1)
Note right of viam-server remote1: (remote2:cam-1).SubscribeRTP
viam-server remote1->>viam-server remote2: AddStream(cam-1)
Note right of viam-server remote2: (cam-1).SubscribeRTP
viam-server remote2->>module cam-1: AddStream(rdk:component:camera/cam-1)
Note right of module cam-1: (cam-1).SubscribeRTP
loop 
    module cam-1->>module cam-1: Stream RTP Packets
end
module cam-1-->>viam-server remote2: AddTrack(rdk:component:camera/cam-1)
viam-server remote2-->>viam-server remote1: AddTrack(cam-1)
viam-server remote1-->>viam-server main: AddTrack(remote2:cam-1)
viam-server main-->>client: AddTrack(remote1+remote2+cam-1)

@viambot viambot added the safe to test This pull request is marked safe to test from a trusted zone label May 15, 2024
Copy link
Contributor

Warning your change may break code samples. If your change modifies any of the following functions please contact @viamrobotics/fleet-management. Thanks!

component function
base IsMoving
board GPIOPinByName
camera Properties
encoder Properties
motor IsMoving
sensor Readings
servo Position
arm EndPosition
audio MediaProperties
gantry Lengths
gripper IsMoving
input_controller Controls
movement_sensor LinearAcceleration
power_sensor Power
pose_tracker Poses
motion GetPose
vision ClassificationsFromCamera

@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels May 15, 2024
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels May 15, 2024
robot/web/web.go Outdated Show resolved Hide resolved
@@ -452,7 +456,8 @@ func (ss *StreamState) streamH264Passthrough(ctx context.Context) error {
}

func (ss *StreamState) unsubscribeH264Passthrough(ctx context.Context, id rtppassthrough.SubscriptionID) error {
cam, err := camera.FromRobot(ss.robot, ss.Stream.Name())
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to turn the SDPTrackName back into a short name for the FromRobot lookup to work if the camera is in a remote part

@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels May 15, 2024
grpc/conn.go Outdated Show resolved Hide resolved
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels May 15, 2024
@@ -292,13 +292,13 @@ func (svc *webService) refreshVideoSources() {
if err != nil {
continue
}
existing, ok := svc.videoSources[validSDPTrackName(name)]
existing, ok := svc.videoSources[cam.Name().SDPTrackName()]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed from using private function to new public method on resource.Name

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no behavior change, other than requiring injected cameras actually have names in test setup

@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels May 15, 2024
@@ -0,0 +1,148 @@
package resource
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added 2 new functions to this package so I added some tests add added tests for the most common ways resource.Name s are created

@viambot viambot removed the safe to test This pull request is marked safe to test from a trusted zone label May 15, 2024
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Jun 17, 2024
@@ -294,6 +295,8 @@ func (c *Camera) SubscribeRTP(
bufferSize int,
packetsCB rtppassthrough.PacketCallback,
) (rtppassthrough.Subscription, error) {
golog.Global().Warnf("SubscribeRTP FAKE START %s", c.Name().String())
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

Comment on lines +75 to +76
golog.Global().Warnf("OnTrack START %s pc: %p", trackRemote.StreamID(), pc)
defer golog.Global().Warnf("OnTrack END %s pc: %p", trackRemote.StreamID(), pc)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

@@ -81,8 +80,8 @@ type SharedConn struct {
// set to nil before this channel is closed.
peerConnFailed chan struct{}

resOnTrackMu sync.Mutex
resOnTrackCBs map[resource.Name]OnTrackCB
onTrackCBByTrackNameMu sync.Mutex
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pull out the name changes into a separate PR

Comment on lines +125 to +126
ss.logger.Warnf("AddStream START %s", req.Name)
defer ss.logger.Warnf("AddStream END %s", req.Name)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

@@ -204,12 +225,14 @@ func (ss *Server) AddStream(ctx context.Context, req *streampb.AddStreamRequest)

guard := rutils.NewGuard(func() {
for _, sender := range ps.senders {
golog.Global().Infof("calling RemoveTrack on %s pc: %p", sender.Track().StreamID(), pc)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

Comment on lines +271 to +272
ss.logger.Warnf("RemoveStream START %s", req.Name)
defer ss.logger.Warnf("RemoveStream END %s", req.Name)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove


if _, ok := ss.activePeerStreams[pc][req.Name]; !ok {
return nil, errors.New("stream already inactive")
}

var errs error
for _, sender := range ss.activePeerStreams[pc][req.Name].senders {
// golog.Global().Infof("calling RemoveTrack on %s pc: %p", sender.Track().StreamID(), pc)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

Comment on lines +385 to +386
ss.logger.Info("monitorStreamAvailable loop START")
defer ss.logger.Info("monitorStreamAvailable loop END")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

Comment on lines +329 to +363
// if ss.streamSource != streamSourceUnknown {
// return fmt.Errorf("unexpected stream %s source %s", ss.Stream.Name(), ss.streamSource)
// }
// this is the first subscription, attempt passthrough
// ss.logger.CDebugw(ctx, "attempting to subscribe to rtp_passthrough", "name", ss.Stream.Name())
// err := ss.streamH264Passthrough(ctx)
// if err != nil {
// ss.logger.CDebugw(ctx, "rtp_passthrough not possible, falling back to GoStream", "err", err.Error(), "name", ss.Stream.Name())
// // if passthrough failed, fall back to gostream based approach
// ss.Stream.Start()
// ss.streamSource = streamSourceGoStream
// }
// ss.activeClients++
// return nil
// }

// switch ss.streamSource {
// case streamSourcePassthrough:
// ss.logger.Debugw("continuing using rtp_passthrough", "name", ss.Stream.Name())
// // noop as we are already subscribed
// case streamSourceGoStream:
// ss.logger.Debugw("currently using gostream, trying upgrade to rtp_passthrough", "name", ss.Stream.Name())
// // attempt to cut over to passthrough
// err := ss.streamH264Passthrough(ctx)
// if err != nil {
// ss.logger.Debugw("rtp_passthrough not possible, continuing with gostream", "err", err.Error(), "name", ss.Stream.Name())
// }
// case streamSourceUnknown:
// fallthrough
// default:
// err := fmt.Errorf("%s streamSource in unexpected state %s", ss.Stream.Name(), ss.streamSource)
// ss.logger.Error(err.Error())
// return err
// }
// ss.activeClients++
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

Comment on lines +389 to +397
// case err != nil && ss.activeClients > 0:
// ss.logger.Warn("camera no longer exists in resource graph, stopping subscriptions and setting active clients to 0")
// // stop stream if the camera no longer exists
// // noop if there is no stream source
// ss.stopBasedOnSub()
// // nil out subsription & stream source even if stopBasedOnSub didn't
// ss.streamSourceSub = rtppassthrough.NilSubscription
// ss.streamSource = streamSourceUnknown
// ss.activeClients = 0
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

Comment on lines 375 to 377
c.logger.Warn("Close START")
debug.PrintStack()
defer c.logger.Warn("Close END")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

Comment on lines +425 to +426
c.redLog(fmt.Sprintf("SubscribeRTP START %s client: %p, pc: %p", c.Name().String(), c, c.conn.PeerConn()))
defer c.redLog(fmt.Sprintf("SubscribeRTP END %s client: %p, pc: %p", c.Name().String(), c, c.conn.PeerConn()))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

c.logger.CDebugw(ctx, "SubscribeRTP AddStream hit error", "subID", sub.ID.String(), "name", c.Name(), "err", err)
defer sc.RemoveOnTrackSub(c.trackName())

c.redLog(fmt.Sprintf("SubscribeRTP AddStream CALL %s client: %p, pc: %p", c.trackName(), c, c.conn.PeerConn()))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

return rtppassthrough.NilSubscription, err
}
c.redLog(fmt.Sprintf("SubscribeRTP AddStream RETURN %s client: %p, pc: %p", c.trackName(), c, c.conn.PeerConn()))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

@@ -469,18 +524,20 @@ func (c *client) SubscribeRTP(

// To prevent that failure mode, we exit with an error if a track is not received within
// the SubscribeRTP context.
c.redLog(fmt.Sprintf("SubscribeRTP waiting for track %s client: %p, pc: %p", c.trackName(), c, c.conn.PeerConn()))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

c.logger.Error(err)
return rtppassthrough.NilSubscription, err
case <-trackReceived:
c.logger.Debug("received track")
c.redLog(fmt.Sprintf("received track %s client: %p, pc: %p", c.trackName(), c, c.conn.PeerConn()))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

c.logger.Debugw("SubscribeRTP: camera client", "name ", c.Name(), "parentID", parentID.String(),
"OnTrack callback terminating as the client is closing")
close(trackClosed)
return
default:
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

}
// END TestWhyMustTimeoutOnReadRTP
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

Comment on lines +733 to +754

// BEGIN TestWhyMustCallUnsubscribe
// if we are talking to a remote viam-sever we need to call RemoveStream so that
// the remote (which still has a healthy peer connection with the main part)
// knows to stop the stream. If we don't do this there is a risk that
// the remote will still leave the stream open, causing a newly reconfigured
// main part client's AddStream call to receive an error that there is already
// a live stream.

// We don't need to do this in the case of modules as resource manager of the main part
// will call `Close()` on them in the cases when the client would have Close() calld on it
// which will terminate the stream.

// if _, isRemote := c.conn.(*grpc.ReconfigurableClientConn); isRemote {
// // NOTE: This is the ctx from `camera/client.Close()` which is called by resource manager. I don't know
// // if it has a timeout. If it doesn't then this might block forever.
// c.logger.CDebugw(ctx, "unsubscribeAll calling RemoveStream", "trackName", c.trackName())
// if _, err := c.streamClient.RemoveStream(ctx, &streampb.RemoveStreamRequest{Name: c.trackName()}); err != nil {
// c.logger.CWarnw(ctx, "unsubscribeAll RemoveStream returned err", "trackName", c.trackName(), "err", err)
// }
// }
// END TestWhyMustCallUnsubscribe
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Jun 18, 2024
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Jun 18, 2024
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Jun 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
safe to test This pull request is marked safe to test from a trusted zone
Projects
None yet
4 participants