Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RSDK-10100, RSDK-10111] - Add utils to lazy encoded image to decode and return errors instead of panic-ing always #4826

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

hexbabe
Copy link
Member

@hexbabe hexbabe commented Mar 4, 2025

PR Guide

Essential changes

  • Change signature of camerautils VideoSourceFromCamera to return an error in order to report the lazy encoding failure to the stream server safely
  • Made the decode and decodeConfig methods on lazy encoded images public so we can call them and check for errors instead of doing it in the image.Image methods Bounds At and ColorModel methods which give no ways to return errors and forces a panic when decoding fails.
  • Added notes to use these util methods and check for errors before using image.Image methods.

Note

This is no longer a breaking change because it is an additive change to the LazyEncodedImage struct.

It is an alternative to un-Golang-ly directly recovering from panics.

Manual testing

Tested with machine config with Rand's malformed camera that returns bad lazy encoded bytes. It would previously (on main) panic when adding the malformed stream. With this change, it returns this error:

3/4/2025, 12:04:11 PM error rdk.networking   utils@v0.1.130/logger.go:161   finished unary call with code Unknown   error rpc error: code = Unknown desc = failed to sample frame size: failed to get frame after 5 attempts: lazy encoded image error(s): image: unknown format; image: unknown format  grpc.code Unknown  grpc.start_time 2025-03-04T17:04:10.760Z  span.kind server  system grpc  grpc.service proto.stream.v1.StreamService  grpc.method GetStreamOptions

3/4/2025, 12:04:11 PM error rdk   stream/backoff.go:72   error getting media   streamName malformed-1  error lazy encoded image error(s): image: unknown format; image: unknown format

@viambot viambot added the safe to test This pull request is marked safe to test from a trusted zone label Mar 4, 2025
Copy link
Contributor

github-actions bot commented Mar 4, 2025

Warning your change may break code samples. If your change modifies any of the following functions please contact @viamrobotics/fleet-management. Thanks!

component function
base IsMoving
board GPIOPinByName
camera Properties
encoder Properties
motor IsMoving
sensor Readings
servo Position
arm EndPosition
audio MediaProperties
gantry Lengths
gripper IsMoving
input_controller Controls
movement_sensor LinearAcceleration
power_sensor Power
pose_tracker Poses
motion GetPose
vision GetProperties

@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Mar 4, 2025
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Mar 4, 2025
@hexbabe hexbabe marked this pull request as draft March 4, 2025 17:34
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Mar 4, 2025
@hexbabe hexbabe changed the title [RSDK-10100] - Add stored error states to lazy encoded images [RSDK-10100] - Add utils to lazy encoded image to decode and catch errors instead of panic-ing always Mar 4, 2025
@hexbabe hexbabe changed the title [RSDK-10100] - Add utils to lazy encoded image to decode and catch errors instead of panic-ing always [RSDK-10100] - Add utils to lazy encoded image to decode and return errors instead of panic-ing always Mar 4, 2025
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Mar 4, 2025
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Mar 4, 2025
@hexbabe hexbabe marked this pull request as ready for review March 5, 2025 16:28
@hexbabe hexbabe requested review from randhid and seanavery March 5, 2025 16:51
@@ -27,7 +29,7 @@ func Camera(robot robot.Robot, stream gostream.Stream) (camera.Camera, error) {

// VideoSourceFromCamera converts a camera resource into a gostream VideoSource.
// This is useful for streaming video from a camera resource.
func VideoSourceFromCamera(ctx context.Context, cam camera.Camera) gostream.VideoSource {
func VideoSourceFromCamera(ctx context.Context, cam camera.Camera) (gostream.VideoSource, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a test case for when this returns an error?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hexbabe you can make an injected camera with bad bytes to mimic y malformed camera, in the test file for this package.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah can do. Incorporating that with the test Nick pointed out was missing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added one

@@ -103,14 +106,18 @@ func (sc *videoSource) Stream(ctx context.Context, errHandlers ...gostream.Error
// and still implement a camera resource.
// We prefer this methodology over passing Image bytes because each transform desires a image.Image over
// a raw byte slice. To use Image would be to wastefully encode and decode the frame multiple times.
func videoSourceFromCamera(ctx context.Context, cam camera.Camera) camera.VideoSource {
func videoSourceFromCamera(ctx context.Context, cam camera.Camera) (camera.VideoSource, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a test case for when this returns an error?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added one

"image"
"image/color"
"net/http"
"strings"
"sync"

"go.uber.org/multierr"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend using errors.Join from the standard library rather than multierr

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

decodeErr interface{}
decodedImage image.Image
decodeOnce sync.Once
decodeImageErr interface{}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why does this need to be an empty interface? why can't it just be an error?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know 🫠

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we try changing it to an error?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather not right now if that's okay - there's also a second error that is an interface in this struct - unclear why - and I'm unsure how much we want to go down the error refactoring rabbit hole.

func (lei *LazyEncodedImage) decodeConfig() {
// DecodeConfig decodes the image configuration. Returns nil if no errors occurred.
// This method is idempotent.
func (lei *LazyEncodedImage) DecodeConfig() error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when would a caller want to decode the config but not the image or vice versa? Wouldn't you always want to decode both or neither?

Copy link
Member Author

@hexbabe hexbabe Mar 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the stream server is a good example. IIUC, we don't actually "do anything" with the image e.g. transform it, crop it, re-encode it etc.

I checked for usages of DecodedImage or ConvertImage (which calls DecodedImage if its a lazy encoded image) and found none for handling camera streams.

We basically check the bounds of the image (which only requires the decoded config not image contents) and then extract the raw bytes here.


imgLazy = NewLazyEncodedImage([]byte{1, 2, 3}, "weeeee")

test.That(t, imgLazy.(*LazyEncodedImage).MIMEType(), test.ShouldEqual, "weeeee")
test.That(t, func() { imgLazy.Bounds() }, test.ShouldPanic)
test.That(t, func() { imgLazy.ColorModel() }, test.ShouldPanic)
test.That(t, imgLazy.(*LazyEncodedImage).DecodeAll(), test.ShouldNotBeNil)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specific error check please.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added one that checks for substring since to my knowledge there is no error type we can assert and check for.

Comment on lines 43 to 45
test.That(t, func() { imgLazy.ColorModel() }, test.ShouldPanicWith, image.ErrFormat)
test.That(t, func() { NewColorFromColor(imgLazy.At(0, 0)) }, test.ShouldPanicWith, image.ErrFormat)
test.That(t, func() { NewColorFromColor(imgLazy.At(4, 4)) }, test.ShouldPanicWith, image.ErrFormat)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should keep the old tests and only add our new ones please.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah sorry, the exact format of the panic catching got lost through my many iterations of this solution's design

Copy link
Member

@randhid randhid Mar 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fine - please look through this file and add back any tests that were deleted.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines +37 to +38
// NOTE: It is recommended to call one of the Decode* methods and check for errors before using
// image.Image methods (Bounds, ColorModel, or At) to avoid panics.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

heads up @bhaney - this should not change the current flow of image - but you will now be able to check if the image has been successfully Decoded from bytes into something that has bounds.

decodeErr interface{}
decodedImage image.Image
decodeOnce sync.Once
decodeImageErr interface{}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather not right now if that's okay - there's also a second error that is an interface in this struct - unclear why - and I'm unsure how much we want to go down the error refactoring rabbit hole.

@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Mar 5, 2025
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Mar 5, 2025
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Mar 5, 2025
@randhid
Copy link
Member

randhid commented Mar 5, 2025

Okay for this PR I'd like an in-person manual test to verify the following:

  • Additional streams after adding a malformed camera is possible.
  • Removing streams after removing a malformed camera is alright.
  • Having a malformed camera does not break resizing/resetting streams.

… control; Add "safe" versions of all the image.Image methods and tests for that
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Mar 5, 2025
@hexbabe
Copy link
Member Author

hexbabe commented Mar 6, 2025

@hexbabe hexbabe changed the title [RSDK-10100] - Add utils to lazy encoded image to decode and return errors instead of panic-ing always [RSDK-10100, RSDK-10111] - Add utils to lazy encoded image to decode and return errors instead of panic-ing always Mar 10, 2025
Comment on lines +40 to +52
func TestVideoSourceFromCameraFailure(t *testing.T) {
malformedCam := &inject.Camera{
ImageFunc: func(ctx context.Context, mimeType string, extra map[string]interface{}) ([]byte, camera.ImageMetadata, error) {
return []byte("not a valid image"), camera.ImageMetadata{MimeType: utils.MimeTypePNG}, nil
},
}

vs, err := camerautils.VideoSourceFromCamera(context.Background(), malformedCam)
test.That(t, err, test.ShouldNotBeNil)
expectedErrPrefix := "failed to decode lazy encoded image: "
test.That(t, err.Error(), test.ShouldStartWith, expectedErrPrefix)
test.That(t, vs, test.ShouldBeNil)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a valid image but nonsense mimetype test?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Screenshot 2025-03-10 at 1 37 52 PM

I added a unit test, but did not see an error or a panic.

I confirmed with a manual test when I rebuilt rand fake modules with the above diff in malformed.go

3/10/2025, 1:36:56 PM warn rdk   rimage/lazy_encoded.go:43   NewLazyEncodedImage resolving to non image mime_type: rand

3/10/2025, 1:36:56 PM warn rdk   rimage/lazy_encoded.go:43   NewLazyEncodedImage resolving to non image mime_type: rand

3/10/2025, 1:36:56 PM info rdk   gostream/stream.go:301   detected new image bounds

3/10/2025, 1:36:56 PM warn rdk   rimage/lazy_encoded.go:43   NewLazyEncodedImage resolving to non image mime_type: rand

3/10/2025, 1:36:56 PM warn rdk.malformed-1   state/state.go:296   tick: rtp_passthrough not possible, falling back to GoStream   err SubscribeRTP failed: rpc error: code = Unknown desc = unknown stream for resource

3/10/2025, 1:36:56 PM warn     module/module.go:871 unknown stream for resource   name map[API:map[Type:map[namespace:rdk type:component] subtype:camera] Name:malformed-1 Remote:]  streamSourceByName map[resource.Name]rtppassthrough.Source{}  log_ts 2025-03-10T17:36:56.566Z

3/10/2025, 1:36:56 PM info rdk.modmanager   camera/client.go:446   SubscribeRTP is creating a video track   client 0x140017a5380  peerConnection 0x14000276f08

3/10/2025, 1:36:56 PM info rdk.malformed-1   state/state.go:293   attempting to subscribe to rtp_passthrough

3/10/2025, 1:36:56 PM warn rdk   stream/server.go:284   AddStream END malformed-1

3/10/2025, 1:36:56 PM info rdk   stream/server.go:146   Adding video stream   name malformed-1  peerConn &{PeerConnection-1741628216368952000 {{0 0} 0 0 {{} 0} {{} 0}} { 2987189465956219544 1741628216 } 0x14000b1ce10 {[{[stun:global.stun.twilio.com:3478] <nil> 0}] 0 1 2 [{0x14000b1da40 0x14000de8108 certificate-1741628216369729000}] 0 0} 0x14000f69780 <nil> 0x14000edd7a0 <nil> 1 {3} {3} <nil> 0x14001c57928 false 0x14001604e70 0x14001604ee0 0x14001c5792c 0x14001c57980 v=0 o=- 2987189465956219544 1741628216 IN IP4 0.0.0.0 s=- t=0 0 a=msid-semantic:WMS* a=fingerprint:sha-256 49:88:E1:26:84:26:C1:73:C9:A5:94:59:FA:89:0A:AE:55:9F:75:CE:60:5F:32:F0:0B:D5:FE:2B:EB:EA:B7:C8 a=extmap-allow-mixed a=group:BUNDLE 0 m=application 9 UDP/DTLS/SCTP webrtc-datachannel c=IN IP4 0.0.0.0 a=setup:active a=mid:0 a=sendrecv a=sctp-port:5000 a=ice-ufrag:SkwlHABgazpCCehZ a=ice-pwd:kFqrNnttiiNnZIHuGIRjXMEOfHaqKrgr -1 [] {<nil>} <nil> {0x104e30930} {<nil>} <nil> <nil> {0x104e3e5f0} 0x14000b146c0 0x14000e3afd0 0x14000f7cb00 0x14000e3b080 0x14000b1d800 0x14000a424c0 0x104bd7f40}

3/10/2025, 1:36:56 PM info rdk.networking   rpc/wrtc_base_channel.go:144   Connection establishment succeeded   conn_id PeerConnection-1741628216368952000  conn_local_candidates [{2025-03-10 17:36:56.419000064 +0000 UTC host 100.115.36.141} {2025-03-10 17:36:56.419000064 +0000 UTC host 127.0.0.1} {2025-03-10 17:36:56.419000064 +0000 UTC host 10.1.12.8}]  conn_remote_candidates [{2025-03-10 17:36:56.419000064 +0000 UTC peer-reflexive 127.0.0.1}]  candidate_pair (local) udp4 host 127.0.0.1:54347 <-> (remote) udp4 prflx 127.0.0.1:53668 related :0

3/10/2025, 1:36:24 PM info rdk   impl/local_robot.go:1334   Robot (re)configured

3/10/2025, 1:36:24 PM warn rdk   rimage/lazy_encoded.go:43   NewLazyEncodedImage resolving to non image mime_type: rand

3/10/2025, 1:36:24 PM info rdk.modmanager.local-rand-mod   modmanager/manager.go:553   Adding resource to module   resource malformed-1  module local-rand-mod

3/10/2025, 1:36:24 PM info rdk.resource_manager   impl/resource_manager.go:725   Now configuring resource   resource rdk:component:camera/malformed-1

3/10/2025, 1:36:24 PM info rdk.modmanager   modmanager/manager.go:323   Modules successfully added   modules [local-rand-mod]

3/10/2025, 1:36:24 PM info rdk.modmanager.local-rand-mod   modmanager/manager.go:413   Module successfully added   module local-rand-mod

Observed the above logs showing that the stream was initialized properly and also observed black frames coming through the wire. Looks like the mimetype being unknown is not an error/panic cause.

Confirmed this by looking at the code, if you trace the code from DecodeImage, you can see we sniff the bytes instead of using the mimetype, leading to no error.

Screenshot 2025-03-10 at 1 46 18 PM

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

func TestVideoSourceFromCameraWithNonsenseMimeType(t *testing.T) {
	sourceImg := image.NewRGBA(image.Rect(0, 0, 3, 3))

	camWithNonsenseMimeType := &inject.Camera{
		ImageFunc: func(ctx context.Context, mimeType string, extra map[string]interface{}) ([]byte, camera.ImageMetadata, error) {
			imgBytes, err := rimage.EncodeImage(ctx, sourceImg, utils.MimeTypePNG)
			test.That(t, err, test.ShouldBeNil)
			return imgBytes, camera.ImageMetadata{MimeType: "nonsense/mimetype"}, nil
		},
	}

	// Should log warning though
	vs, err := camerautils.VideoSourceFromCamera(context.Background(), camWithNonsenseMimeType)
	test.That(t, err, test.ShouldBeNil)
	test.That(t, vs, test.ShouldNotBeNil)

	stream, _ := vs.Stream(context.Background())
	img, _, err := stream.Next(context.Background())
	test.That(t, err, test.ShouldBeNil)
	test.That(t, img, test.ShouldNotBeNil)
}

Would you like me to add this test still?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Mar 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
safe to test This pull request is marked safe to test from a trusted zone
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants