Skip to content

[RSDK-10100, RSDK-10111] - Add utils to lazy encoded image to decode and return errors instead of panic-ing always #4826

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Mar 12, 2025

Conversation

hexbabe
Copy link
Member

@hexbabe hexbabe commented Mar 4, 2025

PR Guide

Essential changes

  • Change signature of camerautils VideoSourceFromCamera to return an error in order to report the lazy encoding failure to the stream server safely
  • Made the decode and decodeConfig methods on lazy encoded images public so we can call them and check for errors instead of doing it in the image.Image methods Bounds At and ColorModel methods which give no ways to return errors and forces a panic when decoding fails.
  • Added notes to use these util methods and check for errors before using image.Image methods.

Note

This is no longer a breaking change because it is an additive change to the LazyEncodedImage struct.

It is an alternative to un-Golang-ly directly recovering from panics.

Manual testing

Tested with machine config with Rand's malformed camera that returns bad lazy encoded bytes. It would previously (on main) panic when adding the malformed stream. With this change, it returns this error:

3/4/2025, 12:04:11 PM error rdk.networking   utils@v0.1.130/logger.go:161   finished unary call with code Unknown   error rpc error: code = Unknown desc = failed to sample frame size: failed to get frame after 5 attempts: lazy encoded image error(s): image: unknown format; image: unknown format  grpc.code Unknown  grpc.start_time 2025-03-04T17:04:10.760Z  span.kind server  system grpc  grpc.service proto.stream.v1.StreamService  grpc.method GetStreamOptions

3/4/2025, 12:04:11 PM error rdk   stream/backoff.go:72   error getting media   streamName malformed-1  error lazy encoded image error(s): image: unknown format; image: unknown format

@viambot viambot added the safe to test This pull request is marked safe to test from a trusted zone label Mar 4, 2025
Copy link
Contributor

github-actions bot commented Mar 4, 2025

Warning your change may break code samples. If your change modifies any of the following functions please contact @viamrobotics/fleet-management. Thanks!

component function
base IsMoving
board GPIOPinByName
camera Properties
encoder Properties
motor IsMoving
sensor Readings
servo Position
arm EndPosition
audio MediaProperties
gantry Lengths
gripper IsMoving
input_controller Controls
movement_sensor LinearAcceleration
power_sensor Power
pose_tracker Poses
motion GetPose
vision GetProperties

@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Mar 4, 2025
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Mar 4, 2025
@hexbabe hexbabe marked this pull request as draft March 4, 2025 17:34
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Mar 4, 2025
@hexbabe hexbabe changed the title [RSDK-10100] - Add stored error states to lazy encoded images [RSDK-10100] - Add utils to lazy encoded image to decode and catch errors instead of panic-ing always Mar 4, 2025
@hexbabe hexbabe changed the title [RSDK-10100] - Add utils to lazy encoded image to decode and catch errors instead of panic-ing always [RSDK-10100] - Add utils to lazy encoded image to decode and return errors instead of panic-ing always Mar 4, 2025
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Mar 4, 2025
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Mar 4, 2025
@hexbabe hexbabe marked this pull request as ready for review March 5, 2025 16:28
@hexbabe hexbabe requested review from randhid and seanavery March 5, 2025 16:51

imgLazy = NewLazyEncodedImage([]byte{1, 2, 3}, "weeeee")

test.That(t, imgLazy.(*LazyEncodedImage).MIMEType(), test.ShouldEqual, "weeeee")
test.That(t, func() { imgLazy.Bounds() }, test.ShouldPanic)
test.That(t, func() { imgLazy.ColorModel() }, test.ShouldPanic)
test.That(t, imgLazy.(*LazyEncodedImage).DecodeAll(), test.ShouldNotBeNil)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specific error check please.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added one that checks for substring since to my knowledge there is no error type we can assert and check for.

Comment on lines 43 to 45
test.That(t, func() { imgLazy.ColorModel() }, test.ShouldPanicWith, image.ErrFormat)
test.That(t, func() { NewColorFromColor(imgLazy.At(0, 0)) }, test.ShouldPanicWith, image.ErrFormat)
test.That(t, func() { NewColorFromColor(imgLazy.At(4, 4)) }, test.ShouldPanicWith, image.ErrFormat)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should keep the old tests and only add our new ones please.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah sorry, the exact format of the panic catching got lost through my many iterations of this solution's design

Copy link
Member

@randhid randhid Mar 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fine - please look through this file and add back any tests that were deleted.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines +37 to +38
// NOTE: It is recommended to call one of the Decode* methods and check for errors before using
// image.Image methods (Bounds, ColorModel, or At) to avoid panics.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

heads up @bhaney - this should not change the current flow of image - but you will now be able to check if the image has been successfully Decoded from bytes into something that has bounds.

@viambot viambot added the safe to test This pull request is marked safe to test from a trusted zone label Mar 5, 2025
@randhid
Copy link
Member

randhid commented Mar 5, 2025

Okay for this PR I'd like an in-person manual test to verify the following:

  • Additional streams after adding a malformed camera is possible.
  • Removing streams after removing a malformed camera is alright.
  • Having a malformed camera does not break resizing/resetting streams.

… control; Add "safe" versions of all the image.Image methods and tests for that
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Mar 5, 2025
@hexbabe
Copy link
Member Author

hexbabe commented Mar 6, 2025

@hexbabe hexbabe changed the title [RSDK-10100] - Add utils to lazy encoded image to decode and return errors instead of panic-ing always [RSDK-10100, RSDK-10111] - Add utils to lazy encoded image to decode and return errors instead of panic-ing always Mar 10, 2025
Comment on lines +40 to +52
func TestVideoSourceFromCameraFailure(t *testing.T) {
malformedCam := &inject.Camera{
ImageFunc: func(ctx context.Context, mimeType string, extra map[string]interface{}) ([]byte, camera.ImageMetadata, error) {
return []byte("not a valid image"), camera.ImageMetadata{MimeType: utils.MimeTypePNG}, nil
},
}

vs, err := camerautils.VideoSourceFromCamera(context.Background(), malformedCam)
test.That(t, err, test.ShouldNotBeNil)
expectedErrPrefix := "failed to decode lazy encoded image: "
test.That(t, err.Error(), test.ShouldStartWith, expectedErrPrefix)
test.That(t, vs, test.ShouldBeNil)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a valid image but nonsense mimetype test?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Screenshot 2025-03-10 at 1 37 52 PM

I added a unit test, but did not see an error or a panic.

I confirmed with a manual test when I rebuilt rand fake modules with the above diff in malformed.go

3/10/2025, 1:36:56 PM warn rdk   rimage/lazy_encoded.go:43   NewLazyEncodedImage resolving to non image mime_type: rand

3/10/2025, 1:36:56 PM warn rdk   rimage/lazy_encoded.go:43   NewLazyEncodedImage resolving to non image mime_type: rand

3/10/2025, 1:36:56 PM info rdk   gostream/stream.go:301   detected new image bounds

3/10/2025, 1:36:56 PM warn rdk   rimage/lazy_encoded.go:43   NewLazyEncodedImage resolving to non image mime_type: rand

3/10/2025, 1:36:56 PM warn rdk.malformed-1   state/state.go:296   tick: rtp_passthrough not possible, falling back to GoStream   err SubscribeRTP failed: rpc error: code = Unknown desc = unknown stream for resource

3/10/2025, 1:36:56 PM warn     module/module.go:871 unknown stream for resource   name map[API:map[Type:map[namespace:rdk type:component] subtype:camera] Name:malformed-1 Remote:]  streamSourceByName map[resource.Name]rtppassthrough.Source{}  log_ts 2025-03-10T17:36:56.566Z

3/10/2025, 1:36:56 PM info rdk.modmanager   camera/client.go:446   SubscribeRTP is creating a video track   client 0x140017a5380  peerConnection 0x14000276f08

3/10/2025, 1:36:56 PM info rdk.malformed-1   state/state.go:293   attempting to subscribe to rtp_passthrough

3/10/2025, 1:36:56 PM warn rdk   stream/server.go:284   AddStream END malformed-1

3/10/2025, 1:36:56 PM info rdk   stream/server.go:146   Adding video stream   name malformed-1  peerConn &{PeerConnection-1741628216368952000 {{0 0} 0 0 {{} 0} {{} 0}} { 2987189465956219544 1741628216 } 0x14000b1ce10 {[{[stun:global.stun.twilio.com:3478] <nil> 0}] 0 1 2 [{0x14000b1da40 0x14000de8108 certificate-1741628216369729000}] 0 0} 0x14000f69780 <nil> 0x14000edd7a0 <nil> 1 {3} {3} <nil> 0x14001c57928 false 0x14001604e70 0x14001604ee0 0x14001c5792c 0x14001c57980 v=0 o=- 2987189465956219544 1741628216 IN IP4 0.0.0.0 s=- t=0 0 a=msid-semantic:WMS* a=fingerprint:sha-256 49:88:E1:26:84:26:C1:73:C9:A5:94:59:FA:89:0A:AE:55:9F:75:CE:60:5F:32:F0:0B:D5:FE:2B:EB:EA:B7:C8 a=extmap-allow-mixed a=group:BUNDLE 0 m=application 9 UDP/DTLS/SCTP webrtc-datachannel c=IN IP4 0.0.0.0 a=setup:active a=mid:0 a=sendrecv a=sctp-port:5000 a=ice-ufrag:SkwlHABgazpCCehZ a=ice-pwd:kFqrNnttiiNnZIHuGIRjXMEOfHaqKrgr -1 [] {<nil>} <nil> {0x104e30930} {<nil>} <nil> <nil> {0x104e3e5f0} 0x14000b146c0 0x14000e3afd0 0x14000f7cb00 0x14000e3b080 0x14000b1d800 0x14000a424c0 0x104bd7f40}

3/10/2025, 1:36:56 PM info rdk.networking   rpc/wrtc_base_channel.go:144   Connection establishment succeeded   conn_id PeerConnection-1741628216368952000  conn_local_candidates [{2025-03-10 17:36:56.419000064 +0000 UTC host 100.115.36.141} {2025-03-10 17:36:56.419000064 +0000 UTC host 127.0.0.1} {2025-03-10 17:36:56.419000064 +0000 UTC host 10.1.12.8}]  conn_remote_candidates [{2025-03-10 17:36:56.419000064 +0000 UTC peer-reflexive 127.0.0.1}]  candidate_pair (local) udp4 host 127.0.0.1:54347 <-> (remote) udp4 prflx 127.0.0.1:53668 related :0

3/10/2025, 1:36:24 PM info rdk   impl/local_robot.go:1334   Robot (re)configured

3/10/2025, 1:36:24 PM warn rdk   rimage/lazy_encoded.go:43   NewLazyEncodedImage resolving to non image mime_type: rand

3/10/2025, 1:36:24 PM info rdk.modmanager.local-rand-mod   modmanager/manager.go:553   Adding resource to module   resource malformed-1  module local-rand-mod

3/10/2025, 1:36:24 PM info rdk.resource_manager   impl/resource_manager.go:725   Now configuring resource   resource rdk:component:camera/malformed-1

3/10/2025, 1:36:24 PM info rdk.modmanager   modmanager/manager.go:323   Modules successfully added   modules [local-rand-mod]

3/10/2025, 1:36:24 PM info rdk.modmanager.local-rand-mod   modmanager/manager.go:413   Module successfully added   module local-rand-mod

Observed the above logs showing that the stream was initialized properly and also observed black frames coming through the wire. Looks like the mimetype being unknown is not an error/panic cause.

Confirmed this by looking at the code, if you trace the code from DecodeImage, you can see we sniff the bytes instead of using the mimetype, leading to no error.

Screenshot 2025-03-10 at 1 46 18 PM

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

func TestVideoSourceFromCameraWithNonsenseMimeType(t *testing.T) {
	sourceImg := image.NewRGBA(image.Rect(0, 0, 3, 3))

	camWithNonsenseMimeType := &inject.Camera{
		ImageFunc: func(ctx context.Context, mimeType string, extra map[string]interface{}) ([]byte, camera.ImageMetadata, error) {
			imgBytes, err := rimage.EncodeImage(ctx, sourceImg, utils.MimeTypePNG)
			test.That(t, err, test.ShouldBeNil)
			return imgBytes, camera.ImageMetadata{MimeType: "nonsense/mimetype"}, nil
		},
	}

	// Should log warning though
	vs, err := camerautils.VideoSourceFromCamera(context.Background(), camWithNonsenseMimeType)
	test.That(t, err, test.ShouldBeNil)
	test.That(t, vs, test.ShouldNotBeNil)

	stream, _ := vs.Stream(context.Background())
	img, _, err := stream.Next(context.Background())
	test.That(t, err, test.ShouldBeNil)
	test.That(t, img, test.ShouldNotBeNil)
}

Would you like me to add this test still?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Mar 10, 2025
Comment on lines 200 to 202
err := safeCall(func() {
bounds = lei.Bounds()
})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would recommend doiung this the exact opposite way.

Bounds() should be implemented in terms of BoundsSafe, no the other way around.

Same goes for all the other safe methods.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh ya that's nice

…vocation of -safe suffixed methods; Assert on err not substr
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Mar 12, 2025
@hexbabe
Copy link
Member Author

hexbabe commented Mar 12, 2025

@nicksanford this commit should address all your comments

… tests to not use substr for err checks at all; Add ConvertImageSafe to rimage
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Mar 12, 2025
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Mar 12, 2025
@hexbabe hexbabe merged commit e0bd017 into viamrobotics:main Mar 12, 2025
16 checks passed
@hexbabe hexbabe deleted the RSDK-10100 branch March 12, 2025 19:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
safe to test This pull request is marked safe to test from a trusted zone
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants