Skip to content
This repository has been archived by the owner on Oct 27, 2020. It is now read-only.

Probable bug with app lifecycle / USB intents / ESD #133

Closed
Windwoes opened this issue Nov 15, 2019 · 6 comments
Closed

Probable bug with app lifecycle / USB intents / ESD #133

Windwoes opened this issue Nov 15, 2019 · 6 comments

Comments

@Windwoes
Copy link
Member

I was testing our new drivetrain a couple weeks ago, and ran into some extremely strange behavior. The drivtrain was being driven around the field for a while, when it touched the metal part of the SkyBridge and there was a nice ESD zap. This induced a USB transfer error. That's to be expected, given I didn't have any USB surge protectors or chokes installed yet, however it's what happened in addition to the USB transfer error that's concerning: app lifecycle changes.

The logs indicate the program was running fine for over 2 minutes (notice the difference in timestamps between the first and second lines below), when at 11:33:10.027 the ESD event occurs.

11-02 11:30:28.822 31292 31408 V Robocol : sending CMD_NOTIFY_RUN_OP_MODE(444), attempt: 0
11-02 11:33:10.027 31292 31436 E BulkPacketInWorker: DQ16RP7R: bulkTransfer() error: -1
11-02 11:33:10.040 31292 31292 V FtDeviceManager: ACTION_USB_DEVICE_DETACHED: /dev/bus/usb/001/002

Then approximately 300ms later, the FTDI comes back online:

11-02 11:33:10.384 31292 31292 V RCActivity: ACTION_USB_DEVICE_ATTACHED: /dev/bus/usb/001/003

But then, right after the FTDI comes online, this happens:

11-02 11:33:10.385 31292 31292 V RCActivity: onStart()
11-02 11:33:10.385 31292 31292 V Robocol : EventLoopManager.shutdown()
11-02 11:33:10.386 31292 31484 V RobotCore: thread: ...terminating 'LinearOpMode main'

There is a lifecycle call to onStart() and the Robot Controller restarts the event loop!

The robot continued moving in the last direction for several seconds (if I had to guess, it was the 5000ms USB_WAIT while the event loop was restarting).

I'm inclined to believe this is NOT a fluke, because I've seen other bizarre issues that involve lifecycle events happening when they shouldn't be, such as 703 and 702. Additionally, although it's possible that it's unrelated, other people have seen strange issues with USB intents / lifecycle as well, for instance this.

I'm not entirely sure this is an SDK bug, (it might be a low-level Android API bug), but there's definitely something fishy going on here.

@EddieDL
Copy link

EddieDL commented Nov 15, 2019

I do not understand all of the lifecycle or other areas, but I can confirm that many of the issue we have had seem to be ESD related. In a recent match, during auto, we touched our partners robot and the robot controller crashed and the robot continued to move forward for several seconds. Then the OpMode resumed and continued moving forward.

Since then we have spent extensive time working to reduce ESD. We have always had some issues with ESD, but most were resolved when we switched to REV modules. This year seems like it is worse.

@slylockfox
Copy link

At an event on Saturday where I was FTA, a team reported their robot switched from teleop to autonomous during the match. I was skeptical this was possible, but now I wonder. Would such a behavior be a "lifecycle change" like @FROGbots-4634 reported?

@cmacfarl
Copy link
Member

@slylockfox The app lifecycle is distinctly different than the opmode lifecycle. It's unlikely, bordering on not possible, that a problem with the app lifecycle is going to cause an automatic switch to autonomous in the middle of an active teleop run. Robot logs from both the RC and DS could shed some light if claims of this happen again.

@Windwoes
Copy link
Member Author

So, after doing some more testing, I think I have a better idea of what happened. I typically run during testing with the screen off to save battery power, which puts the activity to sleep. However, when the USB comms interruption occurred and the FTDI to disappeared and then reappeared in quick succession, Android sent the USB attach intent to the activity, waking it back up (while the screen was still off), which called onStart() and thus triggered restart of the event loop. This, I think, is another good reason to remove the event loop restart from onStart() as mentioned in #11 - the activity should just be a front-end for the service; activity lifecycle events should not impact the service.

@gearsincorg
Copy link

gearsincorg commented Nov 19, 2019 via email

@Windwoes
Copy link
Member Author

I'll go ahead and close this since it's an artifact of #11

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants