Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Save blackbox file from dataflash freezes mid-download in recent versions of configurator #411

Closed
mlopyrev opened this issue Jan 29, 2017 · 29 comments
Labels

Comments

@mlopyrev
Copy link

mlopyrev commented Jan 29, 2017

Target: X-RacerSPI (tried two boards with different configurations)
Betaflight version: 3.1.1; also tried earlier ones with same result
Configurator version: 1.9 (freezes); tried 1.8.4 (freezes), 1.7 (works), 1.3 (works)

Other symptoms: if aircraft is plugged in to power, at a random point during the dataflash download, there is a restart / ESC re-init - at that point, the download progress bar stops, but I can still see Cycle Time, CPU Load numbers move.

I'm able to use v1.3 to download the files.

@mlopyrev mlopyrev changed the title Save blackbox file from dataflash freezes in recent versions of configurator Save blackbox file from dataflash freezes mid-download in recent versions of configurator Jan 29, 2017
@mikeller
Copy link
Member

Can you check if you get any error messages in the console (right-click into configurator, click 'Inspect', console is at the bottom of the window that opens) when you get the freeze / reboot?

@mikeller
Copy link
Member

Also, can you try with a different / shorter USB cable?

@mlopyrev
Copy link
Author

hi @mikeller - sorry I haven't had the chance to get the console output. I've ended up wiring up both quads with OpenLog loggers instead (which is fantastic!).

As for USB cables - tried regular coiled and 1' length, to no avail. Also tried on two different laptops and a fresh reinstall of the VCP drivers. It only occurred to me to roll back the configurator after I've eliminated all other factors.

I'll update this when I get some console dump.

@mikeller
Copy link
Member

@mlopyrev: Fair enough. Fyi, the XRacerSPI does not use VCP, it requires the CP2102 USB to serial drivers.

@mlopyrev
Copy link
Author

ok cool. I'll try a fresh install of the drivers as well. Thanks for checking in on this.

@mlopyrev
Copy link
Author

mlopyrev commented Feb 1, 2017

Ok, here is the console dump.

(attaching a file)
console_dump.txt

@etracer65
Copy link
Member

I can confirm this problem. I have a friend using a X-Racer F303 v2.1 (SPRACINGF3 target) and he was having the configurator freeze while downloading blackbox logs. I did some testing with a board of my own (same X-Racer v2.1) and was able to replicate the problem. Using latest BF 3.1.3 (all defaults), BF configurator 1.9.1. Downloading the logs will sometimes freeze - about 50% of the time for me and 100% of the time for my friend. Took a look at the debug console and got the following:

onboard_logging.js:442 Dataflash dump file path: ~/Desktop/BLACKBOX_LOG_20170202_223507.bfl
msp.js:199 MSP data request timed-out: 71
msp.js:199 MSP data request timed-out: 116
msp.js:199 MSP data request timed-out: 150
msp.js:199 MSP data request timed-out: 110
msp.js:116 code: 71 - crc failed
main.html#:1 Error in event handler for serial.onReceive: TypeError: Cannot read property 'byteLength' of null
at MspHelper.process_data (chrome-extension://koahkmddpgmmglfiggpmiliidhnoebfk/js/msp/MSPHelper.js:960:82)
at chrome-extension://koahkmddpgmmglfiggpmiliidhnoebfk/js/msp.js:136:13
at Object.notify (chrome-extension://koahkmddpgmmglfiggpmiliidhnoebfk/js/msp.js:135:24)
at Object.read (chrome-extension://koahkmddpgmmglfiggpmiliidhnoebfk/js/msp.js:124:26)
at read_serial (chrome-extension://koahkmddpgmmglfiggpmiliidhnoebfk/js/serial_backend.js:331:13)

This is with the default 115,200 baud rate. Computer is a MacBook Pro with latest OS X 10.12.3.

It only looks like the configurator is hanging as force killing it and reopening allows reconnection to the flight controller - not reboot needed.

I haven't been able to replicate with VCP targets.

@etracer65
Copy link
Member

etracer65 commented Feb 3, 2017

@mikeller I tried reinstalling the latest CP210x drivers and the problem still persists. Tested on another computer with OS X 10.10 and the same issue so it's not related to an OSX 10.12 problem.

The timeout is random in that it occurs at different points in the download - doesn't seem to be a pattern. The only commonality is that the debugging reports a CRC error and then the other errors. It looks like the code doesn't recover properly from the CRC error and retry the packet.

Whether or not the CRC error is real and actually corrupted serial data is unknown. I've tried many different baud rates and USB cables and it doesn't seem to matter.

I believe this problem is related to the jumbo packets implementation. Back when that was originally implemented I did some testing and on a couple of occasions I saw the same problem. I couldn't ever replicate it them and it seemed to stop happening so I never reported it.

@etracer65
Copy link
Member

I was just able to replicate this with a VCP target (Revo) on BF 3.1.3. The behavior is the same; CRC error followed javascript errors. It doesn't look like the problem is limited to CP210x targets. The debug console from the VCP failure:

Connecting to: /dev/tty.usbmodemFA131
serial.js:116 SERIAL: Connection opened with ID: 5, Baud: 115200
three.min.js:536 THREE.WebGLRenderer 72
onboard_logging.js:442 Dataflash dump file path: ~/Desktop/BLACKBOX_LOG_20170203_102123.bfl
msp.js:199 MSP data request timed-out: 71
msp.js:199 MSP data request timed-out: 116
msp.js:199 MSP data request timed-out: 150
msp.js:199 MSP data request timed-out: 110
msp.js:116 code: 71 - crc failed
main.html#:1 Error in event handler for serial.onReceive: TypeError: Cannot read property 'byteLength' of null
at MspHelper.process_data (chrome-extension://koahkmddpgmmglfiggpmiliidhnoebfk/js/msp/MSPHelper.js:960:82)
at chrome-extension://koahkmddpgmmglfiggpmiliidhnoebfk/js/msp.js:136:13
at Object.notify (chrome-extension://koahkmddpgmmglfiggpmiliidhnoebfk/js/msp.js:135:24)
at Object.read (chrome-extension://koahkmddpgmmglfiggpmiliidhnoebfk/js/msp.js:124:26)
at read_serial (chrome-extension://koahkmddpgmmglfiggpmiliidhnoebfk/js/serial_backend.js:331:13)

@mikeller
Copy link
Member

mikeller commented Feb 4, 2017

We've confirmed the problem with the XRACERSPI, looking into reducing the block size (and speed) for this target. I've not had any other reports of download problems with REVO, so this makes me suspect that the susceptibility to these problems might be affected by the hardware that the configurator is running on.

@etracer65
Copy link
Member

etracer65 commented Feb 4, 2017

@mikeller I can consistently get the SPRACERF3 target (on a X-Racer F303 v2) to fail. I tried lowering the block size in onboard_logging.js all the way down to 512 and still get occasional CRC errors and aborts (and much slower downloads). Tried multiple computers, multiple USB cables, etc.

Some thoughts: Shouldn't a CRC error be recoverable? Discard the block and re-request it? Right now the code crashes.

The most common point that the failure occurs is right at the very end of the download when the progress bar fills. This is not always the case, but probably about 75% of the time for me.

Maybe other MSP traffic interleaved with blackbox frames causing problems and corrupting the CRC?

Since I can easily replicate on both CP210x and VCP (Revo) targets I'm happy to test any debugging you want to try.

@etracer65
Copy link
Member

etracer65 commented Feb 4, 2017

@mikeller I've been debugging all morning and haven't come up with a reason for the CRC errors, but it looks like they're real. So instead I shifted focus on how to retry the packet and came up with a really messy fix. From the MSP processing structure it appears like it would be really difficult to generically handle retries for CRC errors so I focused on the dataflash read messages.

Since the data structure and callback mechanism is the same for all message types, I had to come up with a way to pass back an indicator that a CRC error occurred to inform the caller to retry the block. Since we have the (currently) unused data compression type byte I used that as a flag (set to 255 on a CRC error). Patch files are attached for js/msp.js and js/msp/MSPHelper.js

I've tested many times for both VCP and CP210x targets and have 100% success - even with high baud rates. The CRC errors are still occasionally happening (as evidenced in the console log) but the block gets retried and the download succeeds. Verified the downloaded logs exactly match between cases where they download successfully with no CRC errors and cases where the blocks have to be retried.

I realize these patches are "messy" but the goal is to suggest a solution that can be integrated in a better way.

msp.js.txt
MSPHelper.js.txt

@waltr1
Copy link

waltr1 commented Feb 5, 2017

Just another data point:
I consistently get the SPRACERF3 target (on a DODO FC) to always fail right from the beginning, no movement of the status bar. This is configurator 1.9.1 without the patches.

Added the patches, copied the two files into 1.9.1 folders, rename existing by adding '.old' and deleting the '.txt' from patch files. Configurator now does not show opening screen and does not Connect.
Renaming file back and configurator working.
Did I do this correctly?

@etracer65
Copy link
Member

@waltr1 No, these patches are not the complete files - just the differences between the old and patched versions. You can't use these files directly. I added a second post on rcgroups with a link to a patched configurator version.

https://www.rcgroups.com/forums/showpost.php?p=36811734&postcount=44503

If you want to use this you'll have to download and unzip the directory and then in Chrome go to preferences -> extensions and enable developer mode (checkbox in top-right). Then click the "Load unpacked extension…" button and locate the unzipped directory.

@etracer65
Copy link
Member

Created pull request 418 with the patches.

#418

@jamesarm97
Copy link

Thank you. I have been unable to download blackbox logs either. Usually stops near 0% and does show an i2c error lower left status line. I will try the patches.

@etracer65
Copy link
Member

@jamesarm97 The patches listed here have been superseded and the final version is merged into code. So just download the current patched version from github - no need to patch yourself.

@peteoz
Copy link

peteoz commented Feb 8, 2017

on my KIWI.F4 boards I have identical issue as described above.
BF3.1.3, running latest Mac OS, download freezes at random point during download - sometimes I get 5%, other times I get to 40% or anywhere in between, I don't think I was ever able to download past 40%.
once this happen the whole configurator freezes, clicking "cancel" has no effect, need to do force quit and try again.
tried with 2 different KIWI.F4 boards, identical problem on both.

@peteoz
Copy link

peteoz commented Feb 8, 2017

the patched version posted above solved the issue on my KIWI.F4 boards, I still get packet errors during download but the download continues and completes (previously it would freeze on first packet error).
thanks for the fix.
is the fix going to be included in the next release of BF configurator ?

@mikeller
Copy link
Member

mikeller commented Feb 8, 2017

Yes, #420 is already merged into master.

Also, thanks for the feedback.

@VolcanoG
Copy link

Hi, I'm having the same issue on an OmnibusF4 target, running 3.1.5.
When I try to save the blackbox log I immediately get an error packet and the the progress bar freezes

@mikeller
Copy link
Member

@VolcanoG: Yes, sounds like the same problem as described above (are you using a Mac?). A workaround will be included in the next release of the configurator.

@VolcanoG
Copy link

@mikeller Yes I'm on Mac OSX, fantastic looking forward to it!

@jamesarm97
Copy link

The beta version now gets further than it did before (it use to stay on 0%), but now moves slowly and gets hundreds of bad packets. I still haven't been able to download a complete log (using beta as of a week ago).

@mikeller
Copy link
Member

@jamesarm97: Are you using a Mac? If so, try downloading on a PC, there is a bug in the MacOS driver that seems to be the root cause for the high incidence of errors on Mac.

@VolcanoG
Copy link

@mikeller Yeah I tried to download the log now on a virtual machine running windows and it downloaded the file with no issue, thanks for the help man

@peteoz
Copy link

peteoz commented Feb 17, 2017

same issue affects Revolt v2 running on BF when trying to download blackbox logs,
the patched BF configurator posted above fixed the problem.

@stale
Copy link

stale bot commented May 6, 2018

This issue / pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs within a week.

@stale stale bot added the Inactive label May 6, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants