Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with serial communication #11

Closed
alek-b opened this issue Jun 30, 2013 · 2 comments
Closed

Problem with serial communication #11

alek-b opened this issue Jun 30, 2013 · 2 comments
Assignees

Comments

@alek-b
Copy link

alek-b commented Jun 30, 2013

HI,
we experienced some problem with the serial communication on our mastermind and the sensor fusion module. The problems are the following: Rx timeout on the asctec_hl_interface with maximum baudrate and NaN error on the sensor fusion. We already patched the kernel, and we use the following config file for the hl_interface

serial_port: /dev/ttyS2
baudrate: 921600
frame_id: fcu
packet_rate_imu: 0.0
packet_rate_rc: 20.0
packet_rate_gps: 5.0
packet_rate_ssdk_debug: 10.0
packet_rate_ekf_state: 300.0

and the default one for the sensor fusion.

With this configuration no problems occurs, for around 20 minutes during which packet_rate_ekf_state was dynamically reconfigured to 700 and 1K without any problem.

Then we rerun the ascect_hl_interface and switched the packet_rate_ekf_state to 1KHz, so we run the sensor fusion and after around 2-3 minutes the filter diverged giving

=== ERROR === before prediction p,v,q: NAN at index 0
=== ERROR === before prediction p: NAN at index 0
=== ERROR === prediction p: NAN at index 0
=== ERROR === prediction done P: NAN at index 0
=== ERROR === update: NAN at index 0

So we restart both hl_interface and filter with the same config as the first experiment (921600 and 300Hz) and after short the hl_interface gave this error.

rx timeout
[ WARN] [1372327067.024112282]: No new valid packets within the last 1.000000 s
[ WARN] [1372327068.024801337]: No new valid packets within the last 1.000000 s
[ WARN] [1372327069.024673660]: No new valid packets within the last 1.000000 s
[ WARN] [1372327070.024373841]: No new valid packets within the last 1.000000 s
[ WARN] [1372327071.023930559]: No new valid packets within the last 1.000000 s
[FATAL] [1372327071.023997467]: No valid packets within the last 5.000000 s, aborting !

Now if we rerun the hl_interface we get:

waiting for acknowledged packet timed out
rx timeout
waiting for acknowledged packet timed out
waiting for acknowledged packet timed out
rx timeout
waiting for acknowledged packet timed out
waiting for acknowledged packet timed out
[ERROR] [1372327365.126108421]: failed
[ERROR] [1372327365.126243285]: unable to connect
rx timeout
rx timeout
[fcu-1] process has died [pid 7152, exit code 255, cmd /home/asctec/ros_workspace/asctec_mav_framework/asctec_hl_interface/bin/hl_node __name:=fcu __log:=/home/asctec/.ros/log/b983e8ec-df0b-11e2-9ba7-000e8e313de3/fcu-1.log].
log file: /home/asctec/.ros/log/b983e8ec-df0b-11e2-9ba7-000e8e313de3/fcu-1*.log
all processes on machine have died, roslaunch will exit

The only way to solve this problem is to turn off and on the autopilot, but if we rerun the hl_interface (921600 and 300Hz) and the sensor fusion, we still got in a short time on the hl_interface:

rx timeout
[ WARN] [1372327067.024112282]: No new valid packets within the last 1.000000 s

Doing the same again (turn off and on the autopilot and rerun hl_interface (921600 and 300Hz) and the sensor fusion) didn't show problem for around 5 minutes, after witch packet_rate_ekf_state was dynamically reconfigured to 1KHz and no problem occurred for around 7 min.

So again we stopped again sensor fusion and hl_interface and we rerun the ascect_hl_interface and switched the packet_rate_ekf_state to 1KHz. Immediately after initializing the filter we got NAN errors.

We do the same again and after five minutes we got:

[ WARN] [1372329184.260457270]: fuzzy tracking triggered: 0.699022 limit: 0.1

[ WARN] [1372329184.290052371]: Negative scale detected: -0.254625. Correcting to 0.1
=== ERROR === prediction done P: INF at index 78
=== ERROR === prediction done P: NAN at index 0
=== ERROR === prediction done P: NAN at index 0
=== ERROR === prediction done P: NAN at index 0
=== ERROR === prediction done P: NAN at index 0
=== ERROR === prediction done P: NAN at index 0
=== ERROR === prediction done P: NAN at index 0
=== ERROR === prediction done P: NAN at index 0
=== ERROR === prediction done P: NAN at index 0
=== ERROR === prediction done P: NAN at index 0
=== ERROR === prediction done P: NAN at index 0
=== ERROR === prediction done P: INF at index 130
=== ERROR === prediction done P: NAN at index 0
=== ERROR === update: NAN at index 0
=== ERROR === before prediction p: NAN at index 0
=== ERROR === prediction p: NAN at index 0
=== ERROR === prediction done P: NAN at index 0
=== ERROR === update: NAN at index 0
=== ERROR === before prediction p,v,q: NAN at index 0
=== ERROR === before prediction p: NAN at index 0
=== ERROR === prediction p: NAN at index 0
=== ERROR === prediction done P: NAN at index 0
=== ERROR === update: NAN at index 0

After this if we rerun the hl_interface and the sensor fusion, using as baudrate 460800 and datarate 300Hz the system look pretty rock solid, but packet_rate_ekf_state can't put to 350, for example, or NAN errors occurs in a short after filter is initialized.

The drone was always stationary, Mastermind was never restarted during these experiments, and rosrun ssf_core plot_relevant was always started after the sensor fusion to check for strange behaviors.

What to do you think about these problems?

Have you ever tested the asctec mastermind?

Thanks in advance.

Best regards.

@ghost ghost assigned markusachtelik Jun 30, 2013
@markusachtelik
Copy link
Contributor

Hi,

we tested this on the mastermind, yes. Does uname -r reveal the correct kernel version that you compiled? But overall, there is no need to go significantly higher than 100 Hz with the ekf state packet rate. This is only for covariance propagation, which we assume is somewhat linear during 10 ms. State prediction is still executed at 1 kHz on the HLP, regardless of packet_rate_ekf_state. That setup worked well for speeds of 4 m/s. Do you need more, or what was the motivation for 1 kHz?

@alek-b
Copy link
Author

alek-b commented Jul 3, 2013

Hi,
yes, uname -r reveals the correct kernel version. Actually we don't need more, it was only for testing. With the values that you suggested everything works fine! Thanks! ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants