-
Notifications
You must be signed in to change notification settings - Fork 118
Potential fix for current robot state incorrect #407
Conversation
…e monitor anyway)
Fixes the issue on my system. +1 @sachinchitta , I owe you a beer. |
This change will unfortunately also trigger updating the state many times a second, depending on how often /joint_state comes in. This means that if move_group is publishing to the rviz plugin, you will also get diffs. If an octomap is being maintained, the planning scene diff will include the octomap. We do not support diffs for the octomap, so we will end up pushing it over the wire very often. This will probably cause issues, depending on computer & network speed. The nice thing would be to actually implement diffs for the octomap, but I think we should throttle state updates in any case. I will take a look at this soon. |
Updating the current state properly is important - what was happening before was not correct. Throttling it is fine but not updating is not. The other option is to have a delay in current state monitor (maybe 1s ) before updating is stopped even if the state is not changing. That way we can be sure that the last few state updates have gone through. |
Fixes it for me as well (though I don't use octomaps). @sachinchitta: just out of curiosity though: what is the difference between plan then execute, and just hitting plan & execute that causes the latter to 'work' without this change? Update of the robot state in the RViz window is much smoother in that case as well (with just execute I only see the second-to-last and the last state update). |
Another solution would be to create a timer callback which fires every dt_state_update_ sec (i.e. every 0.1 sec) and checks whether there is an unprocessed state update. The callback would check for any update that has arrived more than dt_state_update_ sec earlier and not been processed, and if so process it. This would mean storing (a copy of) any state update that arrives and is not processed. Perhaps an even better way of doing it is to store any state update that arrives, and then in the timer callback process the most recent one (if any). |
@acornacorn I agree. I actually started on a fix along these lines. I added an optional period for the addUpdateCallback() (defaults to 0). Then when we get updates, we just cache the shared ptr to the message, so copy is not really an issue. The only thing is that while we have an unprocessed update we have a busy wait loop until the last updater (longest period) has received its state (in the thread that calls the callbacks). If we have an updater that expects updates every second, this is not nice. You could wait the minimum duration too, but if that is 0, we end up with the busy wait loop. What I think is the better approach is to create an updater thread for every requested update callback. The only thing to keep track of then is making sure all threads got the new joint_state. It seems we will need a counter for that. How does this sound to you? I can work on it this weekend unless somebody else fixes it :) |
@isucan I'm not following the reason we need to busy wait -- where is that needed? |
@acornacorn I am not sure I have the right design in mind yet :) Maybe we do not. Here is what I think the version with one update thread is:
|
OK, so we are using a condition variable to trigger updates in the update thread. My suggestion is:
With this scheme:
Does that make sense? |
Discussed with Sachin. A much simpler solution is to just use the ros callback queue like we are doing now. It would work like this:
This means that the timer is running all the time. But the callback will be very quick if no updates are pending. And this is a much simpler change than creating a whole new mechanism with condition variables. |
Hey @acornacorn, I really like your first suggestion. |
Hmm. That timeout 0 thing affects all suggestions so far I think :( I was looking at the timer execution thread: http://docs.ros.org/hydro/api/roscpp/html/timer_8cpp_source.html The main issue is that we want to not run at the minimum dt_update_time_, because that could be 0, and only trigger logic when a message comes in. |
My first suggested solution should work fine with a dt_update_time_ of 0. When dt_update_time_ is 0 the incoming message will always trigger the execution right away and the "timeout" variable will never be set (it will always be 1000sec or whatever we set it to at the start of the loop). My second suggested solution should also work fine. If dt_update_time_ is 0 then don't start the timer at all (or start it and set it to a very long period like 1000 sec). The updates will always be processed immediately when they arrive and the timer callback will never be needed. If there are several update messages and they all have different dt_update_time_ values then start the timer and set its period to the smallest nonzero dt_update_time_ value. Again, messages with a dt_update_time_ of 0 will always be processed when they are first seen and never need the timer callback. The second solution seems to be closer to the way it is implemented now. Am I missing something? Where are the condition variables now? Regarding multiple threads: either of these suggestions should work fine with multiple threads. With multiple threads each message type would need its own mutex to ensure 2 threads do not merge or execute the same type of update at the same time. |
@acornacorn that sounds good. I think I understood something incorrectly from your first comment. Any of these solutions is fine by me! Currently we use condition variables in robot_interaction, the rviz plugin, planning_scene_monitor, trajectory_execution, octomap updaters and pick & place. |
I just tested Sachin's fix with a real robot that did not start at the home position (i.e. not all zeros like moveit and the ROS-I simulator assumes). The Scene robot does not match the current robot state on startup. Pressing the update start state to current state does not update the scene robot as one would expect. There is a work around that fixes the issue:
|
This seems related - it does look like an initialization error where initial updates are either not getting through or being missed completely. I do think Acorn's proposed fix should handle this initialization issue as well - we will try and get it done quickly this week. |
@shaun-edwards - can you try with my hacked update to see if the initialization issue also gets fixed on your robot. I must admit that I thought this would not work for a robot that starts with all zeros but would work for a robot that does not start there but your problem seems to be the other way around. |
@sachinchitta - I tried your latest change. I'm still having the same issue. Just to be sure, I verified that your latest code change was there and added a log message (it appears when I run the robot). The "scene robot" and "start state" are still stuck at the zero position. Perhaps this is caused by a different issue. What is odd, is that the state does NOT update eventually, even if it took a few seconds or so. I'm happy to test, but let me know if you want be to look into this further. I certainly could try to narrow down the problem better. |
I can see where it would fail to update with the simulator - current_state_monitor initializes robot_state to default values and then only does an update if the values have changed. If the simulator publishes default (zero) values on startup, the initial update won't get through until the robot moves. Not sure why this would happen with a robot where the initial state is different from the default state though - that update should just go through. Will keep digging. |
Ok, I think I have both issues figured out and a potential fix attached to this pull request now - its not the fix that @acornacorn is talking about above but it at least highlights the issues. There are two issues here: (1) some updates were not getting through - this is something that we have discussed above, my solution for not throttling state updates will fix this but it might be better to use Acorn's approach @shaun-edwards, @k-okada let me know if the two fixes I have in here at least temporarily fix your issues. |
It has fixed the problem on my robot. |
@sachinchitta, any ETA on when the patch will be released into debians (I realize @acornacorn is working on a more permanent fix) |
I have an alternative fix in pull request #408 This works as I described above #407 (comment) Basically a timer ensures that state updates that arrive and are not processed right away are eventually (within at most dt_state_update_*2) processed (e.g. if no other state updates arrive for a long time). Note: the fix does not include a fix for the "Rviz starts late and misses the initial planning scene message" problem. That will be fixed in a separate pull request... |
I propose we close this pull request without merging in favour of #408. |
I have an additional fix for the planning scene initial state problem. Pull request #409 modifies PlanningSceneDisplay so that it queries the full state of the planning scene when Rviz starts (actually when the PlanningSceneDisplay is enabled and the robot model is loaded). Pull request #409 includes the changes in #408 so if we merge #409 then we can abandon #407 and #408 . I have confirmed that this fixes both problems on my system. @shaun-edwards can you confirm that #408 fixes the problems you were seeing? |
Closing this as its now handled by #409. |
updates should always happen here (they are throttled in current state monitor anyway).
This should fix #405 - the issue was the following:
(1) current_state_monitor will not pass updates on if none of the joints have changed values. industrial simulator publishes updates at 10 Hz.
(2) planning_scene_monitor will not process a state update if they are not spaced apart by at least dt_state_update_ which defaults to 0.1s. So, the last state update never gets processed when being used with industrial robot simulator.
(3) Changing the dt_state_update is not right either since planning_scene_monitor can still miss some updates
Solution is to not use dt_state_update but do an update in planning_scene_monitor everytime current_state_monitor updates - note that this could potentially cause issues if current_state updates come in too fast.