Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enable to cleanly quit from the program when audio is enabled #61

Closed
wants to merge 2 commits into from

Conversation

k-okada
Copy link
Member

@k-okada k-okada commented Jan 29, 2016

see discussion #57

@k-okada k-okada changed the title enable to cleanry quit from the program when audio is enabled enable to cleanly quit from the program when audio is enabled Jan 29, 2016
@k-okada
Copy link
Member Author

k-okada commented Jan 29, 2016

without this naoqi_driver could not shutdown with C-c

@suryaambrose
Copy link
Member

Hi

Does this really fix your issue ? I am a little skeptical.
I tried to reproduce yesterday, but here my problem did not come from audio, but from tactile and bumper. I am testing with a Nao, and the bridge actually never returns properly.
But without tactile or bumper, I have
^C[I] 27594 qi.Application: Sending the stop command... ^\zsh: quit (core dumped) rosrun naoqi_driver naoqi_driver_node --qi-url tcp://169.254.113.113:9559, regardless of audio status. The camera subscribers are properly cleaned so I cannot have your issue (front_cameraMight be a NAOqi problem. Try to restart the ALVideoDevice. front_cameraCamera Handle is empty - cannot retrieve image)

When I enable tactile and/or bumper, the bridge ends up like this:
^C[I] 2616 qi.Application: Sending the stop command... Audio Extractor: Stop terminate called after throwing an instance of 'qi::FutureUserException' what(): ALMemory::getData Data not found ROS-Driver-FrontTactilTouched

And after 6 or 7 tries, I get your issue, because the camera subscribers are not properly cleaned.

@k-okada
Copy link
Member Author

k-okada commented Jan 29, 2016

No,we also just realized this does not fixes the issue #57, but without this, we could not kill the naoqi_driver

speech reset
resetting service robot config service
robot config service is resetting

^C[pepper_robot/pose/pose_manager-4] killing on exit
[pepper_robot/pose/pose_controller-3] killing on exit
[INFO] [WallTime: 1454053775.601248] Stopping pose_controller
[pepper_robot/naoqi_driver-2] killing on exit
[INFO] [WallTime: 1454053775.601636] pose_controller stopped
[I] 16089 qi.Application: Sending the stop command...
[INFO] [WallTime: 1454053775.750736] nao pose_controller stopped.



^Z
[2]+  Stopped                 roslaunch pepper_bringup pepper_full.launch nao_ip:=133.11.216.190
k-okada@kokada-t440s:~/catkin_ws/ws_naoqi/src/naoqi_driver/src$ bg
[2]+ roslaunch pepper_bringup pepper_full.launch nao_ip:=133.11.216.190 &
k-okada@kokada-t440s:~/catkin_ws/ws_naoqi/src/naoqi_driver/src$ fg
emacs -nw naoqi_driver.cpp

[1]+  Stopped                 emacs -nw naoqi_driver.cpp
k-okada@kokada-t440s:~/catkin_ws/ws_naoqi/src/naoqi_driver/src$ fg
emacs -nw naoqi_driver.cpp

[1]+  Stopped                 emacs -nw naoqi_driver.cpp
k-okada@kokada-t440s:~/catkin_ws/ws_naoqi/src/naoqi_driver/src$ [pepper_robot/naoqi_driver-2] escalating to SIGKILL
[rosout-1] killing on exit
[master] killing on exit
Shutdown errors:
 * process[pepper_robot/naoqi_driver-2, pid 15654]: required SIGKILL. May still be running.
shutting down processing monitor...
... shutting down processing monitor complete
done

[2]-  Done                    roslaunch pepper_bringup pepper_full.launch nao_ip:=133.11.216.190
k-okada@kokada-t440s:~/catkin_ws/ws_naoqi/src/naoqi_driver/src$ 

@k-okada
Copy link
Member Author

k-okada commented Jan 29, 2016

and you're correct, even we have this, we still have some error on termination process,

^C[pepper_robot/pose/pose_manager-4] killing on exit
[pepper_robot/pose/pose_controller-3] killing on exit
[pepper_robot/naoqi_driver-2] killing on exit
[INFO] [WallTime: 1454053868.971738] Stopping pose_controller
[I] 17999 qi.Application: Sending the stop command...
[INFO] [WallTime: 1454053868.972319] pose_controller stopped
Audio Extractor: Stop
terminate called after throwing an instance of 'qi::FutureUserException'
  what():   ALMemory::getData
    Data not found ROS-Driver-RightBumperPressed
[INFO] [WallTime: 1454053869.099025] nao pose_controller stopped.
[rosout-1] killing on exit
[master] killing on exit
shutting down processing monitor...
... shutting down processing monitor complete

@k-okada
Copy link
Member Author

k-okada commented Jan 29, 2016

k-okada@1f7efa0 may fix the bumper/tactile terminatiion problem.

@suryaambrose
Copy link
Member

Yes your last commit should fix it !

About the audio I am not sure this works. Can you check it is still working ? Looking at the code, calling subscribe and then setClientPreferences should stop the audio transmission.

@k-okada
Copy link
Member Author

k-okada commented Jan 29, 2016

yes, we need to call subscribe first to prevent the error...

@k-okada
Copy link
Member Author

k-okada commented Jan 29, 2016

we can see following message on the exit, and which was front_cmaera_1 in several attempt before. so at some point the suffix number is increasing

Unsubscribe camera handle front_camera_2
Unsubscribe camera handle bottom_camera_2
Unsubscribe camera handle depth_camera_2
Unsubscribe camera handle infrared_camera_2
naoqi driver is shutting down..

@suryaambrose
Copy link
Member

And do you still receive audio buffer on /naoqi_driver_node/audio ?

@k-okada
Copy link
Member Author

k-okada commented Jan 29, 2016

ooh not it did not publish /audio, that's why it seems working, so any idea?

@suryaambrose
Copy link
Member

Not yet :)
But I'm looking into it

@suryaambrose
Copy link
Member

BTW which version of naoqi do you use ?

@k-okada
Copy link
Member Author

k-okada commented Jan 29, 2016

I'm using default version installed by ros-indigo

k-okada@kokada-t440s:~/catkin_ws/ws_naoqi/src/naoqi_driver/src$ rosversion  naoqi_libqi
2.3.0
k-okada@kokada-t440s:~/catkin_ws/ws_naoqi/src/naoqi_driver/src$ rosversion  naoqi_libqicore
2.3.1

@k-okada
Copy link
Member Author

k-okada commented Jan 29, 2016

just for memo, following code seems better than before...

@@ -162,6 +162,7 @@ void AudioEventRegister::unregisterCallback()

 void AudioEventRegister::processRemote(int nbOfChannels, int samplesByChannel, qi::AnyValue altimestamp, qi::AnyValue buffer)
 {
+  boost::mutex::scoped_lock callback_lock(mutex_);
   naoqi_bridge_msgs::AudioBuffer msg = naoqi_bridge_msgs::AudioBuffer();
   msg.header.stamp = ros::Time::now();
   msg.frequency = 48000;
@@ -174,7 +175,6 @@ void AudioEventRegister::processRemote(int nbOfChannels, int samplesByChannel, q
   msg.data = std::vector<int16_t>(remoteBuffer, remoteBuffer+bufferSize);

   std::vector<message_actions::MessageAction> actions;
-  boost::mutex::scoped_lock callback_lock(mutex_);
   if (isStarted_) {
     // CHECK FOR PUBLISH
     if ( isPublishing_ && publisher_->isSubscribed() )

@suryaambrose
Copy link
Member

Hum I'll study this change.

And the naoqi version on the robot (for audio it matters) ?

@k-okada
Copy link
Member Author

k-okada commented Jan 30, 2016

naoqi version on the robot is 2.4.2.26

@k-okada
Copy link
Member Author

k-okada commented Jan 30, 2016

void AudioEventRegister::stopProcess()
{
  std::cerr << __PRETTY_FUNCTION__ << " " << serviceId << std::endl;
  //boost::mutex::scoped_lock stop_lock(mutex_);                                                                                                                                                             
  std::cerr << __PRETTY_FUNCTION__ << " : lock " << isStarted_ << std::endl;
  if (isStarted_)
  {

may be much better. I suspect it is the problem of mutex is failing to unlock when C-C is pressed.

@suryaambrose
Copy link
Member

Something like that you are right.
I managed to reproduce and gdb the program. It is stuck in two functions:
processRemote, trying to get the lock but not getting it
stopProcess, waiting for its "unsubscribe" call to return.

I guess AudioDevice waits for the callback to return before unsubscribing naoqi_driver, leading to this deadlock.

Removing the mutex in the stopProcess will indeed solve the issue but I'm afraid it might induce crashes of naoqi if it is not enough protected. I'll check that on Monday. If it is not safe then we will probably need something more complex.

@suryaambrose
Copy link
Member

@k-okada if #62 is ok for you, we can merge #62 and close #61

@k-okada k-okada closed this Feb 3, 2016
@k-okada
Copy link
Member Author

k-okada commented Feb 3, 2016

#62 is ok for me, thanks

◉ Kei Okada

On Wed, Feb 3, 2016 at 4:41 AM, Surya Ambrose notifications@github.com
wrote:

@k-okada https://github.com/k-okada if #62
#62 is ok for you, we can
merge #62 #62 and close
#61 #61


Reply to this email directly or view it on GitHub
#61 (comment)
.

@k-okada k-okada deleted the fix_57 branch February 4, 2016 00:21
@k-okada
Copy link
Member Author

k-okada commented Feb 4, 2016

And I've just released new version as 0.5.7

@vrabaud
Copy link
Contributor

vrabaud commented Feb 4, 2016

thx ! that should be a pretty solid version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants