Skip to content

MIDCA Baxter demo

Zohreh edited this page Mar 13, 2017 · 10 revisions

We are working toward cooperative interaction between humans and machines by starting with a small instructional problem for the robot to accomplish. Human asks the robot to pick up a colored block. Robot needs to understand what the human wants and create a plan to achieve it. MIDCA architecture provides the reasoning to processes the command and execute the right actions.

  • Speech to text: Audio signal is translated to text string.
  • Infer goal: The text is mapped to user intent.
  • Create plan: The SHOP2 planner creates a plan to achieve goal.
  • Execute plan: Actions sent to API.

##MIDCA ROS Interface## We added an application programming interface (API) to MIDCA_1.4 to communicate with ROS and a Baxter humanoid robot. It is responsible for sending messages to ROS as requested by MIDCA, and for placing messages received in appropriate queues for MIDCA to process. We created other ROS nodes which are responsible for doing specific actions, such as moving the Baxter’s arms, and for getting object representations. These communicate with MIDCA through the API. In this API, the types of ingoing and outgoing messages on each ROS topic and their meaning is specified. As these messages are asynchronously received, a set of MIDCA handlers put them in appropriate buffers within a partition of MIDCA memory. During the Perceive phase, these messages will be accessed and stored in MIDCA’s main memory.

##MIDCA Phases##

Human asked the robot to pick up a colored block. An external voice recognition node is constantly running which publishes utterances as string messages on UTTERENCE_TOPIC. Once MIDCA receives any message on this topic it puts the message on an appropriate queue which will be processed in the perceive phase later. In the perceive phase, MIDCA reads messages from all the queues, processes them and stores the processed data in MIDCA’s main memory. In the interpret phase, MIDCA checks to see if it has received any instruction. If it detects the message ‘get the red block’, it will create the goal ‘holding the red block’. Once the goal is created, it will be stored in the goal graph. In the Plan phase, after it creates a high level plan for the selected goal, it operationalizes each action using a mapping between high-level actions and directly executable methods for the robot. For example, the high level action reach(object) is instantiated in a method which sends out a ROS message to the node which operates the arm, then repeatedly checks for feedback indicating success or failure. Once all actions in a plan are complete, the plan itself is considered complete. If any action fails, the plan is considered failed. In the act phase, MIDCA loads a plan for the current goal from memory. In each cycle, one action starts and if it is completed in a certain amount of time, the next action will start in the next cycle. If any action fails, the rest of the actions won’t start and the plan fails.

We created other ROS nodes for our purpose which are responsible for doing specific actions, such as moving the Baxter’s arms, and for getting object representations. These communicate with MIDCA through the API. These processes listen to different ROS topics for MIDCA command and act on them appropriately. As it was mentioned earlier, there are different modules in act phase which, when an action is chosen published the correct command on the topic which the low-level processes is listening to.

###Topic Names:

obj_pos: the topic on which list of objects with their location are published.

  • The format of message is: object_1:x,y,z;object_2:x,y,z;
  • OD.py(baxter_srv/experiment/OD.py) sends the list on this topic
  • must be changed in MIDCA run script(baxter_run.py) and object detection node (e.g. baxter_srv/experiment/OD.py)

UTTERANCE_TOPIC: the topic on which utterances (e.g. commands) baxter hears are published

  • must be changed in MIDCA run script(baxter_run.py) and utterance listener node

LOC_TOPIC: the topic on which point commands from MIDCA are published.

  • must be changed in asynch.py and in point effector node (garbbing.py reads asynch's value)

GRAB_TOPIC, RELEASE_TOPIC, RAISE_TOPIC: the topics on which grabbing, releasing and raising commands from MIDCA are published.

  • must be changed in asynch.py and in point effector node (garbbing.py reads asynch's value)

###External steps (sensors and effectors) These can be started in any order, but must all be started for the demo to work.

  1. Start an external object detection ROS node which publishes as string messages on obj_pos topic. Implementation in baxter_srv repository: baxter_srv/experiment/OD.py This node finds the location of known objects in the image received form Baxter's right hand camera. It uses openCv libraries. It detects the objects using their color. Each color is specified in HSV range. More color ranges can be added here to detect more objects.

  2. Start an external voice recognition node which publishes utterances as string messages on UTTERANCE_TOPIC. Implementation is currently in the baxter_cog repository, to be added to MIDCA.

  3. Start an external node which listens for point commands on LOC_TOPIC, GRAB_TOPIC, RELEASE_TOPIC and RAISE_TOPIC. A point command will be in the form of a String message encoding a Dictionary containing x,y,z cooridnates. rosrun.py contains methods for transforming between String and dict; see examples/_baxter/grabbing.py for an implementation. This node should also publish feedback when it encounters an error or completes its task. This too is implemented in grabbing.py.

II. MIDCA setup: all steps from baxter_run_OD.py

  1. Create a new MIDCA object and add robot domain-specific modules to it.

  2. Create a RosMidca object. This object is responsible for sending messages to ROS as requested by MIDCA, and for placing messages received in appropriate queues for MIDCA to process. At present, all topics which will be used for incoming or outgoing messages must be specified at creation.

2.1) As arguments to the RosMidca constructor, pass in handlers for incoming and outgoing messages. In this demo, MIDCA uses 3 incomingMsgHandlers: A) A ObjectsLocationHandler which receives information about the location of objects B) An UtternanceHandler which receives utterances as Strings C) A FeedbackHandler which receives MIDCA Feedback objects to report on the success or failure of requested actions ...and 1 outgoingMsgHandler: A) A handler which sends out String messages to send commands to other ROS nodes(to move the arm, grab arm, and etc.)

  1. Call ros_connect() on the RosMidca object. Note that the ROS master node must have already been started or this method will fail.

  2. Call run_midca() on the RosMidca object. This will run MIDCA asynchronously. If certain rate (phases/second) is desired, it can be input as the cycleRate argument of this method (default 10)

III. What MIDCA does while running

  1. Asynchronously to cyclical behavior, RosMidca's handlers listen for incoming messages. As they are received, handlers place them into appropriate queues in a partition of MIDCA's memory that could be thought of as the subconscious, or perhaps preconscious.

    -Note: if external perception changes its output style or capabilities, the handlers - defined in rosrun.py - are responsible for adjusting to process the new input. Specifically, for each new input type or format, a new handler should be created.

  2. In the perceive phase, MIDCA reads messages from all queues, processes them as necessary, adds a time stamp to indicate when each message was received, and stores the processed data in MIDCA's main memory. Note that only the perceive phase accesses the incoming message queues.

  3. In the interpret phase, MIDCA checks to see if it has received and verbal instructions. If it gets the message 'get the red object', it will create the goal: Goal(objective = "holding", subject = "self", directObject = "red object", indirectObject = "observer"). Once a goal is created it will be stored in the goal graph. In this demo, since all goals are identical and identical goals are only stored once, there will never be multiple goals in the goal graph, though the same goal may be added again after it is achieved and removed.

  4. In the Eval phase, MIDCA checks to see if its current plan is complete. If it is, it declares the goal of that plan completed and removes it and the plan from memory.

  5. In the intend phase, MIDCA selects all goals of maximal priority from the goal graph. In this demo, there is never more than one goal, so MIDCA will select that goal if it exists in the graph.

  6. In the planning phase, MIDCA checks to see if an old plan exists for the current goal. If not, it creates a high level plan by using the pyhop planner, then transforms it into an actionable plan using a mapping between high-level actions and methods to carry them out. For example, the high level action point_to(object) is instantiated in a method which sends out ROS messages to the pointing effector node, then repeatedly checks for feedback indicating success or failure. Once all actions in a plan are complete, the plan itself is considered complete. If any action fails, the plan is considered failed.

  7. In the Act phase, MIDCA attempts to load a plan for the current goal from memory. If there is one, it follows this pattern:

    currentAction = plan.firstAction while currentAction != None: if currentAction.complete: currentAction = plan.nextAction() continue else if currentAction.not_started: currentAction.start() if currentAction.failed or currentAction.isBlocking: break

In other words, actions are begun successively until either one fails or a blocking action is reached. Actions are assumed to be running asynchronously from when they are begun to when they are declared completed.

  1. Lower-level details of planning and action
  • planning methods and operators for this demo are in the _planning/asynch folder.

  • low-level methods - see the point_to example in 6) - are defined in planning/asynch/asynch.py

  • Each Asynch[ronous]Action - a low-level method - defines an isComplete python function and an executeFunc function, which are passed into the constructor as arguments. These methods fully define the behavior of the action. So, for example, the do_point() AsynchAction's isComplete function checks MIDCA's memory for feedback indicating the actions's completion or failure, then updates its status as appropriate. The execute function searches memory for the last known location of the object given as an argument (from the high-level plan), then creates a ROS message containing that location and a command id - for later feedback - and requests that RosMidca broadcast the message.

  • This setup means that if external effectors change their input requirements, MIDCA's high-level planning can stay the same, but the interface between the two defined in asynch.py must change. Specifically, a new AsynchAction must be created for each new behavior type, though this process could be automated to some degree.

  • As an aside, the mirror of the last point with respect to perception rather than action is also true. See the note after 1).

##Camera Calibration

Camera calibration: We use the Baxter’ right hand camera to observe the table and use the color of the objects to find where it is in the image. Then using camera calibration we find the location of the objects on the table plane. The ROS service called “Right_hand_camera” is created to retrieve the image from Baxter’ right hand camera that subscribed to the topic /cameras/right_hand_camera/image, where the images coming from camera are published. Upon calling this service, it stores and returns the last image published. To get the coordinates of a pixel on the table plane from coordinates of the object in the image, we performed a calibration task before running MIDCA. In this calibration, we marked four points on the table and sample them moving the Baxter’ left arm to each point to get the position of the end-effector of the left arm. Then we can visualize them in the image and the corresponding pixels by clicking on those points on the image. With those points, we can calculate a matrix H that represents a linear transform between points in the floor and points in the image, called a homography. Using the inverse matrix H, given a pixel in the image we can calculate the coordinate of an object in the table plane. We use the OpenCV library to recognize an object by its color. To do this, we filter pixels from a certain range in HSV color space that correspond to the colors we are looking for, then we transform the image into black and white. Let’s say red is the current color, it uses the filter to change every red pixel to white, and every other pixel to black. Then it finds the largest contour on the image which represents the object. Then the algorithm finds the center of this contour as the selected pixel and using the inverse matrix H it finds the object position on the table plane. For each defined HSV range, this algorithm finds the location of that colored object on the table plane and adds it to a list. MIDCA will receive a list of objects that are visible in the current scene, with the information on their color and location.