Johannes Kummert edited this page Jul 17, 2018 · 18 revisions

This is a collection of descriptions of software packages, components, and libraries that may be used in RoboCup@Home.

Table of Contents


Team software overview

If needed, new columns can be added

World Modelling Mapping Face recognition People Tracking / Following Navigation Pose recognition Action Planning Speech recognition Object recognition Text-based interface Sound source localization Speech Synthesis Manipulation planning
Homer homer_semantic_knowledgebase Graph based knowledge representation homer_mapping Hyperparticle SLAM homer_face_rec HyperFace / Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks / Deep Metric Learning homer_people_following Joint Operator Detection and Tracking for Person Following from Mobile Platforms homer_navigation homer_openpose Convolutional Pose Machines homer_actions Nuance VoCon / homer_deepspeech homer_home_net homer_mask_rcnn homer_yolo homer_object_recognition (feature based) homer_gui homer_webinterface homer_matrix_creator / Hark homer_tts (Wrappers for Mary TTS, festival, pico)
PUMAS gmapping PUMAS navigation A* for path planning, obstacle avoidance using the kinect sensor and path tracking with non linear control.
TechUnited ED OpenFace CB_base_navigation with ROS move_base openpose_ros action_server Dragonfly Retrained inceptionV3 Telegram_ros + conversation_engine + grammar_parser, using fCGFs Matrix_creator_ros MoveIt! + ed_moveit
SocRob rosplan semantic knowledge base gmapping and cartographer b-it-bots face classification HumanAwareness, bayes_people_tracker, spencer_people_tracking, bayestracking ROS move base openpose custom ISR planning framework Nuance vocon MCR OR mean circle and darknet yolo Custom (closed source) Natural Language Understanding HARK espeak simple pregrasp planner and moveit
ToBI knowledge base gmapping ROS move base openpose ros wrapper Bonsai Pocketsphinx Object Rec Pipeline using tensorflow Naoqi Naoqi
Walking Machine
Happy Robot


Robot Control

  • Fawkes is a component-based software framework for robotic real-time applications for various platforms and domains. Developed and used over two years by the AllemaniACs RoboCup Team for cognitive robotics real-time applications like soccer and service robotics. It supports fast information exchange and efficient combination and coordination of different components to suit the needs of mobile robots operating in uncertain environments. It is massively multi-threaded, uses hybrid blackboard/messaging data exchange and features a Lua-based behavior engine.
  • CARMEN is an open-source collection of software for mobile robot control. CARMEN is modular software designed to provide basic navigation primatives including: base and sensor control, logging, obstacle avoidance, localization, path planning, and mapping.
  • Player provides a network interface to a variety of robot and sensor hardware. Player's client/server model allows robot control programs to be written in any programming language and to run on any computer with a network connection to the robot. Player supports multiple concurrent client connections to devices, creating new possibilities for distributed and collaborative sensing and control. Released under the GNU General Public License, all code from the Player/Stage project is free to use, distribute and modify. Player is developed by an international team of robotics researchers and used at labs around the world. For for information, see this short tutorial. (Markovito Team - INAOE, Mexico)
  • The Mobile Robot Programming Toolkit (MRPT) is a cross-platform C++ library for robotics researchers provided by The Perception and Robotics Research Group at the University of Malaga. It features algorithms in the fields of Simultaneous Localization and Mapping, computer vision, and motion planning. MRPT is free software released under the GPL.

Pepper Robot

  • The AUPAIR package is a Perception-Action-Learning System (PALs) which is a multi-modular system incorporated with state-of-the-art deep learning techniques.


General frameworks

  • The Orocos Project provides some Open Robot Control Software in C++ in form of four libraries for advanced machine and robot control.



See also: List of Vision libraries

  • OpenCV (Open Source Computer Vision) is a library of programming functions mainly aimed at real time computer vision. Example applications are Human-Computer Interaction (HCI); Object Identification, Segmentation and Recognition; Face Recognition; Gesture Recognition; Motion Tracking, Ego Motion, Motion Understanding; Structure From Motion (SFM); and Mobile Robotics.
  • Video4Linux or V4L is a video capture API for Linux. Several USB web cams, TV tuners, and other devices are supported. Video4Linux is closely integrated with the Linux kernel. Video4Linux was named after Video for Windows (which is sometimes abbreviated "V4W"), but is not technically related to it.V4L is in its second version. The original V4L was introduced late into the 2.1.X development cycle of the Linux kernel. Video4Linux2 fixed some design bugs and started appearing in the 2.5.X kernels. Video4Linux2 drivers include a compatibility mode for Video4Linux1 application, though practically, the support can be incomplete and it is recommended to use V4L2 devices in V4L2 mode.
  • Randomized Trees is an experimental face recognition prototype in C++. The system grows random trees for face recognition and face learning (learning of new identities on the fly) and was implemented by the AllemaniACs team at RWTH-Aachen. It was tested at the German Open 2008 and the RoboCup World Cup in Suzhou. It might be possible to extend or refactor the code for general object recognition as well. If the code is used in any way, please reference the corresponding Paper (see corresponding entry also at Publications#Vision). This code is currently embedded in the Fawkes framework scheduled to be released by the end of the year 2008 by the AllemaniACs.
  • The MOPED Framework is a real-time Object Recognition and Pose Estimation system. It recognizes objects from point-based features (e.g. SIFT, SURF) and their geometric relationships extracted from rigid 3D models of objects. During the training stage, several images are acquired from each object that is to be recognized, and features are extracted from each image. Structure from Motion is applied to arrange all the features from all the images and obtain a 3D model of each object. In the recognition phase, the features are obtained from the visual scene and, with 3D model at hand, hypothesis are made in an iterative manner using Iterative Clustering Estimation. Finally, Projective Clustering is used to group the similar hypothesis and provide one for each of the objects being observed.
  • Multi-Resolution Surfel Maps is a library for real-time registration of RGB-D images, 3D object modelling using SLAM, real-time object pose tracking, and indoor SLAM. Core to the approach is a compact multi-resolution of RGB-D images in 3D. The 3D volume is discretized at multiple resolutions using octrees. Each voxel stores the color and shape distribution of points falling into the voxel. The maximum resolution in the map adapts to the measurement noise characteristics of the sensor. Maps can be efficiently aggregated from 640x480 resolution images within a few milliseconds. An efficient and accurate registration method aligns two maps and is used to implement visual odometry, object tracking, and SLAM.
  • CUDA Random Forests for Image Labeling (CURFIL) is an open source implementation with NVIDIA CUDA that accelerates random forest training and prediction for image labeling by using the massive parallel computing power offered by GPUs.


  • The JACK Audio Connection Kit. It is a system that can connect the audio coming from the "output" of one piece of software to the "input" of another. In addition, it provides great flexibility when different audio agents want to use the same audio data at the same time, as it creates an abstraction layer over the native audio system, accessible to multiple agents simultaneously. All of this, while keeping latency low with real-time access.
  • By taking advantage of the robot's microphones' relative positions to one another, Direction-of-Arrival estimators can identify the direction from which a user is talking to the robot. A good example and brief survey of techniques to achieve this are described in the article "Robotic Orientation towards Speaker in Human-Robot Interaction", listed in the Human-Robot Interaction section of Publications.


  • We (RH3-Y) are currently using Microsoft SAPI4.4 and Audiolab, in the context of MS Windows and Borland C++ Builder, as convenient low-level packages for vocal communication on RAH-type robots; the job is then complemented within our Piaget environment: command and dialogue management.
  • To get an good overview over the topic of speech-based human robot interaction, please have a look at the paper "Advanced Speech-based HRI for a Mobile Manipulator" in the publications section.
  • By an appropriate positioning of microphones, noise cancellation can be achieved by simple subtractive methods, such as the Boll Spectral Subtraction, which assumes a certain spectral signature of the environment noise and subtracts it from each audio data window fed to the rest of the Recognition process.
  • CMUSphinx is a speaker-independent large vocabulary continuous speech recognizer released under a BSD style license. It is also a collection of open source tools and resources that allows researchers and developers to build speech recognition systems.
  • CSLU-Toolkit is a software library comprising a comprehensive suite of tools that enable exploration, learning, and research into speech and human-computer interaction. There are tools included for speaker independent speech recognition(for a number of languages), Text-To-Speech(festival),Animated Faces, a Rapid Aplication Designer for creating very fast aplications using all the tools of the toolkit. There is no linux version yet.
  • Microsoft Speech SDK. Use the Win32 Speech API (SAPI) to develop speech applications. The SDK includes freely distributable text-to-speech engines and speech recognition engines. Proved to work robust for trained speakers even in noisy environment (like during competition in exhibition hall). For more details (like setup / system design) look at the paper "Advanced Speech-based HRI for a Mobile Manipulator" in the publications section.
  • Festival offers a general framework for building speech synthesis systems as well as including examples of various modules. It offers full text to speech through a number APIs.
  • Flite (festival-lite) is a small, fast run-time synthesis engine developed at CMU and primarily designed for small embedded machines and/or large servers. Flite is designed as an alternative synthesis engine to Festival for voices built using the FestVox suite of voice building tools.
  • Mac Voice Alex (shipped with Mac OS X Leopard) produces natural and very intelligible speech output. It is closed source, but is easy to use by invoking the system command say.

Robotic Arm Control

  • Energid's Actin-SE software package coordinates the action of joints and base motors to then achieve an appropriate movement of the arm. It does this via an optimization approach that simultaneously avoids joint limits, singularities, and collisions all while minimizing kinetic energy.

Planning and plan execution

  • Petri Net Plans Library for defining and executing Petri Net Plans (PNP), including both ROS and NAOqi bridges.
  • ROSPlan task planning framework A framework to bridge PDDL planning and real robots. An example PDDL (GPSR) domain can be found here. You will need to put your robot actions in action lib for this software to work. You can find a video with more details here
  • ISR task planning framework Similar to ROSPlan but modular. This flexibility allows you to reuse components at will and to implement your own planning strategies. This framework is used by the SocRob team to perform GPSR test as an alternative to state machines for decision making. If you wish to consult more details you can check a draft of the paper here.


  • The Microsoft Robotics Studio is a Windows-based environment for academic, hobbyist and commercial developers to easily create robotics applications across a wide variety of hardware.
  • The Webots Webots mobile robotics simulation software provides you with a rapid prototyping environment for modelling, programming and simulating mobile robots. The included robot libraries enable you to transfer your control programs to many commercially available real mobile robots.
  • URBI is a universal platform specially designed for robotic applications (it implements parallelism, event based programming and modularity) compatible with C++. Also compatible with Webbots. A GUI (URBI Studio)is being developed to edit and create behaviours. (UChile Homebreakers)
  • STAGE simulates a population of mobile robots moving in and sensing a two-dimensional bitmapped environment. Various sensor models are provided, including sonar, scanning laser range finder, pan-tilt-zoom camera with color blob detection and odometry. Stage devices present a standard Player interface so few or no changes are required to move between simulation and hardware. Many controllers designed in Stage have been demonstrated to work on real robots. For for information, see this short tutorial. (Markovito Team - INAOE, Mexico)
  • ISR Gazebo Simulated apartment Provides with an apartment like environment simulation, using Gazebo 7 and Ubuntu 16.04. It has proper collision models, inertia and some objects borrowed from IPA like pizza, applejuice, etc. It also has a standing person as a model such that you can perform face detection, etc., unfortunately the actor is static and can only move in Gazebo 8, so this feature is not yet there. Simulator was brought to you by Institute for Systems and Robotics (ISR) Lisboa.

Base Systems

Fedora Robotics SIG
The SIG is an effort to make Fedora capable to run out-of-the-box on a variety of robots and come with many packages of software relevant in robotics. An integrated simulation environment is one of the first goals. The SIG is lead by Tim Niemueller of the AllemaniACs Robocup Team.

Cognitive Architectures

The Interaction Oriented Cognitive Architecture provides a framework to control and design the behaviour of a service robot, as well as describe the task that needs to be accomplished. Dialogue Models are the center of IOCA and have been the research focus of the Golem group. Dialogue Models perceive and interact with the world in an abstract manner. This paradigm provides great flexibility in software development, as the Recognition processes and Actuators become modular and replaceable, while the task description remains intact. Meaning that developing for a different task does not require a complete rewrite of the internal software.

Directories / Other Sources

You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.