Skip to content

Installation

Selene Baez edited this page Mar 29, 2019 · 8 revisions

Overview

The CLTL/Pepper software is written in Python 2.7, using many components, as to not reinvent the wheel. This does however entail that a lot of dependencies have to be installed manually. We have tried our best to make an extensive and comprehensive installation guide. Yet, if you are stuck at some part of the installation, please don't hesitate to open an issue and we'll try to clarify it.

Most of the dependencies used are Open Source. Alongside the Open Source dependencies, we do make use of a few commercial API's; these, however, are free to use or have free evaluation periods for individual or academic use.

We've tested and are running the code on Windows 10 Pro (Windows with Hyper-V) and MacOS machines. It should, however also run on Linux-based distributions.

Installation

1. Python 2.7 (Required) & Naoqi Python SDK (Optional)

This project builds onto the Naoqi Python SDK by SoftBank Robotics. This allows you to run our code on Pepper / Nao Robots. Support for the Naoqi backend is included in this repository. However Naoqi is not a requirement for other backends (for instance your own pc).

Please refer to the Naoqi Python SDK install guide for more information. The SDK, and by extension this repository, is only available for 32-bit Python 2.7 on Windows and 64-bit Python 2.7 on Linux and OSX, so make sure you grab the right Python version if you intent to run code on Pepper / Nao Robots.

  • Please make sure the PYTHONPATH environment variable reflects the location of the SDK and you're using Python 2.7
  • Please make sure you install Python as a Framework on OSX or else pynaoqi will complain.

2. Google Cloud Services

his projects makes use of Google Cloud services for Speech Recognition & Generation. This is a commercial service, although a trail for a free year is available. After following instructions to make a cloud project on their website, download your key as google_cloud_key.json and place it in the root of this repo, it is then automatically added as environment variable at runtime.

Google Cloud Speech-to-Text

The Google Cloud Speech-to-Text API is used as Speech Recognition solution for this project. Please refer to their website for licencing and installation instructions. It is of course possible to use a different Automatic Speech Recognition (ASR) solution, if you wish, but we found it difficult to find a good performing open source solution... Call pip install google-cloud-speech in order to install the required Python libraries.

Google Cloud Text-To-Speech

The Google Cloud Text-To-Speech API is used a Text to Speech solution when running applications on your PC, instead of on the robot. Please, again, refer to their website for licencing and installation instructions. Call pip install google-cloud-TextToSpeech playsound in order to install the required Python libraries.

  • If your platform happens to be Mac, and the playsound installation complains about the cairo dependency, try running pip install -U PyObjC.
  • If your platform happens to be Mac, and the playsound installation complains about the 'portaudio.h' file not found, try running brew install portaudio before trying pip install pyaudio.

3. OpenFace & Docker

Face recognition in this project is done using the open source OpenFace project (Site, Git), which is, to put it mildly, difficult to install; Probably even impossible on Windows. Luckily, the developers were so kind to provide a Docker image, which can be obtained using docker pull bamos/openface. Once Docker is installed and the bamos/openface dependency is pulled, this repo will automatically interface with it. If you're planning to run Docker on a Windows machine, make sure you have Windows 10 64bit: Pro, Enterprise or Education, since Docker relies on Microsoft Hyper-V, which is only available on these versions of Windows. A possible workaround could involve Docker Toolbox, which does work on lower Windows versions and is able to run containers. As of now, we did not test this.

4. Object Recognition & Pepper Tensorflow

In order to use the Inception and COCO models natively within this project, you need to clone pepper's sister-project: pepper_tensorflow, which includes the Tensorflow services within Python 3.6. These services need to run as a separate process either locally or remotely next to the main application. A Socket-interface is implemented bridging the Python 2 / 3 gap. Since Tensorflow requires Python 3, this is our way of using Tensorflow in an otherwise Python 2.7 repo.

We recommend creating two virtual environments, one Python 2.7 environment for the main repo (pepper) and an Python 3.6 environment for the Tensorflow repo (pepper_tensorflow)

The main script to run here is object_recognition.py, which finds objects in images and annotates them with bounding boxes. In fact, any model from the Tensorflow Detection Model Zoo can be used here! object_recognition.py gives hints on how to do this. To include other object detection models, you'll have to download the data from the Model Zoo yourself first!

5. NLTK for Natural Language Understanding

For analyzing utterances, this project relies on NLTK. Its data dependencies are automatically installed once you launch the program for the first time. For Named Entity Recognition, you will need to have Java installed on your machine. Please make sure it is callable from the command line (and fix your environment variables if not).

6. GraphBD for Knowledge Representation

In this project, knowledge is represented in the form of triples and stored in a triple store. To install the necessary Python dependencies, you need to run pip install rdflib iribaker SPARQLWrapper. Additionally, you have to install GraphDB. Please follow the instructions to download, the free version will suffice.

GraphDB's UI can be accessed through http://localhost:7200/. From the UI, you will need to set up a new repository called leolani. Don't forget to connect to the repository when you start GraphDB.

7. Wolfram Alpha

This project makes use of the Wolfram Alpha Spoken Results API A free (academic) licence is available that allows 2000 queries a month (which is plenty, in our experience). Please create and account and place a file tokens.json, with the following information to the root of this project (right next to the google_cloud_key.json):

{
  "wolfram": <YOUR KEY>
}

This will enable Wolfram Alpha into your Robot applications.

8. Other Python Dependencies

The project requirements are listed in requirements.txt and can be easily installed via pip.

On Windows, an OpenCV binary needs to be downloaded. pip opencv-python will be sufficient for OSX platforms.

Setup

Running

To run an application, you'll have to run a few services and take a few considerations first.

1: Docker

Please Run Docker (and make sure you've pulled bamos/openface, see Installation for details). The docker image will be started automatically and keep active, when running an application for the first time. When you're done running applications and you don't need the OpenFace service anymore, please open a terminal and run docker stop openface. This can also help when experiencing issues with Docker, which sometimes occurs when it starts automatically with the OS. In that case, the classic reboot-x-and-try-again scheme, where x is Docker, works wonders.

2: COCO Client

In order to perform object detection, we make use of a COCO model within Tensorflow. A COCO service has been implemented in and can be run from the pepper_tensorflow project. Simply run pepper_tensorflow/object_recognition.py (using Python 3) and the service will boot.

3: Face Data

In oder to make face recognition possible, the people directory needs to be populated with face-data. The files should be named <Name of Person>.bin and contain 1 or more 128-dimensional vectors of the person's face. These vectors are provided with the on_face event. Due to privacy concerns, faces from our department have not been included in the Git repo. :)

Consider making two subdirectories in pepper/people with the following layout:

pepper/
  ..
  people/
    friends/
      friend1.bin
      friend2.bin
    new/
      new1.bin
  pepper/
  ..

The friends folder is where friends of the robot should reside, whereas new people should be put in the new folder automatically after meeting a person.

4: Config

The global config file can be found under pepper/config.py. Please modify this to your (performance) needs!

5: Running a (test) application

Running test/app/vebose_app.py, will print out which events fire when and with what data. If you manage to run this without any errors, dependencies are most likely installed correctly and other apps should work too!

How To Boot an Application

  1. Start GraphDB Free

  2. Start Docker

  3. Start COCO (pepper_tensorflow -> object_recognition.py)

  4. Start any Application (from the apps/examples directory for example)

Enjoy (& Check settings in pepper/config.py)!

Clone this wiki locally
You can’t perform that action at this time.