Machinetalk functional overview

Michael Haberler edited this page Jul 11, 2014 · 29 revisions

Michael Haberler, 06/2014

Background

Splitting the realtime environment from everything else - interpreter, task, UI - is my top priority for the machinekit effort. Machinetalk - the new communications infrastructure - is a key step in this direction.

A fundamental drawback of HAL is the complete lack of a remote API. This forces the complete LinuxCNC application to run on the same host as the realtime environment.

The current communications middleware, NML, is not used outside LinuxCNC and is - in my opinion - hopeless as a basis for future remote operations. For a discussion of the drawbacks, see this thread .

This merge brings in the key building blocks for future remote operations on the realtime-to-userprocess as well as the user/userprocess level. I call this sum of all new building blocks machinetalk, so that the child has a name.

As such, this is an enabling step, not a feature branch. While work has progressed considerable towards a remote HAL API, work for a remote interface to the interpreter, task, and task/motion interaction still remains to be done. However, this work will use the building blocks introduced in this merge, and will not be realistically possible without it. It brings together all prerequisites, and build support, while not breaking existing functionality (hopefully).

Intended audience and expectations

This document, and the features this merge brings, is intended as an orientation for developers which are familiar with C/C++ and Python. There are few end-user visible enhancements which come with this merge.

Reference Architecture

To understand the new concepts and API’s introduced here, see this first .

New HAL Concepts and API’s

This merge adds several new concepts and API’s. These are discussed in turn.

The motorctrl example in the machinetalk/demos/motorctrl directory helps to explain some of the concepts discussed here.

HAL: remote components

UI’s like gladevcp or pyVCP are HAL components proper and create/change/monitor pins in the HAL shared memory space. To support such components in a remote scenario, HAL remote components have been introduced. This specification lays out the concepts behind HAL remote components.

In a nutshell, it is a HAL object which looks and feels like a normal component, but it has no thread function - so it does not do anything by itself unless it is serviced by a userland process which 'adopts' this pro-forma component. They differ from normal components insofar as the phase of defining name and pins is decoupled from the actual execution of the thread function which happens in the server process once that starts. One can define a remote component at the halcmd level, define its pins, and make it ready; then link other pins to it and commence thread operation.

In the case of remote UI’s, the UI process connects to the haltalk server and asks it to adopt, and 'bind' a matching remote component. Once this succeeds, haltalk and the remote UI exchange update messages to notify about pin changes in both directions.

Support for remote components has been added at the HAL C API, HAL Python API, and halcmd levels, as well as the new server process haltalk.

Remote components can be used to bridge different HAL instances via pins. The code for this feature is currently incomplete.

Remote components are not limited to UI applications - they can serve as a general remote read/write API into HAL.

HAL groups

Some applications require the monitoring of sets of HAL values belonging together as a group. An example for this is progress display: The preview window will paint the tool at the current machine position. To do so, the UI needs a stream of updates for that current machine position.

The concepts for HAL groups are described in the Status Tracking Protocol document. It also gives a usage example for a HAL Group.

HAL group support in haltalk and halcmd is feature complete. There is currently no using code; the first one will likely be remote progress display support.

HAL: named ringbuffers

Besides interaction with HAL scalars, there is a requirement for queued command/response interaction between user processes and realtime environment. This interactions work on aggregate commands - for example, typical motion commands like 'move to position', 'arc to position' etc. Those commands consist of several scalars collapsed into a struct, and must be transmitted to and from the realtime environemnt in an atomic fashion - piecemal transmission via individual HAL scalars is not useful abstraction for this purpose.

The legacy code base has a special-purpose, single use solution for transmitting commands from task to motion and vice versa. The lack of an API has led to the motion component becoming the kitchrn sink of everything required queued commands, making it next to unmaintainable.

The current machinekit code base already contais code for lock-free queues handling message streams, and character-based interaction between RT/RT and RT/userland.

HAL named ringbuffers provide a generalized messaging API to/from RT.

For the API, see hal/lib/hal_ring.h . Usage examples are in machinetalk/msgcomponents - see ringload.comp, ringread.comp and ringwrite.comp.

There is a rudimentary Python API to HAL named ringbuffers, see the Pythone examples in this directory. This API needs to be moved to cython.

To make end-to-end communication via zeromq and ringbuffers seamless, the multiframe ringbuffer code was added. This matches the zeroMQ messaging model - a multipart message may consist of several frames, each one with a freely usable flag word, typically used to tag the meaning of the frame contents.

This multiframe code introduces the capability to route messages between userland and RT components, a key feature completely absent from the task/motion API which is only bilateral. This is the basis for the messagebus routing server.

HAL API: epsilon values

Both the remote component and group support code in the HAL library support automatic change detection for a set of pin (remote comp) or signal (group) values. While this is easy with integral types (bool, signed, unsigned), for float values an epsilon value is needed to determine when a float is considered 'changed'.

To generalize this, epsilon values have been introduced at the halcmd and HAL C API level. The idea is to provide a limited set of numbered epsilon values which can be set at the halcmd level (currently 5). A float pin in a remote component, or a float member in a HAL group is tagged with an epsilon index.

See machinetalk/demos/motorctrl/motorctrl.hal for a usage example - the float pins in the motorctrl component are considered changed if their value changes by more than 0.001 from the last detected change.

Setting different epsilon values can have a significant impact on update message rate, and hence performance. There is a tradeoff between accuracy of tracking, and load.

New Messaging Concepts and API’s

So far, NML was the only game in town. To get rid of NML, a replacement messaging stack was introduced, consisting of:

  • the zeroMQ messaging library which makes it very easy to implement command/response and publish/subscribe patterns. Machinetalk uses the czmq C API to talk to zeroMQ, which has turned out the most productive interface. Cryptographic support for authentication and encryption (libsodium) is available at the library level but currently not used in the code.

  • the Google protobuf serialization library and tools, to describe, encode and decode messages in an architecture independent way (this supports C++ and Python bindings)

  • the Nanopb protobuf bindings - a protobuf support library which is compatible with realtime and kernel environments - this makes RT components understand, and generate protobuf messages as needed without the need for translation at the RT/userland boundary (note that C++ is incompatible with the RT environment)

  • zeroconf - based discovery based on avahi (technically multicast DNS), eases the plumbing of diverse processes to provide a common service, and signifcantly reduces the configuration effort required

  • http://linux.die.net/man/3/uuid’s (unique user ID’s) are used in several places to designate a set of distributed process as belonging to a particular machinekit 'instance'

These provide the core connection framework, and will be used to replace NML.

To include options for Web-based interaction, the following features have been added in the form of an optional server process (webtalk):

  • a websockets/JSON based service fully compatible with the messaging stack described above. This uses the jansson and libwebsockets libraries.

  • optionally, a minimal web server to support loading of a few basic html pages and Javascript libraries.

Message formats: the Protobuf infrastructure

All messages used in userland and RT communications are defined in protobuf definitions. Those live in machinetalk/proto/proto/*.proto.

The build support does for all .proto files:

  • create Python bindings - these go into lib/python/Message_pb2.py

  • a C` library with all protobuf descriptors in the Google `C format: lib/liblinuxcnc-pb2++.so.0

  • a C library with all protobuf descriptors in Nanopb format: liblinuxcnc-npb.so.0

  • a realtime component exporting the Nanopb descriptors to RT components: machinetalk/msgcomponents/pbmsgs.c - this component needs to be loaded if RT components are to generate/parse protobuf messages

All protobuf definitions actually live in a separate repository, currently https://github.com/mhaberler/machinetalk-protobuf/ . This repo is merged in with 'git subtree'; the reason for this is - external projects like QtQuickVCP just need the protobuf message formats, but do not share code. The protobuf repository is also a convenient licensing boundary.

The libmtalk support library

Several functions are used throughout the code and do not fit into the linuxcnchal library since they may be used on a host which does not run HAL. These functions have been collapsed into lib/libmtalk.so and the source lives in machinetalk/lib and machinetalk/include.

Functionality available to developers

HAL C API

The new capabilites of the HAL C API are described in the header files:

  • Remote components: hal/lib/hal_rcomp.h

  • Groups: hal/lib/hal_group.h

  • named rings: hal/lib/hal_ring.h

To use the frame- or character oriented functions of ringbuffers, include rtapi/ringbuffer.h. The multiframe ring operations are in rtapi/multiframe.h.

zeroMQ-based services:

  • rtapi_msgd, the logger, now distributes log messages in protobuf format, using a publish socket. This gets rid of the 'missing log message' defect in LinuxCNC, as log messages are published to all subscribers, not just the first consumer.

  • haltalk serves HAL remote components, and HAL groups. See the code at machinetalk/haltalk, and the motorctrl demo application. This is the counterpart for any remote HAL interaction, including Alex’s QtQuickVCP.

  • webtalk, the JSON/websockets bridge into zeroMQ/protobuf services. There is currently no good example beyond sending log messages to a Web browser, as my Javascript fu is extremely low.

  • GladeVCP has been extended for remote operations (HAL only; the linuxcnc module and Stat/Action widets are not supported yet). Again, this can be explored with the motorctrl demo. Local operation is as before.

  • messagebus is in proof-of-concept stage and not production-ready. It lives at machinetalk/messagebus.

Other changes

Some work which should have gone under the 'new RTOS' label comes in only now, because only now the infrastructure to do so is available:

RT Demon

The rtapi_app application is now used throughout all thread flavors. This has the advantage of simplifying halcmd, and eventually remote interaction with RTAPI. In the case of userland threads, rtapi_app acts pretty much as before; with kernel thread flavors, rtapi_app takes on the task of module loading and unloading.

rtapi_app exports a zeroMQ/protobuf interface over IPC sockets (local only for now), which is used by halcmd to load and unload RT modules. This replaces the arcane character-based method used before.

Direct RT thread creation

Since rtapi_app, the RT demon, is available for all flavors, the creation of threads has now become a rtapi_app primitive which is used by halcmd. This will eventually deprecate the threads RT module since it is not needed any more, and its 'three threads only' limit is gone.

halcmd

halcmd has considerable extensions for supporting the new HAL C API functions, like remote components, groups, and ringbuffers.

Example for the abovementioned thread create and delete primitive in halcmd (see also machinetalk/demos/motorctrl/motorctrl.hal):

# syntax: newthread <name> <periodns> [fp|nofp] [cpu=<int>]
# default is nofp
newthread servo-thread 1000000 fp

....

delthread servo-thread

status of the code base

This branch builds and runs all regression tests, except two for xenomai-kernel which are of known cause and harmless.

It should NOT break any existing application. If it does, please file a bug.

shakeout phase

I plan to handle the preview and merge phase like so:

  • I will post an initial pull request against the machinekit repository, to exercise buildbot builds, but to be merged only after shakeout.

  • as improvements are made, I will rebase this branch as needed, resulting in a revised pull request.

  • once we are confident the result will not sink the ship, we can go ahead with a merge.

Configuration changes

There is now a global INI file, /etc/linuxcnc/machinekit.ini which carries several important parameters which apply to all programs using machinetalk.

This file comes with reasonable defaults, but local operations only enabled. Eventually the unique UUID will be set by a package postinstall script.

It is reproduced in full here since it’s key:

# this file specifies options which apply globally to several programs, and all services

# having these options in one place avoids repeated ini file changes.
# it sits at a well-known place ($EMC2_HOME//etc/linuxcnc/machinekit.ini), thus
# is accessible to all programs.

[MACHINEKIT]

# -------------- Unique UUID of a Machinekit instance  -----------------
# All network-accessible services of a running Machinekit instance are
# identified by a unique id; see 'man uuidgen'
# Clients browsing zeroconf for services belonging to a particular instance
# use this MKUUID value as a unique key.
#
# All MKUUID's must be different, so if there are several Machinekit instances
# running on a LAN, there might be collisions
# hence,  change this UUID by using the output of 'uuidgen':
MKUUID=a42c8c6b-4025-4f83-ba28-dad21114744a

# -------------- enabling remote operation -----------------

# enable remote service access - defaults to local; set to 1 for enabling remote operation
# regardless of INTERFACES section below
# REMOTE=1 means: zeroMQ sockets will use TCP on the preferred interface as per INTERFACES
# REMOTE=0 means: zeroMQ will use IPC sockets in RUNDIR/<rtapi_instance>.<service>.<uuid>
#
# REMOTE=0 also means that zeroconf announcements are be disabled
REMOTE=0

# -------------- network setup -----------------

# by default, services are bound to Unix IPC sockets, meaning the service cannot
# be reached from outside.

# if services should be remotely accessible, a primary interface can be chosen
# by giving a list of preferred interfaces or interface prefices:
# the first IPv4 address of the first matching interface is used to bind(2):

# bind to the first Ethernet interface
# prefer to bind to ethX; then wlanX if eth* is not present; then usbX:

INTERFACES = eth wlan usb

# localhost only - for testing TCP, but within a single machine
#INTERFACES = lo

What is missing

  • The examples are in bad shape. I plan to improve those once the initial pull request is out.

  • the HAL bindings are incomplete and need work.

  • some of the commit messages need way more detail.

  • Manpages for the HAL C API’s are lacking.

  • The work-in-progress code towards replacing NML is not included here yet, this will be a second step.

Building this branch

Platforms: Ubuntu 10.04 is hopeless - do not waste time with it.

These instructions are targeted for Debian wheezy, the recommended build platform.

First, add John’s debian archive containing the prequisite packages ready to go; and wheezy-backports which we need for cython (too old in wheezy repo; but it’s the only one needed):

 sudo sh -c "echo '# Machinekit package archive tracking the master branch\n# From the Dovetail Automata LLC Buildbot\ndeb http://deb.dovetail-automata.com wheezy main\ndeb-src http://deb.dovetail-automata.com wheezy main' >> /etc/apt/sources.list.d/machinekit.list"

 sudo sh -c "echo '# Machinekit needs a recent cython from wheezy-backports:\ndeb http://ftp.us.debian.org/debian wheezy-backports main' >> /etc/apt/sources.list.d/wheezy-backports.list"

Second, install build prerequisite packages:

 sudo apt-get update 
 sudo apt-get install automake1.11 libtool liburiparser-dev libssl-dev  openssl python-setuptools  libusb-1.0-0-dev libudev-dev  uuid-dev uuid-runtime libavahi-client-dev libavahi-compat-libdnssd-dev  avahi-daemon libprotobuf-dev protobuf-compiler python-protobuf libprotoc-dev uuid-runtime python-avahi python-netifaces libxenomai-dev
 sudo apt-get install git build-essential libglib2.0-dev libgtk2.0-dev tcl8.5-dev tk8.5-dev bwidget libreadline6-dev python-tk libboost-python-dev mesa-common-dev libglu1-mesa-dev libxmu-headers libxaw7-dev
 sudo apt-get install libmodbus-dev libsodium-dev libzmq4-dev libczmq-dev  libjansson-dev  libwebsockets-dev python-zmq liburiparser-dev
 sudo apt-get install -t wheezy-backports cython
Recommended for viewing zeroconf service announcements (not needed for build):
 sudo apt-get install avahi-discover

For playing with machinetalk/webtalk, this helps to run the Python websockets test script; not needed otherwise:

 sudo pip install websocket-client

Now verify all package prerequisites are in place - cd to the 'machinekit' toplevel directory (not the 'src' subdirectory!) and do:

debian/configure -x -p -r 
dpkg-checkbuilddeps

If there’s no output from this command, you’re all set to build as usual:

cd src
sh autogen.sh
# on amd64/i386:
./configure
# on beaglebone:
./configure --with-platform=bb --with-posix --with-xenomai --with-rt-preempt
# on raspberry:
./configure --with-platform=rpi --with-posix --with-xenomai --with-rt-preempt
make
sudo make setuid

Running the motorctrl demo, and QtQuickVCP

Have a look at machinetalk/demos/motorctrl and the README therein.

A provisional Android ARM7 binary can be downloaded from http://static.mah.priv.at/public/QtApp-debug.apk - just download on your phone/tablet, install and run it; it should find the running motorctrl HAL instance and show it as a menu option.

You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.