This is a patch release that includes two very important new features. First, we've released a preview version of the Python Connector API. This allows developers to build sources and sinks without the need to worry about Wallaroo’s internal protocol. We also have a better resilience story: we now use an algorithm based on the Chandy-Lamport snapshotting algorithm that minimizes the impact of checkpointing on processing in-flight messages. Read on for more details about each feature and other fixes that happened with this release.
What is Wallaroo
Wallaroo is a modern, extensible framework that makes it simple to get stateful streaming data and event-driven applications to production fast, regardless of scale.
If you are interested in installing Wallaroo, our installation documentation provides the various ways you can get up and running.
Feel free to use the table of contents below to help you navigate to sections you might find relevant.
Table of Contents
- License Changes
- Upgrading Wallaroo
Starting with this release (version 0.5.3), Wallaroo is now licensed completely under an Apache2 license. If you aren’t familiar with the Apache2 license you can find it here.
Python Connector API
The Python Connector API provides developers with a way to quickly and easily connect their data streams as sources and sinks to Wallaroo with a minimal amount of code. Python connectors are processes that run outside of Wallaroo and act as a bridge between Wallaroo and the systems that store data. The API is written in Python, so developers can use the same language for creating connectors and Wallaroo applications.
In addition to the API, we have created connectors for Kafka, Redis, S3, RabbitMQ, Kinesis, Postgres, and UDP. Developers can use these connectors directly or they can use them as a base for building connectors that fit their specific needs.
For more information, please refer to the documentation.
This is a preview release of the connector API and may change based on feedback. Please share your thoughts at firstname.lastname@example.org.
Checkpointing and Recovery from Checkpoints
We've redesigned and improved our resilience strategy from the ground up. We now use an algorithm based on the Chandy-Lamport snapshotting algorithm that minimizes the impact of checkpointing on processing in-flight messages. A checkpoint represents a consistent recovery line. This means that when a failed worker recovers, we can roll back the cluster to the last checkpoint and begin processing again with the guarantee that all state in the system is valid. The interval between checkpoints is configurable.
One pleasant side effect of this work is that we can now use barriers to determine when all in-flight messages are done processing, which is useful for scenarios like growing and shrinking the running cluster size. This replaces our earlier watermark-based strategy that required acks to be propagated from the sinks back up through the entire upstream chain.
Replicated Recovery Data
This release adds a foundation for building a Wallaroo cluster that can recover from catastrophic file system data loss. One cause of such catastrophic data loss could be the accidental destruction of an Amazon AWS/Google GCE/Azure cloud server instance by the administrator.
Command line arguments are now available to add I/O journalling (i.e., a write-ahead log to a remote file service) to all Wallaroo data written to the
Wallaroo Up additional distributions
Wallaroo Up, our shell script that automates the from-source install of Wallaroo on multiple Linux distributions now officially supports more distributions (Fedora 26/27, Amazon Linux 2, Oracle Linux, Ubuntu Artful, and Debian Jessie/Buster).
Wallaroo Up now officially supports and has been tested on:
Amazon Linux 2
Oracle Linux 7
Debian Buster (Testing)
Additionally, Wallaroo Up hasn't been tested on but should work on:
Red Hat Enterprise Linux 7
Below are instructions for upgrading from Wallaroo 0.5.1 for Upgrading Wallaroo when compiled from source, Upgrading Wallaroo when installed via Wallaroo Up, Upgrading Wallaroo in Docker, and Upgrading Wallaroo in Vagrant.
Upgrading Wallaroo when compiled from source
Starting with Wallaroo 0.5.2, Wallaroo is installed into a version specific directory. Installations of new versions are installed next to existing versions.
Upgrading Wallaroo when installed via Wallaroo Up
Wallaroo Up installs Wallaroo into a version specific directory. Installations of new versions are installed next to existing versions. You can then port over any changes you’ve made to the new version as you see fit.
Upgrading the Wallaroo Docker image
To upgrade the Wallaroo Docker image, run the following command to get the latest image. If you don't allow a non-root user to run Docker commands, you'll need to add
sudo to the front of the command.
docker pull wallaroo-labs-docker-wallaroolabs.bintray.io/release/wallaroo:0.5.3
Upgrading Wallaroo Source Code
If you mounted the Wallaroo source code to your local machine using the directory recommended in setup, in
/tmp/wallaroo-docker (UNIX & MacOS users) or
c:/wallaroo-docker (Windows users), then you will need to move the existing directory in order to get the latest source code. The latest Wallaroo source code will be copied to this directory automatically when a new container is started with the latest Docker image.
UNIX & MacOS Users
For UNIX users, you can move the directory with the following command:
mv /tmp/wallaroo-docker/wallaroo-src/ /tmp/wallaroo-docker/wallaroo-0.5.1-src/
For Windows users, you can move the directory with the following command:
move c:/wallaroo-docker/wallaroo-src/ c:/wallaroo-docker/wallaroo-0.5.1-src
Once done moving, you can re-create the
wallaroo-src directory with the following command:
Upgrading Wallaroo in Vagrant
The normal Wallaroo installation in Vagrant instructions will install new versions next to existing versions.
If you have modified your old Vagrant VM in any way that you intend to persist, you’ll need to do that now. For example, copy any edited or new files from the old Vagrant VM to the new one. When you’ve completed that, it’s a good idea to clean up your old Vagrant box, by running:
cd ~/wallaroo-tutorial/wallaroo/vagrant-0.5.1 vagrant destroy
[0.5.3] - 2018-09-28
- Python's argparse and other libraries which require properly initialized python arguments should no longer fail in certain cases in machida
- Added support for Fedora 26/27, Amazon Linux 2, Oracle Linux, Ubuntu Artful, and Debian Jessie/Buster Linux distributions via Wallaroo Up
- Checkpointing protocol for regular checkpointing of application state
- Support for rollback to most recent checkpoint on recovery
- Support for replicated recovery data
- Added preview release of the Python Connectors API
- Added connector scripts for Kafka, Kinesis, RabbitMQ, Redis, S3, and UDP
- Added a template for a Postgres connector script