Mirror of Apache Toree (Incubating)
Clone or download
Permalink
Failed to load latest commit information.
client [TOREE-391] Treat zmq ids as bytes Mar 16, 2017
communication [TOREE-391] Treat zmq ids as bytes Mar 16, 2017
etc [TOREE-480] Add support for spark-context-initialization-mode none Aug 16, 2018
kernel-api [TOREE-483] Add %showOutput magic to disable console output Sep 25, 2018
kernel [TOREE-483] Add %showOutput magic to disable console output Sep 25, 2018
macros [TOREE-451] Remove Scala 2.10 files Nov 13, 2017
plugins [TOREE-451] Remove Scala 2.10 files Nov 13, 2017
project Update to SBT 1.2.1 Sep 14, 2018
protocol [TOREE-391] Treat zmq ids as bytes Mar 16, 2017
pyspark-interpreter [MINOR] Adds file tagging for z/os platforms Sep 7, 2017
resources [TOREE-470] Config option do control SparkContext initialization Apr 20, 2018
scala-interpreter [TOREE-483] Add %showOutput magic to disable console output Sep 25, 2018
sparkr-interpreter [MINOR] Fixed Spelling. Aug 1, 2018
sql-interpreter [MINOR] Fix typo on variable name Jun 15, 2018
src/test/scala Changes to license header on all files Jan 15, 2016
.gitattributes Add misssing license headers Feb 15, 2017
.gitignore Update .gitignore configuration file Feb 1, 2017
.jvmopts [MINOR] Update deprecated Java 7 JVM options Jul 26, 2018
.travis.yml [TOREE-469] Remove binary jars used to test addJar functionality Mar 30, 2018
DISCLAIMER Update to jeromq 0.3.6 and remove LGPL license and disclaimers Sep 27, 2016
Dockerfile Squashed work for Apache Spark 2.0 and Scala 2.11 Sep 8, 2016
Dockerfile.system-test Fix system test Feb 8, 2017
LICENSE Updated LICENSE file with Apache v2 Dec 15, 2014
Makefile Prepare for next development interaction 0.3.0.dev1 Jul 27, 2018
NOTICE Update copyright year information on NOTICE Apr 15, 2018
README.md [MINOR] Update supported Spark version on README Apr 21, 2018
RELEASE_NOTES.md [MINOR] Fix typo on release notes Aug 16, 2018
Vagrantfile Removing --spark-configuration cmd arg Mar 30, 2016
build.sbt Update to Scala 2.11.12 Sep 14, 2018
index.ipynb Adding binder support to simplify consumption of examples and play wi… Mar 8, 2016
test_toree.py [TOREE-467] Fix ShowTypes magic functionality Aug 16, 2018

README.md

Build Status License Join the chat at https://gitter.im/apache/toree Binder

Apache Toree

The main goal of the Toree is to provide the foundation for interactive applications to connect to and use Apache Spark.

Overview

Toree provides an interface that allows clients to interact with a Spark Cluster. Clients can send libraries and snippets of code that are interpreted and ran against a preconfigured Spark context. These snippets can do a variety of things:

  1. Define and run spark jobs of all kinds
  2. Collect results from spark and push them to the client
  3. Load necessary dependencies for the running code
  4. Start and monitor a stream
  5. ...

The main supported language is Scala, but it is also capable of processing both Python and R. It implements the latest Jupyter message protocol (5.0), so it can easily plug into the latest releases of Jupyter/IPython (3.2.x+ and 4.x+) for quick, interactive data exploration.

Try It

A version of Toree is deployed as part of the Try Jupyter! site. Select Apache Toree - Scala under the New dropdown. Note that this version only supports Scala.

Develop

This project uses make as the entry point for build, test, and packaging. It supports 2 modes, local and vagrant. The default is local and all command (i.e. sbt) will be ran locally on your machine. This means that you need to install sbt, jupyter/ipython, and other development requirements locally on your machine. The 2nd mode uses Vagrant to simplify the development experience. In vagrant mode, all commands are sent to the vagrant box that has all necessary dependencies pre-installed. To run in vagrant mode, run export USE_VAGRANT=true.

To build and interact with Toree using Jupyter, run

make dev

This will start a Jupyter notebook server. Depending on your mode, it will be accessible at http://localhost:8888 or http://192.168.44.44:8888. From here you can create notebooks that use Toree configured for Spark local mode.

Tests can be run by doing make test.

NOTE: Do not use sbt directly.

Build & Package

To build and package up Toree, run

make release

This results in 2 packages.

  • ./dist/toree-<VERSION>-binary-release.tar.gz is a simple package that contains JAR and executable
  • ./dist/toree-<VERSION>.tar.gz is a pip installable package that adds Toree as a Jupyter kernel.

NOTE: make release uses docker. Please refer to docker installation instructions for your system. USE_VAGRANT is not supported by this make target.

Run Examples

To play with the example notebooks, run

make jupyter

A notebook server will be launched in a Docker container with Toree and some other dependencies installed. Refer to your Docker setup for the ip address. The notebook will be at http://<ip>:8888/.

Install

Dev snapshots of Toree are located at https://dist.apache.org/repos/dist/dev/incubator/toree. To install using one of those packages, you can use the following:

pip install <PIP_RELEASE_URL>
jupyter toree install

where PIP_RELEASE_URL is one of the pip packages. For example:

pip install https://dist.apache.org/repos/dist/dev/incubator/toree/0.2.0/snapshots/dev1/toree-pip/toree-0.2.0.dev1.tar.gz
jupyter toree install

Reporting Issues

Refer to and open issue here

Communication

You can reach us through gitter or our mailing list

Version

We are working on publishing binary releases of Toree soon. As part of our move into Apache Incubator, Toree will start a new version sequence starting at 0.1.

Our goal is to keep master up to date with the latest version of Spark. When new versions of Spark require specific code changes to Toree, we will branch out older Spark version support.

As it stands, we maintain several branches for legacy versions of Spark. The table below shows what is available now.

Branch Apache Spark Version
master 2.x.x
0.1.x 1.6+

Please note that for the most part, new features will mainly be added to the master branch.

Resources

We are currently enhancing our documentation, which is available in our website.