Skip to content

Jupyter notebook spark-kernel with spark 1.4 and Cassandra support

Compare
Choose a tag to compare
@slowenthal slowenthal released this 28 Sep 21:25
· 41 commits to cassandra-1.4 since this release

First release of the iPython notebook spark-kernel with Cassandra support

To get jupyter notebook

Obviously you need python. Install these python packages

pip install jupyter

To set it up:

unpack the zip file just a bit below

create the directory

~/.ipython/kernels/spark

create the file

~/.ipython/kernels/spark/kernel.json

and paste in the following contents: Note you need to update a path to sparkkernel

{
    "display_name": "Spark-Cassandra (Scala 2.10.4)",
    "language": "scala",
    "argv": [
        "/<path>/<to>/spark-kernel/bin/sparkkernel",
        "--profile",
        "{connection_file}",
     ],
     "codemirror_mode": "scala"
}

If you nee to override the connection host, add these lines to the argv map above

   "--spark-configuration",
   "spark.cassandra.connection.host=127.0.0.1"

To run it

jupyter notebook

In the browser - create a new spark notebook

image

... and spark away

image

If you don't get output, try adding a .toString on the end. There seems to be a bug rendering some types.
0.1.4-cassandra
Fixed output formatting issue

Running CQL Statements from within the notebook

Simply prefix a cell containing a CQL statement with %%Cql

%%Cql select * from system.local

image