Skip to content

OSC/bc_osc_jupyter_spark

Repository files navigation

Batch Connect - OSC Jupyter Notebook + Spark

GitHub Release GitHub License

An interactive app designed for OSC OnDemand that launches a Jupyter Notebook server and an Apache Spark cluster within an Owens batch job.

Prerequisites

This Batch Connect app requires the following software be installed on the compute nodes that the batch job is intended to run on (NOT the OnDemand node):

  • Lmod 6.0.1+ or any other module purge and module load <modules> based CLI used to load appropriate environments within the batch job before launching the Jupyter Notebook server.
  • Jupyter Notebook 4.2.3+ (earlier versions are untested but may work for you)
  • OpenSSL 1.0.1+ (used to hash the Jupyter Notebook server password)
  • Apache Spark 2.1.0+

Install

Use Git to clone this app and checkout the desired branch/version you want to use:

scl enable git19 -- git clone <repo>
cd <dir>
scl enable git19 -- git checkout <tag/branch>

You will not need to do anything beyond this as all necessary assets are installed. You will also not need to restart this app as it isn't a Passenger app.

To update the app you would:

cd <dir>
scl enable git19 -- git fetch
scl enable git19 -- git checkout <tag/branch>

Again, you do not need to restart the app as it isn't a Passenger app.

Contributing

  1. Fork it ( https://github.com/OSC/bc_osc_jupyter_spark/fork )
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Add some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create a new Pull Request

License

  • Documentation, website content, and logo is licensed under CC-BY-4.0
  • Code is licensed under MIT (see LICENSE.txt)
  • The Jupyter logo is a trademark of NumFOCUS foundation.
  • Apache Spark, Spark and the Spark logo are trademarks of the Apache Software Foundation (ASF).