# Jupyter environment

The Jupyter is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and explanatory text. Uses include: data cleaning and transformation, numerical simulation, statistical modeling, machine learning and much more.

[Website](http://jupyter.org/) [Project repository](https://github.com/jupyter)

We will be using a Jupyter server as the primary web interface for this workshop. Several notebooks have been provided to you, in advance, to guide you through the workshop. After the workshop, you may use the [Agave Jupyter image](https://hub.docker.com/r/agaveplatform/jupyter-notebook/) to recreate the notebook server and repeat the workshop, or continue on with your own work at your leisure.  

The Agave image has several customizations to facilitate use of the platform and ease much of the heavy lifting done behind the scenes in this tutorial.


### Custom Kernels

Your Jupyter server has multiple kernels available for use right away. We have preconfigured them with several useful libraries and tools to help users get up and running with common tasks easier. Additionally, we have bundled in Agave CLI and Python SDK into the Bash, Python 2, and Python 3 kernels respectively. Both kernels are pre-authenticated with valid Agave auth tokens that you can use to begin interacting with the Agave Platform right away.


### Shared file system

Your home directory on the Jupyter server is shared with your sandbox, so you can safely copy data between the two environments quickly and easily.


### Web console  

Jupyter contains a web terminal that can be used to access your sandbox environment or interact with the Jupyter container itself. To login to your sandbox from the Jupyter web terminal, simply run the following command:  

```
ssh -p 10022 $VM_IPADDRESS
```  

### Tutorial notebooks  

This tutorial is presented as a series of Jupyter notebooks. If you are attending this tutorial in person, you will download the notebooks into the home directory of your notebook server. If you are following along after the fact, you should download the notebooks from the github repository into your Jupyter workspace.

```
git clone --depth 1 https://github.com/UH-CI/agave_workshop_20180419
``` 

### API access  

The tutorial walks you through the process of obtaining a set of API keys an authenticating to the Agave Platform. Once this is done, you no longer need to authenticate to follow the tutorial. Both the Agave CLI and Python SDK will be picked up your authorization cache and automatically refresh it as needed.

### Extras

Inside of the `examples` directory, you will find several notebooks to help you learn more about the Agave platform, containers, and SciOps. We leave these for you to follow after the tutorial.


<hr>

# Sandbox environment

The tutorial sandbox is a full Ubuntu 16.04 server running as a Docker container on a VM dedicated for your use in this tutorial. The sandbox has a standard HPC build environment with OpenMPI, Python 2, Python 3, build-essential, gfortran, openssl, git, jq, vim, and a host of other utilities. 

### Container runtimes  

Docker and Singularity are both pre-installed in your Sandbox. All images used in this tutorial are available from the public Agave Docker Hub and Singularity Hub accounts. You may also use your own private registry accounts. You will need to login to the respective registries on your own.


### Funwave example code  

The sample code for this project is already present in `$HOME/FUNWAVE-TVD`.


### Shared file system

Your `$HOME/work` directory on the Jupyter server is shared with your sandbox, so you can safely copy data between the two environments quickly and easily.


### Accessibility  

To login to the sandbox from outside the Jupyter server, use the host IP address. You will find the public IP address of your sandbox in the `$VM_IPADDRESS` environment variable. Valid ssh keys are available in the `~/work/.ssh` director of your Jupyter server. Alternatively, you can append your own public key to the `$HOME/work/.ssh/authorized_keys` file.  

```
ssh -i /path/to/private/key.pem -p 10022 jovyan@$VM_IPADDRESS
```

### Persistence

Your VM will remain available for 1-2 days following the tutorial. During that time, your data will remain available.  After that, the VM and any data saved with it will be destroyed. If you need to persist your data, it is recommended that you move it to another host, or [create your own account](https://public.agaveapi.co/create_account) in the Agave public tenant and save your data in the free cloud storage provided to you by default there.  



<hr>

# Logging In

We have already configured resources for you to use in this tutorial. 

### Virtual Machine

Each of you have a dedicated VM provided by [Jetstream](https://jetstream-cloud.org). You will use this VM for the duration of the tutorial. 

### Training Account

A training account on the University of Hawaii Agave Platform's public tenant has also been allocated to you.


### Login

Your Jupyter server is available at `<username>.hawaii.training.agaveplatform.org`. 

Usernames will be training001 to training100. We will count off to determine our instance.

When you first login, you will find it empty, save for a notebook named [INSTALL.ipynb](../INSTALL.ipynb)". Open this notebook by clicking on the notebook name, then click the *"run"* button. This will fetch all the tutorial notebooks from the tutorial's git repository an add them to your workspace. 

<hr>

# Following along at home

If you are following along with this tutorial at home, you can recreate the tutorial Jupyter server and sandbox environments by running the containers on your own laptop/server using the following Docker Compose file (i.e. save the file below in a file named `docker-compose.yml`). 
<div style='border-style: solid'>

```
version: '2'

volumes:
  training-volume:
  ssh-keygen-volume:
  jenkins-home-volume:

services:

  # traefik reverse proxy to expose the jupyter and jenkins servers over ssl via a common hostname
  traefik:
    image: traefik:latest
    mem_limit: 512m
    command: --debug=True --docker --docker.watch --web --web.address=:28443 --entryPoints='Name:http Address::80 --entryPoints='Name:https Address::443 --defaultEntryPoints='http,https'
    ports:
      - '443:443'
      - '80:80'
      - '28443:28443'
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock

  # This is a single shot container that creates a set of ssh keys per instance
  # and deploys them to a persistent volume shared between the sandbox and
  # jupyter container. By doing this, we don't need to ship keys with the image
  # or source.
  ssh-keygen:
    image: agaveplatform/jupyter-notebook:5.2
    entrypoint: /bin/bash
    command: /usr/local/bin/keygen.sh
    user: jovyan
    env_file:
      - training.env
    volumes:
      - ssh-keygen-volume:/home/jovyan/.ssh
      - ./docker/ssh-keygen/keygen.sh:/usr/local/bin/keygen.sh

  # Jupyter server customized for the tenant and user specified by the AGAVE_USERNAME,
  # AGAVE_PASSWORD, and AGAVE_TENANT environment variables defined in the compose command
  jupyter:
    image: agaveplatform/jupyter-notebook:5.2
    command: start-notebook.sh --NotebookApp.token=''
    mem_limit: 2048m
    restart: on-failure
    ports:
      - '8888:8888'
    depends_on:
      - ssh-keygen
    env_file:
      - training.env
    environment:
      - GRANT_SUDO=yes
      - VM_MACHINE=training-node-${AGAVE_USERNAME}
      - VM_HOSTNAME=localhost
      - USE_TUNNEL=True
      - ENVIRONMENT=training
      - SCRATCH_DIR=/home/jovyan
      - MACHINE_USERNAME=jovyan
      - MACHINE_NAME=sandbox
      - DOCKERHUB_NAME=stevenrbrandt
      - AGAVE_APP_DEPLOYMENT_PATH=agave-deployment
      - AGAVE_CACHE_DIR=/home/jovyan/work/.${AGAVE_TENANT}
      - AGAVE_JSON_PARSER=jq
      - AGAVE_SYSTEM_SITE_DOMAIN=localhost
      - AGAVE_STORAGE_WORK_DIR=/home/jovyan
      - AGAVE_STORAGE_HOME_DIR=/home/jovyan
      - AGAVE_APP_NAME=funwave-tvd-hawaii-${AGAVE_USERNAME}
      - AGAVE_STORAGE_SYSTEM_ID=${AGAVE_TENANT}-storage-${AGAVE_USERNAME}
      - AGAVE_EXECUTION_SYSTEM_ID=${AGAVE_TENANT}-exec-${AGAVE_USERNAME}
    volumes:
      - ssh-keygen-volume:/home/jovyan/.ssh:ro
      - training-volume:/home/jovyan/work
      - .:/home/jovyan/notebooks
    labels:
      - "traefik.port=8888"
      - "traefik.protocol=http"
      - "traefik.tags=${AGAVE_USERNAME},jupyter"
      - "traefik.backend=${AGAVE_USERNAME}-training"
  # sandbox ubuntu server with build tools, OpenMP, and the sample code
  sandbox:
    image: agaveplatform/training-sandbox:latest
    mem_limit: 2048m
    privileged: True
    restart: on-failure
    ports:
      - '10022:22'
    depends_on:
      - ssh-keygen
    env_file:
      - training.env
    environment:
      - VM_MACHINE=training-node-${AGAVE_USERNAME}
    volumes:
      - ssh-keygen-volume:/home/jovyan/.ssh:ro
      - training-volume:/home/jovyan/work
      - /var/run/docker.sock:/var/run/docker.sock
    labels:
      - traefik.enable=false

  # Jenkins CI server for automated builds.
  jenkins:
    image: agaveplatform/jenkins:sc18
    mem_limit: 2048m
    privileged: True
    restart: on-failure
    ports:
      - '8080:8080'
      - '8443:8443'
    depends_on:
      - ssh-keygen
    env_file:
      - training.env
    environment:
      - AGAVE_CACHE_DIR=/var/jenkins_home/.${AGAVE_TENANT}
    volumes:
      - ssh-keygen-volume:/var/jenkins_home/.ssh:ro
    labels:
      - "traefik.port=8080"
      - "traefik.protocol=http"
      - "traefik.tags=jenkins"
      - "traefik.backend=jenkins"
      - "traefik.frontend.rule=PathPrefix:/jenkins"
      - "traefik.frontend.passHostHeader=true"

```
</div>

One more file is required to ensure your sandbox environemtn is setup correctly and that is the `training.env` file.
<div style='border-style: solid'>

```
# standard Agave Platform auth config variables used to
# bootstrap your notebook environment with valid token
# and api keys to interact with the Science APIs.
AGAVE_TENANTS_API_BASEURL=https://uhhpctenant.its.hawaii.edu/tenants
AGAVE_USERNAME=
AGAVE_PASSWORD=
AGAVE_CACHE_DIR=/home/jovyan/work/.$AGAVE_TENANT

# If running on a host without a public, static ip address,
# you will need to start a reverse tunnel to a server that
# Agave can communicate with in order to leverage the
# sandbox container and follow the data and compute tutorials.
USE_TUNNEL=True
NGROK_TOKEN=

# Some metadata used to generate unique names of systems, etc
# in the tutorials.
ENVIRONMENT=training

# Should the ssh keys be rotated every time the stack starts up?
# By default, yes. Comment this out to reuse the keys you have
# between restarts.
# ROTATE=yes
```
</div>
> To launch you jupyter and sandbox environment, you need to first set the environment variables `AGAVE_USERNAME`, `AGAVE_PASSWORD`, and `NGROK_TOKEN` in the training.env file. The first two should be your University of Hawaii username and password. The ngrok token should be obtained from [ngrok](https://ngrok.com)

> Ngrok will provide tunnelling for you so that agave can ssh into your laptop or desktop machine. It will do this by setting the `VM_IPADDRESS`, `VM_HOSTNAME` and `VM_SSH_PORT` for you.

> Once you have these things setup in your training.env and have the docker-compose.yml file in the same location, you should be able to run:
```
> AGAVE_USERNAME=myuhusername AGAVE_TENANT=hawaii docker-compose up -d
```

(note: you should run this command from the same directory in which you created your `docker-compose.yml` file and `training.env` file) you should then be able use your browser to connect to the tutorial setup on port 8888 of your local machine (http://localhost:8888).



### Hands On

- Everyone that has Docker installed attempt to set this up on your laptop - we will take 10 minutes for everyone to start the process.
- Pulling the containers down will take several minutes (depenedent on internet bandwidth) so we will move onto other content during this process and come back to this later.