***
# Controlboard for different Data Analytics-Stacks

1. [Vanilla-Jupyter-Datascience-Notebook](#Vanilla-Jupyter-Datascience-Notebook)
2. [Elastic Stack (formerly ELK-Stack)](#Elastic-Stack-(formerly-ELK-Stack))
3. [Neo4j](#Neo4j)
4. [Superset](#Superset)

***
## Vanilla Jupyter Datascience-Notebook

#### Summary

Your "standard" [Jupyter Notebook](https://github.com/jupyter/docker-stacks/tree/master/datascience-notebook), all packages updated. 

#### Pin some variables first

Specify the directory you want to mount in order to persist data:

In [None]:
homedir = r"C:\Users\kat\Documents\Test-Project";

# Need to convert Windows path for Docker (linux-based)
homedir = homedir.replace("\\", "/").replace(":", "")

Docker image version/"tag" of the Jupyter Datascience-Notebook you want to launch (see the [AWK Repo on Dockerhub](https://hub.docker.com/repository/docker/awkgroupag/datascience-notebook/tags))

In [None]:
notebook_version = "e56df3a"

# Hopefully you won't need to use another notebook ;-)
notebook = "awkgroupag/datascience-notebook"

#### Start the container (be sure to run the cells above first)

In [None]:
! sudo docker container run -d -p 8888:8888 \
    -e JUPYTER_ENABLE_LAB=yes \
    -v //$homedir:/home/jovyan \
    --name jupyter \
    $notebook:$notebook_version 
! echo && echo Waiting for 5 seconds for the container to spin up

from IPython.display import Markdown as md
import time

time.sleep(5)
log = ! sudo docker logs jupyter
url = 'http://127.0.0.1:8888'
for line in log:
    if url in line:
        break
else:
    print(log)
    raise RuntimeError('Did not find URL in the log above')
url = url + line.split(url, 1)[1]
md(f"**Your Jupyterlab URL is** {url}")

#### Cleaning up
Stop the Vanilla Jupyter Notebook. Container won't be deleted

In [None]:
! sudo docker stop jupyter

Remove the container

In [None]:
! sudo docker rm jupyter

***
## Elastic Stack (formerly ELK-Stack)

#### Summary

Elasticsearch, Kibana, Beats, and Logstash. Take data from any source, in any format, then search, analyze, and visualize it in real time.

* **Elasticsearch** is a distributed, RESTful search and analytics engine. As the heart of the Elastic Stack, it centrally stores your data for lightning fast search, fine‑tuned relevancy, and powerful analytics that scale with ease.
* **Kibana** lets you visualize your Elasticsearch data and navigate the Elastic Stack so you can do anything from tracking query load to understanding the way requests flow through your apps.
* **Logstash** is a server-side data processing pipeline that ingests data from a multitude of sources simultaneously, transforms it, and then sends it to your favorite "stash."
* **Beats** is the platform for single-purpose data shippers. They send data from hundreds or thousands of machines and systems to Logstash.

Note that Beats (e.g. Metricbeat or Systembeat) are not included in this stack

#### Connections once the stack has been started
* Kibana browser access: [http://localhost:5601](http://localhost:5601)
* Elasticsearch access, e.g. through a Jupyter notebook: [http://localhost:9200](http://localhost:9200)
* Logstash access: 

#### Create a volume to persist all data

In [None]:
! sudo docker volume create --name=elasticsearch_data

#### Start the stack
Once pull has completed and containers are running, startup might take 1-2 minutes!

In [None]:
! sudo docker-compose -f "./elk/docker-compose.yml" -p "elk" up -d

#### Stop and remove the stack (Elasticsearch and Kibana data will be retained)

In [None]:
! sudo docker-compose -f "./elk/docker-compose.yml" -p "elk" down

#### Delete all Elasticsearch and Kibana data

In [None]:
! sudo docker volume rm elasticsearch_data

***
## Neo4j

## (under construction)

***
## Superset

## (under construction)