# Part 1
## Jupyter and SWAN

## How to install Jupyter notebook

* If you use conda, you can install it with:

`conda install -c conda-forge jupyterlab`

* If you use pip, you can install it with:

`pip install jupyterlab`


## Prerequisite: Python

While Jupyter runs code in many programming languages, Python is a requirement (Python 3.3 or greater, or Python 2.7) for installing the JupyterLab or the classic Jupyter Notebook.

I hope, at this moment, you already have installed your Jupyter Notebook, so let's start!

*Other information how to install Jupyter notebook for advanced users are here:*
https://jupyter.org/install

## How to start

Run in you command line:

`jupyter notebook`

or

`jupyter-notebook`

and you will be able to see something like this:

`[I 09:00:31.499 NotebookApp] JupyterLab extension loaded from /usr/lib/python3.7/site-packages/jupyterlab
[I 09:00:31.499 NotebookApp] JupyterLab application directory is /usr/share/jupyter/lab
[I 09:00:31.657 NotebookApp] Serving notebooks from local directory: /home/oksana/CERN_sources/carpentries-root-training
[I 09:00:31.657 NotebookApp] The Jupyter Notebook is running at:
[I 09:00:31.657 NotebookApp] http://localhost:8888/?token=a37410108bcdb101286b232f05f45d2a2c99dc894a8ea6f6
[I 09:00:31.657 NotebookApp]  or http://127.0.0.1:8888/?token=a37410108bcdb101286b232f05f45d2a2c99dc894a8ea6f6
[I 09:00:31.657 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).`


When you first start the notebook server, your browser will open to the notebook dashboard. 
* The dashboard serves as a home page for the notebook. 
* Its main purpose is to display the notebooks and files in the current directory.

For example, here is a screenshot of the dashboard page of our training directory:

<img src="images/jup.png">

The top of the notebook list displays clickable breadcrumbs of the current directory. By clicking on these breadcrumbs or on sub-directories in the notebook list, you can navigate your file system.

* To create a new notebook, click on the “New” button at the top of the list and select a kernel from the dropdown (as seen below). 
* Which kernels are listed depend on what’s installed on the server.
* Some of the kernels in the screenshot below may not exist as an option to you.

<img src="images/jup.png">

* Notebooks and files can be uploaded to the current directory by dragging a notebook file onto the notebook list or by the “click here” text above the list.

* The notebook list shows green “Running” text and a green notebook icon next to running notebooks (as seen below).

* **Notebooks remain running until you explicitly shut them down; closing the notebook’s page is not sufficient.**


<img src="images/jup2.png">

To shutdown, delete, duplicate, or rename a notebook check the checkbox next to it and an array of controls will appear at the top of the notebook list (as seen below). You can also use the same operations on directories and files when applicable.
<img src="images/jup3.png">

To see all of your running notebooks along with their directories, click on the “Running” tab:
<img src="images/jup4.png">

A lot of materials were taken from: https://jupyter-notebook.readthedocs.io/en/stable/

## Use cases of Jupyter notebooks

* Programming and Computer Science
* Statistics, Machine Learning and Data Science
* Mathematics, Physics, Chemistry, Biology
* Earth Science and Geo-Spatial data
* Linguistics and Text Mining
* Signal Processing
* Engineering Education
* and even accessing and programing a IBM quantum computer via notebooks

Check nice examples:
https://github.com/jupyter/jupyter/wiki/A-gallery-of-interesting-Jupyter-Notebooks

## SWAN (Service for Web based ANalysis)
https://swan.cern.ch

### What is a SWAN?

**SWAN (Service for Web-based ANalysis)** is a CERN service that allows users to perform interactive data analysis in the cloud, following a "software as a service" model. It is built upon the widely-used Jupyter notebooks, which allows users to write - and run - their data analyses using only a web browser. Moreover, by connecting to SWAN, users have immediate access to the CERN storage, software and computing resources they need to do their analyses.

What you can do with SWAN:
* Analyse data without the need to install any software
* Jupyter notebook interface as well as shell access from the browser
* Use CERNBox as your home directory and synchronise your local user storage with the cloud
* Access experiments and user data in the **CERN cloud (EOS)**
* Share your work with your colleagues thanks to **CERNBox support**
* Document and preserve science - create catalogues of analyses: encourage **reproducible studies** and learning by example
* Submit your jobs to **CERN Spark Clusters**


More interesting features:

* Support of LCG dev software stacks (multiple useful packages used by LHC experiments, such as ROOT, Geant4 and etc.)


### A lot of SWAN interesting examples: just click and run!

https://swan.web.cern.ch/content/basic-examples


<img src="images/jup15.png">

## How to start SWAN session

* Go to https://swan.cern.ch

or

* Lets try to open materials for our session in SWAN:

* Check https://github.com/oshadura/carpentries-root-training
* Click on icon **Open in SWAN**
* Wait a bit for the magic to start! *(probably you will need to put in your CERN credentials)*
* Voila! (you have materials cloned in you SWAN home directory ready to be used)


<img src="images/jup11.png">

In the configuration form you have the following options:

* **Software stack:** LCG release that will be used to configure your environment. From your session, you will have available all the software packages included in the LCG release that you selected.
 * Our choice is: **96python3**
* **Platform:** GCC compiler version.
 * Our choice is: **Centos7(gcc8)**
* **Environment script:** path to a script with extra environment configuration (see more below).
 * Our choice is: **nothing**
* **Number of cores** allocated to your session.
 * Our choice is: **default**
* **Memory (in GB)** allocated to your session.
 * Our choice is: **default**
* **Spark cluster** that you want to plug to your session.
 * Our choice is: **nothing**

The environment script is a bash shell script that you can write to define your environment variables or to perform other configuration actions. You can locate this script in your CERNBox and access it by using the CERNBOX_HOME environment variable, that automatically resolves to your home directory in SWAN, i.e. your CERNBox.

Easy? **After selecting the values that you want, click on Start my Session to begin working with SWAN!**

<img src="images/swan1.png">

Let's click on any of tutorials. Works?
<img src="images/swan2.png">

## Nice features in SWAN

* Command line support
* Easily change Jupyter kernel (available C++, Python2/3, Octave 3, ROOT C++)
* Share you project wiyth your collegues
* Easily change SWAN configuration for your notebook
* Check SWAN **Help** session


### Easy command line support

`!echo $LD_LIBRARY_PATH`
or
`!pip install --user --upgrade coffea`


<img src="images/swancommand.png">

<img src="images/swan3.png" style="width: 300px;"/>

<img src="images/swan4.png" style="width: 300px;"/>

<img src="images/swan5.png" style="width: 300px;"/>

<img src="images/swan6.png" style="width: 300px;"/>

<img src="images/swan7.png" style="width: 300px;"/>

# Part 2
## Deeper look in posibilities of Jupyter

* Nice features of Jupyter notebooks
* Other products from Jupyter project
* How we can use then at CERN?

Jupyter Notebook can be turned into a slide presentation that is kind of like using Microsoft Powerpoint, except that you can run the slide’s code live! It’s really amazing how well it works. 

<img src="images/jup5.png">

### RISE Jupyter plugin
*(just to show you an extra features, you can try it later from your laptop)*

Reveal.js – Jupyter/IPython Slideshow Extension (RISE) is a plugin that uses *reveal.js* to make the slideshow run live. What that means is that you will now be able to run your code in the slideshow without exiting the slideshow. 

* How to install RISE on your local computer: https://rise.readthedocs.io/en/maint-5.6/

<img src="images/presenthub.png">

* In SWAN, RISE plugin is available by default!


## Jupyter and Binder

A Binder (also called a Binder-ready repository) is a code repository that contains at least two things:

* Code or content that you’d like people to run. This might be a Jupyter Notebook that explains an idea, or an physics analysis script with histograms/visualisions/results.
* Configuration files for your environment. These files are used by Binder to build the environment needed to run your code. 

How to access: https://mybinder.org/


In order to prepare your repository for use with the BinderHub at mybinder.org, all you need to do is ensure that the following conditions are met:

* The repository is in a public location online (e.g., on GitHub, Gitlab or BitBucket)
* The repository does not require any personal or sensitive information (such as passwords)
* The repository has configuration files that specify its environment (see below for an example)


In our case, we have two important ingredients:

* A content files: our Jupyter Notebooks(.ipynb).
* An environment configuration file: environment.yml is a standard file that specifies an Anaconda environment.


### Why we can need Binder?

* For example, you want to share results with someone who doesn't have CERN account (and access to SWAN)
 * Check https://github.com/oshadura/carpentries-root-training
 * I prepared our training course also in Binder, click on button **Open in Binder** and enjoy it!


## JupyterHub

* JupyterHub is the best way to serve Jupyter notebook for multiple users. 
* It can be used in a classes of students, a corporate data science group or scientific research group. 
* It is a multi-user Hub that spawns, manages, and proxies multiple instances of the single-user Jupyter notebook server.



<img src="images/jh.png">

### Use cases

* CERN

* Academic Institutions, Research Labs, and Supercomputer Centers

* Private companies

https://jupyterhub.readthedocs.io/en/stable/gallery-jhub-deployments.html

## CERN JupiterHub

https://hub.cern.ch/

### Together with a CERN Binder for CERN JupiterHub
You can try to generate your Docker image to be used for notebook server:
https://binder.cern.ch/

<img src="images/hub0.png">

<center>**...or even on GPU!**</center>
<img src="images/hub1.png" style="width: 500px;"/>

### Example of Basic ML classification: Classify images of clothing

Example is taken from: https://www.tensorflow.org/tutorials/keras/classification

*(I just tried it for fun, you can try it later!)*

<img src="images/hub2.png">

<img src="images/hub6.png" style="width: 500px;"/>

<img src="images/hub4.png" style="width: 500px;"/>

<img src="images/hub5.png" style="width: 500px;"/>

<center>If you were worried about results on CPU CERN Binder instance - yes, it was slower!</center>
<img src="images/hub7.png">

<center>Our goals here is to achieve **reproducability** of our results!</center>

<img src="images/funnyjup.jpg" style="width: 500px;"/>

<center> And don't spend ages in setup of software!</center>
<img src="images/funny.jpg" style="width: 500px;"/>