# Working with JupyterHub and Jupyter notebooks

This class will involve a lot of work with data using Python 3. To ensure that we're all using the same Python environment, we will be using JupyterHub to interact with a shared Virtual Machine (VM). You can choose whether to work on assignments and exercises in JupyterHub or on your computer, but I will distribute class materials through this service and you will submit your assignments here.

## What is JupyterHub?

JupyterHub allows multiple users to run Jupyter notebooks using shared resources. In our case, that means we're all sharing processors and RAM, and it also means that we're running off the same installation of Python (via anaconda).

JupyterHub uses a browser to provide a GUI* to interact with the VM that it is running on.  
\**This is the same GUI that Jupyter Notebooks uses if you run Notebooks on your local machine.*
![GUI_image](./resources/GUI_image.png)


Using JupyterHub, we can edit file paths and directory structure (i.e., rename items, create new folders, and move items around). 
![GUI_file_edit_image](./resources/GUI_file_edit_image.png)
We can also upload new files and create new plaintext files (which can be used to make CSVs, JSONs, and other file types that are formatted in UTF-8 with no special encoding). If we need to, we can also open a terminal and use normal Linux bash commands.
![GUI_operations_image](./resources/GUI_operations_image.png)

 

## Jupyter Notebooks

The most common thing we will do with JupyterHub, though, is create and edit Python 3 notebooks.
![GUI_create_notebook_image](./resources/GUI_create_notebook_image.png)

This object we are looking at now is a Jupyter notebook running a Python 3 kernel.
By using notebooks instead of some other coding environment (like an interactive Python session), we can mix text (in Markdown syntax) with Python code and the output of Python code.
Because we are using JupyterHub to run notebooks instead of running them on our local machines, we don't have to worry about installation and setup. If you want to install and setup Jupyter Notebooks on your computer, though, you can do so via Anaconda or pip (see [this blog post](https://medium.com/codingthesmartway-com-blog/getting-started-with-jupyter-notebook-for-python-4e7082bd5d46) for more instructions).

### Cells
Notebooks organize contents by cells. Each cell can contain multiple lines, but all the lines in a cell must be the same format (i.e., a cell must be only Markdown or only Python code). We have to specify the format of the contents of each cell. 
By default, new cells are in Python format.  
![Code_format](./resources/Code_format.png)  


In [1]:
print("I can write Python code in this notebook, too!")

I can write Python code in this notebook, too!


To switch to Markdown, we need to select "Markdown" instead of "Code" in the toolbar at the top of the window.  
![change_to_markdown](./resources/change_to_markdown.png)

We can also run some bash commands in the notebook by prepending code cells that contain bash commands with `!`

In [None]:
!pwd
!ls -lh

Python in Jupyter Notebooks has autocomplete, which helps a lot with coding! To pull up options for autocomplete, hit `tab` at any time.  
Autocomplete works (at least in a limited fashion) with bash commands in the notebook, too, but it can be a bit confusing.

Any "code" cell will have an indicator on the side that lets you know whether the cell has been run. Cells that have been run will have a number inside the brackets. That number tells you the order in which run cells were executed.

![code_run_pre](./resources/code_run_pre.png)
![code_run_post](./resources/code_run_post.png)

### Running Python

When we are writing and running Python code in notebooks, we can think of cells as each containing a group of lines of code that are logically connected. In some cases, we might only write one line of code in a cell, but more often we'll write multiple lines.  
As you've probably noticed by now, we don't run individual lines in the notebook. Instead, we run cells. That means that if there is more than one line of Python code in a cell, all of those lines of code will be executed when we run that cell. 

In [None]:
addition = "Adding numbers together"
seven = 2 + 5

In [None]:
print(addition)
print(seven)

You probably know that running some code even without explicit print statements leads to output. For example, run the next cell.

In [None]:
2+5

You might also know that if we run more than one line that generates output without an explicit print statement at the same time, only the last statement will actually generate its output. Remind yourself with the next cell.

In [None]:
2 + 5
4 * 2

It's a good idea to include an explicit print statement any time we want the notebook to generate output.

In [None]:
print(2+5)
print(4*2)

One of the great things about notebooks is that they can render visualizations generated in Python in the same window. In-line visualizations can be incredibly convenient, particularly as you are writing code and revising the viz.

In [None]:
import pandas as pd
import numpy as np

data1 = [{"Number": 1, "Even": "False"}, {"Number": 2, "Even": "True"},
        {"Number": 3, "Even": "False"}, {"Number": 4, "Even": "True"},
        {"Number": 5, "Even": "False"}, {"Number": 6, "Even": "True"},
        {"Number": 423, "Even": "False"}]
data1 = pd.DataFrame(data1)
data1 = data1[["Number", "Even"]]
data1["Even"].value_counts().plot.pie()

## Transferring Files

Sometimes, you might find that you need to transfer files between your computer and JupyterHub. Jupyter makes that really easy.

To download a file, simply click on the box beside the file. A list of possible file actions will appear above the list of files. One of those possibilities is "download." If you click download, the file will be downloaded to your machine. Unfortunately, you can't select multiple files at once to download.

![download_files](./resources/download_image.png)

Uploading files is just as easy. Once you're in the folder you want to upload the file to, just click "upload" in the top right, then follow the normal file upload process. 

![upload_image](./resources/upload_image.png)

Once you've selected the file you want to upload, Jupyter will give you a chance to change the name before it actually does the upload. **Make sure you click upload once you see a screen like this or your file won't actually be uploaded.**
![upload_part2_image](./resources/upload_part2_image.png)