<img src="http://imgur.com/1ZcRyrc.png" style="float: left; margin: 20px; height: 55px">

# AWS: setting up an ec2 instance with anaconda and jupyter 

---

Here are the steps to connect a jupyter notebook to an active ec2 instance. First launch the ec2 instance and ssh in as done before.

Installing anaconda on ec2
We follow instructions from Chris Albon. However, there python3 is used.

Install Anaconda
In the ec2 shell type:
```bash
sudo yum update
```

For python 3:

```bash
wget http://repo.continuum.io/archive/Anaconda3-5.1.0-Linux-x86_64.sh
```

Once it downloaded, type

```bash 
bash Anaconda3-5.1.0-Linux-x86_64.sh
```

- Accept the anaconda license terms.
- When asked about confirming the installation location, say yes.
- When asked about adding anaconda to path, say yes.
- Type

```bash 
source .bashrc
```

Check if the default python is now anaconda:

```bash 
which python
```

If anaconda is returned you can stop and go to the next step. If not type:

```bash
vim .bashrc
```

Now you are in the vim editor, so you must be careful what you type - follow the steps:

- type: `i`

Now you are in insert mode, you can paste the following into the section of the file that says User specific aliases and function:
paste:

`export PATH="/home/ec2-user/anaconda3/bin:$PATH"`

- Exit input by pressing the `esc` key

- Save and quit by typing `:wq`
- You have left vim
- Type

```bash 
source .bashrc
```

Test by typing

```bash
which python
```

and anaconda should be returned.

Now you have anaconda fully working. You could run python scripts and import packages provided by anaconda.

## Launching a jupyter notebook on ec2

To get jupyter notebook access in your ec2 shell we need some more steps. Especially regarding security we have to add some bits.

Go into python by typing

```bash
python
```
Once you are in python, type

```python
from IPython.lib import passwd
passwd()
> Enter password:
> Verify password:
```
- Type an easy password. 
- You will get back a hashed key. Save both in a text file somewhere, you will need them. 
- Exit the python interpreter with `quit()`.

Make a new directory:
```bash
mkdir certificates
cd certificates
sudo openssl req -x509 -nodes -days 365 -newkey rsa:1024 -keyout mycert.pem -out mycert.pem
```

This will ask some slightly personal questions! You can answer them or leave them blank. I gave country (UK), city (London), and first name but not email address for example.

Type:

```bash 
cd ..
```

You should be back in your main directory. Now type:

```bash
jupyter notebook --generate-config
cd .jupyter/
vim jupyter_notebook_config.py
```

We are back in vim; follow the steps:
- type `i`
- paste:

```
c = get_config()
# Notebook config this is where you saved your pem cert
c.NotebookApp.certfile = u'/home/ec2-user/certificates/mycert.pem'
c.NotebookApp.password= u'that_hashed_password_you_returned_earlier'
# Run on all IP addresses of your instance
c.NotebookApp.ip = '*'
# Don't open browser by default
c.NotebookApp.open_browser = False
# Fix port to 8888
c.NotebookApp.port = 8888
```

- Press the `esc` key and type `:wq`
    
To be able to access jupyter notebooks launched from the ec2 instance in the browser, we have to modify some further security settings:

Go to your AWS console and check the page that summarises your ec2 instances currently running (which includes your public DNS).

Look down the sidebar on the left. You can see the first link under "Network & Security" which is "Security Groups". Click that.

Click on a group and click "Actions".

Select "Edit inbound rules". Now a popup appears and you can add (do not replace the current one) the following selection to launch-wizard-1:

```Custom TCP Rule | TCP | 8888 | Custom 0.0.0.0/0```

and save.

Note that if you run a jupyter notebook, shut it down, and then run another one it may consider port 8888 to be still occupied and switch to 8889. 
You could add further rules to be able to access further ports, but that is optional. When you launch the 
notebook it will tell you which port it is running on.

Back in the ec2 shell, make sure first you are in the overall home directory

```bash
cd ..
pwd
> /home/ec2-user/
```

Launch the notebook:

```bash
jupyter notebook
```

It should give output stating that the notebook is running on `https://[all_ip_addresses_on_your_system]:8888/`

Open Chrome (NOT Safari) and enter the path:
`https://replace_with_your_public_DNS:8888`

You should get a warning that the connection is unsafe. Tell it to proceed and you should be prompted for the password (the unhashed version) you entered before.

Type that in and you should now be in your familiar notebook environment. If you save files (eg. write to a csv) it will save it to the ec2 home directory from which you launched the jupyter notebook. Similarly the notebook itself will be saved there.

Owing to processes not fully closing, your port 8888 may be occupied. To deal with this you can type in your ec2 shell:

```bash 
ps aux | grep -i notebook
```

which will return a list of some processes (if you have any). Those with the reference "/home/ec2-user/anaconda2/bin/jupyter-notebook" are notebook processes still running. You will see a format which says "ec2-user some_number some_other_text" where some_number is a number (probably four or five digits) which is the second text to appear after "ec2-user". You can kill this process with:

```bash
kill -9 some_number
```

Once you do this for any notebooks running, then you can start jupyter notebook again and it should run on 8888 (since this port is no longer occupied).

## An aside on running processes with screen

Let's say you want to leave a script running for a long time, and you want it to be independent of your local computer. Currently if you run processes in your ec2 shell and then close that terminal window, the process will stop. The same goes for a jupyter notebook; it will stop if the terminal window is closed.

If you run any python script, you can leave it running in the background lauching it with

```bash
python pythonscript.py &
```

Alternatively, in the ec2 shell, type:

```bash
screen
```

You are now in a screened process. If you run your script and type "ctrl-a d" (i.e. type "ctrl" and "a" at the same time, then immediately afterwards type "d") you should be back in your standard window. Now you can close the terminal window and your process will still run in the background (which was not true before). To return to a screened process, type:

```bash
screen -r
```

Note that this will work even if you close the original terminal window and ssh back in to the ec2 instance. To check the screens running:

```bash 
screen -ls
```

and you can kill any screens left running with

```bash
screen -X -S full_name_of_screen_returned_by_screen_ls_command quit
```

See also [Screening](https://uisapp2.iu.edu/confluence-prd/pages/viewpage.action?pageId=115540034).

> **Test:** Try what happens if you close a browser while a notebook is running. Can you reaccess the notebook and will it have continued working?