# Using Virtual Environments

- Difference between virtual environments (virtualenv or venv), containers, and virtual machines
- Creating a virtual environment
- Activating and deactivating virtualenvs
- Installing packages in a virtual environment
- Using requirements.txt files for reproducability
- Using wheels for faster installations
- Cleaning up virtualenvs

# Virtual Machines, Containers, and Virtualenvs

## Virtual Machine

- Isolated image ("guest") of a computer running its own operating system
- Can have different OS than the host
- Examples: vmware, virtualbox, Amazon EC2

## Container

- Partially isolated environment that shares the operating system with the host
- Same OS kernel as host, but with potentially different users/libraries/networks/etc. and system limits
- Examples: Docker, Heroku, Amazon ECS

## Virtualenv

- Partially isolated **python** environment running in the same OS as the 'host'
- **Only** Python packages and environment are isolated:  basically a "private copy of Python"
- No _security_ isolation from host OS: a program running inside a virtualenv can do whatever a program running outside a virtualenv can do
- Similar to local `node_modules` subfolder in a project for Javascript developers

# Why Virtualenvs?

- **Dependency management** - If two different Python applications require two different versions of the same package, running each app in its own virtualenv allows both versions to be available to their respective applications
- **Keeps your system Python pristine** - Many OSes use Python to implement some of the OS tooling (RedHat in particular). This often results in an older version of Python, or particular versions of Python packages installed globally that you *should not modify* if you want your system tools to keep working.
- **Helps with reproducibility** - Virtualenvs allow you to note the versions of all packages installed in your venv in order to recreate the virtualenv on another machine. This prevents the "Works on My Machine" certification.

# Creating a virtualenv


## Installing virtualenv

Since Python 3.3, Python has included a tool to create virtual environments called `venv` in the standard library. 

If, however, you are developing on Ubuntu, you must separately install it anyway with `apt-get install python-venv`.



## Creating the virtualenv

To create a virtual environment, you invoke the `venv` module with the virtualenv name:

```shell
$ python -m venv env-folder
```

This command

- creates a folder named `env-folder`
- copies the Python you used to invoke `venv` into that folder
- creates a couple of helper scripts inside env-folder to activate/deactivate the virtualenv

In [None]:
!/usr/bin/python3 -m venv data/env-folder

We can see the directory structure that the virtual environment created with the `tree` command. If you don't have `tree`, you can install it on a Mac using homebrew:

```bash
$ brew install tree
```

In [None]:
%%bash 
tree -d data/env-folder

In [None]:
!ls -l data/env-folder/bin

### (windows note)

If you are using Windows, there should be a `Scripts` folder under the environment folder instead of `bin`, and it should contain an `activate.bat` file.

You can invoke the Python in your new virtualenv by specifying the full path:

In [None]:
!data/env-folder/bin/python --version

In [None]:
!data/env-folder/bin/python -c 'import sys; print(sys.executable)'

In [None]:
!data/env-folder/bin/python -c 'import sys; print(sys.path)'

## Activating virtual environments

More commonly, we will *activate* the virtualenv for our current shell by `source`-ing the `activate` script

### Linux

```shell
$ source env-folder/bin/activate
(env-folder) $
```

or

```shell
$ . env-folder/bin/activate
(env-folder) $
```

### Windows

```shell
c:\...> env-folder\Scripts\activate.bat
(env-folder) c:\...>
```

Activating the virtualenv does a few things to your *current shell/terminal window only*:

- Puts the virtualenv's executable folder (`bin` or `Scripts`) at the beginning of your path so the virtualenv python will be picked up automatically
- Changes your prompt so you see that you are in the virtualenv
- Makes a `deactivate` command available to undo the changes

## Deactivating virtual environments

### Linux

```shell
(env-folder) $ deactivate
$
```

### Windows

```shell
(env-folder) c:\...> deactivate
c:\...> 
```


In [None]:
%%bash
echo "ACTIVATE"
source data/env-folder/bin/activate
which python
echo My prompt is now $PS1
python -c 'import sys; print(sys.executable)'
echo "DEACTIVATE"
deactivate
which python
python -c 'import sys; print(sys.executable)'

# Installing packages in virtual environments

When the virtual environment is activated, or when you invoke the version of Python in the virtualenv, you can install third-party packages into the virtualenv without modifying your system Python:

In [None]:
%%bash
set -e
source data/env-folder/bin/activate
which python
pip install -U pip
pip install numpy
python -c 'import numpy; print(numpy)'

# Using requirements.txt for reproducibility

Once you have your app in your virtualenv running, you may need to reproduce the virtualenv on another machine. 
`pip` has a command `freeze` which outputs the exact versions of all packages installed in a virtualenv:

In [None]:
%%bash
set -e
source data/env-folder/bin/activate
data/env-folder/bin/pip freeze

Normally, we'll put this into a file `requirements.txt` that we check into source control and distribute with our project:

In [None]:
%%bash
set -e
source data/env-folder/bin/activate
pip freeze > data/requirements.txt

In [None]:
cat data/requirements.txt

Once we have the requirements.txt file, we can create a new virtualenv and install all the same versions of packages into it:

In [None]:
%%bash
set -e
python -m venv data/env-folder-2
source data/env-folder-2/bin/activate
python -m pip install -r data/requirements.txt

# Using wheels for faster installations

While `pip` tries to cache as much data as possible, we can do even better by using "wheels." 

Wheels are Python packages that have been compiled (if necessary) for a particular target architecture and are thus much faster to install. 

If you're moving to a new machine (for instance, when deploying to production) it can also be useful to have the wheels cached locally so `pip` doesn't try to download the packages from the Python Package Index.

In [None]:
%%bash
set -e
source data/env-folder/bin/activate
pip install scipy scikit-learn jupyter simplejson pymongo wheel
pip freeze > data/requirements.txt

In [None]:
cat data/requirements.txt

In [None]:
%%bash
set -e
source data/env-folder/bin/activate
pip wheel -w data/wheelhouse -r data/requirements.txt

In [None]:
ls data/wheelhouse

Now we can distribute the `data/wheelhouse` directory with our project and install everything from the wheelhouse and not fetch from PyPI:

In [None]:
%%bash
set -e
source data/env-folder-2/bin/activate
pip install --no-index -f data/wheelhouse -r data/requirements.txt

# Cleaning up virtualenvs

Since a virtualenv is just a directory, we can 'clean it up' by removing the directory:

In [None]:
!rm -r data/env-folder data/env-folder-2 data/wheelhouse data/requirements.txt

In [None]:
!python -m venv --help

# Lab

Open [virtualenv lab][virtualenv-lab]

[virtualenv-lab]: ./virtualenv-lab.ipynb