# Using Virtual Environments

- Difference between virtual environments (virtualenv or venv), containers, and virtual machines
- Creating a virtual environment
- Activating and deactivating virtualenvs
- Installing packages in a virtual environment
- Using requirements.txt files for reproducability
- Using wheels for faster installations
- Cleaning up virtualenvs

# Virtual Machines, Containers, and Virtualenvs

## Virtual Machine

- Isolated image ("guest") of a computer running its own operating system
- Can have different OS than the host
- Examples: vmware, virtualbox, Amazon EC2

## Container

- Partially isolated environment that shares the operating system with the host
- Same OS kernel as host, but with potentially different users/libraries/networks/etc. and system limits
- Examples: Docker, Heroku, Amazon ECS

## Virtualenv

- Partially isolated **python** environment running in the same OS as the 'host'
- **Only** Python packages and environment are isolated:  basically a "private copy of Python"
- No _security_ isolation from host OS: a program running inside a virtualenv can do whatever a program running outside a virtualenv can do
- Similar to local `node_modules` subfolder in a project for Javascript developers

# Why Virtualenvs?

- **Dependency management** - If two different Python applications require two different versions of the same package, running each app in its own virtualenv allows both versions to be available to their respective applications
- **Keeps your system Python pristine** - Many OSes use Python to implement some of the OS tooling (RedHat in particular). This often results in an older version of Python, or particuar versions of Python packages installed globally that you *should not modify* if you want your system tools to keep working.
- **Helps with reproducability** - Virtualenvs allow you to note the versions of all packages installed in your venv in order to recreate the virtualenv on another machine. This prevents the "Works on My Machine" certification.

# Creating a virtualenv


## Installing virtualenv

Since Python 3.3, Python has included a tool to create virtual environments called `venv` in the standard library. 

If, however, you are developing on Ubuntu, you must separately install it anyway with `apt-get install python-venv`.



## Creating the virtualenv

To create a virtual environment, you invoke the `venv` module with the virtualenv name:

```shell
$ python -m venv env-folder
```

This command

- creates a folder named `env-folder`
- copies the Python you used to invoke `venv` into that folder
- creates a couple of helper scripts inside env-folder to activate/deactivate the virtualenv

In [1]:
!which python

/Users/rick446/.virtualenvs/productionalizing-notebooks/bin/python


In [2]:
!python -m venv data/env-folder

We can see the directory structure that the virtual environment created with the `tree` command. If you don't have `tree`, you can install it on a Mac using homebrew:

```bash
$ brew install tree
```

In [3]:
!tree -d data/env-folder

data/env-folder
├── bin
├── include
└── lib
    └── python3.7
        └── site-packages
            ├── __pycache__
            ├── pip
            │   ├── __pycache__
            │   ├── _internal
            │   │   ├── __pycache__
            │   │   ├── cli
            │   │   │   └── __pycache__
            │   │   ├── commands
            │   │   │   └── __pycache__
            │   │   ├── models
            │   │   │   └── __pycache__
            │   │   ├── operations
            │   │   │   └── __pycache__
            │   │   ├── req
            │   │   │   └── __pycache__
            │   │   ├── utils
            │   │   │   └── __pycache__
            │   │   └── vcs
            │   │       └── __pycache__
            │   └── _vendor
            │       ├── __pycache__
            │       ├── cachecontrol
            │       │   ├── __pycache__
            │       │   └── caches
            │       │       └── __pycache__
            │       ├─

In [4]:
ls data/env-folder/bin

activate          [31measy_install[m[m*     [31mpip3[m[m*             [35mpython3[m[m@
activate.csh      [31measy_install-3.7[m[m* [31mpip3.7[m[m*
activate.fish     [31mpip[m[m*              [35mpython[m[m@


### (windows note)

If you are using Windows, there should be a `Scripts` folder under the environment folder instead of `bin`, and it should contain an `activate.bat` file.

You can invoke the Python in your new virtualenv by specifying the full path:

In [5]:
!data/env-folder/bin/python --version

Python 3.7.2


In [6]:
!data/env-folder/bin/python -c 'import sys; print(sys.executable)'

/Users/rick446/src/arborian-classes/data/env-folder/bin/python


## Activating virtual environments

More commonly, we will *activate* the virtualenv for our current shell by `source`-ing the `activate` script

### Linux

```shell
$ source env-folder/bin/activate
(env-folder) $
```

or

```shell
$ . env-folder/bin/activate
(env-folder) $
```

### Windows

```shell
c:\...> env-folder\Scripts\activate.bat
(env-folder) c:\...>
```

Activating the virtualenv does a few things to your *current shell/terminal window only*:

- Puts the virtualenv's executable folder (`bin` or `Scripts`) at the beginning of your path so the virtualenv python will be picked up automatically
- Changes your prompt so you see that you are in the virtualenv
- Makes a `deactivate` command available to undo the changes

## Deactivating virtual environments

### Linux

```shell
(env-folder) $ deactivate
$
```

### Windows

```shell
(env-folder) c:\...> deactivate
c:\...> 
```


In [7]:
%%bash
source data/env-folder/bin/activate
which python
echo My prompt is now $PS1
python -c 'import sys; print(sys.executable)'
deactivate
which python
python -c 'import sys; print(sys.executable)'

/Users/rick446/src/arborian-classes/src/data/env-folder/bin/python
My prompt is now (env-folder)
/Users/rick446/src/arborian-classes/data/env-folder/bin/python
/Users/rick446/.virtualenvs/productionalizing-notebooks/bin/python
/Users/rick446/.virtualenvs/productionalizing-notebooks/bin/python


# Installing packages in virtual environments

When the virtual environment is activated, or when you invoke the version of Python in the virtualenv, you can install third-party packages into the virtualenv without modifying your system Python:

In [8]:
%%bash
set -e
source data/env-folder/bin/activate
which python
python -m pip install -U pip
python -m pip install numpy
python -c 'import numpy; print(numpy)'

/Users/rick446/src/arborian-classes/src/data/env-folder/bin/python
Collecting pip
  Using cached https://files.pythonhosted.org/packages/d7/41/34dd96bd33958e52cb4da2f1bf0818e396514fd4f4725a79199564cd0c20/pip-19.0.2-py2.py3-none-any.whl
Installing collected packages: pip
  Found existing installation: pip 18.1
    Uninstalling pip-18.1:
      Successfully uninstalled pip-18.1
Successfully installed pip-19.0.2
Collecting numpy
  Using cached https://files.pythonhosted.org/packages/46/e4/4a0cc770e4bfb34b4e10843805fef67b9a94027e59162a586c776f35c5bb/numpy-1.16.1-cp37-cp37m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl
Installing collected packages: numpy
Successfully installed numpy-1.16.1
<module 'numpy' from '/Users/rick446/src/arborian-classes/data/env-folder/lib/python3.7/site-packages/numpy/__init__.py'>


# Using requirements.txt for reproducibility

Once you have your app in your virtualenv running, you may need to reproduce the virtualenv on another machine. 
`pip` has a command `freeze` which outputs the exact versions of all packages installed in a virtualenv:

In [9]:
%%bash
set -e
source data/env-folder/bin/activate
pip freeze

numpy==1.16.1


Normally, we'll put this into a file `requirements.txt` that we check into source control and distribute with our project:

In [10]:
%%bash
set -e
source data/env-folder/bin/activate
pip freeze > data/requirements.txt

Once we have the requirements.txt file, we can create a new virtualenv and install all the same versions of packages into it:

In [11]:
%%bash
set -e
python -m venv data/env-folder-2
source data/env-folder-2/bin/activate
pip install -r data/requirements.txt

Collecting numpy==1.16.1 (from -r data/requirements.txt (line 1))
  Using cached https://files.pythonhosted.org/packages/46/e4/4a0cc770e4bfb34b4e10843805fef67b9a94027e59162a586c776f35c5bb/numpy-1.16.1-cp37-cp37m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl
Installing collected packages: numpy
Successfully installed numpy-1.16.1


You are using pip version 18.1, however version 19.0.2 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.


# Using wheels for faster installations

While `pip` tries to cache as much data as possible, we can do even better by using "wheels." 

Wheels are Python packages that have been compiled (if necessary) for a particular target architecture and are thus much faster to install. 

If you're moving to a new machine (for instance, when deploying to production) it can also be useful to have the wheels cached locally so `pip` doesn't try to download the packages from the Python Package Index.

In [12]:
%%bash
set -e
source data/env-folder/bin/activate
pip install scipy sklearn jupyter simplejson pymongo boto3 wheel
pip freeze > data/requirements.txt

Collecting scipy
  Using cached https://files.pythonhosted.org/packages/dd/6c/ccf7403d14f0ab0f20ce611696921f204f4ffce99a4fd383c892a6a7e9eb/scipy-1.2.1-cp37-cp37m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl
Collecting sklearn
Collecting jupyter
  Using cached https://files.pythonhosted.org/packages/83/df/0f5dd132200728a86190397e1ea87cd76244e42d39ec5e88efd25b2abd7e/jupyter-1.0.0-py2.py3-none-any.whl
Collecting simplejson
Collecting pymongo
  Using cached https://files.pythonhosted.org/packages/d7/ac/d2e324c1f9bcf653fa106785371a16b4709506a35b04948655de8b961a85/pymongo-3.7.2-cp37-cp37m-macosx_10_9_x86_64.whl
Collecting boto3
  Using cached https://files.pythonhosted.org/packages/95/bd/8dc8a8484c49a4c75f3ae0b67e503397e8b932cc3278bd8fa1584aa568b2/boto3-1.9.94-py2.py3-none-any.whl
Collecting wheel
  Using cached https://files.pythonhosted.org/packages/7c/d7/20bd3c501f53fdb0b7387e75c03bd1fce748a1c3dd342fc53744e28e3de1/wheel-0.33.0-py2.py3-n

In [13]:
%%bash
set -e
source data/env-folder/bin/activate
pip wheel -w data/wheelhouse -r data/requirements.txt

Collecting appnope==0.1.0 (from -r data/requirements.txt (line 1))
  Using cached https://files.pythonhosted.org/packages/87/a9/7985e6a53402f294c8f0e8eff3151a83f1fb901fa92909bb3ff29b4d22af/appnope-0.1.0-py2.py3-none-any.whl
  Saved /Users/rick446/src/arborian-classes/data/wheelhouse/appnope-0.1.0-py2.py3-none-any.whl
Collecting backcall==0.1.0 (from -r data/requirements.txt (line 2))
  Saved /Users/rick446/src/arborian-classes/data/wheelhouse/backcall-0.1.0-cp37-none-any.whl
Collecting bleach==3.1.0 (from -r data/requirements.txt (line 3))
  Using cached https://files.pythonhosted.org/packages/ab/05/27e1466475e816d3001efb6e0a85a819be17411420494a1e602c36f8299d/bleach-3.1.0-py2.py3-none-any.whl
  Saved /Users/rick446/src/arborian-classes/data/wheelhouse/bleach-3.1.0-py2.py3-none-any.whl
Collecting boto3==1.9.94 (from -r data/requirements.txt (line 4))
  Using cached https://files.pythonhosted.org/packages/95/bd/8dc8a8484c49a4c75f3ae0b67e503397e8b932cc3278bd8fa1584aa568b2/boto3-1.9.94-py2

Now we can distribute the `data/wheelhouse` directory with our project and install everything from the wheelhouse and not fetch from PyPI:

In [14]:
%%bash
set -e
source data/env-folder-2/bin/activate
pip install --no-index -f data/wheelhouse -r data/requirements.txt

Looking in links: data/wheelhouse
Collecting appnope==0.1.0 (from -r data/requirements.txt (line 1))
Collecting backcall==0.1.0 (from -r data/requirements.txt (line 2))
Collecting bleach==3.1.0 (from -r data/requirements.txt (line 3))
Collecting boto3==1.9.94 (from -r data/requirements.txt (line 4))
Collecting botocore==1.12.94 (from -r data/requirements.txt (line 5))
Collecting decorator==4.3.2 (from -r data/requirements.txt (line 6))
Collecting defusedxml==0.5.0 (from -r data/requirements.txt (line 7))
Collecting docutils==0.14 (from -r data/requirements.txt (line 8))
Collecting entrypoints==0.3 (from -r data/requirements.txt (line 9))
Collecting ipykernel==5.1.0 (from -r data/requirements.txt (line 10))
Collecting ipython==7.2.0 (from -r data/requirements.txt (line 11))
Collecting ipython-genutils==0.2.0 (from -r data/requirements.txt (line 12))
Collecting ipywidgets==7.4.2 (from -r data/requirements.txt (line 13))
Collecting jedi==0.13.2 (from -r data/requirements.txt (line 14))
Co

# Cleaning up virtualenvs

Since a virtualenv is just a directory, we can 'clean it up' by removing the directory:

In [15]:
!rm -r data/env-folder data/env-folder-2 data/wheelhouse data/requirements.txt

# Lab

Open [virtualenv lab][virtualenv-lab]

[virtualenv-lab]: ./virtualenv-lab.ipynb