# Using Virtual Environments

- Difference between virtual environments (virtualenv or venv), containers, and virtual machines
- Creating a virtual environment
- Activating and deactivating virtualenvs
- Installing packages in a virtual environment
- Using requirements.txt files for reproducability
- Using wheels for faster installations
- Cleaning up virtualenvs

# Virtual Machines, Containers, and Virtualenvs

## Virtual Machine

- Isolated image ("guest") of a computer running its own operating system
- Can have different OS than the host
- Examples: vmware, virtualbox, Amazon EC2

## Container

- Partially isolated environment that shares the operating system with the host
- Same OS kernel as host, but with potentially different users/libraries/networks/etc. and system limits
- Examples: Docker, Heroku, Amazon ECS

## Virtualenv

- Partially isolated **python** environment running in the same OS as the 'host'
- **Only** Python packages and environment are isolated:  basically a "private copy of Python"
- No _security_ isolation from host OS: a program running inside a virtualenv can do whatever a program running outside a virtualenv can do
- Similar to local `node_modules` subfolder in a project for Javascript developers

# Why Virtualenvs?

- **Dependency management** - If two different Python applications require two different versions of the same package, running each app in its own virtualenv allows both versions to be available to their respective applications
- **Keeps your system Python pristine** - Many OSes use Python to implement some of the OS tooling (RedHat in particular). This often results in an older version of Python, or particular versions of Python packages installed globally that you *should not modify* if you want your system tools to keep working.
- **Helps with reproducibility** - Virtualenvs allow you to note the versions of all packages installed in your venv in order to recreate the virtualenv on another machine. This prevents the "Works on My Machine" certification.

# Creating a virtualenv


## Installing virtualenv

Since Python 3.3, Python has included a tool to create virtual environments called `venv` in the standard library. 

If, however, you are developing on Ubuntu, you must separately install it anyway with `apt-get install python-venv`.



## Creating the virtualenv

To create a virtual environment, you invoke the `venv` module with the virtualenv name:

```shell
$ python -m venv env-folder
```

This command

- creates a folder named `env-folder`
- copies the Python you used to invoke `venv` into that folder
- creates a couple of helper scripts inside env-folder to activate/deactivate the virtualenv

In [1]:
!/usr/bin/python -m venv data/env-folder

We can see the directory structure that the virtual environment created with the `tree` command. If you don't have `tree`, you can install it on a Mac using homebrew:

```bash
$ brew install tree
```

In [2]:
!tree -d data/env-folder

[01;34mdata/env-folder[00m
├── [01;34mbin[00m
├── [01;34minclude[00m
├── [01;34mlib[00m
│   └── [01;34mpython3.8[00m
│       └── [01;34msite-packages[00m
│           ├── [01;34m__pycache__[00m
│           ├── [01;34mpip[00m
│           │   ├── [01;34m__pycache__[00m
│           │   ├── [01;34m_internal[00m
│           │   │   ├── [01;34m__pycache__[00m
│           │   │   ├── [01;34mcli[00m
│           │   │   │   └── [01;34m__pycache__[00m
│           │   │   ├── [01;34mcommands[00m
│           │   │   │   └── [01;34m__pycache__[00m
│           │   │   ├── [01;34mdistributions[00m
│           │   │   │   └── [01;34m__pycache__[00m
│           │   │   ├── [01;34mindex[00m
│           │   │   │   └── [01;34m__pycache__[00m
│           │   │   ├── [01;34mmodels[00m
│           │   │   │   └── [01;34m__pycache__[00m
│           │   │   ├── [01;34mnetwork[00m
│           │   │   │   └── [01;34m__pycache__[00m
│           

In [3]:
!ls -l data/env-folder/bin

total 44
-rw-r--r-- 1 rick446 rick446 8834 Jan 21 13:11 Activate.ps1
-rw-r--r-- 1 rick446 rick446 2246 Jan 21 13:11 activate
-rw-r--r-- 1 rick446 rick446 1298 Jan 21 13:11 activate.csh
-rw-r--r-- 1 rick446 rick446 2450 Jan 21 13:11 activate.fish
-rwxr-xr-x 1 rick446 rick446  279 Jan 21 13:11 easy_install
-rwxr-xr-x 1 rick446 rick446  279 Jan 21 13:11 easy_install-3.8
-rwxr-xr-x 1 rick446 rick446  270 Jan 21 13:11 pip
-rwxr-xr-x 1 rick446 rick446  270 Jan 21 13:11 pip3
-rwxr-xr-x 1 rick446 rick446  270 Jan 21 13:11 pip3.8
lrwxrwxrwx 1 rick446 rick446   15 Jan 21 13:11 python -> /usr/bin/python
lrwxrwxrwx 1 rick446 rick446    6 Jan 21 13:11 python3 -> python


### (windows note)

If you are using Windows, there should be a `Scripts` folder under the environment folder instead of `bin`, and it should contain an `activate.bat` file.

You can invoke the Python in your new virtualenv by specifying the full path:

In [4]:
!data/env-folder/bin/python --version

Python 3.8.5


In [5]:
!data/env-folder/bin/python -c 'import sys; print(sys.executable)'

/home/rick446/src/arborian-classes/src/data/env-folder/bin/python


In [6]:
!data/env-folder/bin/python -c 'import sys; print(sys.path)'

['', '/usr/lib/python38.zip', '/usr/lib/python3.8', '/usr/lib/python3.8/lib-dynload', '/home/rick446/src/arborian-classes/src/data/env-folder/lib/python3.8/site-packages']


## Activating virtual environments

More commonly, we will *activate* the virtualenv for our current shell by `source`-ing the `activate` script

### Linux

```shell
$ source env-folder/bin/activate
(env-folder) $
```

or

```shell
$ . env-folder/bin/activate
(env-folder) $
```

### Windows

```shell
c:\...> env-folder\Scripts\activate.bat
(env-folder) c:\...>
```

Activating the virtualenv does a few things to your *current shell/terminal window only*:

- Puts the virtualenv's executable folder (`bin` or `Scripts`) at the beginning of your path so the virtualenv python will be picked up automatically
- Changes your prompt so you see that you are in the virtualenv
- Makes a `deactivate` command available to undo the changes

## Deactivating virtual environments

### Linux

```shell
(env-folder) $ deactivate
$
```

### Windows

```shell
(env-folder) c:\...> deactivate
c:\...> 
```


In [7]:
%%bash
echo "ACTIVATE"
source data/env-folder/bin/activate
which python
echo My prompt is now $PS1
python -c 'import sys; print(sys.executable)'
echo "DEACTIVATE"
deactivate
which python
python -c 'import sys; print(sys.executable)'

ACTIVATE
/home/rick446/src/arborian-classes/src/data/env-folder/bin/python
My prompt is now (env-folder)
/home/rick446/src/arborian-classes/src/data/env-folder/bin/python
DEACTIVATE
/home/rick446/.virtualenvs/classes/bin/python
/home/rick446/.virtualenvs/classes/bin/python


# Installing packages in virtual environments

When the virtual environment is activated, or when you invoke the version of Python in the virtualenv, you can install third-party packages into the virtualenv without modifying your system Python:

In [8]:
%%bash
set -e
source data/env-folder/bin/activate
which python
pip install -U pip
pip install numpy
python -c 'import numpy; print(numpy)'

/home/rick446/src/arborian-classes/src/data/env-folder/bin/python
Looking in links: /home/rick446/src/wheelhouse
Collecting pip
  Using cached pip-20.3.3-py2.py3-none-any.whl (1.5 MB)
Installing collected packages: pip
  Attempting uninstall: pip
    Found existing installation: pip 20.0.2
    Uninstalling pip-20.0.2:
      Successfully uninstalled pip-20.0.2
Successfully installed pip-20.3.3
Looking in links: /home/rick446/src/wheelhouse
Collecting numpy
  Downloading numpy-1.19.5-cp38-cp38-manylinux2010_x86_64.whl (14.9 MB)
Installing collected packages: numpy
Successfully installed numpy-1.19.5
<module 'numpy' from '/home/rick446/src/arborian-classes/src/data/env-folder/lib/python3.8/site-packages/numpy/__init__.py'>


# Using requirements.txt for reproducibility

Once you have your app in your virtualenv running, you may need to reproduce the virtualenv on another machine. 
`pip` has a command `freeze` which outputs the exact versions of all packages installed in a virtualenv:

In [10]:
%%bash
set -e
source data/env-folder/bin/activate
data/env-folder/bin/pip freeze

-f /home/rick446/src/wheelhouse
numpy==1.19.5
pkg-resources==0.0.0


DEPRECATION: --find-links option in pip freeze is deprecated. pip 21.2 will remove support for this functionality. You can find discussion regarding this at https://github.com/pypa/pip/issues/9069.


Normally, we'll put this into a file `requirements.txt` that we check into source control and distribute with our project:

In [11]:
%%bash
set -e
source data/env-folder/bin/activate
pip freeze > data/requirements.txt

DEPRECATION: --find-links option in pip freeze is deprecated. pip 21.2 will remove support for this functionality. You can find discussion regarding this at https://github.com/pypa/pip/issues/9069.


In [12]:
cat data/requirements.txt

-f /home/rick446/src/wheelhouse
numpy==1.19.5
pkg-resources==0.0.0


Once we have the requirements.txt file, we can create a new virtualenv and install all the same versions of packages into it:

In [13]:
%%bash
set -e
python -m venv data/env-folder-2
source data/env-folder-2/bin/activate
python -m pip install -r data/requirements.txt

Looking in links: /home/rick446/src/wheelhouse, /home/rick446/src/wheelhouse
Collecting numpy==1.19.5
  Using cached numpy-1.19.5-cp38-cp38-manylinux2010_x86_64.whl (14.9 MB)
Installing collected packages: numpy
Successfully installed numpy-1.19.5


# Using wheels for faster installations

While `pip` tries to cache as much data as possible, we can do even better by using "wheels." 

Wheels are Python packages that have been compiled (if necessary) for a particular target architecture and are thus much faster to install. 

If you're moving to a new machine (for instance, when deploying to production) it can also be useful to have the wheels cached locally so `pip` doesn't try to download the packages from the Python Package Index.

In [14]:
%%bash
set -e
source data/env-folder/bin/activate
pip install scipy scikit-learn jupyter simplejson pymongo boto3 wheel
pip freeze > data/requirements.txt

Looking in links: /home/rick446/src/wheelhouse
Collecting boto3
  Downloading boto3-1.16.58-py2.py3-none-any.whl (130 kB)
Collecting botocore<1.20.0,>=1.19.58
  Downloading botocore-1.19.58-py2.py3-none-any.whl (7.2 MB)
Collecting jmespath<1.0.0,>=0.7.1
  Using cached jmespath-0.10.0-py2.py3-none-any.whl (24 kB)
Collecting python-dateutil<3.0.0,>=2.1
  Using cached python_dateutil-2.8.1-py2.py3-none-any.whl (227 kB)
Collecting s3transfer<0.4.0,>=0.3.0
  Downloading s3transfer-0.3.4-py2.py3-none-any.whl (69 kB)
Collecting six>=1.5
  Using cached six-1.15.0-py2.py3-none-any.whl (10 kB)
Collecting urllib3<1.27,>=1.25.4
  Using cached urllib3-1.26.2-py2.py3-none-any.whl (136 kB)
Collecting jupyter
  Using cached jupyter-1.0.0-py2.py3-none-any.whl (2.7 kB)
Collecting pymongo
  Using cached pymongo-3.11.2-cp38-cp38-manylinux2014_x86_64.whl (531 kB)
Collecting scikit-learn
  Downloading scikit_learn-0.24.1-cp38-cp38-manylinux2010_x86_64.whl (24.9 MB)
Collecting scipy
  Downloading scipy-1.6.0

DEPRECATION: --find-links option in pip freeze is deprecated. pip 21.2 will remove support for this functionality. You can find discussion regarding this at https://github.com/pypa/pip/issues/9069.


In [15]:
cat data/requirements.txt

-f /home/rick446/src/wheelhouse
argon2-cffi==20.1.0
async-generator==1.10
attrs==20.3.0
backcall==0.2.0
bleach==3.2.2
boto3==1.16.58
botocore==1.19.58
cffi==1.14.4
decorator==4.4.2
defusedxml==0.6.0
entrypoints==0.3
ipykernel==5.4.3
ipython==7.19.0
ipython-genutils==0.2.0
ipywidgets==7.6.3
jedi==0.18.0
Jinja2==2.11.2
jmespath==0.10.0
joblib==1.0.0
jsonschema==3.2.0
jupyter==1.0.0
jupyter-client==6.1.11
jupyter-console==6.2.0
jupyter-core==4.7.0
jupyterlab-pygments==0.1.2
jupyterlab-widgets==1.0.0
MarkupSafe==1.1.1
mistune==0.8.4
nbclient==0.5.1
nbconvert==6.0.7
nbformat==5.1.2
nest-asyncio==1.4.3
notebook==6.2.0
numpy==1.19.5
packaging==20.8
pandocfilters==1.4.3
parso==0.8.1
pexpect==4.8.0
pickleshare==0.7.5
pkg-resources==0.0.0
prometheus-client==0.9.0
prompt-toolkit==3.0.11
ptyprocess==0.7.0
pycparser==2.20
Pygments==2.7.4
pymongo==3.11.2
pyparsing==2.4.7
pyrsistent==0.17.3
python-dateutil==2.8.1
pyzmq==21.0.1
qtconsole==5.0.2
QtPy=

In [17]:
%%bash
set -e
source data/env-folder/bin/activate
pip wheel -w data/wheelhouse -r data/requirements.txt

Looking in links: /home/rick446/src/wheelhouse, /home/rick446/src/wheelhouse
Processing /home/rick446/src/wheelhouse/argon2_cffi-20.1.0-cp37-abi3-linux_x86_64.whl
Collecting async-generator==1.10
  Using cached async_generator-1.10-py3-none-any.whl (18 kB)
Collecting attrs==20.3.0
  Using cached attrs-20.3.0-py2.py3-none-any.whl (49 kB)
Collecting backcall==0.2.0
  Using cached backcall-0.2.0-py2.py3-none-any.whl (11 kB)
Collecting bleach==3.2.2
  Using cached bleach-3.2.2-py2.py3-none-any.whl (146 kB)
Collecting boto3==1.16.58
  Using cached boto3-1.16.58-py2.py3-none-any.whl (130 kB)
Collecting botocore==1.19.58
  Using cached botocore-1.19.58-py2.py3-none-any.whl (7.2 MB)
Collecting cffi==1.14.4
  Using cached cffi-1.14.4-cp38-cp38-manylinux1_x86_64.whl (411 kB)
Collecting decorator==4.4.2
  Using cached decorator-4.4.2-py2.py3-none-any.whl (9.2 kB)
Collecting defusedxml==0.6.0
  Using cached defusedxml-0.6.0-py2.py3-none-any.whl (23 kB)
Collecting entrypoints==0.3
  Using cached en

In [18]:
ls data/wheelhouse

Jinja2-2.11.2-py2.py3-none-any.whl
MarkupSafe-1.1.1-cp38-cp38-manylinux1_x86_64.whl
Pygments-2.7.4-py3-none-any.whl
QtPy-1.9.0-py2.py3-none-any.whl
Send2Trash-1.5.0-py3-none-any.whl
argon2_cffi-20.1.0-cp37-abi3-linux_x86_64.whl
async_generator-1.10-py3-none-any.whl
attrs-20.3.0-py2.py3-none-any.whl
backcall-0.2.0-py2.py3-none-any.whl
bleach-3.2.2-py2.py3-none-any.whl
boto3-1.16.58-py2.py3-none-any.whl
botocore-1.19.58-py2.py3-none-any.whl
cffi-1.14.4-cp38-cp38-manylinux1_x86_64.whl
decorator-4.4.2-py2.py3-none-any.whl
defusedxml-0.6.0-py2.py3-none-any.whl
entrypoints-0.3-py2.py3-none-any.whl
ipykernel-5.4.3-py3-none-any.whl
ipython-7.19.0-py3-none-any.whl
ipython_genutils-0.2.0-py2.py3-none-any.whl
ipywidgets-7.6.3-py2.py3-none-any.whl
jedi-0.18.0-py2.py3-none-any.whl
jmespath-0.10.0-py2.py3-none-any.whl
joblib-1.0.0-py3-none-any.whl
jsonschema-3.2.0-py2.py3-none-any.whl
jupyter-1.0.0-py2.py3-none-any.whl
jupyter_client-6.1.11-py3-none-any.whl
jupyter_console-

Now we can distribute the `data/wheelhouse` directory with our project and install everything from the wheelhouse and not fetch from PyPI:

In [19]:
%%bash
set -e
source data/env-folder-2/bin/activate
pip install --no-index -f data/wheelhouse -r data/requirements.txt

Looking in links: /home/rick446/src/wheelhouse, data/wheelhouse, /home/rick446/src/wheelhouse
Processing /home/rick446/src/wheelhouse/argon2_cffi-20.1.0-cp37-abi3-linux_x86_64.whl
Processing /home/rick446/src/wheelhouse/async_generator-1.10-py3-none-any.whl
Processing /home/rick446/src/wheelhouse/attrs-20.3.0-py2.py3-none-any.whl
Processing /home/rick446/src/wheelhouse/backcall-0.2.0-py2.py3-none-any.whl
Processing /home/rick446/src/arborian-classes/data/wheelhouse/bleach-3.2.2-py2.py3-none-any.whl
Processing /home/rick446/src/arborian-classes/data/wheelhouse/boto3-1.16.58-py2.py3-none-any.whl
Processing /home/rick446/src/arborian-classes/data/wheelhouse/botocore-1.19.58-py2.py3-none-any.whl
Processing /home/rick446/src/arborian-classes/data/wheelhouse/cffi-1.14.4-cp38-cp38-manylinux1_x86_64.whl
Processing /home/rick446/src/wheelhouse/decorator-4.4.2-py2.py3-none-any.whl
Processing /home/rick446/src/wheelhouse/defusedxml-0.6.0-py2.py3-none-any.whl
Processing /home/rick446/src/wheelhous

# Cleaning up virtualenvs

Since a virtualenv is just a directory, we can 'clean it up' by removing the directory:

In [20]:
!rm -r data/env-folder data/env-folder-2 data/wheelhouse data/requirements.txt

In [21]:
!python -m venv --help

usage: venv [-h] [--system-site-packages] [--symlinks | --copies] [--clear]
            [--upgrade] [--without-pip] [--prompt PROMPT]
            ENV_DIR [ENV_DIR ...]

Creates virtual Python environments in one or more target directories.

positional arguments:
  ENV_DIR               A directory to create the environment in.

optional arguments:
  -h, --help            show this help message and exit
  --system-site-packages
                        Give the virtual environment access to the system
                        site-packages dir.
  --symlinks            Try to use symlinks rather than copies, when symlinks
                        are not the default for the platform.
  --copies              Try to use copies rather than symlinks, even when
                        symlinks are the default for the platform.
  --clear               Delete the contents of the environment directory if it
                        already exists, before environment creation.
  -

# Lab

Open [virtualenv lab][virtualenv-lab]

[virtualenv-lab]: ./virtualenv-lab.ipynb