# Virtual Environments and Dependency Management

Python applications will often use packages and modules that don't come as part of the standard library.

There'll be times when it might not be possible for a Python installation to meet the requirements of every application using that Python installation: one application might need version 1 of one package A, while other might need version 2 of the same package, and Python do not support that.

These common situations can be easily solved by creating a **virtual environment** &mdash; a self-contained directory tree that contains a Python installation for a particular version of Python, and a number of packages that are available for the applications running on it.

In this notebook we will describe how to create and manage Python virtual environments with different tools:
+ Using **conda** (preferred way for not too experienced Pythonists).
+ Using **venv**, the native Python tool (preferred).

## Section 1 &mdash; Virtual Environments using `conda`

This section deals with the creation and management of virtual environments using **conda** and **miniconda**.

With a few simple commands, you will be able to streamline the activities related to Python environment creation.

| NOTE: |
| :---- |
| This section assumes you have `miniconda` installed in your system. If not, please refer to [ 00: First steps: Installing Python and intro to virtual environments](00_install-first-steps.ipynb). |

### Creating an environment

To create an environment type:

```bash
conda create --name env-name \
  python=x.y.z \
  {pkg_1} {pkg_2} {pkg3}
```

You can also create an environment with a particular version of package using the variant:

```bash
conda create --name env-name \
  python=x.y.z \
  pkg_1=1.23.4 pkg_2 pkg3
```

| NOTE: |
| :---- |
| Not all the Python versions may be available in `conda`. You can search for the ones you can use typing `conda search "^python$"`. |

For example, you can create an environment with the corresponding tools that you need to interact with AWS:

```bash
(base) $ conda create --name aws \
  python=3.11.3
```

### Switching to a different environment

You can list the available environments in your system doing:

```bash
$ conda env list
# conda environments:
#
base                  *  /home/ubuntu/miniconda3
aws                      /home/ubuntu/miniconda3/envs/aws
```


Alternatively, you can also do:
```bash
$ conda info --envs
# conda environments:
#
base                  *  /home/ubuntu/miniconda3
aws                      /home/ubuntu/miniconda3/envs/aws
```

You can switch to a different conda environment by doing:

```bash
(base) $ conda deactivate base
$ conda activate aws
(aws) $
```

### Listing the packages available in a specific environment

Use `conda list` to see the list of packages in the current environment:

```bash
(aws) $ conda list
# packages in environment at /home/ubuntu/miniconda3/envs/aws:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                        main
_openmp_mutex             5.1                       1_gnu
botocore                  1.29.140                 pypi_0    pypi
bzip2                     1.0.8                h7b6447c_0
ca-certificates           2023.01.10           h06a4308_0
git-remote-codecommit     1.16                     pypi_0    pypi
jmespath                  1.0.1                    pypi_0    pypi
ld_impl_linux-64          2.38                 h1181459_1
libffi                    3.4.4                h6a678d5_0
libgcc-ng                 11.2.0               h1234567_1
libgomp                   11.2.0               h1234567_1
libstdcxx-ng              11.2.0               h1234567_1
libuuid                   1.41.5               h5eee18b_0
ncurses                   6.4                  h6a678d5_0
openssl                   1.1.1t               h7f8727e_0
pip                       23.1.2                   pypi_0    pypi
python                    3.11.3               h7a1cb2a_0
python-dateutil           2.8.2                    pypi_0    pypi
readline                  8.2                  h5eee18b_0
setuptools                66.0.0          py311h06a4308_0
six                       1.16.0                   pypi_0    pypi
sqlite                    3.41.2               h5eee18b_0
tk                        8.6.12               h1ccaba5_0
tzdata                    2023c                h04d1e81_0
urllib3                   1.26.16                  pypi_0    pypi
wheel                     0.38.4          py311h06a4308_0
xz                        5.4.2                h5eee18b_0
zlib                      1.2.13               h5eee18b_0
```

### Installing packages in a specific environment

To install packages in a particular environment you can use `conda install`

```bash
conda install --name <your-env> <package>
```

For example, to install `pip` in your `aws` environment you should do:

```bash
conda install --name aws pip
```

If you omit the environment name, the package will be installed into the current environment:

```bash
(aws) $ conda install pip
```

You can specify the desired version of the package with the syntax:

```bash
(aws) $ conda install pkg_name=x.y.z
```

You can search for packages from your browser in the URL: https://anaconda.org/

In the results, you will see the package owner, as well as the *channel* in which it is published:

For example, `cudatoolkit` is a package published in the `conda-forge` channel.

![Anaconda package search](pics/anaconda-package-search.png)

| NOTE: |
| :---- |
| Conda channels are the locations where packages are stored. Conda packages are downloaded from remote channels, which are URLs to directories containing conda packages. |

In order to install a package from a particular channel, you need to do:

```bash
conda install -c conda-forge cudatoolkit
```

You can remove packages doing:

```bash
conda remove package-name
```

You can remove all of the packages using:

```bash
conda remove --name env-name --all
```

| NOTE: |
| :---- |
| The command to remove an environment is the same you use to remove packages. |


### Removing/Deleting a previously created environment

You can use `conda remove` to delete a conda environment from your system.

For example, to remove the `aws` environment recently created do:

```bash
(base) $ conda remove --name aws --all
```

### Associating certain environment variables to a conda environment

`conda` provides the *config API* to associate environment variables to a particular environment.

To list the variables defined in the current environment do:

```bash
conda env config vars list
```

To set an environment variable use:

```bash
(aws) $ conda env config vars set DEFAULT_REGION=us-east-1
To make your changes take effect please reactivate your environment

# Reactivate your environment and then...
(aws)$ conda env config vars list
DEFAULT_REGION = us-east-1
```

To unset a previousle set variable use:
```bash
(aws) $ conda env config vars unset DEFAULT_REGION
To make your changes take effect please reactivate your environment

# Reactivate your environment and then...
(aws)$ conda env config vars list
(aws)$
```

### Using `pip` in a conda environment

Many packages (git-remote-codecommit, TensorFlow, ...) are not published in conda repositories, and therefore cannot be installed doing `conda install`.

In such cases you need to install `pip` and then install those packages using `pip` instead of `conda`.

Let's imagine that we need to install the `git-remote-codecommit` package in our `aws` conda environment:

```bash
(base) $ conda install --name aws pip
(base) $ conda deactivate
$ conda activate aws
(aws) $ python -m pip install git-remote-codecommit
```

As issues may arise when using `pip` and `conda` together, as both are package management tools, the recommendation is:

+ use an isolated environment for `pip`
+ install as many packages as possible with `conda`, then use `pip` to install the remaining software.

    It is discouraged to use `conda` after having used `pip`. In those case it is better to create a new environment in which you run all your conda installations first, and then the pip ones.

+ Make use of text files to pass package requirements and automate the installation of package dependencies.

    + in `conda`, pass package dependencies in a text file using the `--file` argument.
    + in `pip`, pass package requirements with the standard `--requirements requirements.txt`.

### Exporting conda environments on a `environment.yml`

You can export the configuration of a certain conda environment doing:

```bash
(aws) $ conda env export > environment.yml
```

You can then recreate environments doing:

```bash
(base) $ conda env create -f environment.yml
```

### Disabling default conda activation

By default, your conda environment will get activated so that your prompt and Python version are automatically set everytime a new terminal window is opened:

```bash
(base) $ python --version
(base) $ Python 3.10.10
```

To prevent this automatic initialization of the base conda environment you can do:

```bash
(base) $ conda config --set auto_activate_base false
```

Right after that, the base conda environment won't be automatically initialized on new terminal sessions.

## Section 2 &mdash; Virtual environments using `venv`

While `conda` provides a streamlined way to manage virtual environments, it is not the built-in way to do it.

Conversely, `venv` is the official module to create and manage separate virtual environments for your Python projects, and the one that should be used for packaged applications to ensure it does not require any additional dependencies.

As this section is targeted for more experienced Python developers, the content will delve not only on how to interact with venv to create enviroments, but also on additional details about virtual environments so that by the end of this section you should know:
+ How to create and activate a Python virtual environment
+ Explain why it is important to isolate external deps
+ Visualize what Python does when you create a virtual environment
+ Customize your virtual environment
+ Deactivate and remove virtual environments

### Hello, `venv`: the official virtual environment manager

`venv` is the official and recommended way to create virtual environments. It's part of the standard library, and it's been available since Python 3.5.

| NOTE: |
| :---- |
| `venv` and `conda` are alternativate ways of creating virtual environments. |

### Creating and activating a virtual environment

It is recommended to create a new virtual environment each time that you need to work with a Python project that requires external dependencies.

To create a virtual environment run the commands below:

```bash
# Create a new dir to host our Python project and cd into it
(base) $ mkdir my-new-prj
(base) $ cd my-new-prj

# Create the virtual environment (you need a valid Python runtime)
# Using the convention `.venv` as the virtual environnment name
(base) $ python -m venv .venv

# Once the virtual environment is created we can deactivate it...
(base) $ conda deactivate

# ...and activate .venv
$ source .venv/bin/activate
(.venv) $

# Now we can install packages
(.venv) $ python -m pip install numpy
```

| NOTE: |
| :---- |
| Make sure to include your virtual environment name in your `.gitignore`. |

### Deactivating a virtual environment

Once you're done with your virtual environment, you can do:

```bash
(.venv) $ deactivate
$
```

You can reactivate an existing virtual environment doing:

```bash
$ source .venv/bin/activate
(.venv) $
```

### Reasons for virtual environments

Python isn't great at managing dependencies. By default, `pip` will place all the external packages that you install in a folder called `site-packages/` in your base Python installation, which is a bad idea, as that is the Python installation your OS relies on.

Within that `site-packages/` directory you will find two subdirs:
+ `purelib/` for the modules written only in Python
+ `platlib/` for the binaries (not written in Python)

The location of those libraries can be found with the following script:

```python
import sysconfig
print(f"purelib: {sysconfig.get_path('purelib')}")
print(f"platlib: {sysconfig.get_path('platlib')}")
```

Depending on the Python runtime you're using when executing that script you will get very different results:

+ `/usr/lib/python3.8/site-packages` for both when using the OS provided Python runtime.
+ `/home/ubuntu/miniconda3/lib/python3.10/site-packages` for both when using the conda base virtual environment.
+ `/home/ubuntu/.../.venv/lib/python3.10/site-packages` when using the `.venv` virtual environment recently created.

Using the default location for the `site-packages/` will create a lot of side effects as you create more projects, because you won't be able to work with two different versions of the same library.

However, when using virtual environments, that location will be local to your environment.

#### Lab: Validating you can have different versions of a library

Create two virtual environments for two different versions of your app: *app-v1* and *app-v2*. Change the prompt to be able to easily identify them using `--prompt="[label]"` when creating the environment.

Install Django v2.2.26 for v1 and Django 4.0.3 for v2.

Use `python -m pip list` to obtain a report of the libraries used and validate that those are completely different.


You need to run the following commands:

For app-v1
```bash
mkdir app-v1
cd app-v1/
python -m venv .venv --prompt="app-v1"
conda deactivate
source .venv/bin/activate
python -m pip install django==2.2.26

python -m pip list
Package    Version
---------- -------
Django     2.2.26
pip        22.3.1
pytz       2023.3
setuptools 65.5.0
sqlparse   0.4.4
```

Similarly, for app-v2:
```bash
mkdir app-v2
cd app-v3/
python -m venv .venv --prompt="app-v3"
conda deactivate
source .venv/bin/activate
python -m pip install django==4.0.3

python -m pip list
Package           Version
----------------- -------
asgiref           3.7.2
Django            4.0.3
pip               22.3.1
setuptools        65.5.0
sqlparse          0.4.4
typing_extensions 4.6.2
```

### What is a virtual environment

A Python virtual environment is a folder structure that provides a lightweight, isolated, Python environment for your projects.

It should be somewhat similar to the following (it may vary depending on the Python version):


```
venv/
│
├── bin/                   # executables of your virtual env 
│   ├── Activate.ps1
│   ├── activate
│   ├── activate.csh
│   ├── activate.fish
│   ├── pip
│   ├── pip3
│   ├── pip3.10
│   ├── python
│   ├── python3
│   └── python3.10
│
├── include/               # header files for C extensions
│
├── lib/                   # site-packages by Python version
│   │
│   └── python3.8/
│       │
│       └── site-packages/
├── lib64/                 # site-packages by Python version
│   │
│   └── python3.8/
│       │
│       └── site-packages/
└── pyvenv.cfg             # virtual env configuration
```

The three important takeaways from this structure are:
+ You get a copy (or a symlink) of the Python binary.
+ The environment is configured in the `pyvenv.cfg` file.
+ Packages are installed in the `site-packages/` directory relative to the virtual environment.

The `pyvenv.cfg` file is a simple key-value pair file with the information about the environment:

```bash
(app-v1) $ cat .venv/pyvenv.cfg
home = /home/ubuntu/miniconda3/bin
include-system-site-packages = false
version = 3.10.10
prompt = 'app-v1'
```

| WARNING: |
| :---- |
| The `home` key contains an absolute path to the `bin/` directory. As a result, copying a virtual environment into another directory is not a good idea, as it will still reference the old one. |

Note that `python -m pip list` do not show the Python standard library packages, but those will be available to your application.

You can configure the virtual environment to get access to the base installation's site packages doing:

```bash
$ python -m venv .venv --system-site-packages
```

When doing so, the `include-system-site-packages` variable will be set to true.

### How does the virtual environment work?

Python is aware of virtual environments, which means that when doing the resolution of dependencies, Python will check for the existence of the `pyvenv.cfg` file and act accordingly.

For example, the following script will print the paths that will be scanned when looking for libraries when using a virtual environment:

```python
$ python
Python 3.10.10 (main, Mar 21 2023, 18:45:11) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> from pprint import pp
>>> pp(sys.path)
['',
 '/home/ubuntu/miniconda3/lib/python310.zip',
 '/home/ubuntu/miniconda3/lib/python3.10',
 '/home/ubuntu/miniconda3/lib/python3.10/lib-dynload',
 '/home/ubuntu/Development/git-repos/delete-me/app-v1/.venv/lib/python3.10/site-packages']
>>>
```

Additionally, when activating your environment:
+ The PATH will be updated so that it finds the sources and packages within the virtual environment directory.
+ The prompt is updated to reflext the name of the active environment.

| NOTE: |
| :---- |
| To deactivate an environment type `deactivate` within a virtual environment. The PATH and the command prompt will be restored to their initial values. |



### Activation vs. Absolute Paths

When working with your code, you will typically activate an environment using:

```bash
source .venv/bin/activate
```

However, there are cases in which you might need to rely on the absolute path of the virtual environment. You will find such cases when dealing with tools that require you to use a single command, for example, `cron`.

In those cases, you can use:

```bash
/home/path/to/venv-name/.venv/bin/python
```

That will allow you to use the virtual environment without having to deal with the activation.

### Customizing a virtual environment

#### Adjusting your environment name

We've been using `.venv` as the name for your virtual environment.

It's a convention to name your virtual environment as:
+ `venv`
+ `env`
+ `.venv` (preferred)

Sticking to conventions help you get a better configuration for your `.gitignore`, so choose one and go with it.

| NOTE: |
| :---- |
| Make sure to include your virtual environment name (e.g., `.venv`) in your `.gitignore`. |

#### Adjusting your prompt with `--prompt`

By default, your terminal prompt will be reflect the environment name. Under certain circumstances, you might want to adjust your prompt for easier identification. In those cases do:

```bash
$ python -m venv .venv --prompt "app-v1"
$ source .venv/bin/activate
(app-v1) $
```

The effect of the previous command is changing the `.venv/pyvenv.cfg` to add a prompt key:

```bash
$ cat .venv/pyvenv.cfg
home = /home/ubuntu/miniconda3/bin
include-system-site-packages = false
version = 3.10.10
prompt = 'app-v2'
```

#### Resetting existing virtual environment with `--clear`

The argument `--clear` resets the contents of an existing environment. By default, invoking `python -m venv .venv` on an existing environment has no consequence.

However, running:

```bash
(.app-v1) $ deactivate
$ python -m venv .venv --clear
$ source .venv/bin/activate
(.venv) $
```

Will reset the status of the environment. Note for example how the prompt customization has been reverted. Same thing for the libraries, etc.

##### Lab: using `--clear`

Create a virtual environment in which you configure the prompt and install NumPy and update pip.

Then deactivate the environment, clear it, and check that everything has been reverted to the *factory defaults*.

| NOTE: |
| :---- |
| Using `--clear` might get your virtual environment different from the one you had when you first instantiated. That is, the factory default might be different than the initial state.<br>This is especially notable when you created your environment from a conda environment. |

#### Upgrading the version of the initial deps with `--upgrade-deps`

You can ensure that the dependencies initially installed in your virtual environment are up-to-date using:

```bash
python -m venv .venv --upgrade-deps
```

This is useful to get rid of the frustrating message telling you that `pip` is outdated the first time you install new packages in a new virtual environment.

#### Avoiding pip installation with `--without-pip`

If you don't need to install `pip` (for whatever reason, e.g., when running CI/CD) you can do:

```bash
python -m venv .venv --without-pip
```

#### Enabling access to system wide packages with `--system-site-packages`

You can enable access to your global, system-wide, `site-packages/` directory doing:

```bash
python -m venv .venv --system-site-packages
```

#### Upgrading Python to match system's Python

It might happen that you upgrade your OS, and a new Python version gets installed, but your virtual environment's Python is still referencing the old version you no longer have.

In those cases, you can do:

```bash
python -m venv .venv --upgrade
```

#### Using certain Python versions in your virtual environment

`venv` does not provide out of the box support to select a particular Python version for your virtual .

However, there are some tricks you can use to do so. For example, when you create a virtual environment from a certain conda environment, the virtual environment you create will inherit the Python version of the conda environment.

##### Lab: Virtual environment with a specific Python version

Create a venv-based virtual environment with a specific Python version (e.g., 3.11.3).



The following trick can be used to create a virtual environment with a specific Python version using `conda` and `venv`.

Start by creating a conda environment and setting up the specific Python version.

```bash
(base) $ conda search "^python$"
$ conda search "^python$"
Loading channels: done
# Name                       Version           Build  Channel
python                        2.7.13     hac47a24_15  pkgs/main
python                        2.7.13     heccc3f1_16  pkgs/main
...
python                        3.11.3      h7a1cb2a_0  pkgs/main

(base) $ conda create --name parent \
  python=3.11.3

(base) $ conda deactivate

$ conda activate parent
(parent) $ python --version
Python 3.11.3
```

Now, use this environment to create a virtual environment:

```bash
(parent) $ mkdir my-new-prj
(parent) $ cd my-new-prj/
(parent) $ python -m venv .venv
(parent) $ conda deactivate
$ source .venv/bin/activate
(.venv) $ python --version
Python 3.11.3
```

### Pinning your dependencies

To make your virtual environments reproducible, you can pin your dependencies by creating a `requirements.txt` file while your virtual environment is active:

```bash
(.venv) $ python -m pip freeze > requirements.txt
```

When doing so, anyone will be able to recreate the environment by just doing:

```bash
$ python -m venv .venv
$ source .venv/bin/activate
(.venv) $ python -m pip install -r requirements.txt
```

Note that while this helps pinning your first-level dependencies, pinning your deps doesn't make your project 100% deterministic, as there is no *lockfile* for 3rd party dependencies. As a result, you might find that 3rd party dependencies are different from ones you intended.

| NOTE: |
| :---- |
| Make sure to include your virtual environment name in your `.gitignore`. |

### Recommendations on virtual environments

+ Treat them as disposables

    Do not place them under source control management, and instead, make sure you include a `requirements.txt` that you can use to re-hydrate the environment from scratch when you clone it.

+ Understand that virtual environments are not self-sufficient Python installations.

    By distributing or copying your project source code along with the associated virtual environment doesn't make them portable. If you need to distribute your code to 3rd parties publish your package in a repository (see https://realpython.com/pypi-publish-python-package/ or https://realpython.com/pyinstaller-python/)

+ Pin your dependencies

### `virtualenv`: an alternative to `venv`

`virtualenv` is a superset of `venv` that includes some useful additional capabilities, but it is not part of the core Python packages.

The following sections lay down some details on `virtualenv`.

#### Creating and activating environments

You can create and activate environments with `virtualenv` doing:

```bash
# you might need to install it firs
$ sudo apt update
$ sudo apt install python3-virtualenv

$ virtualenv {env-name}
$ source {env-name}/bin/activate
({env-name}) $
```

#### Creating a virtualenv with a particular Python version

Let's suppose that you want to create a virtual environment with Python 3.11, but your Linux environment is still running 3.8.

You can install Python 3.11 and configure your virtual environment with `virtualenv` to use by typing:

```bash
# Install Python through an APT repository
$ sudo add-apt-repository ppa:deadsnakes/ppa

# update and install the desired Python version
$ sudo apt update

# python3.11 distutils might be required by virtualenv
$ sudo apt install python3.11 python3.11-distutils
```

If everything went well, Python would be install in `usr/bin/python3.11`.

Now you can use `virtualenv` to use that configuration:

```bash
$ virtualenv -p /usr/bin/python3.11 .venv
(.venv) $ python --version
Python 3.11.0
```

### References

+ https://realpython.com/installing-python/#how-to-install-python-on-linux
+ https://realpython.com/python-virtual-environments-a-primer/


## ToDo

- [ ] Understand other options beyond `conda` and `venv` should be moved to a separate section

- create cheatsheet of commands:
  + pip list: obtain a report of dependencies
  + pip freeze: obtain a list of the current dependent packages and their versions for the current active environment.

#