# Managing Dependencies


## Specifying Dependencies

### requirements.txt

Probably the most well known and ubiquitous way of specifying and installing dependencies in Python is with a `requirements.txt` file. This is a text file with a list of the names of packages your code relies on, for example:

```text
geopy
imageio
matplotlib
numpy
requests
```

To install dependencies from a `requirements.txt` file do the following:

```bash
pip install -r requirements.txt
```

`requirements.txt` files are not the only way of specifying dependencies, we'll refer to some others and the differences between them here and later in this module.


### Pinning versions

Different versions of libraries may have different features, behaviour, and interfaces. To ensure our code is reproducible and other users (and ourselves in the future) get the same results from running the code, it's a good idea to specify the version of each dependency that should be installed.

To pin dependencies to specific versions include them in `requirements.txt` like this:

```text
geopy==2.2.0
imageio==2.19.3
matplotlib==3.5.2
numpy==1.23.0
requests==2.28.1
```

To automatically generate a `requirements.txt` file like this, containing the versions of all the libraries installed in your current Python environment, you can run:

```bash
pip freeze
```

However, note that `pip freeze` won't output only your direct dependencies, but also
- the dependencies of your dependencies
- the dependencies of the dependencies of your dependencies
- ...

It may be better to only specify your actual dependencies and let the maintainers of those libraries deal with their own dependencies (but that can also come with future problems and incompatibilities in some cases). 

### Version ranges

You don't have to specify an exact version, you can also use comparisons like `<=`, `!=`, and `>=` to give ranges of package versions that are compatible with your code (see [here](https://peps.python.org/pep-0440/#version-specifiers)).

An interesting one is `~=`, or "approximately equal to". For example, if we specified the numpy dependency as:

```text
numpy~=1.23.0
```

it allows `pip` to install any (newer) `1.23.x` version of numpy (e.g. `1.23.1` or `1.23.5`), but not versions `1.24.0` or later (which may introduce changes that are incompatible with `1.23.0`).


### (How) should you pin dependency versions?

There are potential caveats and pitfalls with all approaches, at the extremes you have:

- **Not specifying a version**:
  - Dependencies are likely to introduce breaking changes in the future that will cause your code to fail or give different results.

- **Pinning an exact version**:
  - Specific versions may not be available on all platforms. You won't get bug and security fixes in new versions.

Generally using a range or "approximately equal" specification like above may be a good approach, with a strategy in place for updating dependencies regularly where needed (see below).


### Updating dependencies

Running

```bash
pip list --outdated
```

will show a list of installed packages that have newer versions available. You can upgrade to the latest version by running:

```bash
pip install --upgrade PACKAGE_NAME
```

(and then update `requirements.txt` to reflect the new version you're using, if needed).

This is quite a manual approach and other tools have more streamlined ways of handling the upgrading process. See, [Poetry](https://python-poetry.org/), for example.

There are also automated tools like [dependabot](https://github.blog/2020-06-01-keep-all-your-packages-up-to-date-with-dependabot/) that can look at the dependencies in your GitHub repo and suggest changes to avoid security vulnerabilities.


## Virtual Environments

Specifying dependency versions may not always be enough to give you a working (and future-proof) set up for yourself and other users of your code. For example, you may have:

- Different projects on your system requiring different versions of a library, or libraries that are incompatible with each other.
- Libraries that are only available on some platforms (e.g. Linux only) or have different behaviour on other platforms.
- Projects requiring different versions of Python itself
- ...



### venv 

https://packaging.python.org/en/latest/guides/installing-using-pip-and-virtual-environments


In [11]:
%%bash

python -m venv myenv

bin
include
lib
pyvenv.cfg


In [None]:
%%bash

ls myenv/

In [16]:
%%bash

which python

/Users/jroberts/opt/anaconda3/envs/rse-course/bin/python


In [17]:
%%bash

source myenv/bin/activate
which python

/Users/jroberts/GitHub/rse-course/module06_software_projects/myenv/bin/python


In [None]:
%%bash

source myenv/bin/activate
pip install -r requirements.txt

In [20]:
%%bash
source myenv/bin/activate

# work in the environment...

deactivate

which python

/Users/jroberts/opt/anaconda3/envs/rse-course/bin/python


### conda

and environment.yml

```yaml
name: myenv

dependencies:
  - python=3.9
  - geopy=2.2.0
  - imageio=2.19.3
  - matplotlib=3.5.2
  - numpy=1.23.0
  - requests=2.28.1
```

```bash
conda env create -f environment.yml
```

```bash
conda activate myenv

# work in the environment

conda deactivate
```


### Docker



## Which to choose?

others pyenv?, ?poetry?)

- requirements.txt, environment.yml, ?pyproject.toml/poetry.lock?
