# Conda - a python package and environment manager

![conda_logo](../images/conda_logo.png)

from https://conda.io/projects/conda/en/latest/index.html

> Conda is an open-source package management system and environment management 
> system that runs on Windows, macOS, and Linux. Conda quickly installs, runs, 
> and updates packages and their dependencies. Conda easily creates, saves, 
> loads, and switches between environments on your local computer. It was 
> created for Python programs but it can package and distribute software for 
> any language.

For this course we recommend `miniconda` (https://docs.conda.io/en/latest/miniconda.html) for installing Python and other open source packages. 🐍

<br />

**Content**

* [Why conda?](#motivation)
* [Installation of conda](#install)
* [Environments](#basics)
* [Package installation with conda](#packages)
* [Using pip and mamba](#pip)
* [Create workshop environemnt](#workshop)

<br />


## Why conda?  <a class="anchor" id="motivation"></a>

🤔

One benefit of the python world is its huge package ecosystem. Many great tools exist which can be installed by any scientist. One part of the work of python coders is to collect and arrange all tools that they plan to use. `Conda` is a solution which can help us to organize and manage this quickly developing ecosystem..

**User managable installations** 💪

With `conda`, we have the power to install packages without the need to ask a system admin to do it.

**Flexible and free access to open source packages** 💪

As it is well documented how packages should be hosted for conda, we have control and access to a large registry of open source packages.

**Environment manager** 💪

`conda` also manages software *environments* on user side allowing us to install different versions of packages.

**Dependency checks**

Conda does *dependency checks* for new packages against *all* preinstalled packages at the same time in order to ensure that they are compatible. `conda` also installs all additional packages required by the target package.


## Concepts <a class="anchor" id="concepts"></a>

- `miniconda`:

Miniconda is a minimal version of `conda` as it only comes with the conda *installer*. The larger `anaconda` program also installs many additional libraries so it sums up to 3GB.

- Channels 📺

Channels are the names of the places where the packages are hosted. They correspond to an *URL* from where the packages are downloaded. A channel usually hosts more than one package but not manditorily. Channels distinguish between *standards* of how packages are tested and built or *licenses*. We highly recommend to use the channel `conda-forge` which is a community channel with high quality tests.

- Environments 🖼️

An environment can be thought of as a Python world completely encapsulated within itself. Several different environments can be created, for example to have different versions of the same software available on the computer at the same time.

- Packages

Packages are compressed tar balls of files (which the user does not have to deal with). They contain *precompiled* programs so that conda does not require compilers on the user system.

## Installation of conda <a class="anchor" id="install"></a>

In Notebook on jupyterhub, `conda` is available within the *Python3* kernel.

If you work in a shell on DKRZ's HPC like levante, you can activate a so called `module` which allows you to use a `conda` installation:

`module load python3/2022.01`

Afterwards, you are able to use all conda commands.

On local PC, the first thing to do is to download the installer for `miniconda` of the respective operating system for `Python 3.x`.

- download miniconda installer from https://docs.conda.io/en/latest/miniconda.html
- run the installer
    + either `double-click on installer file`
    + or run `bash <installer-name>` in a terminal window
- follow the instructions

<div class="alert alert-info">
    <b>Note:</b> Run `conda` commands inside `bash` cells by specifying the magic `%%bash` in the first line of the cell
</div>


On a local PC, as a first step, we need to update the `miniconda` installer with
`conda update --all`

In [None]:
%%bash
#conda update --all -c conda-forge

## Creating an environment `helloworld`  <a class="anchor" id="basics"></a>

As a starting point, we run the `conda info` command which gives us an overview over the configuration of conda. We can see what the `active environment` is and where it is installed:

In [None]:
%%bash
conda info

`conda env` shows you commands which are specific for managing *env*ironments. You can show all available commands with `cond env --help`.

`conda env list` lists all environments while `conda list` lists all packages installed for the active environment. Let us compare these commands:

In [None]:
%%bash
conda env list

In [None]:
%%bash
conda list

In [None]:
%%bash
conda env list

It is always useful to create one environment for each project one works for. This prevents the environments from version conflicts. So let us create a `helloworld` environment with `conda create -n helloworld`:

In [None]:
%%bash
conda create -n helloworld

We can activate the environment with `conda activate helloworld` or `source activate helloworld`:

In [None]:
%%bash
source activate helloworld
conda info

<div class="alert alert-info">
    <b>Note:</b> `source activate` activates the environment for the shell you are working in. If you run this on jupyter notebooks, the shell closes with the end of the cell.
</div>

## Install your first package <a class="anchor" id="packages"></a>

We are moving towards the installation of our first package. You can search for packages by running `conda search <package_name>`. You can specify **channels** for the search by using the parameter `-c <channel_name>` or `--channel <channel_name>`.
<h4 style="color:red"> Exercise: Let's search for `xarray` in the channel `conda-forge`</h4>

In [None]:
%%bash
conda search -c conda-forge xarray

Conda always tries to find and install the most recent package which is by 2022/3 `2022.3.0`. Since our `helloworld` environment is empty, we will be able to install this version without problems. Otherwise, `conda` would check which version of `xarray` is compatible with our environment first. 

Software packages can be easily installed using the `conda install <package-name>` command. Without any additional parameter, the package is installed into the **activated** environment from the **default channel**. We an specify both with the *parameters* `-c <channel_name> -n <environment_name>`.

<h4 style="color:red"> Exercise: Install `xarray` from channel `conda-forge` into your `helloworld` environment</h4>

In [None]:
%%bash
conda install -n helloworld -c conda-forge xarray fsspec

#which is the same as
#source activate helloworld
#conda install -c conda-forge activate

You can also install packages **at the same time** with the creation of environments. `conda create` allows you to specify packages to *install* into the new environment similar to `conda install`:
- Note that you can specify more than one package but rather a space-separated list of packages to install.
- Note that you can specify a specific version of a package by just adding a suffix `<package_name>==<version>`

This results in:

`conda create -n <environment_name> -c <channel_name> <package_name> [<second_package_name>]`.

<h4 style="color:red"> Exercise: Install `xarray`  version 0.18.0 and the package `cartopy` from channel `conda-forge` into a new environment `helloworld2`</h4>

In [None]:
%%bash
conda create -n helloworld2 -c conda-forge xarray==0.18.0 cartopy

<div class="alert alert-info">
    <b>Note:</b> We recommend to set `conda-forge` as a default channel by: `conda config --add channels some-channel`
</div>


## `pip`  and `mamba` <a class="anchor" id="pip"></a>

`pip` is another package manager. Some software packages are not available for `conda` but for `pip` (another package installer https://pypi.org/project/pip/). First, we have to install pip with conda and then we can use pip to install the software in an activated environment.

```bash
conda install -n <environment-name> -c conda-forge pip
pip install <package-name>
```

[Mamba](https://github.com/mamba-org/mamba) is a reimplementation of the conda package manager in C++.

- parallel downloading
- libsolv for much faster dependency solving
- mainly implemented in C++ for maximum efficiency

Mamba can be used in the same way as conda, it is compatible.

`conda install mamba -c conda-forge`

## Clean up and memory management <a class="anchor" id="clean"></a>

Conda can easily exceed users disk space because 

- of its default **package cache**.
- environments can become very large

You can prevent memory problems when you keep your *conda* clean:

**Remove environments**

You can **uninstall** packages with `conda uninstall` and **remove** environemnts with `conda env remove -n <env_name>`.

<h4 style="color:red"> Exercise: Remove the environments `helloworld` and `helloworld2`</h4>

In [None]:
%%bash
conda env remove -n helloworld2

In [None]:
%%bash
conda env list

In [None]:
%%bash
conda clean --all

## Create workshop environment <a class="anchor" id="workshop"></a>

In order to simplify the installation of an entire environment, projects like this python workshop project collect all required packages in a file `environment.yml` in their git repositories. This file can be parsed by `conda env create` so that all packages from that file are installed in one command. The corresponding parameter is `conda env create -f <file_name>`.

<h4 style="color:red"> Exercise: Create the python workhop's environment</h4>

In [None]:
%%bash
#conda env create -f ../environment.yml

## Bonus: Create kernel for notebooks <a class="anchor" id="kernel"></a>

If you want to create a **kernel** based on an environment, you have to install `ipykernel` into the environment.

In [None]:
%%bash
conda create -n helloworld -c conda-forge ipykernel

Afterwards, you have to

- ensure that you have a *kernels* directory in your *home* directory
- activate the environment
- install a new kernel with *ipykernel*. You can specify names for the kernel.

In [None]:
%%bash
mkdir -p /home/k/k204210/kernels
source activate /home/k/k204210/conda-envs/helloworld
python -m ipykernel install --user --name pythoncourse_kernel --display-name="pythoncourse_kernel"