# Conda

by Landung Setiawan (landungs@uw.edu)

We will start this tutorial by looking at a picture of the perfect python environment.
![Perfect Python Environment](https://geohackweek.github.io/datasharing/assets/img/conda/conda-env.jpeg)

**Definition of python virtual environment: a self-contained directory tree that contains a Python installation for a particular version of Python, plus a number of additional packages.**

---
## 0. Python Environment Discussion

What are the benefits of having a well defined python environment?

*Brainstorming: [https://etherpad.wikimedia.org/p/geohack2019-conda](https://etherpad.wikimedia.org/p/geohack2019-conda)*

<details><summary>Solution</summary>
<p>

1. Avoid future breakage if any dependencies changes
2. Allows better collaboration among team
3. Reliability

</p>
</details>

---
## 1. Ways to create environments and manage packages in Python

1. The most classic is with `pip` as a python package manager, 
along with `virtualenv` as the python environment manager.
2. Next modern classic `pipenv` as a python package and environment manager.
3. What we'll learn, `conda`.

---
## 2. Review from Preliminary

### What is conda?
- An open source **package** and **environment** management system, 
used by the scientific software community
- From their website: *Package, dependency and environment management for any language—Python, R, Ruby, Lua, Scala, Java, JavaScript, C/ C++, FORTRAN*

### Flavors of conda
- Miniconda:
   
   **Lightweight** distribution of conda; only contains the necessary python packages.

- Anaconda:
 
   A **data science platform** distribution of conda; comes with a lot of scientific python packages.

---
## 3. Exercise: Let's try creating a python environment

### Scenerio 1:
Bob is a post-doc. He has been programming in Python for a few years now, 
and he is very comfortable managing his own python environment, 
previously using `pip` and `virtualenv`, but now he's with the "cool" kids using `pipenv`. 
Recently, his studies are shifting more towards a geospatial focus, and he will need python libraries such as gdal, fiona, and netcdf. Let's see what happens.

In [2]:
%%html
<script 
class="scenario-vid" 
id="asciicast-EhWPMkBn6I8jVHEvJrkwfVMAg" 
src="https://asciinema.org/a/EhWPMkBn6I8jVHEvJrkwfVMAg.js" 
data-speed=5 async></script>

**Bob got an error ...**

Bob looked at https://pypi.org/project/GDAL/, but it's still really confusing to set this up... HELP!

## Scenerio 2:
In the other side of the world, we meet Sandy. She is an advanced undergrad that has attended one of the hackweek at UW eScience. She just started to really program in Python. Her senior thesis project requires her to analyze a geospatial entity. Similar to bob she knows that she will need to use gdal, fiona, and netcdf. Having learned about `conda` in the hackweek she started following the `conda` workflow in creating a new project. Let's see what happens.

In [3]:
%%html

<script id="asciicast-2Kz7LgkK7BJ1gg7gi1wtJZorn" 
src="https://asciinema.org/a/2Kz7LgkK7BJ1gg7gi1wtJZorn.js" 
data-speed=5 async></script>

**Sandy succeeded!**

## So, what's going on? why didn't pipenv work?

- `pipenv` is basically just a nice wrapper that uses `pip` and `virtualenv` under the hood
- `pip` is simply just a python package manager
- `pip` does not handle library dependencies outside of the python packages as well as the python packages themselves
- `pip` wheels can solve some of the lower level dependencies problems that we run into in bob's case, but GDAL Developers did not include these dependencies within the wheels, users have to set it up themselves!

*NOTE: Conda can manage pip packages, but pip cannot manage conda packages*

---
## Concepts of conda

```
# Check conda version to make sure it's installed.
conda info
```

<details><summary>Conda Help and Manual</summary>
<p>

To see the full documentation for any command, type the command followed by --help. For example, to learn about the conda update command:
```
conda update --help
```

</p>
</details>

### 1. Environments
![conda environments](https://geohackweek.github.io/datasharing/assets/img/conda/conda-env2.jpeg)

What is a conda environment?

- Similar to python virtual environments (venv)
- A set of isolated packages in a directory
- Able to be shared via environment files

```
# List out available environments
conda env list # The starred * environment is the current activate environment

# Create conda environment from command line (Not Best Practice)
conda create --name myenv --channel conda-forge python=3.6

# Activate conda environment
conda activate myenv

# Deactivate conda environment
conda deactivate

# Create conda environment from environment file (Recommended Best Practice)
conda env create --file environment.yml

# Removing conda environments
conda env remove --yes --name myenv
```

Sample of `environment.yml`
```
name: tutorial-env
channels:
- conda-forge
dependencies:
- python=3.7
- numpy
- matplotlib
- pandas
- bokeh
- rise
- nb_conda_kernels
- ipykernel
```

<details><summary><strong>Best practice to share environments</strong></summary>
<p>


1. When starting a new environment, always generate it from an environment file rather than the command line.
2. As you add packages to the environment, be sure to update the environment file.
3. Unless you have to (i.e. Production Environments), try to avoid specifying the version of each package. This will ensure you have the most up to date version that will work across platform.

If you follow these guidelines, you should be able to give your environment file to anyone, and they will be able to install your packages with no problem.

</p>
</details>

### 2. Channels
![conda channels](https://geohackweek.github.io/datasharing/assets/img/conda/conda-channels.jpeg)

What is a conda channel?

- Similar to linux repository (or app store)
- The service is hosted for free at Anaconda Cloud

```
# List out your channels and priorities

conda config --get channels

# If you have a few trusted channels that you prefer to use, you can pre-configure these so that everytime you are creating an environment, you won’t need to explicitly declare the channel.

conda config --add channels conda-forge

# strict priority and conda-forge at the top will ensure
# that all of your packages will be from conda-forge unless they only exist on defaults

conda config --set channel_priority strict
conda config --set show_channel_urls True
```

**NOTE: The highest priority channel is where your packages will be installed from no matter if another channel has a higher version!**

---
#### Conda Forge (https://anaconda.org/conda-forge)

Conda forge is a community led collection of recipes, build infrastructure and distributions for the conda package manager.

Watch Filipe’s talk from pycon, one of the conda-forge lead developer, https://www.youtube.com/watch?v=qJFkIuzD6tI for more info about how to put your packages into the conda-forge channel!

---

### 3. Packages

What is a conda package?

- A compiled software package, but when installed also include all of its dependencies even the lower level ones
- Cross platform
- Made from **recipes**


You can search for conda packages at https://anaconda.org/ or the terminal shown below.

```
# Look at the packages you have installed
conda list

# Let's search for gdal conda
conda search gdal

# Install a single conda package
conda install -c conda-forge gdal

# Or install multiple packages
conda install -c conda-forge gdal fiona

# Removing a conda package
conda remove -n myenv gdal
```

### 4. Recipes
Instruction on how to compile the conda package and its metadata

```
package:
  name: pandas
  version: 
source:
  url: https://github.com/pydata/pandas/archive/v.tar.gz
  sha256: d9f67bb17f334ad395e01b2339c3756f3e0d0240cb94c094ef711bbfc5c56c80
build:
  number: 0
  script: python setup.py install --single-version-externally-managed --record=record.txt
about:
  home: http://pandas.pydata.org
  license: BSD 3-clause
  summary: 'High-performance, easy-to-use data structures and data analysis tools.'
extra:
  recipe-maintainers:
    - jreback
    - jorisvandenbossche
    - TomAugspurger
```

**For official walkthrough go to https://bit.ly/tryconda**

**For conda cheat sheet, go to https://tinyurl.com/y49fjnoj**