# Conda

We will start this tutorial by looking at a picture of the perfect python environment.
![Perfect Python Environment](https://geohackweek.github.io/datasharing/assets/img/conda/conda-env.jpeg)

**Definition of python virtual environment: a self-contained directory tree that contains a Python installation for a particular version of Python, plus a number of additional packages.**

---
## 0. Python Environment Discussion

What are the benefits of having a well defined python environment?

*Brainstorming: [https://etherpad.wikimedia.org/p/geohack-conda](https://etherpad.wikimedia.org/p/geohack-conda)*

<details><summary>Solution</summary>
<p>

1. Avoid future breakage if any dependencies changes
2. Allows better collaboration among team
3. Reliability

</p>
</details>

---
## 1. Ways to create environments and manage packages in Python

1. The most classic is with `pip` as a python package manager, 
along with `virtualenv` as the python environment manager.
2. Next modern classic `pipenv` as a python package and environment manager.
3. What we'll learn, `conda`.

---
## 2. Review from Preliminary

### What is conda?
- An open source **package** and **environment** management system, 
used by the scientific software community
- From their website: *Package, dependency and environment management for any language—Python, R, Ruby, Lua, Scala, Java, JavaScript, C/ C++, FORTRAN*

### Flavors of conda
- Miniconda:
   
   **Lightweight** distribution of conda; only contains the necessary python packages.

- Anaconda:
 
   A **data science platform** distribution of conda; comes with a lot of scientific python packages.

---
## 3. Exercise: Let's try creating a python environment

### Scenerio 1:
Bob is a post-doc. He has been programming in Python for a few years now, 
and he is very comfortable managing his own python environment, 
previously using `pip` and `virtualenv`, but now he's with the "cool" kids using `pipenv`. 
Recently, his studies are shifting more towards a geospatial focus, and he will need python libraries such as gdal, fiona, and netcdf. Let's see what happens.

In [5]:
%%bash

# First bob installs the new pipenv
pip install pipenv

Collecting pipenv
[?25l  Downloading https://files.pythonhosted.org/packages/13/b4/3ffa55f77161cff9a5220f162670f7c5eb00df52e00939e203f601b0f579/pipenv-2018.11.26-py3-none-any.whl (5.2MB)
[K     |████████████████████████████████| 5.2MB 1.4MB/s 
Collecting virtualenv (from pipenv)
[?25l  Downloading https://files.pythonhosted.org/packages/db/9e/df208b2baad146fe3fbe750eacadd6e49bcf2f2c3c1117b7192a7b28aec4/virtualenv-16.7.2-py2.py3-none-any.whl (3.3MB)
[K     |████████████████████████████████| 3.3MB 49.0MB/s 
Collecting virtualenv-clone>=0.2.5 (from pipenv)
  Downloading https://files.pythonhosted.org/packages/ba/f8/50c2b7dbc99e05fce5e5b9d9a31f37c988c99acd4e8dedd720b7b8d4011d/virtualenv_clone-0.5.3-py2.py3-none-any.whl
Installing collected packages: virtualenv, virtualenv-clone, pipenv
Successfully installed pipenv-2018.11.26 virtualenv-16.7.2 virtualenv-clone-0.5.3


In [24]:
%%bash

# Next he creates a folder for his new geoproject
mkdir geoproject

In [35]:
%%bash

# Next bob creates a requirements.txt so he can share this later
# In requirements.txt
# ipython
# requests
# fiona
# gdal
# netCDF4
cat <<EOT >> geoproject/requirements.txt
ipython
requests
fiona
gdal
netCDF4
EOT

In [36]:
%%bash

# Now he installs those packages
pipenv install -r geoproject/requirements.txt

Requirements file provided! Importing into Pipfile…
Pipfile.lock not found, creating…
Locking [dev-packages] dependencies…
Locking [packages] dependencies…
⠼ Locking..✘ Locking Failed! 
Traceback (most recent call last):
  File "/srv/conda/envs/notebook/lib/python3.6/site-packages/pipenv/resolver.py", line 126, in <module>
    main()
  File "/srv/conda/envs/notebook/lib/python3.6/site-packages/pipenv/resolver.py", line 119, in main
    parsed.requirements_dir, parsed.packages)
  File "/srv/conda/envs/notebook/lib/python3.6/site-packages/pipenv/resolver.py", line 85, in _main
    requirements_dir=requirements_dir,
  File "/srv/conda/envs/notebook/lib/python3.6/site-packages/pipenv/resolver.py", line 69, in resolve
    req_dir=requirements_dir
  File "/srv/conda/envs/notebook/lib/python3.6/site-packages/pipenv/utils.py", line 726, in resolve_deps
    req_dir=req_dir,
  File "/srv/conda/envs/notebook/lib/python3.6/site-packages/pipenv/utils.py", line 480, in actually_resolve_deps
    reso

CalledProcessError: Command 'b'\n# Now he installs those packages\npipenv install -r geoproject/requirements.txt\n'' returned non-zero exit status 1.

**Bob got an error ...**

Bob looked at https://pypi.org/project/GDAL/, but it's still really confusing to set this up... HELP!

In [28]:
%%bash

# clean up before scenario 2
rm -rf geoproject

## Scenerio 2:
In the other side of the world, we meet Sandy. She is an advanced undergrad that has attended one of the hackweek at UW eScience. She just started to really program in Python. Her senior thesis project requires her to analyze a geospatial entity. Similar to bob she knows that she will need to use gdal, fiona, and netcdf. Having learned about `conda` in the hackweek she started following the `conda` workflow in creating a new project. Let's see what happens.

In [29]:
%%bash

# Sandy has installed conda into her linux machine, so her first step now is to make a new directory for the project
mkdir geoproject

In [30]:
%%bash

# Next Sandy creates an environment.yml so she can share this later
# In environment.yml
# name: geoproj
# channels:  
#   - conda-forge
# dependencies:
#   - python=3.6
#   - ipython
#   - requests
#   - fiona
#   - gdal
#   - netCDF4
cd geoproject;
cat <<EOT >> environment.yml
name: geoproj
channels:  
  - conda-forge
dependencies:
  - python=3.6
  - ipython
  - requests
  - fiona
  - gdal
  - netCDF4
EOT

In [37]:
%%bash

# Now she installs those packages
conda env create -f geoproject/environment.yml

Collecting package metadata (repodata.json): ...working... done
Solving environment: ...working... done

Downloading and Extracting Packages
kealib-1.4.10        | 173 KB    | ########## | 100% 
xorg-libx11-1.6.8    | 907 KB    | ########## | 100% 
attrs-19.1.0         | 32 KB     | ########## | 100% 
openjpeg-2.3.1       | 470 KB    | ########## | 100% 
poppler-0.67.0       | 8.9 MB    | ########## | 100% 
lz4-c-1.8.3          | 187 KB    | ########## | 100% 
xorg-libxext-1.3.4   | 51 KB     | ########## | 100% 
numpy-1.17.0         | 5.2 MB    | ########## | 100% 
pygments-2.4.2       | 661 KB    | ########## | 100% 
wcwidth-0.1.7        | 17 KB     | ########## | 100% 
cligj-0.5.0          | 8 KB      | ########## | 100% 
xorg-libsm-1.2.3     | 25 KB     | ########## | 100% 
chardet-3.0.4        | 190 KB    | ########## | 100% 
ipython-7.7.0        | 1.1 MB    | ########## | 100% 
libxcb-1.13          | 396 KB    | ########## | 100% 
urllib3-1.25.3       | 187 KB    | ########## | 1



  current version: 4.7.10
  latest version: 4.7.11

Please update conda by running

    $ conda update -n base conda




**Sandy suceeded in the install after a few minutes**

In [39]:
%%bash

# Sandy activated her new geoproj python environment, and check whether gdal works.
source activate geoproj
gdalinfo --version
ogr2ogr --version

GDAL 2.4.2, released 2019/06/28
GDAL 2.4.2, released 2019/06/28


## So, what's going on? why didn't pipenv work?

- `pipenv` is basically just a nice wrapper that uses `pip` and `virtualenv` under the hood
- `pip` is simply just a python package manager
- `pip` does not handle library dependencies outside of the python packages as well as the python packages themselves
- `pip` wheels can solve some of the lower level dependencies problems that we run into in bob's case, but GDAL Developers did not include these dependencies within the wheels, users have to set it up themselves!

*NOTE: Conda can manage pip packages, but pip cannot manage conda packages*