# Package management in python and beyond

## File based system organization
>In the beginning, there was the command line, compiler and file system. It it was good until it wasn't.

In the early days of *nix, most programs were written in C and compiled at the command line. It didn't take long before it was obvious that organizing files into some type of predictable directory structure made everyone's life a little easier. This led to the "standard" (as long as folks generally follow it) of segregating binaries that run and libraries those binaries called upon. In "classic" unix, these directories were often placed in the "user" directory, /usr:

```
/usr/
    ├── bin
    ├── home
    ├── lib
```

/bin contains runtime programs. /home contained user directories and files. /lib contained any needed libraries to run the programs in /bin and/or programs written by users. You can see remnants of this file structure in all modern *nix base OSs, including MacOS and all variants of Linux. Of course, each distribution does it a little different, but the basic structure is still there. There is an exhaustive history on [Wikipedia](https://en.wikipedia.org/wiki/Unix_filesystem).

## Package managers

Why should we care? Once Linux truly took off in at the turn of the century, different arrangements of the source code broke into separate distributions. Red Hat, Debian, SuSe and others all had their preferred arrangement, often related to where their maintainers had come from (solaris, BSD, Aix, etc). The idea was the same but the names were different. This became a problem when your program was looking for a basic library of functions like *libgcc.a* (or *libgfortran.a*, etc.) If you had some special software installed from a vendor, did those libraries end up in /usr/lib or /usr/superSpecial1.2/lib?

This problem was eventually solved by using package managers at the system level, most prominently *yum* or *apt*, depending upon the type of distro you were using. 

Then Python arrived and started getting used a lot for system code as well as everyday programming tasks. Lot and lots of additional libraries proliferated and it was, for awhile, a *real* mess, particularly with Red Hat based distributions, since the system python also was used for managing the system, so if a user wanted to use a different, more up-to-date version all kinds of problems might happen if the system needed python1.5 and your were on python2.4. 

![XKCD strikes again!](https://imgs.xkcd.com/comics/python_environment_2x.png "Python Environment")

## Resources to learn more

* [Blog Post on python package managers](https://dublog.net/blog/so-many-python-package-managers/)
* [micromamba: https://mamba.readthedocs.io/en/latest/user_guide/micromamba.html](https://mamba.readthedocs.io/en/latest/user_guide/micromamba.html)
* [miniforge: https://github.com/conda-forge/miniforge](https://github.com/conda-forge/miniforge)
* [pixi: https://prefix.dev/](https://prefix.dev/)

## Ocean Hack Week Tutorials and Videos
* [General OHW Resources Page](https://oceanhackweek.org/resources/prep/)
* [Jupyter Lab for OHW](https://oceanhackweek.org/resources/prep/jupyterhub.html)
* [Software Installation Survival Guide (conda, etc)](https://www.youtube.com/embed/pIJXHyLcxjY?si=faKLciP2fuQSmIql)
* 


# >>>>>LIVE DEMO TIME<<<<<

## 1) Look at your local directory, should match jupyter lab list

In [3]:
!ls

simple_kml_example.ipynb


## 2) Open the simple kml example
Click on the notebook in this directory called "simple_kml_example"

## 3) Run the first cell
This cell should fail, trying to load simplekml, as that library is not installed in OHW's Jupyter Hub.

```
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
Cell In[2], line 3
      1 import pandas as pd
      2 import xarray as xr
----> 3 import simplekml
      5 #dependencies for range ring
      6 from functools import partial

ModuleNotFoundError: No module named 'simplekml'
```

## 4) Use pixi to install local dependency for this directory

First, access the Jupyter Hub terminal (+ on upper left, select terminal). To use pixi, we will need to first intialize the directory with pixi (comparable to conda env create [ENV_NAME])

### List the files in your home directory

```
jovyan@jupyter-cpsarason:~$ ls
ohw-tutorials  shared  shared-readwrite
jovyan@jupyter-cpsarason:~$ mkdir package-mgmt-test
jovyan@jupyter-cpsarason:~$ cd package-mgmt-test
```

### View installed packages in the hub using pixi
```
jovyan@jupyter-cpsarason:~/package-mgmt-test$ pixi list
```

### grep the output for simplekml
```
jovyan@jupyter-cpsarason:~/package-mgmt-test$ pixi list | grep simplekml
jovyan@jupyter-cpsarason:~/package-mgmt-test$
```

### Create a local pixi installation that contains simplekml
```
jovyan@jupyter-cpsarason:~/package-mgmt-test$ pixi init
Created /home/jovyan/package-mgmt-test/pixi.toml
jovyan@jupyter-cpsarason:~/package-mgmt-test$ pixi add simplekml --manifest-file ./pixi.toml
Added simplekml >=1.3.6,<2
```

## 5) Open up a "pixi jupyter" tab and import simplekml to test
You will likely run into an error.
```
To run the Pixi - Python 3 (ipykernel) kernel, you need to add the ipykernel package to your project dependencies.
You can do this by running 'pixi add ipykernel' in your project directory and restarting your kernel.
```


### Add ipykernel to the local pixi install
```
jovyan@jupyter-cpsarason:~/package-mgmt-test$ pixi add ipykernel --manifest-path ./pixi.toml
Added ipykernel >=6.29.5,<7
```

### Restart kernel and try simple_kml_example again
What happens?

This should probably fail, since now the Jupyter Hub is using your local pixi.toml file to run, and it won't find the other dependencies (numpy, pandas, etc)

```
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
Cell In[3], line 1
----> 1 import pandas as pd
      2 import xarray as xr
      3 import simplekml

ModuleNotFoundError: No module named 'pandas'
```


### Add other dependencies with pixi, using local manifest path
```
jovyan@jupyter-cpsarason:~/package-mgmt-test$ pixi add pandas xarray pyproj shapely --manifest-path=./pixi.toml
Added pandas >=2.2.2,<3
Added xarray >=2024.7.0,<2025
Added pyproj >=3.6.1,<4
Added shapely >=2.0.6,<3
```

### Try running the example notebook again
SUCCESS!


## Pixi Resources
* https://prefix.dev/blog/pixi_a_fast_conda_alternative
* https://pixi.sh/v0.24.2/tutorials/python/
* https://pixi.sh/dev/switching_from/conda/

## Thanks
Many thanks to @ocepaf (Filipe Fernandes) for his excellent package management introduction he presented at OHW23 (linked above) and to @abkfenris (Alex Kerney) for the introduction and walkthrough on how this stuff should work on the Jupyter Hub. All mistakes are mine, hit me with an issue in github or email at cbps@uw.edu.

Last Update Sun Aug 25 13:25:08 PDT 2024, cbps@uw.edu
