# Portable Jupyter Notebooks and libraries

Gregor von Laszewski, laszewski@gmail.com

In this notebook we demonstrate a portable way to develap large jupyter notebooks.

## Editor

Although colab provides an online editor, this editor has limited features and
we recommend avoiding it. The same is said for jupyter notebooks as well as
its successor jupyter-lab. They are good for interactively experimenting
with notebooks when features are not supported by other more powerful editors.

Two other editors are avaloable. VSCode and pyCharm. We recommend pyCharm as it
has many code checking and formating features that are just right for developing
large jupyter scripts.

### Pycharm configuration

It is important to create a python venv that you register with your pycharm
environment so that it is the same as you use on the commandline. We name the
environment ENV3. To do this you use

```bash
python -m venv ~/ENV3
```

To activate it in the command line you say

```bash
source ~/ENV3/bin/activate
```

For Windows you say

```bash
source ~/ENV3/Scripts/activate
```

You can then install with pip or requirements.txt files modules in theis
environment after you ativate it. However if you were to use Google colab, it
is easier to embed such activities in your program. This makes them portable
across colab, and your local machine.

### Installing Libraries

We will give here an example on how to install cloudmesh a library with many
useful add ons that makes managing some portion of your large script much easier.
To install the library use

In [None]:
import os
os.system("pip install -U cloudmesh.common")

Now you can use the many useful features of cloudmesh. One is the easy availability
of a banner that makes print statements more visible by putting a frame aound it.

In [5]:
from cloudmesh.common.util import banner
banner("hallo")


# ----------------------------------------------------------------------
# hallo
# ----------------------------------------------------------------------



One additional function is an anhanced command to run commanline programs from
within jupyter notebooks as well as regular pythin programs. This method is
prefered as it allows portable program execution across a variety of operating
systems, google colab, and other environments. Thus we **DO NOT** recommend
that you use ! or !! as command execution as it will potentially making the
move to a future integarion in a python library of portions of the notebook
more difficult. Instead just use the command `Shell` class from cloudmesh
that contains mainy useful methods portable between operating systems. Such
methods include for example `pwd`, `mkdir`, `run`, `grep`, `cm_grep` (a grep
portable in windows, Linux, and Mac), and so on. Please refer to
[https://github.com/cloudmesh/cloudmesh-common/blob/main/cloudmesh/common/Shell.py](Shell.py)

In [3]:
from cloudmesh.common.Shell import Shell

In [6]:
basedir = Shell.pwd()
print(basedir)

/Users/grey/Desktop/github/dsc-spidal/dl-hec


In [None]:
datadir = f"{basedir}/data"

In [37]:
content = Shell.mkdir(datadir)
os.system(f"echo >> {datadir}/gregor.txt")
content = Shell.run(f"ls {datadir}")
print (content)

gregor.txt



## Installing libraries directly from github

To develop in a team large libraries it is best practice to use a code
repository such as GitHub. However, although instalation of libarries into the
venv on your local machine is simple and usually done with a requiremets.txt file,
we can also install them directly on your local machine when checking out the code
and using in the directory `pip install -e .` (the dot is important).
This will make it possible on your local machine to make modifications that are
directly available to you without reinstalling it once you do a change
on you rlocal machine.

However, on google colab you may need to do this directly from the notebook.
It is to be noted that if you make a change the notebook kernal needs to be
stopped or interrupted and restarted before you load in the new library. It is
not sufficient to just go in the cell and rerun it. This means you have to
start the notebook form the beginning. For this reason it is best to do all
development first on the local computer before you move to colab. It will
simplify your development cycle, especially when developing libtraries.
Thus notebooks are not realy suitable for best practices in software
engeneering, but are great for interactive exploration of principles
and experiments.

To include a library that is hosted on GitHub we simply can activate it in google
 col;ab directly from the latest version of the code such as (but do not forget
 to interrupt the kernal and restart it before doing this command as otherwise
 an earlier verison may be still available.)

In [7]:
print(Shell.run("pip install -U git+https://github.com/DSC-SPIDAL/dl-hec.git"))

Collecting git+https://github.com/DSC-SPIDAL/dl-hec.git
  Cloning https://github.com/DSC-SPIDAL/dl-hec.git to /private/var/folders/q5/s8_pcggn5f73xnz11zjqrhlw0000gp/T/pip-req-build-ly081agt
  Running command git clone --filter=blob:none --quiet https://github.com/DSC-SPIDAL/dl-hec.git /private/var/folders/q5/s8_pcggn5f73xnz11zjqrhlw0000gp/T/pip-req-build-ly081agt
  Resolved https://github.com/DSC-SPIDAL/dl-hec.git to commit f7eec66ce78d4ff7baaad0be503cbc760ee95f96
  Preparing metadata (setup.py): started
  Preparing metadata (setup.py): finished with status 'done'
Building wheels for collected packages: hec
  Building wheel for hec (setup.py): started
  Building wheel for hec (setup.py): finished with status 'done'
  Created wheel for hec: filename=hec-1.0-py3-none-any.whl size=141946 sha256=83bd2ef18c5b388136a50e5e3fad3d13ce941b4865f79c92c8f9a399ce6a640c
  Stored in directory: /private/var/folders/q5/s8_pcggn5f73xnz11zjqrhlw0000gp/T/pip-ephem-wheel-cache-0s65ymuk/wheels/5e/75/dc/9f2c8

Now you can use the functions defined in hec.util

In [8]:
from hec.util import timenow

In [9]:
t = timenow()
print (t)

01/03/2023, 08:31:13 UTC


In [10]:
from hec.util import NaN
print (NaN)

nan
