# Running `ws3` and `libcbm` as a two-stage sequential pipeline

We run `ws3` and `libcbm` in a two-stage sequential software pipeline. This is the _de facto_ standard way to run CBM models (i.e., run a forest estate model and CBM in a two-stage sequential pipeline, where the output from the first stage becomes the input for the second stage). The pipeline stages can be _soft-linked_ (i.e., output from stage-1 model exported to disk in a specific format, then this same data is read and imported to stage-2 model), or _hard linked_ (i.e., output from stage-1 model directly piped into stage-2 model at runtime, with no intermediate disk-based data drop). Either way, the result from running the pipeline should be the same.

Note that soft-linked version of this pipeline can be implemented using almost any combinination of forest estate model (e.g., ws3, Patchworks, Woodstock, FPS Atlas, Woodlot, etc.) and CBM (e.g., libcbm, CBM-CFS3, GCBM, spadesCBM, etc.), although an intermediary _data munging_ module might need to be included in the middle of the pipeline to link the two main stages if the forest estate model you are using does not include built-in soft-link data export functions that are compatible with the CBM implementation you are using. The data munging module could be a human manually reformatting raw stage-1 output data using a spreadsheet (simple, but yuck), a Jupyter or R markdown notebook that semi-automates (and documents) the data munging process, or a fully-automated software module.

`ws3` does not currently include built-in functions to export data for soft-link to CBM, so we implement some custom data munging code in this notebook (with examples of both soft- and hard-link approaches). One obvious advantage of selecting `ws3` and `libcbm` as the software modules for this two-stage pipeline is that they are both Python packages, which makes it easy to hard-link them with a bit of custom data munging Python code. 

Note that we plan to extend `ws3` at some point to include built-in `libcbm` soft-link and hard-link functions (similar to those implemented in this notebook). 

:warning:

## Install `ws3` and `libcbm` packages

First, make sure we have the correct versions of `ws3` and `libcbm` installed. Both of these packages are relatively new and under active development, it is best we stick to known-working versions of each package from their respective GitHub repos. 


> We _strongly recommend_ that you run this notebook in venv-sandboxed Python kernel (see `venv_python_kernel_setup` notebook for an example of how to do this). This will ensure that you are working from a fresh Python package environment, and not wasting time debugging random interactions between this notebook and whatever mishmash of packages you have installed on your system in various parts of your Python path. You have been warned. 


Install `dev` branch of `ws3` package (from GitHub repo) into user space.

In [2]:
%pip install -U git+https://github.com/gparadis/ws3@dev

Collecting git+https://github.com/gparadis/ws3@dev
  Cloning https://github.com/gparadis/ws3 (to revision dev) to /media/data/tmp/pip-req-build-rhgcwx97
  Running command git clone --filter=blob:none --quiet https://github.com/gparadis/ws3 /media/data/tmp/pip-req-build-rhgcwx97
  Resolved https://github.com/gparadis/ws3 to commit 95a94cc5bf3a97dca1bc1a4ccc21d75d897d6aa1
  Preparing metadata (setup.py) ... [?25ldone
[?25hCollecting fiona
  Using cached Fiona-1.9.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (16.0 MB)
Collecting matplotlib
  Downloading matplotlib-3.7.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (11.6 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m11.6/11.6 MB[0m [31m25.2 MB/s[0m eta [36m0:00:00[0m00:01[0m00:01[0m
[?25hCollecting numpy
  Downloading numpy-1.24.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (17.3 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m17.3/17.3 MB[0m 

In [5]:
import ws3
ws3.__path__

['/media/data/home/gparadis/.virtualenvs/foo/lib/python3.10/site-packages/ws3']

Install `main` branch of `libcbm_py` package (from GitHub repo) into user space.

In [3]:
%pip install -U git+https://github.com/cat-cfs/libcbm_py.git@main

Collecting git+https://github.com/cat-cfs/libcbm_py.git@main
  Cloning https://github.com/cat-cfs/libcbm_py.git (to revision main) to /media/data/tmp/pip-req-build-xu8t6rtv
  Running command git clone --filter=blob:none --quiet https://github.com/cat-cfs/libcbm_py.git /media/data/tmp/pip-req-build-xu8t6rtv
  Resolved https://github.com/cat-cfs/libcbm_py.git to commit f186f31e6986b917d898e4244fe42df99ecec97f
  Preparing metadata (setup.py) ... [?25ldone
[?25hCollecting mock
  Using cached mock-5.0.1-py3-none-any.whl (30 kB)
Collecting numba
  Using cached numba-0.56.4-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (3.5 MB)
Collecting numexpr
  Using cached numexpr-2.8.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (381 kB)
Collecting openpyxl
  Using cached openpyxl-3.1.1-py2.py3-none-any.whl (249 kB)
Collecting pyyaml
  Downloading PyYAML-6.0-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (682 kB)
[2K     

In [6]:
import libcbm
libcbm.__path__

['/media/data/home/gparadis/.virtualenvs/foo/lib/python3.10/site-packages/libcbm']