# Initial Probings: Installing ShuTu SWC generation on Colab

This document is a verbose blow-by-blow of getting ShuTu to run on Colab on 2019-10-12, following the instructions in the ShuTu tutorial. This proves that ShuTu can be used on Colab. 

Having an archive of the chore is good info to keep, but for actually using ShuTu on The Allen Institute's Brightfield challenge dataset, a much terser and easier to use set of instructions ave been generated. For that see, [install_shutu_on_colab.ipynb](https://colab.research.google.com/drive/1wRnt5ceTs2Oau4g_BiYv09ZOLbv3cyWI#scrollTo=Q1eCY1mvHbCr)


For context and further explanation, see [brightfield neuron reconstruction challenge](https://colab.research.google.com/drive/1qvwT-SxHpZSLQ88VeIOkR296pbZfQqTK#scrollTo=UHFoENqqYV65). The context includes a tool to view the generated SWC skeleton file and compare it to a manually generated "gold standard" skeleton. This notebook just generates the SWC files, on Colab.

## Status
- Sat 2019-10-12:
  - Actually started on code. 
  - Tested if deployable on Colab: yup, with a bit of fanegling.
    - ShuTu SWC generator works on Colab.
    - ShuTu UI is not Jupyter-based.
  - Stopped developing this. Keeping it as archive of what happened.
- Fri 2019-10-11:
  - Inited this file.
  - Started as an empty Python 3 notebook.
  


## Install ShuTu on Colab

This is a log of a (successful) attempt to install ShuTu on Colab, following the instruction in [the ShuTu tutorial](http://personal.psu.edu/dzj2/ShuTu/).

In [0]:
# See ShuTu tutorial, "Installation" section, which has 
# text "Download Linux installer ShuTuUbuntu" which
# links to the following zip file.

# Get the Linux distro of ShuTu (took less than a minute to DL)
!wget https://psu.box.com/shared/static/wszpq3r6hwgoxo8eq0vb1k82fwrabycl.zip

--2019-10-13 18:22:09--  https://psu.box.com/shared/static/wszpq3r6hwgoxo8eq0vb1k82fwrabycl.zip
Resolving psu.box.com (psu.box.com)... 107.152.27.197, 107.152.26.197
Connecting to psu.box.com (psu.box.com)|107.152.27.197|:443... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: /public/static/wszpq3r6hwgoxo8eq0vb1k82fwrabycl.zip [following]
--2019-10-13 18:22:09--  https://psu.box.com/public/static/wszpq3r6hwgoxo8eq0vb1k82fwrabycl.zip
Reusing existing connection to psu.box.com:443.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: https://psu.app.box.com/public/static/wszpq3r6hwgoxo8eq0vb1k82fwrabycl.zip [following]
--2019-10-13 18:22:10--  https://psu.app.box.com/public/static/wszpq3r6hwgoxo8eq0vb1k82fwrabycl.zip
Resolving psu.app.box.com (psu.app.box.com)... 107.152.25.199
Connecting to psu.app.box.com (psu.app.box.com)|107.152.25.199|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://public.bo

In [0]:
%%shell

# There should now be a zip file on Colab VM now:
pwd
ls
unzip wszpq3r6hwgoxo8eq0vb1k82fwrabycl.zip

/content
sample_data  wszpq3r6hwgoxo8eq0vb1k82fwrabycl.zip
Archive:  wszpq3r6hwgoxo8eq0vb1k82fwrabycl.zip
   creating: ShuTuUbuntu/
  inflating: ShuTuUbuntu/ShuTu.Parameters.granule63XTest.dat  
  inflating: ShuTuUbuntu/processImages.c  
  inflating: ShuTuUbuntu/makefile    
  inflating: ShuTuUbuntu/tinydir.h   
  inflating: ShuTuUbuntu/ShuTu.Parameters.100X.dat  
  inflating: ShuTuUbuntu/scaleSWC.c  
  inflating: ShuTuUbuntu/ShuTu.Parameters.flyMotorNeuron.dat  
   creating: ShuTuUbuntu/llist/
  inflating: ShuTuUbuntu/llist/setup.py  
   creating: ShuTuUbuntu/llist/src/
  inflating: ShuTuUbuntu/llist/src/llist.c  
  inflating: ShuTuUbuntu/llist/src/dllist.c  
  inflating: ShuTuUbuntu/llist/src/py23macros.h  
  inflating: ShuTuUbuntu/llist/src/sllist.c  
  inflating: ShuTuUbuntu/llist/src/sllist.h  
  inflating: ShuTuUbuntu/llist/src/dllist.h  
  inflating: ShuTuUbuntu/llist/MANIFEST.in  
  inflating: ShuTuUbuntu/llist/LICENSE  
   creating: ShuTuUbuntu/llist/docs/
  inflating: ShuTuUb



In [0]:
# Now there should be one new directory: ShuTuUbuntu/
!ls


sample_data  ShuTuUbuntu  wszpq3r6hwgoxo8eq0vb1k82fwrabycl.zip


# Building

Now some heavy building, which only needs to be done once every 12 hours, when Colab tears down the VM and throws everything away (except this notebook file itself).


In [0]:
%%shell
set -m # needed to avoid error: bash: no job control in this shell
cd ShuTuUbuntu
./build.sh

gcc  -o generator generator.c
./generator <cdf.p >cdf.c
gcc -c  -fPIC cdf.c
./generator <fct.min.p >fct.min.c
gcc -c  -fPIC fct.min.c
./generator <filters.p >filters.c
gcc -c  -fPIC filters.c
./generator <linear.algebra.p >linear.algebra.c
gcc -c  -fPIC linear.algebra.c
./generator <array.p >array.c
gcc -c  -fPIC array.c
./generator <connectivity.p >connectivity.c
gcc -c  -fPIC connectivity.c
./generator <region.p >region.c
gcc -c  -fPIC region.c
./generator <fct.root.p >fct.root.c
gcc -c  -fPIC fct.root.c
./generator <image.p >image.c
gcc -c  -fPIC image.c
./generator <utilities.p >utilities.c
gcc -c  -fPIC utilities.c
./generator <snake.p >snake.c
gcc -c  -fPIC snake.c
./generator <hash.p >hash.c
gcc -c  -fPIC hash.c
./generator <level.set.p >level.set.c
gcc -c  -fPIC level.set.c
./generator <water.shed.p >water.shed.c
gcc -c  -fPIC water.shed.c
./generator <fft.p >fft.c
gcc -c  -fPIC fft.c
./generator <draw.p >draw.c
gcc -c  -fPIC draw.c
./generator <histogram.p >histogram.c
gcc -c 

# Debugging 

For the record, build.sh crapped out at:
```
make[1]: Leaving directory '/content/ShuTuUbuntu/mylib/MY_FFT'
ar cr libmylib.a cdf.o fct.min.o filters.o linear.algebra.o array.o connectivity.o region.o fct.root.o image.o utilities.o snake.o  hash.o level.set.o water.shed.o fft.o draw.o histogram.o swc.o paths.o mylib.o svg.o MY_FFT/myfft.o MY_TIFF/mytiff.o
ranlib libmylib.a
make: Nothing to be done for 'all'.
bash: cannot set terminal process group (626): Inappropriate ioctl for device
bash: no job control in this shell
root@23b2ea86b11a:/content/ShuTuUbuntu# 
```

So, an error message occurred. But it doesn't seem to actually cause any real problems. I.e. the "bash: cannot set terminal process group" error message can be ignored.



While running, build.sh errored but nonetheless it did seem to complete it's last task: write out the process.sh file. The file even has the correct permissions as set with _chmod_ which is the penultimate command. (Also as will become clear later, the process.sh file isn't even used during this notebook.)



In [0]:
# Debugging. What's in that build.sh file that is causing error messages?
!cat ShuTuUbuntu/build.sh

#!/bin/bash

set -e

if ! [ -x "$(command -v mpicc)" ]
then
  echo "Cannot find mpicc. Installing openmpi ..."
  if [ ! -d Downloads ]
  then
    mkdir Downloads
  fi
  cd Downloads
  downloadDir=$PWD
  MPI_RUN=$downloadDir/local/bin/mpirun
  if [ ! -f $MPI_RUN ]
  then
    if [ ! -d $downloadDir/openmpi-4.0.1 ]
    then
      package=openmpi-4.0.1.tar.gz
      host=https://download.open-mpi.org/release/open-mpi/v4.0
      if [ -x "$(command -v wget)" ]
      then
        wget $host/$package
      else 
        if [ -x "$(command -v curl)" ]
        then
          curl -c - -O $host/$package
        else
          echo 'Failed to get $package: No download tool found.'
          exit 1
        fi
      fi
      gunzip -c $package | tar xf -
      cd openmpi-4.0.1
      ./configure --prefix=$downloadDir/local --disable-mpi-fortran
      make all install
      cd ..
    fi
    echo "export PATH=$downloadDir/local/bin:\$PATH" >> ~/.bashrc
  fi
  cd ..
else
  MPI_RUN=mpirun
fi

export PATH=

In [0]:
!pwd
!ls -l ShuTuUbuntu/process.sh
!cat ShuTuUbuntu/process.sh

## Continuing onward

Ignoring the (harmless?) error messages from build.sh, the next stage in the ShuTu tutorial is described in section, ["Image processing and automated reconstruction"](http://personal.psu.edu/dzj2/ShuTu/):

>Download the image files from [Granule Cell 63X Start](https://psu.box.com/shared/static/f4gs13fs0y9ckrjjo78oid7hkv61k2wq.zip). Unzip in the home directory, which should create a directory granuleCell63XStart.

Specifically, the link behind "Granule Cell 63X Start" is: [https://psu.box.com/shared/static/f4gs13fs0y9ckrjjo78oid7hkv61k2wq.zip](https://psu.box.com/shared/static/f4gs13fs0y9ckrjjo78oid7hkv61k2wq.zip)

In [0]:
# This is a 1.3GB image stack for use in skeletonization (make a SWC file). 
# Downloads in less than 2 minutes
!wget https://psu.box.com/shared/static/f4gs13fs0y9ckrjjo78oid7hkv61k2wq.zip

--2019-10-12 20:54:27--  https://psu.box.com/shared/static/f4gs13fs0y9ckrjjo78oid7hkv61k2wq.zip
Resolving psu.box.com (psu.box.com)... 107.152.24.197
Connecting to psu.box.com (psu.box.com)|107.152.24.197|:443... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: /public/static/f4gs13fs0y9ckrjjo78oid7hkv61k2wq.zip [following]
--2019-10-12 20:54:27--  https://psu.box.com/public/static/f4gs13fs0y9ckrjjo78oid7hkv61k2wq.zip
Reusing existing connection to psu.box.com:443.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: https://psu.app.box.com/public/static/f4gs13fs0y9ckrjjo78oid7hkv61k2wq.zip [following]
--2019-10-12 20:54:27--  https://psu.app.box.com/public/static/f4gs13fs0y9ckrjjo78oid7hkv61k2wq.zip
Resolving psu.app.box.com (psu.app.box.com)... 107.152.24.199
Connecting to psu.app.box.com (psu.app.box.com)|107.152.24.199|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://public.boxcloud.com/d/1/b

In [0]:
!pwd && ls -l

/content
total 1645644
-rw-r--r-- 1 root root 1389488667 Oct 12 20:55 f4gs13fs0y9ckrjjo78oid7hkv61k2wq.zip
drwxr-xr-x 1 root root       4096 Aug 27 16:17 sample_data
drwxrwxr-x 5 root root       4096 Oct 12 19:14 ShuTuUbuntu
-rw-r--r-- 1 root root  295628950 Oct 12 19:04 wszpq3r6hwgoxo8eq0vb1k82fwrabycl.zip


In [0]:
!unzip f4gs13fs0y9ckrjjo78oid7hkv61k2wq.zip

Archive:  f4gs13fs0y9ckrjjo78oid7hkv61k2wq.zip
   creating: granuleCell63XStart/
  inflating: granuleCell63XStart/DG-151213-S1_63X_20percentOverlap_100617_redo_meta.xml  
  inflating: granuleCell63XStart/DG-151213-S1_63X_20percentOverlap_100617_redo_info.xml  
   creating: granuleCell63XStart/slides/
  inflating: granuleCell63XStart/slides/DG-151213-S1_63X_20percentOverlap_100617_redo_z103m3_ORG.tif  
  inflating: granuleCell63XStart/slides/DG-151213-S1_63X_20percentOverlap_100617_redo_z024m1_ORG.tif  
  inflating: granuleCell63XStart/slides/DG-151213-S1_63X_20percentOverlap_100617_redo_z070m1_ORG.tif  
  inflating: granuleCell63XStart/slides/DG-151213-S1_63X_20percentOverlap_100617_redo_z121m4_ORG.tif  
  inflating: granuleCell63XStart/slides/DG-151213-S1_63X_20percentOverlap_100617_redo_z135m2_ORG.tif  
  inflating: granuleCell63XStart/slides/DG-151213-S1_63X_20percentOverlap_100617_redo_z051m4_ORG.tif  
  inflating: granuleCell63XStart/slides/DG-151213-S1_63X_20percentOverlap_100617

So, that puts the image stack in directory `/content/granuleCell63XStart/`

In [0]:
!pwd && ls -l

/content
total 1645648
-rw-r--r-- 1 root root 1389488667 Oct 12 20:55 f4gs13fs0y9ckrjjo78oid7hkv61k2wq.zip
drwxrwxr-x 3 root root       4096 Oct 26  2017 granuleCell63XStart
drwxr-xr-x 1 root root       4096 Aug 27 16:17 sample_data
drwxrwxr-x 5 root root       4096 Oct 12 19:14 ShuTuUbuntu
-rw-r--r-- 1 root root  295628950 Oct 12 19:04 wszpq3r6hwgoxo8eq0vb1k82fwrabycl.zip


In [0]:
# How many cores do we have to play with?
# See https://colab.research.google.com/drive/151805XTDg--dgHb3-AXJCpnWaqRhop_2#scrollTo=01i75DIjKA5D


#no.of sockets i.e available slots for physical processors
!lscpu | grep 'Socket(s):'

#no.of cores each processor is having 
!lscpu | grep 'Core(s) per socket:'


Socket(s):           1
Core(s) per socket:  1


## Beginning of main action

So far in the story a bunch of code and data ZIP files were downloaded and "installed." Now it's time to start doing things.

At this point, the next step is, quoting the ShuTu tutorial:
```
[1] Create tiff stacks:
mpirun -n 4 ./createTiffStacksZeiss ~/granuleCell63XStart/ granule
[here granule is filenameCommon given to the created tiff stacks]
```

The example has `-n 4` processors but it seems Colab is giving us 1. Now that we know there is only one core available, the next task is a `mpirun -n 1` where the default invocation would be:
```
mpirun -n 1 ./createTiffStacksZeiss ~/granuleCell63XStart/ granule
```
Wisely, that will cause an error message warning the user to not run as root, but we can ignore that here on Colab. The warning message says to force run-as-root, the trick is to add `-allow-run-as-root`. 

So that comes out as the following (I'm also using absolute paths to be more explicit):

In [0]:
%%shell
cd ShuTuUbuntu

mpirun --allow-run-as-root -n 1 ./createTiffStacksZeiss /content/granuleCell63XStart/ granule

Creating tiff stacks from images and xml files in /content/granuleCell63XStart/ with finameCommon=granule
Number of tasks= 1 My rank= 0 Running on 23b2ea86b11a
Finding image slices with finames ending with _ORG.tif 
/content/granuleCell63XStart/
filenameSliceCommon=/content/granuleCell63XStart/slides/DG-151213-S1_63X_20percentOverlap_100617_redo_
nTiles=4 nz=130 zstart=6 zend=135
/content/granuleCell63XStart/granule-1.tif
/content/granuleCell63XStart/granule-2.tif
/content/granuleCell63XStart/granule-3.tif
/content/granuleCell63XStart/granule-4.tif




Next the ShuTu tutorial suggests:
```
[2] Process images:
mpirun -n 4 processImages ~/granuleCell63XStart/
```
So, tweak that for Colab here and it gets to:

In [0]:
%%shell
cd ShuTuUbuntu

mpirun --allow-run-as-root -n 1 processImages /content/granuleCell63XStart/

Bright field images
Number of tasks= 1 My rank= 0 Running on 23b2ea86b11a
ntiff=4 filenameCommon=/content/granuleCell63XStart/granule-
Image dimension=(1040 1388 129) type=0 number of channels=3
Color stack. Save projection to /content/granuleCell63XStart/granule-1.Proj.tif
Image dimension=(1040 1388 129) type=0 number of channels=3
Color stack. Save projection to /content/granuleCell63XStart/granule-4.Proj.tif
Image dimension=(1040 1388 129) type=0 number of channels=3
Color stack. Save projection to /content/granuleCell63XStart/granule-3.Proj.tif
Image dimension=(1040 1388 129) type=0 number of channels=3
Color stack. Save projection to /content/granuleCell63XStart/granule-2.Proj.tif




Next in the ShuTu tutorial is #3:
```
[3] Stitch images:
mpirun -n 4 ./stitchTiles ~/granuleCell63XStart/
```


In [0]:
%%shell
cd ShuTuUbuntu
mpirun --allow-run-as-root -n 1 ./stitchTiles /content/granuleCell63XStart/

Number of tasks= 1 My rank= 0 Running on 23b2ea86b11a
Image dimension=(1040 1388 129) type=0 number of channels=3
ntiff=4 filenameCommon=/content/granuleCell63XStart/granule-
Parsing /content/granuleCell63XStart/DG-151213-S1_63X_20percentOverlap_100617_redo_info.xml for tile positions 
numTiles=4 nx=1388 ny=1040 nz=129
ID=2 X=1112 Y=0
ID=3 X=1112 Y=833
ID=4 X=0 Y=833
ID=1 X=1 Y=0
Computing offsets between 2 3 offset =75.096153 direction =down
 Stitching /content/granuleCell63XStart/granule-2.tif and /content/granuleCell63XStart/granule-3.tif. Image dimension=(1040 1388 129) type=0 number of channels=3
Color stack. Smoothing in each plane... Bright field image. normalize maximum intensity to 1...Image dimension=(1040 1388 129) type=0 number of channels=3
Color stack. Smoothing in each plane... Bright field image. normalize maximum intensity to 1...Dimensions of overlapping stack nxx=512 nyy=2048 nzz=256. tcmalloc: large alloc 1073741824 bytes == 0x556731ac8000 @  0x7fedff66e1e7 0x5566d4



## The main show

ShuTuAutoTrace is the main show and here is where that comes into play.

Next in the ShuTu tutorial is #4, quoting:
```
[4] Auto reconstruction:
mpirun -n 4 ./ShuTuAutoTrace ~/granuleCell63XStart/ ShuTu.Parameters.granule63XTest.dat
```

So, for Colab set # of processors to 1 and add the `--allow-run-as-root`.

Eventually after many messages, the following should end in "Creating swc file... Saving SWC to /content/granuleCell63XStart/granule-.auto.swc
There 48 segments 4 trees"

That SWC file is the whole goal of this notebook.



In [0]:
%%shell
cd ShuTuUbuntu
mpirun --allow-run-as-root -n 1 ./ShuTuAutoTrace /content/granuleCell63XStart/ ShuTu.Parameters.granule63XTest.dat


Number of tasks= 1 My rank= 0 Running on 23b2ea86b11a
Tracing neuron from tif stacks in /content/granuleCell63XStart/
Reading parameters from file ShuTu.Parameters.granule63XTest.dat
The important parameters are:
1. xyDist		= 0.103 micron
2. zDist		= 0.500 micron
3. nSplit		=1
4. sparse		= 0.100
5. levelSetIter		=500
6. smallLen		=9 pixel
7. zJumpFact		= 5.000
8. distFactConn		= 2.000
9. minDistConn		=29.126 pixels
10. angle		=60.000 degree
11. searchMax		=48.544
12. factSigmaDD		=20.000
13. factExpandConn	= 1.000
14. sigmaSmoothCurve	= 1.942 pixel
15. sigma		= 0.030
16. factSigmaThreshold	= 0.300
17. factSigmaThreStrict	= 2.000
18. zOcc		=10 plane
19. minNumPointsBr	=5
20. minLenBrIso		=20.000000 micron
21. somaSparseThr	=0.010000
22. factSigmaThresholdZ	= 1.000

Seldom changed parameters are:
23. maxFracTotPoints	= 0.500
24. zext		=6 plane
25. sigmaFilter		= 0.971 pixel
26. sigmaBack		=19.417 pixel
27. smallArea		=94 pixel^2
28. lambdaRatioThr	= 1.000
29. levelSetMu		= 0.100
30. fact



In [0]:
# Let's download that SWC before it gets thrown away after a max of 12 hours.
from google.colab import files
files.download('/content/granuleCell63XStart/granule-.auto.swc') 


Next in the ShuTu tutorial is an explanation of that process.sh file:
> These commands can be combined into a single script [see script process.sh].

So, that `process.sh` that was written earlier by `build.sh`. Which is pretty much exactly equivalent to the code just worked through in this notebook, but as a shell script not a Jupyter notebook:
```
#!/bin/bash
set -e
if [ $# -lt 4 ]; then
  echo "Usage: ./process.sh <data_dir> <common_name> <param_file> <num_proc>"
  exit 1
fi
dataDir=$1
commonName=$2
paramFile=$3
numproc=$4
mpirun -n $numproc ./createTiffStacksZeiss $dataDir $commonName
mpirun -n $numproc ./processImages $dataDir
mpirun -n $numproc ./stitchTiles $dataDir
mpirun -n $numproc ./ShuTuAutoTrace $dataDir $paramFile
```



For Colab set num_proc to 1.

In [0]:
!ls content/granuleCell63XStart/granule-.auto.swc

# ------------
# clipped from other docs
## build.sh becomes a Colab code cell

The above `wget` and `unzip` should run uneventfully. 

### The out-of-the-box way 
The next big thing is to run `build.sh` which will hang on an error but it's harmless. The script is done when it hangs at:
```
mpicc -o scaleSWC scaleSWC.o libShuTu.o -Lmylib -lmylib -lm
bash: cannot set terminal process group (118): Inappropriate ioctl for device
bash: no job control in this shell
root@55ff33362eac:/content/ShuTuUbuntu# 
```
But that last thing `build.sh` does is write a file called `process.sh` and that file gets written, so that's how to detect if it worked.




In [0]:
%%shell

set -m # needed to avoid error: bash: no job control in this shell
cd ShuTuUbuntu

# This next line will work OK, but will hang on a harmless error message
./build.sh

## Features

This is just a dumping ground for random ideas. Nothing herein is yet planned for developement.

- [Document how to set up ShuTu desktop viewer](https://github.com/reconstrue/brightfield_neuron_reconstruction/issues/2)



from The Allen Institute's challenge dataset. 

This is the main notebook for processing Allen Institute's BioImage challenge data using ShuTu to generate the SWC file.

The code in this Jupyter notebook has been tested to run on Google's Colab site. 






### Cut?
This project runs ShuTu's SWC generator on Colab. Optionally one can easily download the ShuTu generated SWC, and view it in ShuTu's SWC editor GUI application running on localhost.  ShuTu's desktop application is very easy to install on Linux and Mac, Windows looks to be a bit more work, but seemingly can be done (dunno, haven't tested).
