# <center> BeakerX </center>

## <center> Volker Bäcker </center>

### <center> 27.01.2020 </center>

# <center> BeakerX features </center>

1. polyglot
1. EasyForm widgets
1. interactive plots
1. interactive tables
1. spark support
1. instant publication
1. java support

# <center> BeakerX history </center>

![beakerx](https://miro.medium.com/max/1096/1*BPo471kiXGeeF9WSln5c6g.png)

* beaker-notebook
> * independant notebook implementation

* BeakerX
> * re-implementation as jupyter-notebook extensions

# <center> Polyglot - introduction </center>

* different cells in the same notebook can execute different languages
* data can be exchanged between languages

Data can be exchanged between languages by attaching it to the ``beakerx`` object.

In a ``groovy``-cell we attach the number 6 to the ``beakerx`` object with the name ``myValue``.

In [1]:
beakerx.myValue = 6

6

We get the value back in a ``java``-cell and modify it on the ``beakerx`` object.

In [2]:
%%java
Integer a = (Integer)NamespaceClient.getBeakerX().get("myValue");
System.out.println(a);
NamespaceClient.getBeakerX().set("myValue", a+1);

6


null

We get the current value in a ``kotlin``-cell. ``kotlin`` is a java like programming language prefered for ``android`` development.

In [3]:
%%kotlin
beakerx["myValue"]

7

In python it would work as well, but the OMERO-python kernel here is not configured to use beakerx <big>&#128577;</big>.

In [None]:
%%python
from beakerx.object import beakerx
beakerx.myValue

# <center> Polyglot example </center>

* use python to download an image from omero
* use groovy to run an ImageJ-macro on it

We create a connetion-dialog using ``ipywidgets``. In ``beakerx`` we would use ``EasyForms`` which is easier to use.

Notes:
* We create widgets for the omero-host, the port, the username, the password, a connect button and a status text.
* The connect button has a function associated, which is executed when the button is pressed.
* The update of the status-text is not working correctly. The event-loop just does not get the time to do it. The status will be updated correctly after you executed the next cell. There are solutions for this, but no time now.... You can fix this as a homework <big>&#x1F642;</big>.

In [3]:
%%python
import ipywidgets as widgets
from omero.gateway import BlitzGateway
hosts = ["workshop.openmicroscopy.org", "omero.mri.cnrs.fr"]
port = 4064
CD_host = widgets.Dropdown(
       options=hosts,
       value=hosts[0],
       description='host:')
CD_port = widgets.IntText(
    value=port,
    description='port:',
) 
CD_top = widgets.HBox([CD_host, CD_port])
CD_user = widgets.Text(
    value = 'user',
    description = 'user:'
)
CD_password = widgets.Password(
    value = '',
    description = 'password:'
)
CD_button = widgets.Button(
    description = 'Connect',
)
CD_status = widgets.Label(
    description = "Status:",
    value = "Status: Not connected"
)
def on_connect_button_clicked(b):
    global conn
    CD_status.value = "Status: Connecting"
    conn = BlitzGateway(CD_user.value,
                        CD_password.value,
                        host=CD_host.value, port=CD_port.value)
    result = conn.connect()
    if result:
        CD_status.value = "Status: Connected"
    else:
        CD_status.value = "Status: Not connected"
    
CD_button.on_click(on_connect_button_clicked)
CD_layout = widgets.Layout(display='flex',
                flex_flow='column',
                align_items='center',
                width='50%')
CD = widgets.VBox([CD_top, widgets.HBox([CD_user, CD_password]), widgets.HBox([CD_button],  layout=CD_layout), widgets.HBox([CD_status], layout=CD_layout)])
display(CD)

VBox(children=(HBox(children=(Dropdown(description='host:', options=('workshop.openmicroscopy.org', 'omero.mri…

We first get the list of projects of the connected user from Omero. We then create a dialog, that lets the user select a project and a dataset and that displays the thumbnail of the selected image.  

* The list of datasets in a project can be accessed by the method ``getChildren`` of the ``project``-object. 
* The list of images in a dataset can be accessed by the method ``getChildren`` of the ``dataset``-object. 
* The thumbnail of an image can be accessed via the method ``getThumbnail`` of the ``image``-object.
* We use ``Python Imaging Library (PIL`` to get the ``bytes`` of the thumbnail and ``html`` to display the bytes as an image (one of many ways to display an image).

In [6]:
%%python
from IPython.display import Image
from PIL import Image as pImage
from io import BytesIO, StringIO
from base64 import b64encode

def on_project_change(change):
    global datasets
    datasets = list(projects[change['new']].listChildren())    
    BD_dataset.options = [(dataset.getName(), dataset.getId()) for dataset in datasets]
    BD_dataset.value =  BD_dataset.options[0][1]

def on_dataset_change(change):
    global images
    images = list(datasets[change['new']].listChildren())
    BD_image.options = [(image.getName(), image.getId()) for image in images] 
    BD_image.value = BD_image.options[0][1]

def on_image_change(change):
    global selectedImageIndex 
    selectedImageIndex = change['new']
    image = images[change['new']]
    img_data = image.getThumbnail()
    pil_im = pImage.open(BytesIO(img_data))
    img_bytes = BytesIO()  
    pil_im.save(img_bytes, format='png')
    BD_thumb.value = "<img src='data:image/png;base64,{0}' width=200, height=200/>".format(b64encode(img_bytes.getvalue()).decode('utf-8'))

projects = list(conn.getObjects("Project",
                               opts={'owner': conn.getUser().getId(),
                                     'group': conn.getEventContext().groupId,
                                     'order_by': 'lower(obj.name)'}))
projectTupels = [(project.getName(), project.getId()) for project in projects]

BD_project = widgets.Dropdown(
       continuous_update=True,
       options=projectTupels,
       value=projectTupels[0][1],
       description='project:')

BD_dataset = widgets.Dropdown(
       description='dataset:')

BD_image = widgets.Select(
       description='image:') 

BD_thumb = widgets.HTML("") 

on_project_change({'new':0})
on_dataset_change({'new':0})
on_image_change({'new':0})

BD_left = widgets.VBox([BD_project, BD_image])
BD_right = widgets.VBox([BD_dataset, BD_thumb, CD_layout])
BD_right.layout.align_items = 'flex-end'
BD_all = widgets.HBox([BD_left, BD_right])
BD_project.observe(on_project_change, names="index")
BD_dataset.observe(on_dataset_change, names="index")
BD_image.observe(on_image_change, names="index")
display(BD_all)


HBox(children=(VBox(children=(Dropdown(description='project:', options=(('idr00021', 976), ('matlab-project', …

# <center> Polyglot example: download the images of the selected dataset </center>

We first delete the ``tmp``-folder and its content if it exists and then create it again. We use shell commands to do this. When a line starts with an exclamation mark (bang), it is not interpreted as python but executed as a shell command. We use the PIL-library again, this time to save the images to the local ``tmp``-folder.

In [7]:
%%python
!rm -rf ./tmp
!mkdir -p ./tmp

DD_download_button = widgets.Button(
    description = 'Download dataset' 
)
DD_status = widgets.Label(
    description = "status",
    value = "Status: waiting..."
)

def on_download_button_clicked(b): 
    counter = 1
    for image in images:
        newStatus = "Status: downloading image " + str(counter) + " of " + str(len(images))
        DD_status.value = newStatus
        pixels = image.getPrimaryPixels()
        plane = pixels.getPlane()
        nucleiImage = pImage.fromarray(plane)
        nucleiImage.save('./tmp/'+image.getName(), save_all=True)
        counter = counter + 1
    DD_status.value = "Status: successfully downloaded " + str(counter-1) + " images."
DD_download_button.on_click(on_download_button_clicked)

DD = widgets.VBox([DD_download_button, DD_status])
display(DD)

VBox(children=(Button(description='Download dataset', style=ButtonStyle()), Label(value='Status: waiting...', …

We check the content of the ``tmp``-folder.

The 'ls' shell command lists the content of a folder. The options ``alh`` stand for: ``all``, ``long list format`` and ``human readable``. The ``h`` means that the file sizes will be displayed in Mega- and Gigabyte instead of Bytes.

In [11]:
%%python
!ls -alh ./tmp

total 13M
drwxr-xr-x 2 jovyan users 4.0K Jan 29 23:17 .
drwxr-xr-x 4 jovyan users 4.0K Jan 29 23:41 ..
-rw-r--r-- 1 jovyan users 2.1M Jan 29 23:16 gwlmabrp875gwlv1_w3Hoechst.TIF
-rw-r--r-- 1 jovyan users 2.1M Jan 29 23:16 gwlmabrp875gwlv2_w3Hoechst.TIF
-rw-r--r-- 1 jovyan users 2.1M Jan 29 23:16 gwlmabrp875gwlv3_w3Hoechst.TIF
-rw-r--r-- 1 jovyan users 2.1M Jan 29 23:16 gwlmabrp875gwlv4_w3Hoechst.TIF
-rw-r--r-- 1 jovyan users 2.1M Jan 29 23:16 gwlmabrp875gwlv5_w3Hoechst.TIF
-rw-r--r-- 1 jovyan users 2.1M Jan 29 23:17 gwlmabrp875gwlv_w3Hoechst.TIF
-rw-r--r-- 1 jovyan users 1.3K Jan 29 23:17 macro.ijm


Here is the ImageJ-macro that will do the analysis. It works as follows:

* threshold the image
* scale it down 
* apply a binary watershed (and fill the holes in the mask)
* scale it up to the original size
* analyze the particles
* write the results table to the ``out``-subfolder of the ``tmp``-folder 

The ``%%writefile``-cell-magic writes the content of the cell to a file. We will later read it in from that file and execute it.

In [9]:
%%python
%%writefile /home/jovyan/notebooks/volker/tmp/macro.ijm
SCALE_FACTOR = 5;
arg = getArgument();
parts = split(arg, ",");

for(i=0; i<parts.length; i++) {
	nameAndValue = split(parts[i], "=");
	if (indexOf(nameAndValue[0], "input")>-1) inputPath=nameAndValue[1];
	if (indexOf(nameAndValue[0], "scale_factor")>-1) SCALE_FACTOR=nameAndValue[1];
}
open(inputPath);
title = getTitle();
run("Set Measurements...", "area mean modal min centroid perimeter bounding fit shape feret's integrated median skewness kurtosis area_fraction display redirect="+title+" decimal=9");
width = getWidth();
height = getHeight();
run("Scale...", "x="+(1.0/SCALE_FACTOR)+" y="+(1.0/SCALE_FACTOR)+" interpolation=Bilinear create title=small_tmp");
setAutoThreshold("Huang dark");
run("Convert to Mask");
run("Fill Holes");
run("Watershed");
run("Scale...", "x=- y=- width="+width+" height="+height+" interpolation=Bilinear create title=big_tmp");
setAutoThreshold("Huang dark");
roiManager("Reset");
run("Analyze Particles...", "size=0-Infinity circularity=0.00-1.00 show=Overlay exclude display in_situ");
Overlay.copy;
close();
close();
Overlay.paste;
if (!File.exists(File.directory + "/out")) File.makeDirectory(File.directory + "/out");
save(File.directory + "/out/" + File.name);
run("Close All");
saveAs("results", File.directory + "/out/results.csv");
print("Done!");

Writing /home/jovyan/notebooks/volker/tmp/macro.ijm


We will now switch from ``python`` to ``groovy``, which allows us to access the ImageJ-library to execute a macro. 

To be able to use the ImageJ-library we add the FIJI-jars and the jars of the plugins to the classpath of the java-virtual-machine. The ``%classpath magic`` does the job.

In [10]:
//Add dependencies to the classpath
%classpath add jar /opt/java-apps/Fiji.app/jars/*
%classpath add jar /opt/java-apps/Fiji.app/plugins/*

We will now read the macro from the file and run it.

* We import the IJ-class from the package ij. 
* We just run the macro on one image now. We will do it in a loop on all images in the ``tmp``-folder later.
* We pass the path to the image and the scale-factor that will be used by the macro as parameters when running the macro.
* The content of the results-table will be displayed as output of the cell. However the macro also writes it to a file.

In [12]:
import ij.IJ
import static groovy.io.FileType.FILES
scaleFactor = 5

macro = new File('/home/jovyan/notebooks/volker/tmp/macro.ijm').text

def dir = new File("/home/jovyan/notebooks/volker/tmp");
def files = [];
dir.traverse(type: FILES, maxDepth: 0) { if (it.getName().contains(".TIF")) files.add(it) };
file = files[0]
args = "input="+file.getPath()+", scale_factor="+scaleFactor
IJ.runMacro('run("Clear Results");', "")
answer = IJ.runMacro(macro, args)

 	Label	Area	Mean	Mode	Min	Max	X	Y	Perim.	BX	BY	Width	Height	Major	Minor	Angle	Circ.	Feret	IntDen	Median	Skew	Kurt	%Area	RawIntDen	FeretX	FeretY	FeretAngle	MinFeret	AR	Round	Solidity
1	gwlmabrp875gwlv5_w3Hoechst.TIF	10202	10353.091550676	9419	4331	16380	337.330915507	106.035287199	394.232539419	283	38	107	132	143.838261445	90.306916288	56.924610040	0.824879002	147.299015611	105622240	10180	-0.112429016	-0.168288041	100	105622240	295	165	55.231043445	93.400000000	1.592771267	0.627836539	0.961772331
2	gwlmabrp875gwlv5_w3Hoechst.TIF	13999	7954.662118723	8315	4214	11074	701.671512251	150.813736695	461.303607231	643	73	117	157	163.217578166	109.204416504	112.544186534	0.826671902	167.026943934	111357315	7994	-0.282783709	0.479401828	100	111357315	670	73	109.953799220	112.697826066	1.494606019	0.669072644	0.970400665
3	gwlmabrp875gwlv5_w3Hoechst.TIF	71	6062.774647887	4590	4590	7554	181.514084507	99.077464789	30.970562748	178	93	7	12	12.054813108	7.499079983	89.201968071	0.930186472	12.369316

null

We check the content of the output folder. It should contain the control-output-image with the rois of the detected nuclei as overlay and the results table as a csv-file. You can use the jupyter-lab filebrowser to check the content of the files.

In [13]:
%%python
!ls ./tmp/out

gwlmabrp875gwlv5_w3Hoechst.tif	results.csv


We read in the csv-file and display it as an interactive table. 

Now let's play with the table a bit.

* This is done using the ``Python Data Analysis Library (pandas)`` library under the hood, however BeakerX facilitates the usage of pandas and makes the table interactive.
* Put the ``IntDen``-column next to the ``Mean``-column, by using drag-and-drop!
* Sort the table by ``Area``, by clicking on the label of the area-column!
* Display only the columns ``index, area, mean`` and ``IntDen``, by using the menu that opens when clicking in the upper-left corner!
* Resize the table (as you would do with a window) and the width of the columns!
* Display a heatmap of the area-values via the menu from the upper-left corner of the area-label-cell!
* Display data-bars in the IntDen-column!
* Filter the data by the occurence of a substring in a column or by expressions. You can use the column names and $ for the current column in expressions!
* Export the table to excel!

In [14]:
csv = new CSV().read("/home/jovyan/notebooks/volker/tmp/out/results.csv")

Access a value in the table

In [15]:
csv[3]["Area"]

13578

Access a row in the table

In [16]:
csv[3]

Access a column in the table

In [17]:
csv["Area"]

[10202, 13999, 71, 13578, 16236, 10978, 6850, 13574, 10602, 14550, 12371, 10896]

# Plots and histograms

To make it more interesting we first create more data, by applying the macro to all images in the dataset.

In [18]:
import ij.IJ
import static groovy.io.FileType.FILES
scaleFactor = 5

macro = new File('/home/jovyan/notebooks/volker/tmp/macro.ijm').text

def dir = new File("/home/jovyan/notebooks/volker/tmp");
def files = [];
dir.traverse(type: FILES, maxDepth: 0) { if (it.getName().contains(".TIF")) files.add(it) };
IJ.runMacro('run("Clear Results");', "")
files.each {
    file = it
    args = "input="+file.getPath()+", scale_factor="+scaleFactor
    answer = IJ.runMacro(macro, args)
}

 	Label	Area	Mean	Mode	Min	Max	X	Y	Perim.	BX	BY	Width	Height	Major	Minor	Angle	Circ.	Feret	IntDen	Median	Skew	Kurt	%Area	RawIntDen	FeretX	FeretY	FeretAngle	MinFeret	AR	Round	Solidity
1	gwlmabrp875gwlv5_w3Hoechst.TIF	10202	10353.091550676	9419	4331	16380	337.330915507	106.035287199	394.232539419	283	38	107	132	143.838261445	90.306916288	56.924610040	0.824879002	147.299015611	105622240	10180	-0.112429016	-0.168288041	100	105622240	295	165	55.231043445	93.400000000	1.592771267	0.627836539	0.961772331
2	gwlmabrp875gwlv5_w3Hoechst.TIF	13999	7954.662118723	8315	4214	11074	701.671512251	150.813736695	461.303607231	643	73	117	157	163.217578166	109.204416504	112.544186534	0.826671902	167.026943934	111357315	7994	-0.282783709	0.479401828	100	111357315	670	73	109.953799220	112.697826066	1.494606019	0.669072644	0.970400665
3	gwlmabrp875gwlv5_w3Hoechst.TIF	71	6062.774647887	4590	4590	7554	181.514084507	99.077464789	30.970562748	178	93	7	12	12.054813108	7.499079983	89.201968071	0.930186472	12.369316

[/home/jovyan/notebooks/volker/tmp/gwlmabrp875gwlv5_w3Hoechst.TIF, /home/jovyan/notebooks/volker/tmp/gwlmabrp875gwlv2_w3Hoechst.TIF, /home/jovyan/notebooks/volker/tmp/gwlmabrp875gwlv1_w3Hoechst.TIF, /home/jovyan/notebooks/volker/tmp/gwlmabrp875gwlv3_w3Hoechst.TIF, /home/jovyan/notebooks/volker/tmp/gwlmabrp875gwlv4_w3Hoechst.TIF, /home/jovyan/notebooks/volker/tmp/gwlmabrp875gwlv_w3Hoechst.TIF]

We read in and display the newly created csv-file.

* You can change the number of rows that are displayed.

In [19]:
csv = new CSV().read("/home/jovyan/notebooks/volker/tmp/out/results.csv")

## Creating histograms and plots

We get the values from the ``area``-column and display a hisogram of the areas.

* The histogram and plots are done by BeakerX. 
* As the table they are interactive.
* It is possible to:
 * resize the plot
 * pan - left-click and drag
 * zoom in and out - right-click and drag (box-zoom) or use the mouse-wheel
 * reset the zoom - double-click
 * pin the values at a point to the plot - left click
* You can export a plot as an image

In [20]:
areas = csv["Area"]
new Histogram(data: areas, binCount: 10);

We create a scatterplot of area vs mean-intensity and area vs max-intensity. 

* You can select and unselect which data is displayed.
* Display different pairs of data in the scatterplot.

In [30]:
new Plot() << new Points(x: csv['Area'], y: csv['Mean'], displayName: "area vs mean") \
           << new Points(x: csv['Area'], y: csv['Max'], displayName: "area vs max") 

# More...

There is much more in BeakerX:
* initialization cells that are automatically executed when the notebook is loaded
* automatic timing of the cell execution
* display the progress while a cell is executed
* execute other cells from a cell
* EasyForm widgets
* TableSaw - descriptive statistics, k-means-clustering, ... on tables
* Smile - machine learning
* DataVec - Deep learning
* more types of plots

Check out the documentation:
* [beakerx](http://beakerx.com/)
* [beakerx notebooks](https://nbviewer.jupyter.org/github/twosigma/beakerx/blob/master/StartHere.ipynb)

# The end

or almost...
* homework: 
 * upload the csv-file as an attachment to the dataset in OMERO

In [None]:
%%python
conn.close()