# QuPath for Python Programmers 🐍
**Alan O'Callaghan, Léo Leplat, Peter Bankhead, Fiona Inglis, Laura Nicolás Sáenz**

# Presenting QuBaLab

This is a Python package for exploring quantitative bioimage analysis... *especially* (but not exclusively) in combination with QuPath (https://qupath.github.io/).

The name comes from **Quantitative Bioimage Analysis Laboratory**. This is chosen to be reminiscent of QuPath (*Quantitative Pathology*), but recognizes that neither is really restricted to pathology.

## Why use QuBaLab?

QuBaLab isn't QuPath - they're just good friends.

* **QuPath** is a user-friendly Java application for bioimage analysis, which has some especially nice features for handling whole slide and highly-multiplexed images. But lots of bioimage analysis research is done in Python, and is hard to integrate with QuPath.
* **QuBaLab**'s main aim is to help with this, by providing tools to help exchange data between QuPath and Python *without any direct dependency on QuPath and Java*. It therefore doesn't require QuPath to be installed, and can be used entirely from Python.

QuBaLab doesn't share code with QuPath, but is uses many of the same conventions for accessing images and representing objects in a GeoJSON compatible way. By using the same custom fields for things like measurements and classifications, exchanging data is much easier.


### How does QuBaLab compare to paquo?

[paquo](https://paquo.readthedocs.io/) is an existing library linking Python and QuPath that provides a pythonic interface to QuPath.

_We think paquo is great - we don't want to replace it!_

Here are the 3 main differences as we see them:

1. **Target audience**
    - paquo is written mostly for Python programmers who need to work with QuPath data
    - QuBaLab is written mostly for QuPath users who want to dip into Python
2. **Convenience vs. Efficiency**
    - paquo is based on [JPype](http://jpype.readthedocs.io/) to provide full & efficient access to Java from Python
    - QuBaLab is based on [Py4J](https://www.py4j.org) to exchange data between Java & Python - preferring convenience to efficiency
3. **Pixel access**
    - paquo is for working with QuPath projects and objects - accessing pixels is beyond its scope (at least for now)
    - QuBaLab enables requesting pixels as numpy or dask arrays, and provides functions to convert between thresholded images & QuPath objects

So if you're a Python programmer who needs an intuitive and efficient way to work with QuPath data, use paquo.

But if you're a QuPath user who wants to switch to Python for some tasks, including image processing, you might want to give QuBaLab a try.


## Getting started

You can find the documentation on https://qupath.github.io/qubalab-docs/.

We'll go through some examples that are adapted from the notebooks hosted there. We hope to add more workflows soon!

## Downloading sample images

If you downloaded the sample project for this notebook: https://github.com/qupath/i2k-qupath-for-python-programmers/releases/download/v0.1.0/i2k-qupath-python-project.zip

and extracted it to the notebook directory, this code won't do anything. Otherwise, this code will download two images we'll use for this.

In [None]:
cache_folder = "./i2k-qupath-python-project/images"

# Define a utility function to find or download an image

from pathlib import Path
import urllib.request

if cache_folder != "":
    Path(cache_folder).mkdir(parents=True, exist_ok=True)

def get_image(image_name, image_url):
    if cache_folder == "":
        filename = None
    else:
        filename = Path(cache_folder) / image_name
    
    if filename is None or not(filename.exists()):
        print(f"Downloading {image_name}...")
        path, _ = urllib.request.urlretrieve(image_url, filename=filename)
        print(f'{image_name} saved to {path}')
    else:
        path = filename
        print(f'{image_name} found in {path}')

    return path

In [None]:
# Download or get image
cmu_path = get_image("CMU-1.svs", "https://openslide.cs.cmu.edu/download/openslide-testdata/Aperio/CMU-1.svs")

In [None]:
# Download or get image
fluoro_path = get_image("Patient_test_1.ome.tiff", "https://ftp.ebi.ac.uk/biostudies/fire/S-BIAD/463/S-BIAD463/Files/my_submission/Validation_raw/DCIS/Patient_test_1.ome.tiff")

## ImageServer

QuBaLab provides a number of ImageServers, that allow you to read images
just as QuPath would.


OpenSlide for RGB images:

In [None]:
from qubalab.images.openslide_server import OpenSlideServer

print(cmu_path)
openslide_server = OpenSlideServer(cmu_path)

AICSImageIO for general purpose images (this may soon be replaced with BioIO):

In [None]:
from qubalab.images.aicsimageio_server import AICSImageIoServer

print(fluoro_path)
aicsimageio_server = AICSImageIoServer(fluoro_path)

ICC profile servers that wrap other servers:

In [None]:
from qubalab.images.icc_profile_server import IccProfileServer

icc_profile_server = IccProfileServer(openslide_server)

Since we're using an RGB image, we'll use OpenSlide.

In [None]:
server = openslide_server

## ImageServer operations

We can easily query image metadata usign an ImageServer.

In [None]:
metadata = server.metadata

print(f'Image name: {metadata.name}')
print('Levels:')
for level, shape in enumerate(metadata.shapes):
    print(f'Shape of level {level}: {shape}')
print('Pixel calibration:')
print(f'Pixel length on x-axis: {metadata.pixel_calibration.length_x}')
print(f'Pixel length on y-axis: {metadata.pixel_calibration.length_y}\n')
print(f'Pixel type: {metadata.dtype}')
print(f'Downsamples: {metadata.downsamples}\n')
print('Channels:')
for channel in metadata.channels:
    print(channel)

## Reading images

ImageServer provides a simple API for reading whole images:

In [None]:
highest_downsample = server.metadata.downsamples[-1]
lowest_resolution = server.read_region(highest_downsample)

print(f'Image shape: {lowest_resolution.shape}')

from qubalab.display.plot import plotImage
import matplotlib.pyplot as plt
_, ax = plt.subplots()
plotImage(ax, lowest_resolution)

## Reading regions

Similarly, we can request just part of an image:

In [None]:
downsample = 1
x = 13000
y = 15000
width = 2000
height = 1000
tile = server.read_region(downsample, x=x, y=y, width=width, height=height)

print(f'Tile shape: {tile.shape}')

_, ax = plt.subplots()
plotImage(ax, tile)

## Reading images lazily

Images can be read to numpy or dask arrays, which can be computed on demand:

In [None]:
last_level = server.metadata.n_resolutions - 1
lowest_resolution = server.level_to_dask(last_level)

# Pixel values are not read yet, but you can get the shape of the image
print(f'Image shape: {lowest_resolution.shape}')

# Compute array. This will read the pixel values
lowest_resolution = lowest_resolution.compute()

_, ax = plt.subplots()
plotImage(ax, lowest_resolution)

## Reading tiles lazily

We can also read tiles into dask arrays:

In [None]:
highest_resolution = server.level_to_dask(0)
print(f'Full resolution image shape: {highest_resolution.shape}')

x = 13000
y = 15000
width = 2000
height = 1000
tile = highest_resolution[:, y:y+height, x:x+width]
print(f'Tile shape: {tile.shape}')

tile = tile.compute() #  This will only read the pixel values of the tile
_, ax = plt.subplots()
plotImage(ax, tile)

## Interacting with QuPath

So far, we've been working just in the Python world. However, QuBaLab also provides and easy-to-use connection to QuPath.

## Setting up a gateway

The main method of interacting with QuPath
is through a *gateway*, which operates using
a websocket connection:

In [None]:
from qubalab.qupath import qupath_gateway

token = None
port = 25333
gateway = qupath_gateway.create_gateway(auth_token=token, port=port)

gateway

## QuPath setup

We're using a [CC0 TMA image](https://dataverse.harvard.edu/file.xhtml?persistentId=doi:10.7910/DVN/GG0D7G/VCRA28&version=1.0). We can take a snapshot of the QuPath GUI. This notebook assumes you've opened the `HE_Hamamatsu.tiff` image in the example project.

In [None]:
import matplotlib.pyplot as plt

plt.imshow(qupath_gateway.create_snapshot())
plt.axis(False)
plt.show()

## Basic gateway operations

A gateway provides us with an *entry point*,
which allows us to call Java methods from python. This is an instance of the `QPEx` class, for anybody well-versed in the QuPath groovy API.


In [None]:
print(f"Extension version: {gateway.entry_point.getExtensionVersion()}")

In [None]:
print(f"Current image name: {gateway.entry_point.getCurrentImageName()}")

## Mixing 🐍 and ☕

Since the entry point exposes the QuPath scripting interface, you can do lots of basic scripting operations in python by calling object methods.

In [None]:
cores = gateway.entry_point.getTMACoreList()

positive_cores = [core for core in cores if core.getClassification() == "Positive"]

[(core.getName(), core.getChildObjects().size()) for core in positive_cores]

## Downsides

However, it's not ideal to write Java code in python, and we wouldn't necessarily encourage it.

For one, we don't provide wrappers for Java objects, so finding the right methods can be tricky.

Furthermore, it'll be very difficult to maintain complex scripts, especially if and when implementation details change in QuPath.

If you just want to script QuPath, groovy will
remain the best bet.

## Objects

We can request objects from QuPath and get references to Java objects.

In [None]:
from qubalab.objects.object_type import ObjectType

object_type = ObjectType.ANNOTATION    # could be DETECTION, TILE, CELL, TMA_CORE

In [None]:
annotations = qupath_gateway.get_objects(object_type = object_type)
annotation = annotations[0]

annotation.setName("Hello from Python")
print(annotation)

In [None]:
qupath_gateway.refresh_qupath()

plt.imshow(qupath_gateway.create_snapshot())
plt.axis(False)
plt.show()

## Converting objects

We can also specify a converter when requesting objects, meaning we actually retrieve simple Python objects. This makes it easy to write proper Python code to process objects.

In [None]:
annotations = qupath_gateway.get_objects(object_type = object_type, converter='geojson')

print(type(annotations[0]))

In [None]:
from shapely.geometry import shape

shape(annotations[0].geometry)

## Adding and deleting objects

We can modify properties of these Python objects, and then add them back to QuPath!

In [None]:
import random

for annotation in annotations:
    annotation.color = (random.randint(0, 255), random.randint(0, 255), random.randint(0, 255))

qupath_gateway.add_objects(annotations)

## Building a workflow using ImageServer and python

Now that we've shown, let's walk through a short workflow that pieces these things together!

### Reading the image currently open

Let's use a `QuPathServer` to read pixels from the currently-open image in QuPath.

**Note: this will be slower than reading from disk.**

In [None]:
from qubalab.images.qupath_server import QuPathServer

qupath_server = QuPathServer(gateway) # use image currently opened in QuPath
downsample = 20 # reduce size by 20x
image = qupath_server.read_region(downsample=downsample)

### Converting to grayscale and filtering

With image in hand, we can first convert the image to greyscale and apply a simple Gaussian filter
to remove some high-frequency content.

In [None]:
import numpy as np

from skimage.filters import gaussian
from skimage.color import rgb2gray

# If the image is RGB, we convert it to grayscale
# read_region() returns an image with the (c, y, x) shape.
# To use rgb2gray, we need to move the channel axis so that
# the shape becomes (y, x, c)
image = np.moveaxis(image, 0, -1)
image = rgb2gray(image)

# Apply a gaussian filter
image = gaussian(image, 2.0)

### Identifying and applying a threshold

Now we can identify the threshold using the Otsu method, and apply the threshold to the filtered image to make a mask

In [None]:
from skimage.filters import threshold_otsu

threshold = threshold_otsu(image)

mask = image < threshold

### Converting a mask to an ImageFeature

We can use a QuBaLab method to convert from a labelled image into a Python ImageFeature --- something like a QuPath object as we worked with earlier.

In [None]:
from qubalab.objects.image_feature import ImageFeature

mask_annotation = ImageFeature.create_from_label_image(
    mask,   
    scale=downsample,   # mask is 20 times smaller than the QuPath image, so we scale
                        # the annotations to fit the QuPath image
    classification_names="Otsu",  # set a single classification to the detected annotations
)

## add the object back to QuPath
qupath_gateway.add_objects(mask_annotation)

### Visualising our mask

Before showing what it looks like in QuPath, let's visualise it in Python:

In [None]:
import matplotlib.pyplot as plt

plt.imshow(mask)
plt.title(f'Otsu (threshold={threshold:.2f})')
plt.axis(False)
plt.show()

## Other workflow ideas

- Object classifiers using scikit-learn
- Object clustering using graph clustering
- Assessing classifier feature importance
- Your ideas or requests...?