# Pixasonics: An Image Sonification Toolbox for Python

## Introduction

Pixasonics is a library for interactive audiovisual image analysis and exploration, through image sonification. That is, it is using real-time audio and visualization to listen to image data: to map between image features and acoustic parameters. This can be handy when you need to work with a large number of images, image stacks, or hyper-spectral images (involving many color channels) where visualization becomes limiting, challenging, and potentially overwhelming.

With pixasonics, you can launch a little web application (running in Jupyter Notebooks), where you can load images, probe their data with various feature extraction methods, and map the extracted features to parameters of synths, devices that make sound. You can do all this in real-time, using a visual interface, you can remote-control the interface programmatically, record sound real-time, or non-real-time, with a custom script.

### If you are in a hurry...

In [None]:
# ;pip install pixasonics

# quick workflow with a simple example
from pixasonics.core import App, Mapper
from pixasonics.features import MeanChannelValue
from pixasonics.synths import Theremin

# create a new app
app = App() # by default 500x500 pixels

# load an image from file
app.load_image_file("images/test.jpg")

# create a Feature that will report the mean value of the red channel
mean_red = MeanChannelValue(filter_channels=0, name="MeanRed")
# attach the feature to the app
app.attach_feature(mean_red) # this adds it to the pipeline, so that it is updated every frame (the Probe moves)

# create a Theremin, a simple sine wave synth that we will use to sonify the mean pixel value
theremin = Theremin(name="MySine")
# attach the Theremin to the app
app.attach_synth(theremin) # this adds it to the pipeline, so that its audio output is patched into the global audio graph

# create a Mapper that will map the mean red pixel value (within the Probe) to the frequency of the Theremin
red2freq = Mapper(mean_red, theremin["frequency"], exponent=2, name="Red2Freq") # cubic mapping curve for a more "linear" feel of frequency changes
# attach the Mapper to the app
app.attach_mapper(red2freq) # this adds it to the pipeline, so that it is updated every frame (the Probe moves)

## Toolbox Structure

Pixasonics (at the moment) is expected to run in a Jupyter notebook environment. (Nothing stops you from using it in the terminal, but it is not optimized for that yet.)

At the center of pixasonics is the `App` class. This represents a template pipeline where all your image data, feature extractors, synths and mappers will live. The App also comes with a graphical user interface (UI). At the moment it is expected that you only create one `App` at a time, which will control the global real-time audio server. (And every time you create an `App` it will reset the audio graph.)

When you have your app, you load an image (either from a file, or from a numpy array) which will be displayed in the `App` canvas. Note that your image data height and width dimensions (the first two) will be downsampled to the `App`'s `image_size` creation argument, which is a tuple of `(500, 500)` pixels by default.

Then you can explore the image data with a Probe (represented by the yellow rectangle on the canvas) using your mouse or trackpad. The Probe is your "stethoscope" on the image, and more technically, it is the sub-matrix of the Probe that is passed to all `Feature` objects in the pipeline.

Speaking of which, you can extract visual features using the `Feature` base class, or any of its convenience abstractions (e.g. `MeanChannelValue`). Currently only basic statistical reductions are supported, such as mean, median, min, max, sum, std (standard deviation) and var (variance). `Feature` objects also come with a UI that shows their current values and global/running min and max. There can be any number of different `Feature`s attached to the app, and all of them will get the same Probe matrix as input.

Image features are to be mapped to synthesis parameters, that is, to the settings of sound-making gadgets. (This technique is called "Parameter Mapping Sonification" in the literature.) All synths (and audio) in pixasonics are based on the fantastic [signalflow library](https://signalflow.dev/). For now, there are 5 synth classes that you can use (and many more are on the way): `Theremin`, `Oscillator`, `FilteredNoise`, and `SimpleFM`. Each synth comes with a UI, where you can tweak the parameters (or see them being modulated by `Mapper`s) in real-time.

What connects the output of a `Feature` and the input parameter of a Synth is a `Mapper` object. There can be multiple `Mapper`s reading from the same `Feature` buffer and a Synth can have multiple `Mapper`s modulating its different parameters.

## The App

The `App` class is at the core of the pixasonics workflow. The app is where you load your image data, where you move the probe, the `App` controls the real-time audio server, and it represents the pipeline of `Feature`s connected to Synths with `Mapper`s.

To showcase the `App` and its functionality, let's create a basic scene, using an image from the [CELLULAR open dataset](https://zenodo.org/records/8315423) and map the mean red channel value to the frequency of a Theremin (a simple sine wave generator).

In [None]:
from pixasonics.core import App, Mapper
from pixasonics.features import MeanChannelValue
from pixasonics.synths import Theremin

# create a new app
app = App() # by default 500x500 pixels

# load an image from file
app.load_image_file("images/cellular_dataset/merged_8bit/Timepoint_001_220518-ST_C03_s1.jpg")

# create a Feature that will report the mean value of the red channel
mean_red = MeanChannelValue(filter_channels=0, name="MeanRed")
# attach the feature to the app
app.attach_feature(mean_red) # this adds it to the pipeline, so that it is updated every frame (the Probe moves)

# create a Theremin, a simple sine wave synth that we will use to sonify the mean pixel value
theremin = Theremin()
# attach the Theremin to the app
app.attach_synth(theremin) # this adds it to the pipeline, so that its audio output is patched into the global audio graph

# create a Mapper that will map the mean red pixel value (within the Probe) to the frequency of the Theremin
red2freq = Mapper(mean_red, theremin["frequency"], exponent=2, name="Red2Freq") # cubic mapping curve for a more "linear" feel of frequency changes
# attach the Mapper to the app
app.attach_mapper(red2freq) # this adds it to the pipeline, so that it is updated every frame (the Probe moves)

A few things just happened. The app created (or restarted) the global audio graph and the UI popped up: an canvas with the image on the left, and various settings panes on the right. The size of the canvas is determined by the `image_size` argument at the creation of the `App` object and cannot be changed afterwards. It is a tuple corresponding to `(height, width)`, and it is `(500, 500)` by default. All images that you load into the app will be __resized__ to this height and width! This means, that, at least currently, the app does not respect the aspect ratio of your input image, it will stretch or shrink it to whatever `image_size` your `App` has. Luckily the image we loaded is 2048x2048 with a square aspect ratio, so the default `image_size` was fine.

Now let's look to the right where the various settings panes live. These can be opened and closed, one at a time. Here is a brief summary of what you can find in them.

### Audio Settings

Click on "Audio Settings" on the top right to open this pane. Here you can control the _global_ audio settings of the app.

#### The Audio switch

There is an audio switch at the top left, and a volume slider next to it. To have any sound produced by the app, you need to turn on audio, either on the UI, or by evaluating:

In [3]:
app.audio = True

Still no sound from the app, huh? This is because we haven't interacted with the canvas yet. Try clicking on a few of the cells. Then try to click and drag to hear a continuous change in frequency.

#### The Master Volume Slider

Next to the audio switch there is the Master Volume slider where you can set the loudness of the app's sound output in decibels (dB). You can either control it via the UI or the `master_volume` property:

In [4]:
app.master_volume = -24 # check the slider after running this cell

#### Real-time recording to file

Below the audio switch and the master slider, there is a pair of widgets that let you record the real-time output of the app. Leave the file name at the default "recording.wav", hit the Record button, click and drag around the image, then click on the record button again to stop recording.

Let's listen to what you've done:

In [None]:
from IPython.display import Audio, display
display(Audio("recording.wav"))

The global recording is meant to be a quick-and-easy way to record your results. If you want to be more precise, (or quick) there are methods to render your results in non-real-time, which we will discuss later.

#### The Master Envelope

Below the recording section you find the Master Envelope, which controls how fast the sound fades in and out when you interact with the canvas. It is a traditional Attack-Decay-Sustain-Release (ADSR) curve that is applied to the volume of the global audio output. Here is how it works in a nutshell:
- Attack: the time for the sound to fade in (in seconds),
- Decay: the time to fade to the Sustain level (in seconds),
- Sustain: the level (amplitude) to sustain indefinitely, until we release the mouse button (or deactivate the Probe),
- Release the time for the sound to fade out after we release the mouse button (or deactivate the Probe).

There are some fields to set these parameters, and a little drawing that visualizes the proportions of the different time segments. On the bottom, there is also a "Duration" value that shows the total duration of the envelope (that is Attack time + Decay time + Release time).

Let's set a bit slower envelope to illustrate how it works:

In [6]:
app.master_envelope.attack = 0.5
app.master_envelope.release = 1.5

We can also set a more percussive envelope:

In [7]:
app.master_envelope.attack = 0.01
app.master_envelope.decay = 0.01
app.master_envelope.sustain = 0.1
app.master_envelope.release = 0.8

For now, let's reset it to a more neutral default:

In [8]:
app.master_envelope.attack = 0.1
app.master_envelope.decay = 0.01
app.master_envelope.sustain = 1
app.master_envelope.release = 0.1

In the future Envelopes will have more functionality in pixasonics, but for now there is only a global Master Envelope. Now let's close the Audio Settings pane and move on to Display Settings.

### Display Settings

We start to stray a bit far from the cell where our app UI lives. Luckily, we can get a synced copy down here if we evaluate:

In [None]:
app.ui

Aaah, much better. Let's open the Display Settings. As the name suggests, this is where you can find settings related to how the image is displayed in the app canvas. These will only affect the display though, the underlying image data (that may be HDR, or have many color channels or image layers) will not be affected by these.

#### Normalization

On the top of the pane you can find two checkboxes: "Normalize display" and "Global normalization". More often than not we need to normalize images to understand (or even, simply, to see!) the image content better. Traditionally, the normalization is performed individually on all color channels (Red, Green, and Blue in "normal" images). Keeping it channel-wise can help remove imbalances between the channels, like the green tint here. Check "Normalize display" in the UI, or evaluate:

In [10]:
app.normalize_display = True

You probably noticed that the background became much less green-tinted. Let's try also checking "Global normalization":

In [11]:
app.normalize_display_global = True

...aaand now the gren tint is back. It makes sense, because when normalization is set to global, the algorithm will scale all pixels according to the minimum and maximum pixel value apparent in __any__ of the channels. Since in the original image the green tint was the result of generally much higher green pixel values, applying global normalization brought the tint back. Let's turn it off for now:

In [12]:
app.normalize_display_global = False

#### Channel Offset and Layer Offset

As you see, under the checkboxes we have two (currently disabled) sliders, labelled "Channel Offset" and "Layer Offset". We will use these when we read numpy arrays instead of single image files (more on this later). For now, let's move on to the Probe Settings.

### Probe Settings

This is where you set everything related to the Probe (remember, our image "stethoscope") and how you interact with it.

#### The Probe Width and Height

The Probe is a rectangle, whose width and height can vary from 1x1 pixel up to the app's `image_size`. Try moving the sliders and observe how the shape of the yellow rectangle on the canvas changes. There is not really a "right" setting for this, it is highly dependent on the content of your image, and the kind of sonification you want to create. At an extreme, you can create horizontal or vertical scan lines for the whole image like this:

In [13]:
# horizontal scan line
app.probe_width = 1
app.probe_height = app.image_size[0] # remember that image size is given as (height, width)

In [14]:
# vertical scan line
app.probe_width = app.image_size[1]
app.probe_height = 1

Most of the time, you probably want to set the Probe dimensions to something smaller, to fit the content of the image. Here, let's set it to fit that nice, bright cell in the middle:

In [16]:
app.probe_width = 25
app.probe_height = 25
app.probe_x = 233
app.probe_y = 252

Yes, you guessed it: we can programmatically move the probe around using the `probe_x` and `probe_y` properties of our app. Setting the dimensions and the position of the probe like this will be the basis of scripting custom paths and render them non-real-time (more on that later).

#### Interaction modes

Now something more fun: interaction. So far we have used the "Hold" mode, that is, sound will be activated (and all existing mappers evaluated in the pipeline) while the mouse button is held down. This mode is comfortable when you want to quickly inspect (and listen to) the various parts of your image.

But sometimes you may want to activate the Probe, and then leave it on while you experiment with, let's say, changing synthesis parameters. Or you just want to free your mouse to use it somewhere else without stopping the sound output. This is what "Toggle" mode is for. You can change it on the UI by clicking on the "Toggle" button, or programmatically like so:

In [17]:
app.interaction_mode = "toggle"

Now you can double-click on the probe to activate it. Double-click again to deactivate it. It could be sometimes comfortable to activate the Probe, and just click on various parts in the image (here, on different cells) to get an abrupt comparison of their sounding mappings. 

Try to activate the toggle and then click on the different cells to find out which one produces the highest pitch! Since we mapped the mean red channel value to the theremin frequency, and since in this image red fluoresent protein indicates the level of autophagy in the cell, we can intuitively find the cell with the most active autophagy by simple clicking around and listening.

Finally, there is a checkbox for having the Probe __always__ follow your mouse, even when you don't hold the mouse button. Beyond this being simply comfortable for your hand if you are making sound from the image continuously, it can also be handy when you don't want sound output at the moment, just want to read the value of a Feature. Let's check "Probe follows idle mouse" like so:

In [18]:
app.probe_follows_idle_mouse = True

Now, if you hover your mouse over the app canvas, you can see that the Probe constantly follows it, and you can see indicators of the Probe's horizontal and vertical coordinates labelled "Probe X" and "Probe Y", respectively, and controlled by the `probe_x` and `probe_y` properties (as shown above).

Let's move on to the Features pane.

### Features

The Features pane is where all your `Feature` objects (that you attached to the app!) will show up. Right now you should see a card labelled "MeanRed" there, our Feature that reports the mean red pixel value of the Probe's slice of the image. Since we just checked the option for the Probe to follow our idle (hovering) mouse, let's move it around a bit (without making sound), and look at the values we get on the card.

To avoid scrolling back-and-forth too much, let's have another copy of our app here:

In [None]:
app.ui

### The images (that we explore)

### Features

### Synths

### Mappers

## More in depth...

### The anatomy of the App

### Loading images and matrices

### Features

### Synths

### Mapping

### Recording and Non-Real-Time (NRT) rendering