# Labeled Point Data

In this notebook we're going to implement a custom tile generator to serve custome data

In [None]:
%load_ext autoreload
%autoreload 2

In [None]:
import h5py
import numpy as np
import pandas as pd

### Load data

In this example we show a dataset on image-based spatial transcriptomics by [Wang et al., 2018.](https://doi.org/10.1038/s41598-018-22297-7)

In [None]:
df = pd.read_csv('data/spatial_gene_expression.csv.gz')

print('Number of points: {}'.format(len(df)))
df.head()

### Custom tileset API

To generate and serve custom tiles we need a `tileset_info()` and a `tiles()` function as demonstrated below.

In [None]:
from higlass.tilesets import Tileset
from clodius.tiles.format import format_dense_tile
from clodius.tiles.utils import tile_bounds

def dfdensity(df, channel_col=None, channel=None, uuid=None, max_zoom=10):
    
    # [Optional] Subsample data
    if channel_col and channel:
        data = df[df[channel_col] == channel].reindex(columns=['x', 'y']).values
    else:
        data = df.reindex(columns=['x', 'y']).values

    tileset_info = {
        'min_pos': [df['x'].min(), df['y'].min()],
        'max_pos': [df['x'].max(), df['y'].max()],
        'max_width': max(
            df['x'].max() - df['x'].min(),
            df['y'].max() - df['y'].min()
        ),
        'max_zoom': max_zoom,
        'mirror_tiles': False
    }
            
    def get_tile(z, x, y):
        extent = tile_bounds(tileset_info, z, x, y)
        
        # get all the points within the extent
        points = data[
            (data[:, 0] > extent[0]) &
            (data[:, 0] < extent[2]) &
            (data[:, 1] > extent[1]) &
            (data[:, 1] < extent[3])
        ]
        
        # Generate a 2D histogram
        hist, _, _ = np.histogram2d(
            points[:, 0],
            points[:, 1],
            bins=256
        )
        
        # Set empty bins to `nan` to make them transparent
        hist[hist == 0.] = np.nan
        
        return hist.T
    
    def tiles(tile_ids):
        tiles = []
        
        for tile_id in tile_ids:
            _, z, x, y = tile_id.split('.')
            # `format_dense_tile()` converts the ndarray to base64 encoding
            # and adds min/max values
            data = format_dense_tile(get_tile(int(z), int(x), int(y)))
            tiles.append((tile_id, data))
    
        return tiles

    return Tileset(
        uuid=uuid,
        tileset_info=lambda: tileset_info,
        tiles=tiles
    )

Let's visualize the data as usual

In [None]:
import higlass
from higlass.client import Track, View

tileset = dfdensity(df, uuid='density')

display, server, viewconf = higlass.display([
    View([
        Track('heatmap',
            tileset=tileset,
            position='center',
            height=600,
            options={
                'colorRange': ['rgba(245,166,35,1.0)', 'rgba(208,2,27,1.0)', 'black'],
                'backgroundColor': 'white',
                'name': 'Wang et al. Spatial Gene Transcription',
            }
        ),
    ]),
])

display

Because we implemented a custom tileset API we can add any kind of customization like picking which channels (i.e., genes) we actually want to visualize.

In [None]:
display, server, viewconf = higlass.display([
    View(
        x=0,
        y=0,
        width=4,
        height=12,
        tracks=[
            Track('heatmap',
                tileset=dfdensity(df, uuid='density'),
                position='center',
                height=400,
                options={
                    'colorRange': ['rgba(245,166,35,1.0)', 'rgba(208,2,27,1.0)', 'black'],
                    'backgroundColor': 'white',
                    'name': 'Wang et al.: All',
                }),
        ]
    ),
    View(
        x=4,
        y=0,
        width=4,
        height=12,
        tracks=[
            Track('heatmap',
                tileset=dfdensity(df, channel_col='gene', channel='ENO1', uuid='density_eno1'),
                position='center',
                height=400,
                options={
                    'colorRange': ['rgba(128,217,255,1.0)', 'rgba(0,180,255,1.0)', 'rgba(0,90,128,1.0)', 'black'],
                    'backgroundColor': 'white',
                    'name': 'Wang et al.: ENO1',
                }
            ),
        ]
    ),
    View(
        x=8,
        y=0,
        width=4,
        height=12,
        tracks=[
            Track('heatmap',
                tileset=dfdensity(df, channel_col='gene', channel='THBS1', uuid='density_thbs1'),
                position='center',
                height=400,
                options={
                    'colorRange': ['rgba(128,217,255,1.0)', 'rgba(0,180,255,1.0)', 'rgba(0,90,128,1.0)', 'black'],
                    'backgroundColor': 'white',
                    'name': 'Wang et al.: THBS1',
                }
            ),
        ]
    ),
])

display

### Tileset Response

Most HiGlass tracks expect the data to be base64 encoded numpy arrays (`dense`) with a the data type (`dtype`), length `size`, and precalculated min and max values (`min_value` and `max_value`) for efficient value scaling.

In [None]:
response = server.tiles(tileset.uuid, 0, 0, 0)
print(
    'dtype:', response['dtype'],
    'size:', response['size'],
    'min_value:', response['min_value'],
    'max_value:', response['max_value'],
    'dense:', '{}...'.format(response['dense'][:20])
)

### View config

To ultimately visualize the data, HiGlass' viewer requires a view configuration telling HiGlass the what, how, and where.

In [None]:
viewconf