# Transformation of tables of pixel coordinates to world coordinates (for STED)

For the correct alignment of datasets consisting of multiple images based on keypoints (e.g., fiducial beads), we need to transform keypoint coordinates in pixels to world coordinates (here: µm) that take stage position / scan field of the individual images into account.

STED images have both a **stage/coarse offset** (stage position and objective focus position) as well as a **scan/fine offset** (position of recorded FOV relative to stage position and z position of Piezo stage). Z-stacks are acquired via the Piezo stage by default, going from low positions to high, so deeper parts of a sample (away from coverslip) are imaged first. This is in contrast to focusing by moving the objective: moving the objective up would go from the coverslip deeper into the sample. The solution used here is to **consider the image direction as the reference** - thus a higher objective focus would be a negative offset.

The world coordinate of a spot we calculate here will be:
$(z_{pixel}, y_{pixel}, x_{pixel}) * pixel size + offset$
with $offset = (- z_{stage}, y_{stage}, x_{stage}) + (z_{scan}, y_{scan}, x_{scan}) - \frac{1}{2} FOVsize$ (both the stage and scan offsets are the sum of local and global offsets, and indicate the middle of the imaged FOV). Imspector uses coordinates in meters, for more microscopy-friendly units, we convert everything to µm in the end.

This way, the combined world z position will be negative most of the time, but it will be consistent across images and make subsequent alignment easier, as we do not have to consider extra z-flips of the image.

## Worflow of this notebook

### Input
- base path of sample
- relative path to raw ```.msr```/```.h5``` data
- relative path (may contain wildcard for multiple files) to csv file(s) containing spot coordinates **in pixels** and names of images from which the spot has been derived in the format `/path/to/imagename_chX.tif`


### Output

- a corrected csv for each orginal, named `{filename}_global_coords.csv` with the addition of global coordinates **in µm**, located in the specified output folder

In [None]:
# path to upper level directory
in_path = "/home/stumberger/ep2024/RNA_DNA_FISH_spot_detection/example/DNAFISH/"

# subpaths
raw_subpath = "raw"

# NOTE: can be path to a single .csv file, but can also include wildcard (*) to correct multiple files
spots_subpath = "detections"

# where to save results
out_subpath = "detections"

# the naming of xyz variables for coordinates in pixels in your df
coordinate_column_names = ['z', 'y', 'x']

# how to call columns of world coordinates
global_coordinate_column_names = ['z_global_um', 'y_global_um', 'x_global_um']

# the name of the image file column
image_file_column_name = 'img'

# whether to use h5 raw data (True) to get metadata or msr raw data (False)
# NOTE: should not make any difference, except when you don't have one of the two raw types
# NOTE: h5 seems a bit faster
use_h5_for_metadata = False

# whether to drop underscores from acqisition name (field_X_sted_Y) in h5
# in autosted prior to v2 we used keys like fieldX_stedY
drop_underscores_for_h5 = False

In [None]:
import re
from pathlib import Path
import pandas as pd

from utils.transform_helpers import get_scan_field_metadata, get_scan_field_metadata_h5, world_coords_for_pixel_spots

# make out directory if necessary
if not (Path(in_path) / out_subpath).exists():
    (Path(in_path) / out_subpath).mkdir()

# go through all csv files specified
for csv_file in Path(in_path).glob(spots_subpath):

    df = pd.read_csv(csv_file)
    sub_dfs = []

    # go through each image in table and get metadata & global coords
    for img_path, dfi in df.groupby(image_file_column_name):

        if use_h5_for_metadata:

            # parse base name (random prefix of files)
            h5_base_name = re.match('(.*?)(_.*?_[0-9]+)+_ch.*', Path(img_path).name).group(1)
            
            # get acquisition index (field_X_sted_Y), drop underscores not present in H5
            if drop_underscores_for_h5:
                h5_acquisition_name = '_'.join(map(lambda s: s.replace('_', ''), re.findall('_.*?_[0-9]+', Path(img_path).name)))
            else:
                h5_acquisition_name = '_'.join(map(lambda s: s[1:], re.findall('_.*?_[0-9]+', Path(img_path).name)))

            # load metadata for current acquisition from corresponding h5 file
            h5_file = h5_base_name + '.h5'
            h5_file = Path(in_path) / raw_subpath / h5_file

            meta = get_scan_field_metadata_h5(h5_file, h5_acquisition_name)

        else:

            # get corresponding msr file
            channel_idx_pattern = '_ch([0-9]+)\.tif'
            msr_file = re.sub(channel_idx_pattern, '.msr', Path(img_path).name)
            msr_file = msr_file.replace("_aligned", "") 
            msr_file = Path(in_path) / raw_subpath / msr_file
            # get channel idx, we can use the same pattern
            channel_idx = int(re.findall(channel_idx_pattern, Path(img_path).name)[0])

            # get scan field metadata
            meta = get_scan_field_metadata(msr_file, channel_idx)

        # correct pixel coords to global coords
        # NOTE: 1e6 factor takes us from meters to µm
        dfi[global_coordinate_column_names] = world_coords_for_pixel_spots(dfi[coordinate_column_names], meta) * 1e6

        sub_dfs.append(dfi)
    
    df_corrected = pd.concat(sub_dfs)
    df_corrected.to_csv(Path(in_path) / out_subpath / csv_file.name.replace('.csv', '_global_coords.csv'), index=False)