# ImSpiRE: <ins>Im</ins>age-aided <ins>Sp</ins>at<ins>i</ins>al <ins>R</ins>esolution <ins>E</ins>nhancement

## This is a tutorial written using Jupyter Notebook.

### Step 1. ImSpiRE installation following the [tutorial](https://github.com/Yizhi-Zhang/ImSpiRE).

### Step 2. Input preparation

ImSpiRE utilizes the count file in tab-delimited format or hierarchical-data format (HDF5 or H5) and the image file in TIFF format, as well as a file containing spot coordinates as input. 

We provided a small [test dataset](https://github.com/Yizhi-Zhang/ImSpiRE/tree/master/test/test_data) containing the raw count matrix, image and spot coordinates. A CellProfiler pipeline is also included in the test dataset for use if required.

### Step 3. Operation of ImSpiRE

### 3.1 Load required packages

In [1]:
import imspire_object as imspire
import pandas as pd
import numpy as np
import scanpy as sc

### 3.2 Custom parameters

In [2]:
imspire_param=imspire.ImSpiRE_Parameters()

In [3]:
imspire_param.BasicParam_InputCountFile="test_data/count_matrix.tsv"
imspire_param.BasicParam_InputDir="test_data/"
imspire_param.BasicParam_InputImageFile="test_data/image.tif"
imspire_param.BasicParam_PlatForm="ST"
imspire_param.BasicParam_Mode=2
imspire_param.BasicParam_OutputName="test_output"
imspire_param.BasicParam_Overwriting=True
imspire_param.CellProfilerParam_Pipeline="test_data/Cellprofiler_Pipeline_HE.cppipe"
imspire_param.BasicParam_Verbose=True

### 3.3 Run ImSpiRE

#### 3.3.1 one step

In [4]:
imspire_run=imspire.ImSpiRE(imspire_param)

In [5]:
imspire_run.run()

Generating spot images...
Generating patch images...
Extracting features of spot images...


Experiment-wide values for mean threshold, etc calculated by MeasureImageQuality may be incorrect if the run is split into subsets of images.
Experiment-wide values for mean threshold, etc calculated by MeasureImageQuality may be incorrect if the run is split into subsets of images.
Times reported are CPU and Wall-clock times for each module
Fri Feb 10 15:13:34 2023: Image # 89, module Images # 1: CPU_time = 0.00 secs, Wall_time = 0.00 secs
Fri Feb 10 15:13:34 2023: Image # 89, module Metadata # 2: CPU_time = 0.00 secs, Wall_time = 0.00 secs
...
Fri Feb 10 15:13:50 2023: Image # 11, module ExportToSpreadsheet # 8: CPU_time = 0.00 secs, Wall_time = 0.00 secs
Fri Feb 10 15:13:50 2023: Image # 11, module CreateBatchFiles # 9: CPU_time = 0.00 secs, Wall_time = 0.00 secs


Extracting features of patch images...


Experiment-wide values for mean threshold, etc calculated by MeasureImageQuality may be incorrect if the run is split into subsets of images.
Given the large number of image sets, you may want to consider using ExportToDatabase as opposed to ExportToSpreadsheet.
Experiment-wide values for mean threshold, etc calculated by MeasureImageQuality may be incorrect if the run is split into subsets of images.
Times reported are CPU and Wall-clock times for each module
Fri Feb 10 15:14:03 2023: Image # 208, module Images # 1: CPU_time = 0.00 secs, Wall_time = 0.00 secs
Fri Feb 10 15:14:03 2023: Image # 208, module Metadata # 2: CPU_time = 0.00 secs, Wall_time = 0.00 secs
...
Fri Feb 10 15:18:07 2023: Image # 1242, module ExportToSpreadsheet # 8: CPU_time = 0.00 secs, Wall_time = 0.00 secs
Fri Feb 10 15:18:07 2023: Image # 1242, module CreateBatchFiles # 9: CPU_time = 0.00 secs, Wall_time = 0.00 secs


Solving OT...
It.  |Loss        |Relative loss|Absolute loss
------------------------------------------------
    0|1.902401e-01|0.000000e+00|0.000000e+00
    1|5.629286e-02|2.379472e+00|1.339473e-01
    2|5.363270e-02|4.959944e-02|2.660152e-03
    3|5.336251e-02|5.063377e-03|2.701945e-04
    4|5.332444e-02|7.139977e-04|3.807352e-05
    5|5.331593e-02|1.595670e-04|8.507464e-06
    6|5.330993e-02|1.124530e-04|5.994862e-06
    7|5.330467e-02|9.876406e-05|5.264586e-06
    8|5.330102e-02|6.844175e-05|3.648015e-06
    9|5.329882e-02|4.128897e-05|2.200654e-06
   10|5.329751e-02|2.451187e-05|1.306422e-06
Computing high resolution expression profiles...


#### 3.3.2 step by step

In [6]:
imspire.create_folder(imspire_param.BasicParam_OutputDir,
                      imspire_param.BasicParam_OutputName,
                      imspire_param.BasicParam_Overwriting)

In [7]:
imdata=imspire.ImSpiRE_Data()
imdata.read_ST(imspire_param.BasicParam_InputDir, count_file=imspire_param.BasicParam_InputCountFile)

In [8]:
## optional
imdata.preprocess(min_counts=imspire_param.Threshold_MinCounts, 
                  max_counts=imspire_param.Threshold_MaxCounts, 
                  pct_counts_mt=imspire_param.Threshold_MitoPercent, 
                  min_cells=imspire_param.Threshold_MinSpot)

In [9]:
im=imspire.ImSpiRE_HE_Image(imspire_param.BasicParam_InputImageFile,
                            imspire_param.BasicParam_PlatForm,
                            imspire_param.BasicParam_OutputDir,
                            imspire_param.BasicParam_OutputName,
                            imspire_param.FeatureParam_IterCount)

In [10]:
spot_image_output_path=f"{imspire_param.BasicParam_OutputDir}/{imspire_param.BasicParam_OutputName}/ImageResults/SpotImage"
im.segment_spot_image(pos_in_tissue_filter=imdata.pos_in_tissue_filter,
                      output_path=spot_image_output_path,
                      crop_size=imspire_param.ImageParam_CropSize)

In [11]:
patch_image_output_path=f"{imspire_param.BasicParam_OutputDir}/{imspire_param.BasicParam_OutputName}/ImageResults/PatchImage"
im.generate_patch_locations_2(pos_in_tissue=imdata.pos_in_tissue_filter,
                              dist=imspire_param.ImageParam_PatchDist)
im.segment_patch_image(patch_in_tissue=im.patch_in_tissue, 
                       output_path=patch_image_output_path, 
                       crop_size=imspire_param.ImageParam_CropSize)

In [12]:
spot_cp=imspire.ImSpiRE_HE_CellProfiler(image_file_path=f"{spot_image_output_path}", 
                                        cellprofiler_pipeline=imspire_param.CellProfilerParam_Pipeline)
patch_cp=imspire.ImSpiRE_HE_CellProfiler(image_file_path=f"{patch_image_output_path}", 
                                         cellprofiler_pipeline=imspire_param.CellProfilerParam_Pipeline)

In [13]:
spot_feature_output_path=f"{imspire_param.BasicParam_OutputDir}/{imspire_param.BasicParam_OutputName}/FeatureResults/SpotFeature"
spot_cp.compute_image_features(output_path=spot_feature_output_path,
                               number_of_kernels=imspire_param.CellProfilerParam_KernelNumber)
spot_cp.filter_image_features(output_path=spot_feature_output_path)

Experiment-wide values for mean threshold, etc calculated by MeasureImageQuality may be incorrect if the run is split into subsets of images.
ExportToSpreadsheet is configured to refrain from overwriting files and the following file(s) already exist: .//test_output/FeatureResults/SpotFeature/Image_Features_Image.txt
Experiment-wide values for mean threshold, etc calculated by MeasureImageQuality may be incorrect if the run is split into subsets of images.
Times reported are CPU and Wall-clock times for each module
Fri Feb 10 15:20:48 2023: Image # 100, module Images # 1: CPU_time = 0.00 secs, Wall_time = 0.00 secs
Fri Feb 10 15:20:48 2023: Image # 100, module Metadata # 2: CPU_time = 0.00 secs, Wall_time = 0.00 secs
...
Fri Feb 10 15:21:05 2023: Image # 99, module ExportToSpreadsheet # 8: CPU_time = 0.01 secs, Wall_time = 0.00 secs
Fri Feb 10 15:21:05 2023: Image # 99, module CreateBatchFiles # 9: CPU_time = 0.00 secs, Wall_time = 0.00 secs


In [14]:
patch_feature_output_path=f"{imspire_param.BasicParam_OutputDir}/{imspire_param.BasicParam_OutputName}/FeatureResults/PatchFeature"
patch_cp.compute_image_features(output_path=patch_feature_output_path,
                                number_of_kernels=imspire_param.CellProfilerParam_KernelNumber)
patch_cp.filter_image_features(output_path=patch_feature_output_path)

Experiment-wide values for mean threshold, etc calculated by MeasureImageQuality may be incorrect if the run is split into subsets of images.
Given the large number of image sets, you may want to consider using ExportToDatabase as opposed to ExportToSpreadsheet.
Experiment-wide values for mean threshold, etc calculated by MeasureImageQuality may be incorrect if the run is split into subsets of images.
Times reported are CPU and Wall-clock times for each module
Fri Feb 10 15:21:18 2023: Image # 415, module Images # 1: CPU_time = 0.00 secs, Wall_time = 0.00 secs
Fri Feb 10 15:21:18 2023: Image # 415, module Metadata # 2: CPU_time = 0.00 secs, Wall_time = 0.00 secs
...
Fri Feb 10 15:25:13 2023: Image # 1449, module ExportToSpreadsheet # 8: CPU_time = 0.01 secs, Wall_time = 0.00 secs
Fri Feb 10 15:25:13 2023: Image # 1449, module CreateBatchFiles # 9: CPU_time = 0.00 secs, Wall_time = 0.00 secs


In [15]:
spot_features=spot_cp.image_features.loc[imdata.pos_in_tissue_filter.index,]

patch_cp.image_features.index=patch_cp.image_features.index.astype('int')
patch_features=patch_cp.image_features.sort_index()
patch_features.index=list(range(patch_features.shape[0]))

spot_features.to_csv(f"{spot_feature_output_path}/Image_Features_Image_filter.txt", sep = "\t")
patch_features.to_csv(f"{patch_feature_output_path}/Image_Features_Image_filter.txt", sep = "\t")

In [16]:
spot_locations=np.array(imdata.pos_in_tissue_filter.loc[:,["pxl_row_in_fullres","pxl_col_in_fullres"]])
patch_locations=np.array(im.patch_in_tissue.loc[:,["pxl_row","pxl_col"]])

In [17]:
spot_feature=pd.read_csv(f"{spot_feature_output_path}/Image_Features_Image_filter.txt",sep="\t",index_col=0)
patch_feature=pd.read_csv(f"{patch_feature_output_path}/Image_Features_Image_filter.txt",sep="\t",index_col=0)

## make sure that spot features and patch features have the same dimension
spot_feature=spot_feature.dropna(axis=1)
patch_feature=patch_feature.dropna(axis=1)
commom_feature=list(set(spot_feature.columns).intersection(set(patch_feature.columns)))
spot_feature = spot_feature.loc[:,commom_feature]
patch_feature = patch_feature.loc[:,commom_feature]

spot_feature=np.array(spot_feature.loc[imdata.pos_in_tissue_filter.index,])
patch_feature=np.array(patch_feature.sort_index())

In [18]:
exp_data=imdata.adata.to_df()
exp_data=exp_data.loc[imdata.pos_in_tissue_filter.index,]

In [19]:
ot_solver=imspire.ImSpiRE_OT_Solver(spot_locations,patch_locations,
                                    spot_image_features=spot_feature,
                                    patch_image_features=patch_feature,
                                    spot_gene_expression=exp_data,
                                    random_state=imspire_param.BasicParam_RandomState)

In [20]:
ot_solver.setup_cost_matrices(alpha=imspire_param.OptimalTransportParam_Alpha,
                              num_neighbors=imspire_param.OptimalTransportParam_NumNeighbors)
ot_solver.solve_OT(beta=imspire_param.OptimalTransportParam_Beta, 
                   epsilon=imspire_param.OptimalTransportParam_Epsilon,
                   numItermax=imspire_param.OptimalTransportParam_NumIterMax,
                   verbose=imspire_param.BasicParam_Verbose)

In [21]:
exp_data_hr=imspire.compute_high_resolution_expression_profiles(exp_data,ot_solver.T)

In [22]:
## output results
im.patch_in_tissue.index=list(range(im.patch_in_tissue.shape[0]))
im.patch_in_tissue.to_csv(f"{imspire_param.BasicParam_OutputDir}/{imspire_param.BasicParam_OutputName}/{imspire_param.BasicParam_OutputName}_PatchLocations.txt", sep = "\t")

adata_hr=sc.AnnData(exp_data_hr)
adata_hr.write_h5ad(f"{imspire_param.BasicParam_OutputDir}/{imspire_param.BasicParam_OutputName}/{imspire_param.BasicParam_OutputName}_ResolutionEnhancementResult.h5ad")

np.save(f"{imspire_param.BasicParam_OutputDir}/{imspire_param.BasicParam_OutputName}/SupplementaryResults/ot_M_alpha{imspire_param.OptimalTransportParam_Alpha}_beta{imspire_param.OptimalTransportParam_Beta}_epsilon{imspire_param.OptimalTransportParam_Epsilon}.npy",ot_solver.M)
np.save(f"{imspire_param.BasicParam_OutputDir}/{imspire_param.BasicParam_OutputName}/SupplementaryResults/ot_C1_alpha{imspire_param.OptimalTransportParam_Alpha}_beta{imspire_param.OptimalTransportParam_Beta}_epsilon{imspire_param.OptimalTransportParam_Epsilon}.npy",ot_solver.C1)
np.save(f"{imspire_param.BasicParam_OutputDir}/{imspire_param.BasicParam_OutputName}/SupplementaryResults/ot_C2_alpha{imspire_param.OptimalTransportParam_Alpha}_beta{imspire_param.OptimalTransportParam_Beta}_epsilon{imspire_param.OptimalTransportParam_Epsilon}.npy",ot_solver.C2)
np.save(f"{imspire_param.BasicParam_OutputDir}/{imspire_param.BasicParam_OutputName}/SupplementaryResults/ot_T_alpha{imspire_param.OptimalTransportParam_Alpha}_beta{imspire_param.OptimalTransportParam_Beta}_epsilon{imspire_param.OptimalTransportParam_Epsilon}.npy",ot_solver.T)