## AESTETIK with simulated spatial transcritpomics data

In this notebook we apply AESTETIK on simulated spatial transcriptomics data.

You can find the code for generating spatial transcriptomics data [here](https://github.com/ratschlab/simulate_spatial_transcriptomics_tool). https://github.com/ratschlab/simulate_spatial_transcriptomics_tool

In summary:

> We adapted the simulation approach suggested in [5] by introducing spatial structure in the experiment. Briefly, relying on simulated ground truth labels, we simulate transcriptomics and morphology modalities, allowing partial observation of true clusters within each modality individually. However, combining both modalities enables the identification of all clusters. Spatial coordinates are incorporated by sorting the ground truth in spatial space.

Please refer to our original publication for more information and examples.

Now, we will load the data using scanpy, perform clustering and visualize the results.

In [None]:
import os
os.chdir('../')

In [None]:
from sklearn.cluster import KMeans
from aestetik import AESTETIK
import squidpy as sq
import scanpy as sc
import numpy as np

In [None]:
adata = sc.read("test_data/A.h5ad")
adata

In [None]:
adata.obsm["X_pca"].shape

In [None]:
adata.obsm["image"].shape

In [None]:
adata.obsm["combined"] = np.concatenate((adata.obsm["X_pca"], adata.obsm["image"]), axis=1)
adata.obsm["combined"].shape

Data explanation:
- `X_pca` contains the top 15 PCs computed on the "expression" matrix
- `image` contains the top 15 PCs computed on the "image features"
- `x_array`, `y_array`, `x_pixel`, `y_pixel` are the coordinates in array and pixel space.

### Clustering

In [None]:
#based only on transcriptomics
sc.pp.neighbors(adata, use_rep="X_pca")
sc.tl.umap(adata)

adata.obs["transcriptomics_kmeans"] = KMeans(5).fit_predict(adata.obsm["X_pca"]).astype(str)
sc.pl.umap(adata, color=["ground_truth", 
                         "transcriptomics_kmeans"])

In [None]:
#based only on morphology
sc.pp.neighbors(adata, use_rep="image")
sc.tl.umap(adata)

adata.obs["morphology_kmeans"] = KMeans(5).fit_predict(adata.obsm["image"]).astype(str)
sc.pl.umap(adata, color=["ground_truth", 
                         "morphology_kmeans"])

In [None]:
#based only on combined
sc.pp.neighbors(adata, use_rep="combined")
sc.tl.umap(adata)

adata.obs["combined_kmeans"] = KMeans(5).fit_predict(adata.obsm["combined"]).astype(str)
sc.pl.umap(adata, color=["ground_truth", 
                         "combined_kmeans"])

In [None]:
sc.pl.umap(adata, color=["ground_truth", 
                         "transcriptomics_kmeans",
                         "morphology_kmeans",
                         "combined_kmeans"])

In [None]:
sq.pl.spatial_scatter(adata, color=["ground_truth", 
                         "transcriptomics_kmeans",
                         "morphology_kmeans",
                         "combined_kmeans"], 
                      size=0.5)

### Now, we apply AESTETIK. Please find more infromation about AESTETIK [here](https://github.com/ratschlab/aestetik).

In [None]:
# we set the transcriptomics modality
adata.obsm["X_pca_transcriptomics"] = adata.obsm["X_pca"][:,0:15]
# we set the morphology modality
adata.obsm["X_pca_morphology"] = adata.obsm["image"][:,0:15]

In [None]:
parameters =    {'morphology_weight': 1.5,
                 'refine_cluster': 1,
                 'window_size': 3,
                 'clustering_method': "kmeans"
                }
parameters

In [None]:
model = AESTETIK(nCluster=adata.obs.ground_truth.unique().size,
                 **parameters)

In [None]:
model.fit_predict(X=adata, cluster=True)

In [None]:
#based on AESTETIK representation
sc.pp.neighbors(adata, use_rep="AESTETIK")
sc.tl.umap(adata)

sc.pl.umap(adata, color=["ground_truth", 
                         "AESTETIK_cluster"])

In [None]:
sq.pl.spatial_scatter(adata, color=["ground_truth",
                         "combined_kmeans",
                         "AESTETIK_cluster"], 
                      ncols=5,
                      wspace=0,
                      dpi=150,
                      size=0.5)

In [None]:
sq.pl.spatial_scatter(adata, color=["ground_truth",
                                    "transcriptomics_kmeans",
                                     "morphology_kmeans",
                                     "combined_kmeans",
                                     "AESTETIK_cluster"], 
                                  ncols=5,
                                  wspace=0,
                                  dpi=150,
                                  size=0.5,
                                  save="AESTETIK_clustering.png"
                     )