
#### Load Required Extensions and Libraries
- `%autoreload` is used to automatically reload any modules that are updated.
- Import necessary functions from `insitupy` and `scanpy`.


In [None]:

%load_ext autoreload
%autoreload 2
from insitupy import read_xenium
import scanpy as sc

- We load the Xenium data using the `read_xenium` function from `insitupy`.
- The path points to the dataset location on your system.
- The `load_cells()` method loads the cell data from the specified dataset.

In [24]:
# Load the Xenium data from the folder
xd = read_xenium(r'c:\Users\Aitana\A_BRC_LM_Project\Original_Data\output-XETG00050__0018891__3414-15-1B__20240614__101940')
xd.load_cells()

Loading cells...
	No alternative cells found...


#### Filter Cells and Genes
- **Filter Cells**: We filter out cells that have fewer than 10 detected genes using `sc.pp.filter_cells`.
- **Filter Genes**: We filter out genes that are not detected in at least 20 cells using `sc.pp.filter_genes`.



In [25]:
#Filter out cells with <10 genes
sc.pp.filter_cells(xd.cells.matrix, min_genes=10)

#Filter out genes that are not detected in at least 20
sc.pp.filter_genes(xd.cells.matrix, min_cells=20)

#### Compare Transformations and Generate Normalization Report
- Here we apply several transformations to the dataset (`log1p`, `sqrt_1`, `sqrt_2`, `pearson_residuals`, and `sctransform`).
- The results are saved as an HTML report at the specified path: `"C:/Users/Aitana/normalization_results.html"`.
- The HTML report contains graphical and statistical comparisons of the transformation methods, including:
  - A **summary table** that highlights key metrics for each transformation, such as skewness, kurtosis, mean absolute deviation (MAD), coefficient of variation (CV), Shapiro-Wilk test results, and more. The best-performing metrics are highlighted in green.
  - **Histograms** showing the distribution of transformed counts for each method overlaid with a normal distribution curve.
  - **Q-Q plots** that compare the quantiles of the transformed data against a theoretical normal distribution to assess the normality of the transformed data.

In [27]:
xd.compare_transformations(
    transformation_methods=["log1p", "sqrt_1", "sqrt_2", "pearson_residuals", "sctransform"],
    output_path="C:/Users/Aitana/normalization_results.html"
)


Comparing transformations for the main modality (cells.matrix)...
Store raw counts in anndata.layers['counts']...
Applying transformation: log1p
Applying transformation: sqrt_1
Applying transformation: sqrt_2
Applying transformation: pearson_residuals
Applying transformation: sctransform
Starting SCTransform...
AnnData object saved temporarily at: C:\Users\Aitana\AppData\Local\Temp\tmpdz88txb5.h5ad


R[write to console]: Running SCTransform on assay: RNA

R[write to console]: vst.flavor='v2' set. Using model with fixed slope and excluding poisson genes.

R[write to console]: Calculating cell attributes from input UMI matrix: log_umi

R[write to console]: Variance stabilizing transformation of count matrix of size 477 by 42867

R[write to console]: Model formula is y ~ log_umi

R[write to console]: Get Negative Binomial regression parameters per gene

R[write to console]: Using 475 genes, 5000 cells

R[write to console]: Second step: Get residuals using fitted parameters for 477 genes

R[write to console]: Computing corrected count matrix for 477 genes

R[write to console]: Calculating gene attributes

R[write to console]: Wall clock passed: Time difference of 8.892223 secs

R[write to console]: Determine variable features

R[write to console]: Centering data matrix

  |                                                                            
  |                                  

SCTransform applied to Seurat object.
Converted Seurat object to SingleCellExperiment.
SCTransform transformation completed and returned as AnnData.
Processing log1p...


  res = hypotest_fun_out(*samples, **kwds)


Processing sqrt_1...
Processing sqrt_2...
Processing pearson_residuals...
Processing sctransform...
HTML report created and saved as 'C:/Users/Aitana/normalization_results_1.html'


{'main':                    skewness  kurtosis        mad         cv  shapiro_stat  \
 log1p             -0.067653 -0.502083  12.857011   0.228110      0.994218   
 sqrt_1            -0.180534 -0.445408   6.574030   0.010263      0.992288   
 sqrt_2             0.127878 -0.448148  16.251417   0.216326      0.994874   
 pearson_residuals  1.082801  2.649645  40.160252  13.518975      0.948076   
 sctransform       -0.049513 -0.206769   7.017416   0.033700      0.992655   
 
                       shapiro_p  anderson_stat   ks_stat           ks_p  
 log1p              1.102331e-36      37.152233  0.023574   3.973254e-21  
 sqrt_1             3.319929e-41      58.527609  0.029390   1.336710e-32  
 sqrt_2             7.142255e-35      39.405594  0.017271   1.542915e-11  
 pearson_residuals  4.527068e-78     440.689315  0.062679  7.553077e-147  
 sctransform        2.017466e-40     164.534602  0.068015  7.257627e-173  }