# Running Tensor-cell2cell to perform a tensor factorization

After generating a 4D-Communication Tensor, we can run Tensor-cell2cell to perform a tensor factorization and obtain the context-driven patterns of cell-cell communication (factors).

This tutorial will show you how to load a previously exported 4D-Communication Tensor and run Tensor-cell2cell on this tensor. 

## Initial Setup

Enable GPU use as discussed in [Tutorial 03](./03-Generate-Tensor.ipynb)

In [1]:
gpu_use = TRUE

if (gpu_use){
    device<-'cuda:0'
    tensorly <- reticulate::import('tensorly')
    tensorly$set_backend('pytorch')
}else{
    device<-NULL
}

###  Import Libraries

We will use reticulate to run Tensor-cell2cell in R

In [2]:
library(reticulate, quietly = T)
c2c <- reticulate::import(module = "cell2cell", as="c2c")

## Directories

In [3]:
output_folder = '../../data/tc2c-outputs/'

## Load Tensor

Now, we can load our 4D-Communication Tensor.

In [4]:
tensor<-reticulate::py_load_object(filename = paste0(output_folder, 'BALF-Tensor-R.pkl'))

## Perform Tensor Factorization

First, we have to write a wrapper function that can run Tensor-cell2cell using reticulate. To do so, we simply modify the [`liana::liana_tensor_c2c`](https://rdrr.io/github/saezlab/liana/src/R/liana_tensor.R) function to take our prebuilt tensor as input:

In [5]:
library(rlang)
library(stringr)

#' LIANA message/warning helper function to allow for verbosity
#'
#' @inheritParams base::stop
#' @param output type of output - message, warning, or stop
#' @param verbose logical for verbosity
liana_message <- function(...,
                          output = "message",
                          verbose = TRUE){
    if(verbose){
        exec(output, ...)
    }
}

#' Wrapper function to run `cell2cell_tensor`. Adapted from liana_tensor_c2c to take a prebuilt tensor as input. 
#'
#' @details This function servers as a one-liner wrapper to the tensor factorisation
#' method described in \href{https://www.nature.com/articles/s41467-022-31369-2}{tensor_cell2cell}.
#' We refer the user to the publication and \href{https://earmingol.github.io/cell2cell/tutorials/ASD/01-Tensor-Factorization-ASD/}{tensor_cell2cell tutorial page}
#' made by the authors. Logically, one should cite cell2cell's paper if their
#' method was used via LIANA.
#'
#' @param rank Ranks for the Tensor Factorization (number of factors to deconvolve the original tensor).
#'  If NULL, then rank selection is performed using the `elbow_rank_selection` function.
#'
#' @param seed Random seed integer
#'
#' @param upper_rank Upper bound of ranks to explore with the elbow analysis.
#'
#' @param runs Number of tensor factorization performed for a given rank.
#' Each factorization varies in the seed of initialization. Consider increasing
#' the number of runs, in order to obtain a more robust rank estimate.
#'
#' @param init Initialization method for computing the Tensor Factorization.
#' {‘svd’, ‘random’}
#'
#' @param factors_only whether to return only the factors after factorization
#'
#' @param verbose verbosity logical
#'
#' @param ... Dictionary containing keyword arguments for the c2c.compute_tensor_factorization function.
#' The function deals with `random_state` (seed) and `rank` internally.
#'
#' @returns an instance of the cell2cell.tensor.BaseTensor class (via reticulate).
#' If build_only is TRUE, then no rank selection or tensor decomposition is returned.
#' Otherwise, returns a tensor with factorization results.
#'
#' @export
#'
decompose_tensor <- function(tensor,
                             rank=NULL,
                             seed = 1337,
                             upper_rank = 25,
                             runs = 3,
                             init = 'svd',
                             factors_only = TRUE,
                             verbose = TRUE,
                             ...){

    # Deal with rank
    rank <- if(is.null(rank)){ NULL } else {as.integer(rank)}

    reticulate::py_set_seed(seed)

    # estimate factor rank
    elbow_metric_raw <- NULL
    if(is.null(rank)){
        liana_message(str_glue("Estimating ranks..."),
                      verbose = verbose,
                      output = "message")
        py$temp <- tensor$elbow_rank_selection(upper_rank=as.integer(upper_rank),
                                               runs=as.integer(runs),
                                               init=init,
                                               automatic_elbow=TRUE,
                                               random_state=as.integer(seed))

        elbow_metric_raw <- tensor$elbow_metric_raw

        rank <- as.integer(tensor$rank)
    }

    # Compute tensor factorization
    liana_message(str_glue("Decomposing the tensor..."),
                  verbose = verbose,
                  output = "message")
    tensor$compute_tensor_factorization(rank = as.integer(rank),
                                        random_state=as.integer(seed),
                                        ...)

    if(factors_only){
        res <- format_c2c_factors(tensor$factors)

        if(!is.null(elbow_metric_raw)){
            res$elbow_metric_raw <- elbow_metric_raw
        }

    } else{
         res <- tensor
        }
    
    return(res)
}

In [6]:
tensor<-decompose_tensor(tensor, factors_only = FALSE, rank = 10)

Decomposing the tensor...



Tensor-cell2cell can be run with a simple pipeline command that executes all important steps to perform the tensor factorization.


**Key parameters are:**

- `copy_tensor` indicates whether modifying the original tensor passed as input to report the results or return a new version including the results.


- `rank` is the number of factors or latent patterns we want to obtain from the analysis. You can either indicate a specific number or leave it as `None` to obtain a suggested number through an elbow analysis.


- `tf_optimization` indicates whether running the analysis in the `'regular'` or the `'robust'` way. The regular way means that the tensor decomposition is run 10 times per rank evaluated in the elbow analysis, and 1 time in the final decomposition. Additionally, the optimization algorithm has less number of iterations in the regular than the robust case (100 vs 500) and less precision (tolerance of 1e-7 vs 1e-8). The robust case runs the tensor decomposition 20 times per rank evaluated in the elbow analysis, and 100 times in the final decomposition. Here we could use the ```tf_optimization='regular'```, which is faster but generates less robust results. We recommend using ```tf_optimization='robust```, which takes longer to run (more iteractions and more precise too). It is important to notice that multiple runs of a tensor decomposition differ in the initialization values (regardless of using `tf_init='random'` or `tf_init='svd'`).


- `random_state` is the seed for randomization. It controls the randomization used when initializing the optimization algorithm that performs the tensor decomposition. It is useful for reproducing the same result every time that the analysis is run. If `None`, a different randomization will be used each time.


- `device` indicates whether we are using the `'cpu'` or a GPU with `'cuda'` cores. See the beginning of this notebooks to enable using a GPU.


- `'elbow_metric'` is the metric to perform the elbow analysis to obtain a suggested number of factors. Options are `'error'` and `'similarity'`, indicating respectively the error of tensor reconstruction, and the similarity of tensor factorization across multiple runs for a given rank. The similarity metric may result in non-smooth curves given the highly variability especially at higher ranks. If so, we recommend using the parameter `smooth_elbow=True`.


- `uppper_rank` is the max number of ranks to try in the elbow analysis.


- `tf_init` is the initialization of the tensor decomposition algorithm. If your tensor contains a mask (`tensor.mask`), it will automatically use `tf_init='random'`. Options are `'random'` and `'svd'`. The latter helps to obtain results that are more close to a global optima from the optimization method behind the algorithm.


- `tf_svd` is the method to perform the SVD to initialize the tensor factorization. This is only considered whe `tf_init='svd'`. 


- `output_folder` is the full path to the folder where the results will be saved. Make sure that this folder exists before passing it here.