DTD

Digital Tissue Deconvolution (DTD) reconstructs the cellular composition of a tissue from its bulk expression profile. In order to increase deconvolution accuracy, DTD adapts the deconvolution model to the tissue scenario via loss-function learning.
Training is performed on 'in-silicio' training mixtures, for which the cellular composition are known.
As input, DTD requires a labelled expression matrix. The package includes functions to generate training and test mixtures, train the model, and assess its deconvolution capability via visualizations.
In Goertler et al. 2018 "Loss-function Learning for Digital Tissue Deconvolution" the theory has been published.

An exemplary analysis can be viewed at https://github.com/MarianSchoen/Exemplary-DTD-analysis

Install

Install from github, without vignette:

  devtools::install_github("spang-lab/DTD")

I strongly recommend creating the vignette. Therefore, install from github with vignette
(creating vignettes approximately takes ~3 minutes)

  devtools::install_github(
    "spang-lab/DTD", 
    build_opts = c("--no-resave-data", "--no-manual"), 
    build_vignettes=TRUE
    )
  browseVignettes("DTD")

Introduction to DTD Theory

The gene expression profile of a tissue combines the expression profiles of all cells in this tissue. Digital tissue deconvolution (DTD) addresses the following inverse problem: Given the expression profile y of a tissue, what is the cellular composition c of cells X in that tissue? The cellular composition c can be estimated by

Görtler et al (2019) generalized this formula by introducing a vector g

Every entry g[i] of g holds the information how important gene i is for the deconvolution process. It can either be selected via prior knowledge, or learned on training data. Training data consists of artificial bulk profiles Y, and the corresponding cellular compositions C. We generate this data with single cell RNASeq profiles.
The underlying idea of loss-function learning DTD is to gain the vector g by minimizing a loss function L on the training set:

$L = -\sum_j cor(C_{j, .}, \widehat{C_{j, .}}(g))$

Here, $\widehat{C}(g)$ is the solution of formula (2). During training we iteratively adjust the g vector in the direction of the gradient $\nabla L$ , leading to a g vector, which cellular estimates $\widehat{C}(g)$ correlate best with the known cellular compositions C.

License

All source code and documentation can be freely used and is available under a MIT license.

Name		Name	Last commit message	Last commit date
Latest commit History 380 Commits
R		R
man		man
src		src
tests		tests
vignettes		vignettes
.gitignore		.gitignore
.gitmodules		.gitmodules
DESCRIPTION		DESCRIPTION
LICENSE		LICENSE
NAMESPACE		NAMESPACE
README.md		README.md
configure		configure
configure.ac		configure.ac

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DTD

Install

Introduction to DTD Theory

License

About

Releases

Packages

Contributors 3

Languages

License

MarianSchoen/DTD

Folders and files

Latest commit

History

Repository files navigation

DTD

Install

Introduction to DTD Theory

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages