Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix conversion fuction #158

Open
wants to merge 43 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
fb7971d
addition of seqimmucc custom deconv
Nov 15, 2023
19c045c
update on cell type mappings
Nov 16, 2023
5deffba
vignette updates
Nov 16, 2023
0fb45a3
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Nov 17, 2023
7aa54a6
Roxygenize
LorenzoMerotto Nov 17, 2023
f461ebe
pkgdown
Nov 17, 2023
ed2fb4b
Merge branch 'fix_conversion_fuction' of https://github.com/omnidecon…
Nov 17, 2023
5a2d97f
removing one broken function
Nov 17, 2023
2b78dd5
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Nov 17, 2023
f7f7b35
Roxygenize
LorenzoMerotto Nov 17, 2023
a496eb4
test for seqimmucc custom
Nov 17, 2023
9e3c710
Merge branch 'fix_conversion_fuction' of https://github.com/omnidecon…
Nov 17, 2023
b03d114
editing the test for seqimmucc custom
Nov 17, 2023
206c384
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Nov 17, 2023
5f4dc80
Roxygenize
LorenzoMerotto Nov 17, 2023
e5a84a7
bug
Nov 17, 2023
231179d
Merge branch 'fix_conversion_fuction' of https://github.com/omnidecon…
Nov 17, 2023
236d67c
bug in consensustme cancer types
Nov 20, 2023
11a7ced
minor fixes in custom methods
Nov 20, 2023
a4af09c
Roxygenize
LorenzoMerotto Nov 20, 2023
0f9e114
use miniconda, change version
Nov 22, 2023
59a4491
Merge branch 'fix_conversion_fuction' of https://github.com/omnidecon…
Nov 22, 2023
6ed6516
remove miniforge
Nov 22, 2023
b1b9447
python ver
Nov 22, 2023
c8b2348
revert to miniforge
Nov 22, 2023
d798917
spect mamba
Nov 22, 2023
0d524d6
get rid of mamba
Nov 22, 2023
b376ab3
change conda versions
Nov 22, 2023
4e66c01
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Nov 22, 2023
dcc6b1c
use a fx mniforge version
Nov 23, 2023
0d21d4a
Merge branch 'fix_conversion_fuction' of https://github.com/omnidecon…
Nov 23, 2023
3342382
use mamba
Nov 23, 2023
d50ebc8
swtch to conda
Nov 23, 2023
59a0196
create new env
Nov 23, 2023
8201f02
downgrade of conda build
Nov 23, 2023
3979132
let's see if mamba works
Nov 23, 2023
4cd8d2e
try if mamba works
Nov 23, 2023
9022ae4
revert back to conda
Nov 23, 2023
6b0ad46
fixing consensustme function
Nov 24, 2023
7a68872
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Nov 24, 2023
f45752c
addition of test for genes conversion
Dec 13, 2023
2e25320
Merge branch 'fix_conversion_fuction' of https://github.com/omnidecon…
Dec 13, 2023
544fc49
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Dec 13, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .conda/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

package:
name: {{ name }}
version: "develop"
version: "develop"

source:
path: ".."
Expand Down Expand Up @@ -68,7 +68,7 @@ test:
- '$R -e "library(''immunedeconv'')"'

about:
home: https://github.com/icbi-lab/immunedeconv
home: https://github.com/omnideconv/immunedeconv
license: BSD_3_clause
summary: "collection of methods for immune cell deconvolution of bulk RNA-seq samples."
license_family: BSD
Expand Down
9 changes: 5 additions & 4 deletions .github/workflows/conda.yml
Original file line number Diff line number Diff line change
Expand Up @@ -26,19 +26,20 @@ jobs:
uses: conda-incubator/setup-miniconda@v2
with:
auto-update-conda: true
miniforge-version: latest
miniforge-version: 23.3.0-0
use-mamba: true
channels: conda-forge,bioconda
channel-priority: true
python-version: 3.8
python-version: 3.9

- name: Set-up channels and install conda build
run: |
mamba install -y conda-build conda-verify boa
conda install -y conda-build=3.25.0 conda-verify boa
shell: bash

- name: build and test package
run: |
cd .conda
mamba build . --no-anaconda-upload
conda create -n myenv -y conda-build
conda run --no-capture-output -n myenv conda build . --no-anaconda-upload
shell: bash
3 changes: 2 additions & 1 deletion NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,6 @@ export(custom_deconvolution_methods)
export(deconvolute)
export(deconvolute_abis)
export(deconvolute_base_algorithm)
export(deconvolute_base_custom)
export(deconvolute_cibersort)
export(deconvolute_cibersort_custom)
export(deconvolute_consensus_tme)
Expand All @@ -22,6 +21,7 @@ export(deconvolute_mmcp_counter)
export(deconvolute_mouse)
export(deconvolute_quantiseq)
export(deconvolute_seqimmucc)
export(deconvolute_seqimmucc_custom)
export(deconvolute_timer)
export(deconvolute_xcell)
export(deconvolution_methods)
Expand All @@ -35,6 +35,7 @@ export(map_cell_types)
export(map_result_to_celltypes)
export(reduce_mouse_cell_types)
export(scale_to_million)
export(seqImmuCC_LLSR)
export(set_cibersort_binary)
export(set_cibersort_mat)
export(timer_available_cancers)
Expand Down
36 changes: 20 additions & 16 deletions R/custom_deconvolution_methods.R
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ NULL
#' List of methods that support the use of a custom signature
#'
#' The available methods are
#' `epic`, `cibersort`, `cibersort_abs`, `consensus_tme`, `base`
#' `epic`, `cibersort`, `cibersort_abs`, `consensus_tme`, `seqimmucc`
#'
#' The object is a named vector. The names correspond to the display name of the method,
#' the values to the internal name.
Expand All @@ -29,7 +29,7 @@ custom_deconvolution_methods <- c(
"EPIC" = "epic",
"CIBERSORT" = "cibersort",
"ConsensusTME" = "consensus_tme",
"BASE" = "base"
"seqImmuCC" = "seqimmucc"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happened to BASE?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing a method is a breaking change that should only happen with a good reason (+ major version bump + explanation in changelog)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not removing a method here, I'm just removing the possibility to use it with a custom matrix (this is due to an error in my implementation)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all good then

)


Expand Down Expand Up @@ -153,25 +153,29 @@ deconvolute_consensus_tme_custom <- function(gene_expression_matrix, signature_g
}



#' Deconvolute using BASE and a custom signature matrix
#' Deconvolute using seqImmuCC (LLSR regression) and a custom signature matrix
#'
#' @param gene_expression_matrix a m x n matrix with m genes and n samples. Data
#' should be TPM normalized and log10 scaled.
#' @param signature_matrix a m x l matrix with m genes and l cell types. Data
#' should be non normalized, as the normalization wil be done in the construction
#' of the compendium (internal structure)
#' @param n_permutations the number of permutations of each sample expression
#' to generate. These are used to normalize the results.
#' @param log10 logical. if TRUE, log10 transforms the expression matrix.
#' @param signature_matrix a m x l matrix with m genes and l cell types. The
#' matrix should contain only a subset of the genes useful for the analysis.
#' @param ... passed to the original seqImmuCC_LLSR function
#' @export
#'
deconvolute_base_custom <- function(gene_expression_matrix,
signature_matrix,
n_permutations = 100,
log10 = TRUE) {
new.cell.compendium <- create_base_compendium(signature_matrix)
results <- base_algorithm(gene_expression_matrix, new.cell.compendium, perm = n_permutations)
deconvolute_seqimmucc_custom <- function(gene_expression_matrix,
signature_matrix,
...) {
arguments <- dots_list(
signature = signature_matrix,
SampleData = gene_expression_matrix, ..., .homonyms = "last"
)

call <- rlang::call2(seqImmuCC_LLSR, !!!arguments)
results <- eval(call)


# results <- seqImmuCC_LLSR(signature_matrix, gene_expression_matrix, ..., .homonyms = "last")
results <- results[, !colnames(results) %in% c("Correlation", "RSEM")]

return(t(results))
}
14 changes: 12 additions & 2 deletions R/immune_deconvolution_methods.R
Original file line number Diff line number Diff line change
Expand Up @@ -328,12 +328,22 @@ deconvolute_consensus_tme <- function(gene_expression_matrix,
t, method
)

colnames(cur.results) <- colnames(gene_expression_matrix)[cur.samples]
list.results[[t]] <- cur.results
}

results <- Reduce(cbind, list.results)
list.results <- lapply(list.results, function(x) as.data.frame(x))

colnames(results) <- colnames(gene_expression_matrix)
if (length(tumor.types) > 1) {
results <- Reduce(
function(x, y) merge(x, y, all = TRUE),
lapply(list.results, function(x) data.frame(x, rn = row.names(x)))
)

results <- column_to_rownames(results, "rn")
} else {
results <- list.results[[1]]
}

return(results)
}
Expand Down
1 change: 1 addition & 0 deletions R/mouse_deconvolution_methods.R
Original file line number Diff line number Diff line change
Expand Up @@ -271,6 +271,7 @@ convert_human_mouse_genes <- function(gene_expression_matrix, mirror = "www",
gene_expression_matrix <- as.data.frame(gene_expression_matrix)
gene_expression_matrix$gene_name <- gene.names
} else {
gene.names <- gene_expression_matrix
gene_expression_matrix <- data.frame(
"gene_name" = gene_expression_matrix,
"X" = rep(0, length(gene_expression_matrix))
Expand Down
2 changes: 1 addition & 1 deletion R/seqImmuCC_LLSR.R
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
#' @param sig.stand logical.If TRUE, standardizes the signature matrix
#' @param sample.scale logical. If TRUE, scales the sample expression
#' @param log logical. If TRUE, log transforms signature and expression data
#'
#' @export
#'
seqImmuCC_LLSR <- function(signature, SampleData, w = NA, QN = T, sig.scale = F, sig.stand = T, sample.scale = T, log = T) {
# Expression profile format standarlization
Expand Down
2 changes: 1 addition & 1 deletion _pkgdown.yml
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ reference:
- custom_deconvolution_methods
- deconvolute_cibersort_custom
- deconvolute_epic_custom
- deconvolute_base_custom
- deconvolute_seqimmucc_custom
- deconvolute_consensus_tme_custom
- title: Cell type mapping
desc: Map cell types and datasets to a controlled vocabulary.
Expand Down
Binary file modified inst/extdata/cell_type_mapping.xlsx
Binary file not shown.
2 changes: 1 addition & 1 deletion man/cell_type_tree.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion man/custom_deconvolution_methods.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

29 changes: 0 additions & 29 deletions man/deconvolute_base_custom.Rd

This file was deleted.

20 changes: 20 additions & 0 deletions man/deconvolute_seqimmucc_custom.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

7 changes: 4 additions & 3 deletions tests/testthat/test_custom_matrices.R
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,9 @@ test_mat <- as.matrix(test_mat)



test_that("BASE works with a custom signature matrix", {
sign_mat <- matrix(120 * runif(1500), ncol = 10)
test_that("seqimmucc works with a custom signature matrix", {
test_mat <- dataset_racle$expr_mat
sign_mat <- matrix(120 * runif(15000), ncol = 10)
colnames(sign_mat) <- c(
"A", "B", "C", "D",
"E", "F", "G", "H",
Expand All @@ -15,7 +16,7 @@ test_that("BASE works with a custom signature matrix", {

rownames(sign_mat) <- sample(rownames(test_mat), nrow(sign_mat))

res <- deconvolute_base_custom(test_mat, sign_mat)
res <- deconvolute_seqimmucc_custom(test_mat, sign_mat)
assert("matrix dimensions consistent", ncol(res) == ncol(test_mat))
assert("matrix dimensions consistent", nrow(res) == ncol(sign_mat))
})
Expand Down
4 changes: 2 additions & 2 deletions tests/testthat/test_deconvolution.R
Original file line number Diff line number Diff line change
Expand Up @@ -84,8 +84,8 @@ test_that("consensus_tme works", {


test_that("consensus_tme with multiple indications, ordered and unordered", {
indications_1 <- c("blca", "blca", "brca", "brca", "brca", "brca", "chol", "chol")
indications_2 <- c("brca", "brca", "blca", "chol", "chol", "brca", "brca", "blca")
indications_1 <- c("blca", "blca", "brca", "brca", "brca", "brca", "chol", "dlbc")
indications_2 <- c("brca", "brca", "blca", "chol", "dlbc", "brca", "brca", "blca")
res_1 <- deconvolute_consensus_tme(test_mat, indications = indications_1)
res_2 <- deconvolute_consensus_tme(test_mat, indications = indications_2)
assert("matrix dimensions consistent", ncol(res_1) == ncol(test_mat))
Expand Down
25 changes: 15 additions & 10 deletions tests/testthat/test_deconvolution_mouse.R
Original file line number Diff line number Diff line change
Expand Up @@ -39,13 +39,18 @@ test_that("generic deconvolution works for all methods", {
})
})

# test_that("mouse gene names can be converted into their human orthologus, and vice versa", {
# test_mat_newGenes <- convert_human_mouse_genes(test_mat[1:1000, ], convert_to = "human")
# assert("matrix dimensions consistent", ncol(test_mat_newGenes) == ncol(test_mat))
#
# test_mat_human <- read_tsv("bulk_mat.tsv") %>%
# as.data.frame() %>%
# tibble::column_to_rownames("gene_symbol")
# test_mat_newGenes <- convert_human_mouse_genes(test_mat_human[1:1000, ], convert_to = "mouse")
# assert("matrix dimensions consistent", ncol(test_mat_newGenes) == ncol(test_mat_human))
# })
test_that("mouse gene names can be converted into their human orthologus, and vice versa", {
test_mat_newGenes <- convert_human_mouse_genes(test_mat[1:1000, ], convert_to = "human")
assert("matrix dimensions consistent", ncol(test_mat_newGenes) == ncol(test_mat))

test_mat_human <- read_tsv("bulk_mat.tsv") %>%
as.data.frame() %>%
tibble::column_to_rownames("gene_symbol")
test_mat_newGenes <- convert_human_mouse_genes(test_mat_human[1:1000, ], convert_to = "mouse")
assert("matrix dimensions consistent", ncol(test_mat_newGenes) == ncol(test_mat_human))
})

test_that("mouse gene names in vector can be converted into their human orthologus, and vice versa", {
newGenes <- convert_human_mouse_genes(rownames(test_mat[1:1000, ]), convert_to = "human")
assert("gene names are in vector form", is.vector(newGenes))
})
2 changes: 1 addition & 1 deletion vignettes/detailed_example_mouse.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ res_mMCPcounter %>%

## Deconvolution with human-based methods

Human-based methods can still be used to deconvolve mouse data through the use of orthologous genes. The function `mouse_genes_to_human` does that by retrieving the correspondent gene names with `biomaRt`. Since the gene names are retrieved from the Ensembl database, it can happen that the command has to be run with different Emsembl mirrors (see the documentation)
Human-based methods can still be used to deconvolve mouse data through the use of orthologous genes. The function `convert_human_mouse_genes` does that by retrieving the correspondent gene names with `biomaRt`. Since the gene names are retrieved from the Ensembl database, it can happen that the command has to be run with different Emsembl mirrors (see the documentation)


```R
Expand Down
27 changes: 12 additions & 15 deletions vignettes/immunedeconv.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -124,26 +124,28 @@ In addition, human-based methods can be used to deconvolute mouse data through t
to the corresponding human orthologues

```R
gene_expression_matrix <- immunedeconv::mouse_genes_to_human(gene_expression_matrix)
gene_expression_matrix <- immunedeconv::convert_human_mouse_genes(gene_expression_matrix, convert_to = "human")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did I get this right that the function wasn't renamed, but just the wrong function reference from the tutorial?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function was renamed because instead of going from mouse genes to human, now it can go both ways

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also a breaking change then... Usually what is done in such cases is to keep the old function as an alias to the new one (provided that it has a compatible signature) and show a deprecation warning when it is still used.

immunedeconv::deconvolute(gene_expression_matrix, "quantiseq")
```

## Deconvolution using a custom signature

Finally, certain methods can be used with custom signatures, consisting of either a signature matrix or signature genes
for the cell types of interest. Since the information used to deconvolute the bulk is user-provided, these functions can be
used for different tissues and organisms.
for the cell types of interest. Since the information used to deconvolute the bulk is user-provided, these functions can be used for different tissues and organisms.
The functions may require different input data formats, related to the requirements of each method. Please refer to their documentation.
The available methods are


```r
base: deconvolute_base_custom()
cibersort norm/abs: deconvolute_cibersort_custom()
epic: deconvolute_epic_custom()
consensus_tme: deconvolute_consensus_tme_custom()
seqimmucc: deconvolute_seqimmucc_custom()
```

Please note that using the `deconvolute_seqimmucc_custom()` with a custom signature matrix will automatically use the LLSR regression method.
If you wish to use the nuSVR regression method, use the `deconvolute_cibersort_custom()` function.


## Example
For this example, we use a dataset of four melanoma patients from [@EPIC2017].
Expand Down Expand Up @@ -241,8 +243,7 @@ with(cell_type_hierarchy, {

For each method, each cell-type is mapped to a node in the tree.
If you are curious, it's all defined in [this excel
sheet](https://github.com/grst/immunedeconv/blob/master/inst/extdata/cell_type_mapping.xlsx).

sheet](https://github.com/omnideconv/immunedeconv/blob/master/inst/extdata/cell_type_mapping.xlsx).
This tree can be used to summarize scores along the tree. For instance,
quanTIseq provides scores for regulatory and non-regulatory CD4+ T cells independently, but you are
interested in the fraction of overall CD4+ T cells. In that case you can use
Expand Down Expand Up @@ -300,15 +301,11 @@ a score in arbitrary units.

# FAQs
### Can I specify a custom signature matrix through immunedeconv?
No, currently not. The reason is that the methods are conceptually different.
Some are marker gene based and others deconvolution-based. CIBERSORT performs
feature-selection on the matrix while EPIC and quanTIseq don't. EPIC uses *all*
genes to estimate the inter-sample variance while quanTIseq uses marker genes
only. This is also being discussed in
[#15](https://github.com/grst/immunedeconv/issues/15).

You can, however, provide custom signatures for most individual methods (see
next question).
Currentl, four methods (EPIC, CIBERSORT, ConsensusTME and seqImmuCC) allow users to provide their own signature matrix and/or gene set.
Please note that the format of the signatures might vary across methods.
In addition, since the embedded signatures were optimized for the methods, user-provided ones might have a worse performance.
If you want to know more, please see [#15](https://github.com/grst/immunedeconv/issues/15).


### I want to use a special feature of a method, but I cannot access it through the `deconvolute` function.
You can access each method individually through the `deconvolute_xxx` function.
Expand Down
Loading