Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Support ENSEMBL IDs #44

Open
lcolladotor opened this issue Dec 20, 2023 · 2 comments
Open

[Feature Request] Support ENSEMBL IDs #44

lcolladotor opened this issue Dec 20, 2023 · 2 comments

Comments

@lcolladotor
Copy link
Member

We should add a is_gencode = TRUE default argument such that when it's set to FALSE, it matches using ENSEMBL IDs instead of Gencode IDs. Aka, it removes the trailing .[0-9] in the IDs.

This should include a unit test that checks the results using the same data with Gencode IDs, then manually makes them ENSEMBL IDs, and checks that with is_gencode = FALSE we get exactly the same results (might have to use set.seed() on this unit test).

@HediaTnani
Copy link
Collaborator

This code has been added to the function to support ENSEMBL IDs

`

Check if all rownames start with "ENST"

if (!all(grepl("^ENST", rownames(rse_tx)))) {
stop("Error: Some rownames do not start with 'ENST'.")
}

Check patterns and perform operations based on the patterns

if (all(grepl("^ENST.?\.", rownames(rse_tx)))) {
# If all row names have the format 'ENST00000442987.3'
rse_tx <- rse_tx[rownames(rse_tx) %in% sig_transcripts, , drop = FALSE]
} else if (all(grepl("^ENST", rownames(rse_tx)))) {
# If all row names have the format 'ENST00000442987'
sig_transcripts <- gsub("\..
", "", sig_transcripts)
rse_tx <- rse_tx[rownames(rse_tx) %in% sig_transcripts, , drop = FALSE]
} else {
stop("Error: Row names do not match the expected patterns.")
}
`

@HediaTnani
Copy link
Collaborator

Unit tests have been implemented here:

# Alter the row names of covComb_tx_deg and apply getDegTx
altered_covComb_tx_deg <- covComb_tx_deg
rownames(altered_covComb_tx_deg) <- gsub("\\..*", "", rownames(covComb_tx_deg))
altered_results <- getDegTx(altered_covComb_tx_deg,sig_transcripts =select_transcripts("cell_component"))
rownames(altered_results) <- rownames(original_results)
# Test if two objects identical
expect_identical(original_results, altered_results)
})

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants