Skip to content

Commit

Permalink
some more edits to pkgdown and GHA
Browse files Browse the repository at this point in the history
  • Loading branch information
mikelove committed Aug 26, 2023
1 parent 69cd5a4 commit 2ac093b
Show file tree
Hide file tree
Showing 4 changed files with 62 additions and 3 deletions.
3 changes: 3 additions & 0 deletions .github/workflows/check-bioc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,9 @@ jobs:
mkdir /__w/_temp/Library
echo ".libPaths('/__w/_temp/Library')" > ~/.Rprofile
- name: work around permission issue
run: git config --global --add safe.directory /__w/thelovelab/tximeta

## Most of these steps are the same as the ones in
## https://github.com/r-lib/actions/blob/master/examples/check-standard.yaml
## If they update their steps, we will also need to update ours.
Expand Down
44 changes: 41 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,16 +2,54 @@

[![R build status](https://github.com/thelovelab/tximeta/actions/workflows/check-bioc.yml/badge.svg)](https://github.com/thelovelab/tximeta/actions/workflows/check-bioc.yml)

For a reference for `tximeta`:
# Automatic metadata for RNA-seq

*tximeta* provides a set of functions for conveniently working with
metadata for transcript quantification data in Bioconductor. The
`tximeta()` function imports quantification data from *Salmon* or
other quantifiers, and returns a
[SummarizedExperiment](https://bioconductor.org/packages/release/bioc/vignettes/SummarizedExperiment/inst/doc/SummarizedExperiment.html#anatomy-of-a-summarizedexperiment)
object.

If `tximeta()` recognizes the reference transcripts used
for quantification, it will automatically download relevant
information about the location of the transcripts in the correct genome.
*These actions happen in the background without requiring any extra
effort or information from the user.*

This metadata is attached to the *SummarizedExperiment* in the
`metadata()` and `rowRanges()` slots.

For a list of the reference transcriptomes supported by `tximeta()`,
see the "Pre-computed checksums" section of the vignette in the
`Get started` tab.

Further steps are also facilitated, e.g. `summarizeToGene()`, `addIds()`,
or even `retrieveCDNA()` (the transcripts used for quantification) or
`retrieveDb()` (the correct *TxDb* or *EnsDb* to match the
quantification data).

# How it works

The key idea behind *tximeta* is that *Salmon* propagates a hash value
summarizing the reference transcripts into each quantification
directory it outputs. *tximeta* can be used with other tools as long
as the
[hash of the transcripts](https://github.com/COMBINE-lab/FastaDigest)
is also included in the output directories.

![](man/figures/diagram.png)

# Reference

A reference for *tximeta* package is:

> Michael I. Love, Charlotte Soneson, Peter F. Hickey, Lisa K. Johnson,
> N. Tessa Pierce, Lori Shepherd, Martin Morgan, Rob Patro.
> "Tximeta: reference sequence checksums for provenance
> identification in RNA-seq" *PLOS Computational Biology* (2020)
> [doi: 10.1371/journal.pcbi.1007664](https://doi.org/10.1371/journal.pcbi.1007664)
![](man/figures/diagram.png)

# Feedback

We would love to hear your feedback. Please post to
Expand Down
Binary file added vignettes/images/assignRanges-abundant.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
18 changes: 18 additions & 0 deletions vignettes/tximeta.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -312,6 +312,8 @@ gse <- summarizeToGene(se)
rowRanges(gse)
```

## Assign ranges by abundance

We also offer a new type of range assignment, based on the most
abundant isoform rather than the leftmost to rightmost coordinate. See
the `assignRanges` argument of `?summarizeToGene`. Using the most
Expand All @@ -323,6 +325,22 @@ than the default option.
gse <- summarizeToGene(se, assignRanges="abundant")
```

For more explanation about why this may be a better choice, see the
following tutorial chapter:

<https://tidyomics.github.io/tidy-ranges-tutorial/gene-ranges-in-tximeta.html>

In the below diagram, the pink feature is the set of all exons
belonging to any isoform of the gene, such that the TSS is on the
right side of this minus strand feature. However, the blue feature is
the most abundant isoform (the brown features are the next most
abundant isoforms). The pink feature is therefore not a good
representation for the locus.

```{r echo=FALSE}
knitr::include_graphics("images/assignRanges-abundant.png")
```

# Add different identifiers

We would like to add support to easily map transcript or gene
Expand Down

0 comments on commit 2ac093b

Please sign in to comment.