Overview

This package leverages the MultiAssayExperiment and curatedTCGAData packages to download the relevant data, and assemble a singular, pan-cancer, FacileTcgaDataSet using a subset of the assays made availalbe from those resources.

Gene-level quantitaion (expression, CNV, etc) will be mapped to the ensembl gene-level universe. Entries that cannot be mapped that way will be dropped.

Datasets

We will create a dataset across all indications using the following assays:

mRNA Abundance

The following assays can be pulled from the curatedTCGAData package:

RNASeq2GeneNorm: fpkm values (sad)
RNASeqGene: v2 fpkm values?
miRNASeqGene: miRNA-seq data

We might consider getting gene and transcript level counts from the recount2 project, though:

https://jhubiostatistics.shinyapps.io/recount/

They used Gencode v25 GFF3 annotations, which we can parse with the utility funcionts in GemomicsStudyDb

Copy Number

GISTIC_AllByGene: real valued copy number per gene
GISTIC_ThresholdedByGene: threshold (duplication / deletion / normal) gene-level CNV scores

GenomicVariants

Mutation: mutation status? What does this look like?

Other

RPPAArray: Reverse Phase Protein Array (NOISY)
Methylation, Methylation_methyl27, or Methylation_methyl450

Individual indication will be downloaded separately and assembled into one final FacileTcgaDataSet.

Data Assembly

Let’s follow along with the curatedTCGAData vignette to figure out how to do this.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
R		R
inst		inst
man		man
tests		tests
vignettes		vignettes
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
DESCRIPTION		DESCRIPTION
NAMESPACE		NAMESPACE
README.Rmd		README.Rmd
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

R

R

inst

inst

man

man

tests

tests

vignettes

vignettes

.Rbuildignore

.Rbuildignore

.gitignore

.gitignore

DESCRIPTION

DESCRIPTION

NAMESPACE

NAMESPACE

README.Rmd

README.Rmd

README.md

README.md

Repository files navigation

Overview

Datasets

mRNA Abundance

Copy Number

GenomicVariants

Other

Data Assembly

About

Releases

Packages

Languages

facilebio/FacileTcgaDataSet

Folders and files

Latest commit

History

Repository files navigation

Overview

Datasets

mRNA Abundance

Copy Number

GenomicVariants

Other

Data Assembly

About

Topics

Resources

Stars

Watchers

Forks

Languages