An R wrapper for Gemma's restful API.
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
R
docs
man
tests
.Rbuildignore
.gitignore
.travis.yml
CONTRIBUTING.md
CONTRIBUTING.rmd
DESCRIPTION
LICENSE
NAMESPACE
README.md
README.rmd
codecov.yml
gemmaAPI.Rproj
gemmaAPI.png

README.md

Gemma API

Build Statuscodecov

Table of Contents

This is an R wrapper for Gemma's restful API.

To cite Gemma, please use: Zoubarev, A., et al., Gemma: A resource for the re-use, sharing and meta-analysis of expression profiling data. Bioinformatics, 2012.

Installation

devtools::install_github('PavlidisLab/gemmaAPI.R')

Documentation

For basic api calls see ?endpointFunctions. These functions return mostly unaltered data from a given API endpoint.

For high level functions see ?highLevelFunctions. These functions return data compiled from multiple api calls.

Examples

Download data for a dataset

data = 
    datasetInfo('GSE107999',
                request='data', # we want this endpoint to return data. see documentation
                filter = FALSE, # data request accepts filter argument we want non filtered data
                return = TRUE, # TRUE by default, all functions have this. if false there'll be no return
                file = NULL # NULL by default, all functions have this. If specificed, output will be saved.
    )

head(data) %>% knitr::kable(format ='markdown')
Probe Sequence GeneSymbol GeneName GemmaId NCBIid GSE107999_Biomat_9___BioAssayId=427205Name=LUHMEScells,untreated,proliferatingprecursorstaterep4 GSE107999_Biomat_8___BioAssayId=427206Name=LUHMEScells,untreated,proliferatingprecursorstaterep3 GSE107999_Biomat_12___BioAssayId=427207Name=LUHMEScells,untreated,proliferatingprecursorstaterep2 GSE107999_Biomat_10___BioAssayId=427208Name=LUHMEScells,untreated,proliferatingprecursorstaterep1 GSE107999_Biomat_5___BioAssayId=427201Name=LUHMEScells,untreated,day3ofdifferentiationrep4 GSE107999_Biomat_4___BioAssayId=427202Name=LUHMEScells,untreated,day3ofdifferentiationrep3 GSE107999_Biomat_7___BioAssayId=427203Name=LUHMEScells,untreated,day3ofdifferentiationrep2 GSE107999_Biomat_6___BioAssayId=427204Name=LUHMEScells,untreated,day3ofdifferentiationrep1 GSE107999_Biomat_11___BioAssayId=427197Name=LUHMEScells,untreated,day6ofdifferentiationrep4 GSE107999_Biomat_2___BioAssayId=427198Name=LUHMEScells,untreated,day6ofdifferentiationrep3 GSE107999_Biomat_1___BioAssayId=427199Name=LUHMEScells,untreated,day6ofdifferentiationrep2 GSE107999_Biomat_3___BioAssayId=427200Name=LUHMEScells,untreated,day6ofdifferentiationrep1
1007_s_at 1007_s_at_collapsed DDR1 discoidin domain receptor tyrosine kinase 1 16908 780 8.360044 8.347570 8.384220 8.631552 9.426037 9.332862 9.556137 9.571225 9.830016 9.534368 9.644813 9.638160
1053_at 1053_at_collapsed RFC2 replication factor C subunit 2 139878 5982 8.321700 8.441607 8.538243 8.223463 6.900833 7.811239 7.362803 7.487110 6.727149 6.781015 6.871821 6.822983
117_at 117_at_collapsed HSPA7|HSPA6 heat shock protein family A (Hsp70) member 7|heat shock protein family A (Hsp70) member 6 73442|73420 3311|3310 5.640347 4.309247 4.561608 4.412733 4.274228 4.109736 4.466428 4.262011 4.013711 4.285905 4.445415 3.929470
121_at 121_at_collapsed PAX8 paired box 8 173107 7849 6.915072 7.001704 6.886536 6.995852 6.789746 6.988139 6.950670 6.897583 6.632473 6.872863 6.892053 6.845294
1255_g_at 1255_g_at_collapsed GUCA1A guanylate cyclase activator 1A 58787 2978 2.328086 2.683368 2.292127 2.395157 2.267915 2.371985 2.148122 2.219700 2.078340 2.243999 2.376379 2.238994
1294_at 1294_at_collapsed UBA7 ubiquitin like modifier activating enzyme 7 165857 7318 4.436209 4.315595 4.434729 4.505724 4.182772 4.334539 4.278525 4.204030 4.105466 4.410392 4.382536 4.151413

Get metadata for first 10 mouse studies.

mouseStudies = taxonInfo('mouse',request = 'datasets',limit = 0)
studyIDs = mouseStudies %>% purrr::map_int('id')
mouseMetadata = studyIDs[1:10] %>% lapply(compileMetadata,outputType = 'list') 
# default outputType is data.frame, which returns a single data frame with study and sample data all together.
mouseMetadata[[1]]$sampleData %>% head %>% knitr::kable(format ='markdown')
id sampleName accession sampleBiomaterialID sampleAnnotCategory sampleAnnotCategoryOntoID sampleAnnotCategoryURI sampleAnnotBroadCategory sampleAnnotBroadCategoryOntoID sampleAnnotBroadCategoryURI sampleAnnotation sampleAnnotationOntoID sampleAnnotType sampleAnnotationURI otherCharacteristics
Brain_C57 Wildtype_affs275-1099 48 Brain_C57 Wildtype_affs275-1099 GSM101416 48 genotype EFO_0000513 http://www.ebi.ac.uk/efo/EFO_0000513 genotype EFO_0000513 http://www.ebi.ac.uk/efo/EFO_0000513 wild type genotype EFO_0005168 factor http://www.ebi.ac.uk/efo/EFO_0005168 total RNA|Biotin|C57 Wildtype Mouse #1099 Brain|Strain: C57BL/6 Gender: female Age: 123 days Tissue: brain
Brain_C57 Wildtype_affs275-1100 47 Brain_C57 Wildtype_affs275-1100 GSM101417 47 genotype EFO_0000513 http://www.ebi.ac.uk/efo/EFO_0000513 genotype EFO_0000513 http://www.ebi.ac.uk/efo/EFO_0000513 wild type genotype EFO_0005168 factor http://www.ebi.ac.uk/efo/EFO_0005168 total RNA|C57 Wildtype Mouse #1100 Brain|Biotin|Strain: C57BL/6 Gender: female Age: 123 days Tissue: brain
Brain_Melanotransferrin Knockout_affs275-1096 52 Brain_Melanotransferrin Knockout_affs275-1096 GSM101412 52 genotype;genotype EFO_0000513;EFO_0000513 http://www.ebi.ac.uk/efo/EFO_0000513;http://www.ebi.ac.uk/efo/EFO_0000513 genotype EFO_0000513 http://www.ebi.ac.uk/efo/EFO_0000513 Homozygous negative;Mfi2 [mouse] antigen p97 (melanoma associated) identified by monoclonal antibodies 133.2 and 96.5 TGEMO_00001;GENE_30060 factor http://purl.obolibrary.org/obo/TGEMO_00001;http://purl.org/commons/record/ncbi_gene/30060 total RNA|brain|Melanotransferrin Knockout Mouse #1096 Brain|female|Biotin|Strain: C57BL/6 - Lucy|Age: 123 days
Brain_Melanotransferrin Knockout_affs275-1097 51 Brain_Melanotransferrin Knockout_affs275-1097 GSM101413 51 genotype;genotype EFO_0000513;EFO_0000513 http://www.ebi.ac.uk/efo/EFO_0000513;http://www.ebi.ac.uk/efo/EFO_0000513 genotype EFO_0000513 http://www.ebi.ac.uk/efo/EFO_0000513 Homozygous negative;Mfi2 [mouse] antigen p97 (melanoma associated) identified by monoclonal antibodies 133.2 and 96.5 TGEMO_00001;GENE_30060 factor http://purl.obolibrary.org/obo/TGEMO_00001;http://purl.org/commons/record/ncbi_gene/30060 total RNA|Melanotransferrin Knockout Mouse #1097 Brain|Biotin|Strain: C57BL/6 - Lucy Gender: female Age: 123 days Tissue: brain
Brain_Melanotransferrin Knockout_affs275-1098 50 Brain_Melanotransferrin Knockout_affs275-1098 GSM101414 50 genotype;genotype EFO_0000513;EFO_0000513 http://www.ebi.ac.uk/efo/EFO_0000513;http://www.ebi.ac.uk/efo/EFO_0000513 genotype EFO_0000513 http://www.ebi.ac.uk/efo/EFO_0000513 Homozygous negative;Mfi2 [mouse] antigen p97 (melanoma associated) identified by monoclonal antibodies 133.2 and 96.5 TGEMO_00001;GENE_30060 factor http://purl.obolibrary.org/obo/TGEMO_00001;http://purl.org/commons/record/ncbi_gene/30060 total RNA|Biotin|Melanotransferrin Knockout Mouse #1098 Brain|Strain: C57BL/6 - Lucy Gender: female Age: 123 days Tissue: brain
Brain_Melanotransferrin Knockout_affs275-1101 49 Brain_Melanotransferrin Knockout_affs275-1101 GSM101415 49 genotype;genotype EFO_0000513;EFO_0000513 http://www.ebi.ac.uk/efo/EFO_0000513;http://www.ebi.ac.uk/efo/EFO_0000513 genotype EFO_0000513 http://www.ebi.ac.uk/efo/EFO_0000513 Homozygous negative;Mfi2 [mouse] antigen p97 (melanoma associated) identified by monoclonal antibodies 133.2 and 96.5 TGEMO_00001;GENE_30060 factor http://purl.obolibrary.org/obo/TGEMO_00001;http://purl.org/commons/record/ncbi_gene/30060 Melanotransferrin Knockout Mouse #1101 Brain|total RNA|Biotin|Strain: C57BL/6 - Lucy Gender: female Age: 123 days Tissue: brain

Download expression data a study

studyIDs %>% sapply(function(x){datasetInfo(x,request= 'data',return= FALSE, file = paste0('data/',x))})

Changelog

17 September 2018:

  • Start writing changelog...
  • compileMetadata function now returns all quality information in geeq. Existing columnames for batch effect information has been altered to better explain what they are.
  • compileMetadata now returns a list instead of a data frame for experiment specific information if the desired output is a list.
  • endpoint functions are fine if their naming variable is NULL. For most cases this shouldn't happen but names are for interactive usage and should not be relied on.
  • Started using proper semantic versioning
  • TOC added to readme