Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'destdir' is ignored when GPL file is downloaded for local GSE matrix file #8

Closed
biochem-fan opened this issue May 7, 2014 · 1 comment

Comments

@biochem-fan
Copy link

When I try to load a (manually downloaded) GSE matrix file by

getGEO(filename="GSEXXXX_series_matrix.txt", destdir="./")

the destdir option is not passed to parseGSEMatrix() through parseGEO().
Therefore, GPL file is not searched in the destdir, but downloaded
into a temporary folder every time R is restarted.

@zhilongjia
Copy link

getGEO cannot use local GPL file directly when GPL is true so far due to a bug in function parseGSEMatrix. In this function, the parameter "destdir=tempdir()" should be "destdir=destdir", which should also be added into other related function parseGEO.

I suggest another parameter should be add. Two files will be reivsed (getGEO.R and parseGEO.R ) with a new parameter GPLfn.
It will look like

getGEO(filename="../GSE31747_series_matrix.txt.gz", GPLfn="../GPL8300.soft")

##################################################################
$git diff getGEO.R
diff --git a/R/getGEO.R b/R/getGEO.R
index 06bde57..912b23c 100644
--- a/R/getGEO.R
+++ b/R/getGEO.R
@@ -3,7 +3,8 @@ getGEO <- function(GEO=NULL,
destdir=tempdir(),
GSElimits=NULL,GSEMatrix=TRUE,
AnnotGPL=FALSE,

  •               getGPL=TRUE) {
    
  •               getGPL=TRUE,
    
  •               GPLfn=NULL) {
    
    con <- NULL
    if(!is.null(GSElimits)) {
    if(length(GSElimits)!=2) {
    @@ -21,6 +22,6 @@ getGEO <- function(GEO=NULL,
    }
    filename <- getGEOfile(GEO,destdir=destdir,AnnotGPL=AnnotGPL)
    }
  • ret <- parseGEO(filename,GSElimits)
  • ret <- parseGEO(filename,GSElimits, GPLfn)
    return(ret)
    }
    ##################################################################

$git diff parseGEO.R

diff --git a/R/parseGEO.R b/R/parseGEO.R
index 813c996..c6b499f 100644
--- a/R/parseGEO.R
+++ b/R/parseGEO.R
@@ -1,4 +1,4 @@
-parseGEO <- function(fname,GSElimits) {
+parseGEO <- function(fname,GSElimits,GPLfn=NULL) {
con <- fileOpen(fname)
first.entity <- findFirstEntity(con)
close(con)
@@ -14,7 +14,7 @@ parseGEO <- function(fname,GSElimits) {
parseGPL(fname)
},
"0" = {

  •                  parseGSEMatrix(fname)$eset
    
  •                  parseGSEMatrix(fname, GPLfn=GPLfn)$eset
               },
               )
    

    return(ret)
    @@ -368,7 +368,7 @@ getAndParseGSEMatrices <- function(GEO,destdir,AnnotGPL,getGPL=TRUE) {

    Function to parse a single GSEMatrix

    file into an ExpressionSet

    -parseGSEMatrix <- function(fname,AnnotGPL=FALSE,destdir=tempdir(),getGPL=TRUE) {
    +parseGSEMatrix <- function(fname,AnnotGPL=FALSE,destdir=tempdir(),getGPL=TRUE, GPLfn=NULL) {
    dat <- readLines(fname)

    get the number of !Series and !Sample lines

    nseries <- sum(grepl("^!Series_", dat))
    @@ -406,7 +406,12 @@ parseGSEMatrix <- function(fname,AnnotGPL=FALSE,destdir=tempdir(),getGPL=TRUE) {

    if getGPL is FALSE, skip this and featureData is then a data.frame with no columns

    fd = new("AnnotatedDataFrame",data=data.frame(row.names=rownames(datamat)))
    if(getGPL) {

  •    gpl <- getGEO(GPL,AnnotGPL=AnnotGPL,destdir=destdir)
    
  •    if (!is.null(GPLfn)) {
    
  •        gpl <- getGEO(filename=GPLfn, AnnotGPL=FALSE,destdir=destdir)
    
  •    } else {
    
  •        gpl <- getGEO(GEO=GPL,AnnotGPL=AnnotGPL,destdir=destdir)
    
  •    }
    
  •  vmd <- Columns(gpl)
     dat <- Table(gpl)
     ## GEO uses "TAG" instead of "ID" for SAGE GSE/GPL entries, but it is apparently
    

    ##################################################################

seandavi pushed a commit that referenced this issue May 8, 2016
Commit id: bbcc13a

    Version Bump


Commit id: 0f5d2f3

    Merge branch 'master' of github.com:seandavi/GEOquery


Commit id: d360aa8

    Fixes #8

    This fixes an issue whereby the destdir was not respected
    when a filename was specified for a GSEMatrix file.
    


git-svn-id: https://hedgehog.fhcrc.org/bioconductor/trunk/madman/Rpacks/GEOquery@102653 bc3139a8-67e5-0310-9ffc-ced21a209358
seandavi added a commit that referenced this issue Sep 23, 2017
Commit id: bbcc13a

    Version Bump


Commit id: 0f5d2f3

    Merge branch 'master' of github.com:seandavi/GEOquery


Commit id: d360aa8

    Fixes #8

    This fixes an issue whereby the destdir was not respected
    when a filename was specified for a GSEMatrix file.
    


git-svn-id: file:///home/git/hedgehog.fhcrc.org/bioconductor/trunk/madman/Rpacks/GEOquery@102653 bc3139a8-67e5-0310-9ffc-ced21a209358
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants