Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TCGAquery_clinic error #8

Closed
needleworm opened this issue Jul 11, 2016 · 4 comments
Closed

TCGAquery_clinic error #8

needleworm opened this issue Jul 11, 2016 · 4 comments

Comments

@needleworm
Copy link

Hi.
for several days ago, TCGAquery doesn't work throwing error message as below:

clinical_brca_data <- TCGAquery_clinic("brca","clinical_patient")
Error in fread(paste0(root, url, "/", files[grep("MANIFEST", files)]), :
Expected sep (',') but new line or EOF ends field 1 on line 33 when reading data: -->
In addition: Warning messages:
1: In fread(paste0(root, url, "/", files[grep("MANIFEST", files)]), :
Unable to find 5 lines with expected number of columns (+ middle)
2: In fread(paste0(root, url, "/", files[grep("MANIFEST", files)]), :
Unable to find 5 lines with expected number of columns (+ last)

What should I do to fix this problem?

@tiagochst
Copy link
Contributor

This happens because of server's maintanance problem. Other packages will give you the same problem. There is actually no easy solution until the server is back to fix our package.

But, you can use RTCGA (I believe it has downloaded all the data and put it into Bioconductor) or RTCGAtoolbox (uses GDAC firehose as source) packages for the moment to get the old data.

@rolfhaut
Copy link

Hi,

Yeah it is really a pitty that the package is currently not working, especially as I just started to get a grasp of it (I'm actually very new to R). I contacted the NCBI site, but they weren't able to tell when they are back up again. However, they told me that the data portal on TCGA will go away in July and will be available through GDC. So I guess the code will have to be modified accordingly, and hope that you guys keep up the good work enabling non-experienced user like myself to perform analysis on TCGA data.

@labrazil
Copy link
Member

Thank you for your interest in our package. Yes we are aware of this new
change and we have ironed out a solution that will use GDC. Stay tuned, we
hope to have something ready in the coming days.


\hn
cell (USA): +1-310-570-2362
cell (BRA): +55-16-99779-2362

Sent from my nexus 6p
On Jul 13, 2016 18:05, "rolfhaut" notifications@github.com wrote:

Hi,

Yeah it is really a pitty that the package is currently not working,
especially as I just started to get a grasp of it (I'm actually very new to
R). I contacted the NCBI site, but they weren't able to tell when they are
back up again. However, they told me that the data portal on TCGA will go
away in July and will be available through GDC. So I guess the code will
have to be modified accordingly, and hope that you guys keep up the good
work enabling non-experienced user like myself to perform analysis on TCGA
data.


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#8 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/ABBCt3eSPmdZAoVs_wWKYKXX0SUgQ-daks5qVVMxgaJpZM4JI-IV
.

@tiagochst
Copy link
Contributor

tiagochst commented Jul 25, 2016

Our team updated TCGAbiolinks to work with the new GDC portal. We had to rewrite almost all the code again, but we hope this will help improve our tool which we are dedicated to improving.
Unfortunately, the changes are not in the Bioconductor repository, but we anticipate the codes will be updated within the coming weeks. For the moment, if you require the new package, you can install it from our GitHub repository with the following command:

devtools::install_github(repo = "BioinformaticsFMRP/TCGAbiolinks")
To get clinical data you have two options:

The first one will get the indexed clinical data, which is the same data if you download using "Download clinical" through GDC data portal. This function gives less information and can be retrieved in some minutes.

clin <- GDCquery_clinic("TCGA-BRCA", type = "clinical", save.csv = TRUE)
biospecimen <- GDCquery_clinic("TCGA-BRCA", type = "biospecimen", save.csv = TRUE)

The second option is to parse the clinical xml files, which will give you all the clinical information. But downloading all the xml files will take some time.

query <- GDCquery(project = "TCGA-BRCA",data.category = "Clinical")
GDCdownload(query)
clinical <- GDCprepare_clinic(query, clinical.info = "patient")
clinical.drug <- GDCprepare_clinic(query, clinical.info = "drug")
clinical.radiation <- GDCprepare_clinic(query, clinical.info = "radiation")
clinical.admin <- GDCprepare_clinic(query, clinical.info = "admin")
clinical.followup <- GDCprepare_clinic(query, clinical.info = "follow_up")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants