Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Final visualisation step #10

Closed
yashsondhi opened this issue Apr 1, 2020 · 5 comments
Closed

Final visualisation step #10

yashsondhi opened this issue Apr 1, 2020 · 5 comments

Comments

@yashsondhi
Copy link

Hi, I got the volcano plot working, but the heatmap portion is not showing up. I am mapping RNA seq reads to a genome.

This is the output of the script when I run it only using R.
(rasflow) [yashsondhi@login4 RASflow]$ Rscript scripts/visualize.R output/test_25_march/moth/genome/dea/countGroup output/test_25_march/moth/genome/dea/DEA output/test_25_march/moth/genome/dea/visualization
Loading required package: plotscale
hash-3.0.1 provided by Decision Patterns

Loading required package: GenomicFeatures
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: ‘BiocGenerics’

The following objects are masked from ‘package:parallel’:

clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from ‘package:stats’:

IQR, mad, sd, var, xtabs

The following objects are masked from ‘package:base’:

anyDuplicated, append, as.data.frame, basename, cbind, colMeans,
colnames, colSums, dirname, do.call, duplicated, eval, evalq,
Filter, Find, get, grep, grepl, intersect, is.unsorted, lapply,
lengths, Map, mapply, match, mget, order, paste, pmax, pmax.int,
pmin, pmin.int, Position, rank, rbind, Reduce, rowMeans, rownames,
rowSums, sapply, setdiff, sort, table, tapply, union, unique,
unsplit, which, which.max, which.min

Loading required package: S4Vectors
Loading required package: stats4

Attaching package: ‘S4Vectors’

The following objects are masked from ‘package:hash’:

values, values<-

The following object is masked from ‘package:base’:

expand.grid

Loading required package: IRanges
Loading required package: GenomeInfoDb
Loading required package: GenomicRanges
Loading required package: AnnotationDbi
Loading required package: Biobase
Welcome to Bioconductor

Vignettes contain introductory material; view with
'browseVignettes()'. To cite Bioconductor, see
'citation("Biobase")', and for packages 'citation("pkgname")'.

Attaching package: ‘AnnotationDbi’

The following objects are masked from ‘package:hash’:

keys, keys<-

Loading required package: ggplot2
Loading required package: ggrepel
Querying chunk 1
Querying chunk 2
Querying chunk 3
Querying chunk 4
Querying chunk 5
Querying chunk 6
Querying chunk 7
Querying chunk 8
Querying chunk 9
Querying chunk 10
Querying chunk 11
Querying chunk 12
Querying chunk 13
Querying chunk 14
Querying chunk 15
Querying chunk 16
Finished
Pass returnall=TRUE to return lists of duplicate or missing query terms.
Error in .request.get(mygene, paste("/query/", sep = ""), params) :
Request returned unexpected status code:
Response [http://mygene.info/v3/query/?fields=symbol]
Date: 2020-04-01 15:38
Status: 400
Content-Type: application/json; charset=UTF-8
Size: 65 B
{
"success": false,
"error": "Missing required parameters."
Calls: plot.volcano.heatmap ... query -> query -> query -> .request.get -> .request.get
Execution halted

@zhxiaokang
Copy link
Owner

Hi, the error indicates that the problem happens when using the package mygene. It is used twice in scripts/visualize.R: line 57 gene.symbol.dea.all <- queryMany(gene.id.dea, scopes = 'ensembl.gene', fields = 'symbol') and line 106 gene.symbol.norm.table <- queryMany(gene.id.norm.table, scopes = 'ensembl.gene', fields = 'symbol')$symbol. As you said that volcanoplot is plotted (in line 88), so the error must come from line 106.

Reading the code in scripts/visualize.R from line 90 to line 106, it's clear that the problem is the normalized count table: control_gene_norm.tsv and treat_gene_norm.tsv. Could you check them out (maybe compare them with the example data data/example/dea/countGroup/oil_control_gene_norm.tsv and data/example/dea/countGroup/oil_low_gene_norm.tsv)?

@yashsondhi
Copy link
Author

Hi,
I compared the files and found that there is an addition of the word gene: in front of the gene id if I remove all of these it runs. I will work on removing it in R, I could fix it easily in python, but if you have any suggestions on doing it in the script, that would be helpful. Thank you

@zhxiaokang
Copy link
Owner

I see. The extra word "gene" before the gene ID confused the program. But the design of RASflow shouldn't add that extra word. So I can't figure out why that is happening without taking a look at your code and data. Otherwise, if a little bit of manual work (removing it manually) is OK for you, then it's fine.

@yashsondhi
Copy link
Author

yashsondhi commented Apr 19, 2020 via email

@zhxiaokang
Copy link
Owner

Sounds great! I'll close the issue for now. Feel free to open it whenever it's necessary

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants