Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

qiime rdp classifier biom import failure #392

Closed
ghanesh opened this issue Oct 10, 2014 · 4 comments
Closed

qiime rdp classifier biom import failure #392

ghanesh opened this issue Oct 10, 2014 · 4 comments

Comments

@ghanesh
Copy link

ghanesh commented Oct 10, 2014

Hi Joey!
I commented on #272, but this issue might be worth an own thread....
Thanks for the details on solving the error in #272 .
It would be great if you could help me with following problem to import my files from qiime in phyloseq!
Thank you in advance!
I have a similar problem, but it is only partly solved by this thread.
The files result from qiime pipeline, where we used the rdp classifier.
It was not possible to import the biom file directly. I believe it because of the way how the rdp classifier is used in qiime. Well, what worked so far was:

otufile = "otu_table_uclust.biom"
mapfile = "map_bac_all_corrected.txt"
trefile = "rep_set.tre"
read_biom(otufile)
biom object.
type: OTU table
matrix_type: sparse
1149 rows and 1504 columns
x = read_biom(otufile)
x = read_biom(otufile)
otumat = as(biom_data(x), "matrix")
OTU = otu_table(otumat, taxa_are_rows=TRUE)
taxmat = as.matrix(observation_metadata(x), rownames.force=TRUE, byrows=FALSE)

BUT here the problems start This command produces a List with one column only:
row.names V1
1 denovo0 Bacteria
2 denovo1 Bacteria
3 denovo2 c("Bacteria", "Proteobacteria", "Alphaproteobacteria", "Rickettsiales", "SAR11", "Pelagibacter")
4 denovo3 c("Bacteria", "Bacteroidetes", "Flavobacteria", "Flavobacteriales", "Flavobacteriaceae")
5 denovo4 c("Bacteria", "Proteobacteria")

so therefore the command

TAX = tax_table(taxmat)
produces following:

Error in validObject(.Object) : invalid class “taxonomyTable” object:
Non-character matrix provided as Taxonomy Table.

Taxonomy is expected to be characters.

I believe this might be the reason, why I got stuck with using:
MyExp <- import_qiime(otufile, mapfile, trefile)
in the first place:

Error in fread(input = paste0(x, collapse = "\n"), sep = "\t", header = TRUE, :
'skip' must be a length 1 vector of type numeric or integer >=-1, or single character search string
In addition: Warning messages:
1: In readLines(file) :
incomplete final line found on 'otu_table_uclust.biom'
2: In max(which(substr(x[1:25L], 1, 1) == "#")) :
no non-missing arguments to max; returning -Inf

3: running command 'C:\Windows\system32\cmd.exe /c ({"id": "None","format": "Biological Observation Matrix 1.0.0","format_url": "http://biom-format.org","type": "OTU table","generated_by": "QIIME 1.7.0-dev","date": "2014-10-09T16:21:41.267897","matrix_type": "sparse","matrix_element_type": "int","shape": [1149, 1504],"data":

Any ideas?
Cheers!
Alexander

@ghanesh
Copy link
Author

ghanesh commented Oct 10, 2014

maybe this problem is also linked to #357....

@ghanesh
Copy link
Author

ghanesh commented Oct 15, 2014

Has anyone an idea?

@joey711
Copy link
Owner

joey711 commented Jul 13, 2015

@ghanesh / Alexaner, did you solve this issue yet? If so, what did it?

What version of biom-format was this file? Version 1 (JSON) or Version 2 (HDF5)?

Thanks and sorry for the delay.

joey

@ghanesh
Copy link
Author

ghanesh commented Jul 23, 2015

@joey711 Hello,
Well I was able to circumvent the prroblem, which I did via:
otufile = "otu_table_w_tax.biom"
mapfile = "Map_final.txt"
trefile = "arc_rep_set.tre"
envir=import_qiime_sample_data(mapfile)
myData = import_biom(otufile, trefile)
myData = merge_phyloseq(myData,envir)

which gave me this:
phyloseq-class experiment-level object
otu_table() OTU Table: [ 1949 taxa and 118 samples ]
sample_data() Sample Data: [ 118 samples by 28 sample variables ]
tax_table() Taxonomy Table: [ 1949 taxa by 7 taxonomic ranks ]
phy_tree() Phylogenetic Tree: [ 1949 tips and 1947 internal nodes ]

with the following:
Taxonomy Table: [6 taxa by 7 taxonomic ranks]:
Rank1 Rank2 Rank3 Rank4 Rank5 Rank6
3354502 "k__Archaea" "p__Crenarchaeota" "c__Thaumarchaeota" "o__Cenarchaeales" "
Rank7
3354502 "s__"

therefore
rank_names(myData)
results in:

[1] "Rank1" "Rank2" "Rank3" "Rank4" "Rank5" "Rank6" "Rank7"

I then solved the headers issue with:
colnames(tax_table(myData)) = c( "Kingdom", "Phylum", "Class", "Order", "Family", "Genus", "Species")
as you suggested in an other reply (Issue 162, if I'm not mistaken).
Anyways, what I did not manage to get rid of are the taxonomic level identificators and the "__".
I know it should be possible to tweak this, but at the moment not really how.
Do you have a suggestion for me?

Thank you for your support!
Alex

@joey711 joey711 closed this as completed May 6, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants