Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in Import_biom #674

Closed
Imee929 opened this issue Oct 9, 2016 · 4 comments
Closed

Error in Import_biom #674

Imee929 opened this issue Oct 9, 2016 · 4 comments

Comments

@Imee929
Copy link

Imee929 commented Oct 9, 2016

Hi Joey,
I have issues with importing biom files into phyloseq. I tried parsing with qiime and greengenes and I got 2 different errors. I have read and followed what the forums suggested but I am still unsuccessful with importing files in to phyloseq. Can you help me figure out the issue.

file: table.from_txt_json.biom

import_biom(a, parseFunction=parse_taxonomy_qiime)
Error in strsplit(char.vec, ";", TRUE) : non-character argument
import_biom(a, parseFunction=parse_taxonomy_greengenes)
Error in colnames<-(*tmp*, value = c("ta1", "ta0")) :
length of 'dimnames' [2] not equal to array extent

Thanks,
Imee

@fconstancias
Copy link

Hello,
It seems like you sucessfully converted your biom table into json, right?

biom convert -i otu_table.biom -o otu_table_json.biom --table-type="OTU table" --to-json

Could you try this command?
otutable <- import_biom(BIOMfilename = 'table.from_txt_json.biom')

Did you convert to text and then to json? If so, you might have lost the taxonomic information in your table. Try to convert directly to json.

Hopefully that’s helpful
Cheers

@Imee929
Copy link
Author

Imee929 commented Oct 11, 2016

still having error in importing the biom file

@joey711
Copy link
Owner

joey711 commented Nov 7, 2016

That error is likely to occur when one of the tables (e.g. taxonomy table or OTU table) does not have IDs, or most often the case, when the IDs between the tables do not match. It might also be that one of the tables is empty. It's hard to know without a reproducible example that generates this error.

To avoid surprises later in analysis, phyloseq requires that IDs along the same dimensions match exactly. During import or object construction, only IDs that match across data components are retained. When there are no IDs that match, you can get weird errors about one of the components having length or dim zero. If I can find out where in the data import process this is happening in your case, I can improve the error messages to be more informative about the problem.

@Imee929 can you share some more information for diagnosis? What are the dimensions of the data you expect in those files? Is there a publicly available or shareable file you can post that reproduces this same error?

I will close for now, but if you can provide this I will re-open and we can try to figure out what is going on. Rest assured that properly formatted BIOM files with properly formatted IDs are successfully imported by phyloseq without an error.

@joey711 joey711 closed this as completed Nov 7, 2016
@morien
Copy link

morien commented Dec 5, 2017

@joey711 i've got this same problem with an OTU table. I started with a large meta-dataset which I split using phyloseq into several component datasets. I wrote the OTU tables as .txt files with the taxonomy in the rightmost column. These files look fine, there are no special characters or carriage returns hiding in them. The taxonomies are delimited, but that shouldn't affect the output, as they are the same in the large dataset. I use the 'biom convert' function of QIIME to convert these to json formatted .biom files with taxonomy preserved, but when I go to import them with import_biom in phyloseq, I get the same error the OP has above. I've done this 8 ways now, checking for any irregularities, still the same error. I'm attaching the file that will reproduce the error so you can help me debug this. I know I can import plain text with phyloseq but some of our lab aren't as computer savvy and prefer to follow set instructions which involve importing .biom files.

TEST.biom.zip

This is my import code:
rawdata <- import_biom(file.path("TEST.biom"), parallel = TRUE, trim_ws = TRUE)
and here is the error output:
Error in dimnames(x) <- dn : length of 'dimnames' [2] not equal to array extent In addition: There were 50 or more warnings (use warnings() to see the first 50)
50: In parseFunction(i$metadata$taxonomy) : Empty taxonomy vector encountered.

I know that the taxonomy vectors aren't actually empty, and the issue is with how they're processed by import_biom, because I can take this file and convert it to .txt with this command, and the taxonomy strings will be right there in the last column:
biom convert -i TEST.biom -o TEST.txt --to-tsv --process-obs-metadata taxonomy --header-key Taxonomy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants