AttributeError: 'str' object has no attribute 'decode' #5

joelnitta · 2021-06-24T09:18:55Z

Hello!

I was wondering if I can get some help with the error, AttributeError: 'str' object has no attribute 'decode' when running tetrad on an hdf5 file from ipyrad:

(tetrad) root@29ea72f56332:/wd/intermediates/ipyrad/hymeno-v2_outfiles# tetrad -i hymeno-v2.snps.hdf5 -o outdir -n test 
                                                                                                                        
-------------------------------------------------------                                                                 
tetrad [v.0.9.13]                                                                                                       
Quartet inference from phylogenetic invariants                                                                          
-------------------------------------------------------                                                                 
tetrad instance: test                                                                                                   
Traceback (most recent call last):                                                                                      
  File "/opt/conda/envs/tetrad/bin/tetrad", line 11, in <module>                                                        
    sys.exit(main())                                                                                                    
  File "/opt/conda/envs/tetrad/lib/python3.7/site-packages/tetrad/__main__.py", line 287, in main                       
    CLI()                                                                                                               
  File "/opt/conda/envs/tetrad/lib/python3.7/site-packages/tetrad/__main__.py", line 159, in __init__                   
    self.get_data()                                                                                                     
  File "/opt/conda/envs/tetrad/lib/python3.7/site-packages/tetrad/__main__.py", line 233, in get_data                   
    load=self.load,                                                                                                     
  File "/opt/conda/envs/tetrad/lib/python3.7/site-packages/tetrad/tetrad.py", line 178, in __init__                     
    self._init_seqarray()                                                                                               
  File "/opt/conda/envs/tetrad/lib/python3.7/site-packages/tetrad/tetrad.py", line 336, in _init_seqarray               
    names = [i.decode() for i in io5["snps"].attrs["names"]]                                                            
  File "/opt/conda/envs/tetrad/lib/python3.7/site-packages/tetrad/tetrad.py", line 336, in <listcomp>                   
    names = [i.decode() for i in io5["snps"].attrs["names"]]                                                            
AttributeError: 'str' object has no attribute 'decode'

Here is a link to the hdf5 file on Dropbox if it helps: https://www.dropbox.com/s/be0cdol47f4renx/hymeno-v2.snps.hdf5?dl=0

The snps.hdf5 file was generated with ipyrad v.0.9.65.

Thanks!

The text was updated successfully, but these errors were encountered:

isaacovercast · 2021-07-01T15:14:37Z

The decode error has to do with a difference between binary and plain text string representations. When I have seen this in the past it was caused by running the ipyrad assembly with python 2, then trying to run downstream tools with ipyrad/tetrad and python 3. Can you please try re-running step 7 with ipyrad after verifying your python version is 3.7 or greater? If you are going to re-run step 7 I would also suggest updating to the newest version of ipyrad (it never hurts). Good luck.

joelnitta · 2021-07-02T08:16:12Z

Thanks for the suggestions.

I ran ipyrad in a docker container (tag 0.9.65--pyh3252c3a_0), which is based off the bioconda package (recipe here).

Running these commands in the container indicates it is using python3:

bash-4.2# which python
/usr/local/bin/python
bash-4.2# python --version
Python 3.7.9

I tried steps 2-7 with the most recent version of ipyrad (tag 0.9.81--pyh5e36f6f_0), but got the same error from tetrad.

isaacovercast · 2021-07-02T16:20:05Z

Ok. I dl'd and ran this docker image, and it seems fine. The docker container doesn't include tetrad, so how are you installing and running it?

isaacovercast · 2021-07-02T17:00:15Z

Ok, well I fixed it but I don't have permissions on this repository. Here is the diff for the working version:

diff --git a/tetrad/tetrad.py b/tetrad/tetrad.py
index a0c70c3..9ed2e9a 100644
--- a/tetrad/tetrad.py
+++ b/tetrad/tetrad.py
@@ -316,7 +316,10 @@ class Tetrad(object):
         # reloading info from hdf5
         assert ".snps.hdf5" in self.files.data, "data file is not .snps.hdf5"
         io5 = h5py.File(self.files.data, 'r')
-        names = [i.decode() for i in io5["snps"].attrs["names"]]
+        try:
+            names = [i.decode() for i in io5["snps"].attrs["names"]]
+        except AttributeError:
+            names = [i for i in io5["snps"].attrs["names"]]
         self.samples = names
 
 
@@ -333,7 +336,10 @@ class Tetrad(object):
         # get data shape from io5 input file       
         assert ".snps.hdf5" in self.files.data, "data file is not .snps.hdf5"
         io5 = h5py.File(self.files.data, 'r')
-        names = [i.decode() for i in io5["snps"].attrs["names"]]
+        try:
+            names = [i.decode() for i in io5["snps"].attrs["names"]]
+        except AttributeError:
+            names = [i for i in io5["snps"].attrs["names"]]
         self.samples = names
         ntaxa = len(names)
         nsnps = io5["snps"].shape[1]

You can pull the repo, apply the diff, and then pip install -e . the top level tetrad repository directory, and it should work fine.

joelnitta · 2021-07-02T22:24:30Z

Thanks!

I was running tetrad in a custom docker image (has since been updated to apply the patch).

However, I now have a different problem... I think this is just my incomplete understanding of how tetrad works. I was able to run it successfully, but it returns 0 bootstrap result trees already exist for test and no other output. What do I need to do for tetrad to produce a tree?

(base) joelnitta@beyond:~/hymeno-migseq/test$ docker run --rm -v /home/joelnitta/hymeno-migseq/test:/wd -w /wd joelnitta/tetrad:0.9.14-patch tetrad -i hymeno-v2.snps.hdf5

-------------------------------------------------------
tetrad [v.0.9.14] 
Quartet inference from phylogenetic invariants
-------------------------------------------------------
tetrad instance: test
loading snps array [134 taxa x 206615 snps]
max unlinked SNPs per quartet [nloci]: 18785
quartet sampler [random, nsamples**2.8]: 903427 / 12840751
0 bootstrap result trees already exist for test.

joelnitta · 2021-07-03T00:05:40Z

Nevermind... it was indeed just my usage of tetrad. Once I added -b 100, the analysis is producing output as expected. (Suggestion: if tetrad -i input without additional arguments doesn't actually do anything, you may want to change that example in the README).

I am leaving this open for now because it seems to me it shouldn't be considered resolved until the patch gets merged.

isaacovercast · 2021-12-01T07:03:50Z

Good directions for installing the patch from @bmichanderson:

First you may want to create a new environment without tetrad or uninstall it from the current one and make sure it isn't installed anywhere else (typing tetrad --version shouldn't give anything). Then you can install it as I mentioned. Where you clone the repository doesn't matter, as you can just delete it after. The install puts the program and scripts in the right places.
If you are in your home directory cd ~, you can run the commands I put above. First clone the repository, then change into the repository directory (tetrad). Now make a file called mydiff (or whatever) and paste the text from the comment I linked. Save the file, then use the command git apply mydiff or whatever you called it git apply <your_file>. After it completes, while you are still in the tetrad directory, you can run the python setup.py install .

isaacovercast mentioned this issue Jul 20, 2021

vcf_to_hdf5 and tetrad: 'str' object has no attribute 'decode' dereneaton/ipyrad#451

Closed

isaacovercast closed this as completed in 461fe50 Nov 29, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AttributeError: 'str' object has no attribute 'decode' #5

AttributeError: 'str' object has no attribute 'decode' #5

joelnitta commented Jun 24, 2021 •

edited

Loading

isaacovercast commented Jul 1, 2021

joelnitta commented Jul 2, 2021

isaacovercast commented Jul 2, 2021

isaacovercast commented Jul 2, 2021

joelnitta commented Jul 2, 2021

joelnitta commented Jul 3, 2021

isaacovercast commented Dec 1, 2021

AttributeError: 'str' object has no attribute 'decode' #5

AttributeError: 'str' object has no attribute 'decode' #5

Comments

joelnitta commented Jun 24, 2021 • edited Loading

isaacovercast commented Jul 1, 2021

joelnitta commented Jul 2, 2021

isaacovercast commented Jul 2, 2021

isaacovercast commented Jul 2, 2021

joelnitta commented Jul 2, 2021

joelnitta commented Jul 3, 2021

isaacovercast commented Dec 1, 2021

joelnitta commented Jun 24, 2021 •

edited

Loading