Skip to content

Conversation

@wasade
Copy link
Contributor

@wasade wasade commented May 27, 2015

Truncating is happening as expected now

@antgonza
Copy link
Member

👍 if tests pass.

@adamrp
Copy link
Contributor

adamrp commented May 27, 2015

code looks good, just testing for myself now

@ElDeveloper
Copy link
Contributor

Looks good to me as well, would you mind adding a note to the ChangeLog? This seems like an important bug fix to mention there.

@adamrp
Copy link
Contributor

adamrp commented May 27, 2015

When I test similar code on my system, I get an error from scikit-bio (0.2.3) not recognizing the filetype. The code I'm trying to execute is:

from h5py import File

from qiita_ware.demux import to_hdf5, fetch

hdf5_fp = '/tmp/variable.hdf5'
fastq_fp = '/tmp/variable.fq'

with open(fastq_fp, 'U') as fq, File(hdf5_fp, 'w') as h5:
    to_hdf5(fq, h5)

And the contents of /tmp/variable.fq are:

@a_1 orig_bc=abc new_bc=abc bc_diffs=0
ATG
+
ABC
@b_1 orig_bc=abw new_bc=wbc bc_diffs=4
ATG
+
DFG
@b_2 orig_bc=abw new_bc=wbc bc_diffs=4
ATGCG
+
DEF#G

So I'm not sure what's going on... I'm going to hold off investigating further until these tests finish.

@coveralls
Copy link

Coverage Status

Coverage remained the same at 79.04% when pulling 454446b on wasade:demux_fetch_bug into 13e54c1 on biocore:master.

@adamrp
Copy link
Contributor

adamrp commented May 27, 2015

well, I guess I'm doing something wrong since the tests appear to be working... but for the record the error I am getting is:

Traceback (most recent call last):
  File "/tmp/variable.py", line 9, in <module>
    to_hdf5(fq, h5)
  File "/Users/adam/git/qiita/qiita_ware/demux.py", line 355, in to_hdf5
    sample_stats, full_stats = _summarize_lengths(_per_sample_lengths(fp))
  File "/Users/adam/git/qiita/qiita_ware/demux.py", line 205, in _per_sample_lengths
    for record in load(fp):
  File "/Users/adam/git/scikit-bio/skbio/parse/sequences/factory.py", line 123, in load
    i_seqs, o_seqs = _determine_types_and_openers(seqs)
  File "/Users/adam/git/scikit-bio/skbio/parse/sequences/factory.py", line 43, in _determine_types_and_openers
    raise IOError("Unknown filetype for %s" % fpath)
IOError: Unknown filetype for @a_1 orig_bc=abc new_bc=abc bc_diffs=0

@adamrp
Copy link
Contributor

adamrp commented May 27, 2015

Ah... the to_hdf5 takes a filepath as the first argument and an open HDF5 file as the second argument. Corrected, and it works now!

adamrp added a commit that referenced this pull request May 27, 2015
@adamrp adamrp merged commit d435781 into qiita-spots:master May 27, 2015
@wasade
Copy link
Contributor Author

wasade commented May 27, 2015

I think the filename needs to be the input to to_hdf5, not the file handle

On Wed, May 27, 2015 at 5:30 PM, adamrp notifications@github.com wrote:

Merged #1214 #1214.


Reply to this email directly or view it on GitHub
#1214 (comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants