Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

normalize-by-median.py does not accept reads on the stdin #633

Closed
macmanes opened this issue Oct 8, 2014 · 6 comments · Fixed by dib-lab/screed#11
Closed

normalize-by-median.py does not accept reads on the stdin #633

macmanes opened this issue Oct 8, 2014 · 6 comments · Fixed by dib-lab/screed#11

Comments

@macmanes
Copy link

macmanes commented Oct 8, 2014

Any reason why normalize-by-median.py could (should) not accept reads on the stdin from interleave-reads.py?

 interleave-reads.py clam_trim_1P.gz clam_trim_2P.gz | \
 normalize-by-median.py -p -k 20 -C 50 -N 4 -x 15e9 --savetable normC50k20.kh -

This would be super.. Could imagine streaming input reads from a number of different sources as well…

P.S. Do you want these messages here or on the mailing list..

@brtaylor92
Copy link
Contributor

See #393, if I'm understanding your comment correctly.

@mr-c
Copy link
Contributor

mr-c commented Oct 8, 2014

Hello @macmanes. We don't support the dash syntax yet; did you try specifying /dev/stdin as the file and using the -out option to specify the destination?

@mr-c
Copy link
Contributor

mr-c commented Oct 8, 2014

Here is the correct place, thanks!

@macmanes
Copy link
Author

macmanes commented Oct 8, 2014

sorry, /dev/stdin does not work either:

 interleave-reads.py clam_trim_1P.gz clam_trim_2P.gz \
 | normalize-by-median.py -p -k 20 -C 50 -N 4 -x 15e9 \
 --out norm.fq --savetable normC50k20.kh /dev/stdin

I get output and error:

| This is the script 'interleave-reads.py' in khmer.
|| You are running khmer version 1.2-rc2-6-gc5dee21
|| You are also using screed version 0.7
Interleaving:
    clam_trim_1P.gz
    clam_trim_2P.gz
... 0 pairs
|| This is the script 'normalize-by-median.py' in khmer.
|| You are running khmer version 1.2-rc2-6-gc5dee21
|| You are also using screed version 0.7
PARAMETERS:
 - kmer size =    20            (-k)
 - n tables =     4             (-N)
 - min tablesize = 1.5e+10      (-x)
Estimated memory usage is 6e+10 bytes (n_tables x min_tablesize)
--------
WARNING: Input file /dev/stdin is empty 
making k-mer counting table
** ERROR: [Errno 29] Illegal seek
** Failed on /dev/stdin:
** ...dumping k-mer counting table to stdin.ct.failed

Interestingly, a 56Gb stdin.ct.failed file is produced..
Input files are really there. Individual components work fine.

@mr-c
Copy link
Contributor

mr-c commented Oct 9, 2014

Ah, drat: this is a screed issue.

To reproduce:

Comment out lines 254, 258-268, and adjust indentation as necessary in normalize-by-median.py

$ mkfifo named-pipe
$ interleave-reads.py tests/test-data/paired.fq.1 tests/test-data/paired.fq.2 > named-pipe & 
$ gdb python
(gdb) run scripts/normalize-by-median.py -p --out streaming named-pipe
[...]
  File "/home/mcrusoe/khmer/gl-master/env/local/lib/python2.7/site-packages/screed-0.7.1-py2.7.egg/screed/openscreed.py", line 52, in open_reader
    fp.seek(0)

@mr-c
Copy link
Contributor

mr-c commented Oct 10, 2014

I have fixed this in dib-lab/screed#11 but it isn't compatible with Python2.6 at this time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants