Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

load-into-counting.py: ValueError: bigcount is not supported for this storage #1722

Closed
sjackman opened this issue Jun 7, 2017 · 10 comments
Closed

Comments

@sjackman
Copy link

sjackman commented Jun 7, 2017

load-into-counting.py with --small-count yields ValueError: bigcount is not supported for this storage.. Adding --no-bigcount doesn't seem to help.

❯❯❯ load-into-counting.py --small-count -x 1e7 khmer2 foo.fq.gz
PARAMETERS:
 - kmer size =     32 		(-k)
 - n tables =      4 		(-N)
 - max tablesize = 1e+07 	(-x)
Estimated memory usage is 0.0 Gb (4e+07 bytes = 4 bytes x 1e+07 entries / 1 entries per byte)
--------
Saving k-mer countgraph to khmer2
Loading kmers from sequences in ['foo.fq.gz']
making countgraph
Traceback (most recent call last):
  File "/Users/sjackman/.homebrew/bin/load-into-counting.py", line 221, in <module>
    main()
  File "/Users/sjackman/.homebrew/bin/load-into-counting.py", line 139, in main
    countgraph.set_use_bigcount(args.bigcount)
ValueError: bigcount is not supported for this storage.
❯❯❯ load-into-counting.py --version
khmer 2.1.1
@betatim
Copy link
Member

betatim commented Jun 7, 2017

I will take a look at this.

@betatim
Copy link
Member

betatim commented Jun 7, 2017

Thanks for catching this!

@sjackman
Copy link
Author

sjackman commented Jun 7, 2017

No worries! Is there a workaround for version 2.1.1? load-into-counting.py --small-count --no-bigcount -x 1e7 khmer2 foo.fq.gz didn't work for me.

@sjackman
Copy link
Author

sjackman commented Jun 7, 2017

Or can you suggest a different command that I can use to test the memory usage of --small-count vs the default?

@standage
Copy link
Member

standage commented Jun 7, 2017

I'm porting this to the v2.1 maintenance branch now. Will cut a bugfix release once the JOSS review is complete.

@sjackman
Copy link
Author

sjackman commented Jun 7, 2017

For the item Performance: Have any performance claims of the software been confirmed? I'm trying to confirm the reduced memory usage of --small-count vs the default. Can you suggest a command?

@standage
Copy link
Member

standage commented Jun 7, 2017

Once #1726 is merged, you should be able to run the following commands on branch maint/2.1 (from which release version 2.1.2 will be cut when the JOSS review is complete).

Same graph size, same FPR, less memory

scripts/load-into-counting.py -x 2e6 default.count tests/test-data/test-reads.fq.gz
scripts/load-into-counting.py -x 2e6 small.count --small-count tests/test-data/test-reads.fq.gz

The reduced memory usage should be reflected in the configuration dump printed to the screen, as well as the count graph file sizes (default.cg vs small.cg).

Same amount of memory, larger graph size, smaller FPR

load-into-counting.py -M 4M default.cg tests/test-data/test-reads.fq.gz
load-into-counting.py -M 4M small.cg --small-count tests/test-data/test-reads.fq.gz

The output file sizes should be the same, but the reported table size for the --small-count should be larger, and the reported FPR smaller.

@sjackman
Copy link
Author

sjackman commented Jun 7, 2017

Thanks. I've subscribed to PR #1726.

@sjackman
Copy link
Author

sjackman commented Jun 9, 2017

Your suggested commands worked on the maint/2.1 branch. Thanks. I confirmed that the memory usage with --small-count is half that of the default.

@standage
Copy link
Member

standage commented Jun 9, 2017

Excellent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants