Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues with --nodc, --justref and --noref parameters #248

Open
darko-cucin opened this issue Oct 12, 2022 · 0 comments
Open

Issues with --nodc, --justref and --noref parameters #248

darko-cucin opened this issue Oct 12, 2022 · 0 comments

Comments

@darko-cucin
Copy link

I have recently run centrifuge-build (version 1.0.4, docker container based on ubuntu 18:04) with sequences downloaded from refseq database. I have tested all the parameters, and for the 3 that are mentioned in the title, I am not sure if they work properly and for what use cases they can be useful. Then I ran centrifuge-build with examples files that are provided in the example folder of the centrifuge toolkit and the result was the same.

Commands that I ran are as follows:

  1. centrifuge-build --conversion-table gi_to_tid.dmp --taxonomy-tree nodes.dmp --name-table names.dmp --nodc test.fa test_index_nodc

This is stdout which the command line produced:

Settings:
Output files: "test_index_nodc.*.cf"
Line rate: 7 (line is 128 bytes)
Lines per side: 1 (side is 128 bytes)
Offset rate: 4 (one in 16)
FTable chars: 10
Strings: unpacked
Local offset rate: 3 (one in 8)
Local fTable chars: 6
Max bucket size: default
Max bucket size, sqrt multiplier: default
Max bucket size, len divisor: 4
Difference-cover sample period: 1024
Endianness: little
Actual local endianness: little
Sanity checking: disabled
Assertions: disabled
Random seed: 0
Sizeofs: void*:8, int:4, long:8, size_t:8
Input files DNA, FASTA:
test.fa
Reading reference sizes
Time reading reference sizes: 00:00:00
Calculating joined length
Writing header
Reserving space for joined string
Joining reference sequences
Time to join reference sequences: 00:00:00
bmax according to bmaxDivN setting: 268
Using parameters --bmax 201 and *no difference cover*
Doing ahead-of-time memory usage test
qemu: uncaught target signal 8 (Arithmetic exception) - core dumped
Floating point exception

In this case, centrifuge-build produced 3 index files (1.cf and 2.cf are empty files) and after that when centrifuge was run there is no output report file because of the problem with the index files. Also, why Floating point exception error pops up when parameter --nodc is specified? Are there any use cases where output index files can be used for further analysis or this is expected behaviour?

  1. centrifuge-build --conversion-table gi_to_tid.dmp --taxonomy-tree nodes.dmp --name-table names.dmp --justref test.fa test_index_justref

This is stdout which the command line produced:

Settings:
Output files: "test_index_justref.*.cf"
Line rate: 7 (line is 128 bytes)
Lines per side: 1 (side is 128 bytes)
Offset rate: 4 (one in 16)
FTable chars: 10
Strings: unpacked
Local offset rate: 3 (one in 8)
Local fTable chars: 6
Max bucket size: default
Max bucket size, sqrt multiplier: default
Max bucket size, len divisor: 4
Difference-cover sample period: 1024
Endianness: little
Actual local endianness: little
Sanity checking: disabled
Assertions: disabled
Random seed: 0
Sizeofs: void*:8, int:4, long:8, size_t:8
Input files DNA, FASTA:
test.fa
Reading reference sizes
Time reading reference sizes: 00:00:00
Total time for call to driver() for forward index: 00:00:00

In this case, output index files were not produced. Are there any use cases where output files can be produced or this is expected behaviour?

  1. centrifuge-build --conversion-table gi_to_tid.dmp --taxonomy-tree nodes.dmp --name-table names.dmp --noref test.fa test_index_noref

In this case output index files were the same as files that were produced without --noref parameter. Are there any use cases where output index files are different from index files produced with default parameters or this is expected behaviour?

Thank you in advance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant