Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error during db build: "find: -printf: unknown primary or operator" #48

Closed
kliere opened this issue Jul 22, 2016 · 10 comments
Closed

Error during db build: "find: -printf: unknown primary or operator" #48

kliere opened this issue Jul 22, 2016 · 10 comments

Comments

@kliere
Copy link

kliere commented Jul 22, 2016

I'm building a custom db following the instructions on the kraken website using the scripts provided by Mick Watson (http://www.opiniomics.org/building-a-kraken-database-with-new-ftp-structure-and-no-gi-numbers/). Everything works fine up to the moment when I try to build the db using the following command:
karsten$ kraken-build --build --threads 24 --work-on-disk --db kraken_20160720
Kraken build set to minimize RAM usage.
Creating k-mer set (step 1 of 6)...
Found jellyfish v1.1.11
find: -printf: unknown primary or operator

Copied from the script, the line in question looks like:
"KRAKEN_HASH_SIZE=$(find library/ '(' -name '.fna' -o -name '.fa' -o -name '*.ffn' ')' -printf '%s\n' | perl -nle '$sum += $_; END {print int(1.15 * $sum)}')"

I'm using Mac OS X 10.11.6 with Perl v5.18.2.

So, what is wrong with the printf command?

Cheers,

Karsten

@kliere
Copy link
Author

kliere commented Jul 22, 2016

Ok, found the fix for the above problem in the Pull request section. However, changing the line in question to "KRAKEN_HASH_SIZE=$(find library/ -name '*.fna' -ls | awk '{print $7}' | perl -nle '$sum += $_; END {print int(1.15 * $sum)}')" results now in the following error:

karsten$ kraken-build --build --threads 24 --work-on-disk --db kraken_20160720
Kraken build set to minimize RAM usage.
Creating k-mer set (step 1 of 6)...
Found jellyfish v1.1.11
Hash size not specified, using '27010052168'
/usr/local/Cellar/kraken/0.10.5-beta/libexec/build_kraken_db.sh: line 96: 49200 Broken pipe: 13 find library/ '(' -name '.fna' -o -name '.fa' -o -name '*.ffn' ')' -print0
49201 Exit 1 | xargs -0 cat
49202 Killed: 9 | jellyfish count -m $KRAKEN_KMER_LEN -s $KRAKEN_HASH_SIZE -C -t $KRAKEN_THREAD_CT -o database /dev/fd/0

So, what's next? ;-)

Cheers,

Karsten

@kliere
Copy link
Author

kliere commented Jul 25, 2016

Interestingly, trying to build the library under Linux (Ubuntu 16.04.1 LTS) results in pretty much the same error.

@andrewdavis3
Copy link

having the same issue, but I'm not getting a broken pipe Did you ever find a fix?
Creating k-mer set (step 1 of 6)...
Found jellyfish v1.1.11
Hash size not specified, using '23770869521'
/usr/local/Cellar/kraken/0.10.5-beta/libexec/build_kraken_db.sh: line 97: 76250 Done find library/ '(' -name '.fna' -o -name '.fa' -o -name '*.ffn' ')' -print0
76251 Exit 1 | xargs -0 cat
76252 Killed: 9 | jellyfish count -m $KRAKEN_KMER_LEN -s $KRAKEN_HASH_SIZE -C -t $KRAKEN_THREAD_CT -o database /dev/fd/0

@tseemann
Copy link

@kliere I see you are using Brew - I am still trying to get the builder working, as it depends on Jellyfish 1.1 not 2.0 as is installed. See https://github.com/Homebrew/homebrew-science/pull/2161

@andrewdavis3
Copy link

andrewdavis3 commented Aug 29, 2016

its the --jellyfish-hash-size that's causing it to exit, not sure what the max should be but probably need to add a cap.

@getopt
Copy link

getopt commented Aug 3, 2017

The issue seems to be due macOS version of find not having -printf option. I have changed -printf '%s\n' to -print0 | xargs -0 stat -f '%i ' as per instructions https://stackoverflow.com/questions/752818/find-lacks-the-option-printf-now-what and this worked for me on macOS Sierra

KRAKEN_HASH_SIZE=$(find library/ '(' -name '*.fna' -o -name '*.fa' -o -name '*.ffn' ')' -print0 | xargs -0 stat -f '%i '| perl -nle '$sum += $_; END {print int(1.15 * $sum)}')

@tseemann
Copy link

tseemann commented Aug 12, 2017

@getopt nice 'find' on the lack of -printf support on macOS ! :-)

@tseemann
Copy link

@kliere @getopt @andrewdavis3 the bigger problem is that @DerrickWood has moved to industry and is no longer maintaining this software.

@kliere
Copy link
Author

kliere commented Sep 6, 2017

@tseemann, yep indeed - though Kraken is still a great tool, one should look for alternatives such as CLARK...

@jenniferlu717
Copy link
Collaborator

We are currently working on updates/fixes to the Kraken software. We haven't abandoned it, promise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants