Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quoting of file paths #76

Open
mbhall88 opened this issue Jul 4, 2023 · 3 comments
Open

Quoting of file paths #76

mbhall88 opened this issue Jul 4, 2023 · 3 comments

Comments

@mbhall88
Copy link

mbhall88 commented Jul 4, 2023

Describe the bug
When a file path has "weird" characters like | in it, mashtree fails.

To Reproduce

Here is an example of my file of filenames

mycobacteria/Mycolicibacillus/kraken:taxid|1069220|NZ_AP022594.1.fa
mycobacteria/Mycolicibacillus/kraken:taxid|1069221|NZ_CP092365.1.fa
mycobacteria/Mycolicibacter/kraken:taxid|29314|NZ_AP022609.1.fa
mycobacteria/Mycolicibacter/kraken:taxid|2872309|NZ_CP084029.1.fa
mycobacteria/Mycolicibacter/kraken:taxid|2875777|NZ_CP084028.1.fa
mycobacteria/Mycolicibacter/kraken:taxid|1788|NZ_LT906469.1.fa

First up, I know, these are terrible filenames. Who in their right mind would name files this way? Well there is a lot of them and I basically can't be bothered to change them. So you are well within your right to tell me to buzz off :)

Here is an example of the command and output

$ mashtree --file-of-file all_myco.fofn --numcpus 8 --outtree tree.dnd
mashtree: main: Found mash version 2 - /home/mihall/sw/mambaforge/envs/classbench/bin/mash
mashtree: main: Temporary directory will be /tmp/MASHTREE.raIOT8
mashtree: main: mashtree on 1 files
mashtree: mashSketch(TID1): This thread will work on 137 sketches
mashtree: mashSketch(TID1): Working on file 1 out of 137
mashtree: mashSketch(TID2): This thread will work on 137 sketches
mashtree: mashSketch(TID2): Working on file 1 out of 137
sh: 1767: command not found
sh: 1767: command not found
mashtree: mashSketch(TID3): This thread will work on 136 sketches
mashtree: mashSketch(TID3): Working on file 1 out of 136
sh: NZ_AP024256.1.fa: command not found
sh: 1962118: command not found
sh: 1962118: command not found
sh: NZ_CP022235.1.fa: command not found
mashtree: mashSketch(TID1): ERROR running mash sketch -S 42 -k 21 -s 10000   -o /tmp/MASHTREE.raIOT8/kraken:taxid|1767|NZ_AP024256.1.fa mycobacteria/Mycobacterium/kraken:taxid|1767|NZ_AP024256.1.fa 2>&1!
  sh: NZ_AP024256.1.fa: command not found

Basically, I think word-splitting is causing perl to think we're trying to pipe something? I don't know anything about perl though so I might be way off. Is there a concept of quoting file paths like in bash to avoid this behaviour?

Expected behavior
Michael names his files like a sane person. Or mashtree is super kind and forgiving and knows Michael is an idiot but makes him feel better about himself by handling his crappy file paths.

Desktop (please complete the following information):

  • OS: Linux
  • Version 1.2.0
  • which method did you install with? conda
@lskatz
Copy link
Owner

lskatz commented Jul 19, 2023

I think you have correctly identified this as a bug but it would probably be easier to just change the filenames :)

@mbhall88
Copy link
Author

mbhall88 commented Jul 20, 2023

I appreciate changing the filenames is the easier option, but is quoting the file paths also not an easy fix in Perl? (I don't know anything about Perl so feel free to correct me)

@lskatz
Copy link
Owner

lskatz commented Jul 26, 2023

It would take some time to find each time a filename is used and properly quote it but yes that would be ideal

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants