multiple_split_libraries_fastq.py -- Argument list too long #2069
Comments
What is the max number of arguments? Or is it the length of the argument string that it's unhappy with? In either case, it might be possible to check for this, and if it's detected, split the input command into multiple commands, and check the counts of the output reads from each command to generate the next command with the --start_seq_id parameter added to give unique numbers to the reads, and finally concatenate the separate reads. |
As far as I understand, this is a limitation imposed by the OS/Kernel On (Aug-06-15| 8:26), Tony wrote:
|
Yes about kernel and is based on characters long and not on number of parameters. The only way I can think of is splitting the commands by a |
Would need to be separate logical lines, joining by ";" doesn't reduce On Thu, Aug 6, 2015 at 9:51 AM, Antonio Gonzalez notifications@github.com
|
Agree, that was just an example 😄 |
How about this potential solution: |
The more I think about this the more I realize that most of these On (Aug-06-15| 9:22), Tony wrote:
|
Hi, Yoshiki I like your idea of bypassing the CLI/system call, but how would you deal with logging? Currently, the "command executor" An alternative would be to add a batch mode to |
Good point @agentfog! Maybe what needs to happen is that every each sample has to be processed individually and finally all results can be collated together? ... not ideal but I guess it would work. |
After talking with @rob-knight, he suggested that the options that took parameters that can trigger this error ( |
That's what I meant by "batch mode". The way I imagined the text file, each
line would correspond to a sample and there would be two or three columns
depending on the demux mode. The columns would be either (read_file,
sample_id) or (read_file, barcode_file, mapping_file).
What do you think?
|
@agentfog Ah, got it - sorry for missing that! Yes, though I think we On (Aug-24-15|19:00), agentfog wrote:
|
Sounds good. |
BTW, I'm working on this right now, had to delay it due to other things that came up, but I should be able to post a PR sometime soon. |
With this new argument split_libraries_fastq.py can now take a file that list the input files for the `-b`, `-i`, `--sample_ids` and `--mapping_files` arguments. Thus allowing an indefinite amount of files as inputs, as opposed to the previous approach where we would be limited by the maximum number of characters that can fit into a single command line execution. Fixes biocore#2069
With this new argument split_libraries_fastq.py can now take a file that list the input files for the `-b`, `-i`, `--sample_ids` and `--mapping_files` arguments. Thus allowing an indefinite amount of files as inputs, as opposed to the previous approach where we would be limited by the maximum number of characters that can fit into a single command line execution. Fixes biocore#2069
If the length of the paths in the
split_libraries_fastq.py
command are too long (as they get to be when you have a few hundred samples each in its own FASTQ file),multiple_split_libraries_fastq.py
will fail with the following error:I don't think there's a good solution for this, we had previously found this problem in Qiita but ended up not finding a good solution. Anyone have any ideas about this?
@walterst @gregcaporaso
The text was updated successfully, but these errors were encountered: