Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code Cleanup #43

Open
1 of 6 tasks
dfornika opened this issue Apr 3, 2019 · 15 comments
Open
1 of 6 tasks

Code Cleanup #43

dfornika opened this issue Apr 3, 2019 · 15 comments

Comments

@dfornika
Copy link
Member

dfornika commented Apr 3, 2019

  • Remove any unnecessary commented-out code
  • Standardize variable names
  • Collect related variables into collections (dictionaries) eg. collect file paths into a paths dict.
  • Document functions
  • Improve documentation in README.md
  • Review consistency of USAGE messages in all job scripts.
@DiDigsDNA
Copy link
Member

Let's check that variables listed in USAGE statement match those actually parsed in job scripts.

@dfornika
Copy link
Member Author

dfornika commented May 8, 2019

Good idea. I've added it to the list above.

@DiDigsDNA
Copy link
Member

USAGE statement variables in bwa_mem.sh script don't match those referenced later in script.

USAGE="qsub $( basename "$BASH_SOURCE" ) [-h] -f|--fasta\n\

 case $key in
    -1|--R1)
    # input_R1.fastq.gz file
    input_r1_fastq="$2"
    shift # past argument
    shift # past value
    ;;
    -2|--R2)
    # input_R2.fastq.gz file
    input_r2_fastq="$2"
    shift # past argument
    shift # past value
    ;;
    -r|--reference)
    # reference fasta file
    reference="$2"
    shift # past argument
    shift # past value
    ;;
    -o|--output)
    # output sam file
    output="$2"
    shift # past argument
    shift # past value
    ;;
  esac
done

I can change the USAGE string variables.

@DiDigsDNA
Copy link
Member

In this case, the USAGE statement contains the correct variables used in the script but these variables don't get initialized to "" (whereas an unused variable does).

input_fastq="" This variable not used in script.
These variables are used but aren't initialized:
input_r1_fastq="$2"
input_r2_fastq="$2"

I can fix this.

@DiDigsDNA
Copy link
Member

USAGE statement includes all variables used in script. However, the alignment variable is initialized to "", while the model variable isn't.

Do you want me to add model="" to line 21?

@DiDigsDNA
Copy link
Member

USAGE stmt has all the variables used within the script and 3 of 4 variables used in the script are initialized. I suspect treefile="" at line 21 should be replaced with output_file=""... do you agree?

@DiDigsDNA
Copy link
Member

DiDigsDNA commented Jun 8, 2019

  1. 2 of 3 variables initialized to "" except for label --> would you like me to change this?
  2. Comment should read # input contigs.fasta file
    # input_R1.fastq.gz file
  3. It looks like input variable was changed to assembly:

    mlst --label "${label}" "${assembly}" > "${output_file}"

    ...so I'll replace input="" with assembly="".

@DiDigsDNA
Copy link
Member

I noticed 2 issues with samtools_filter_fixmate_sort.sh:

  1. Confusing comment. I don't understand the use of the term integer here.
    # only include reads with none of the FLAGS in this integer present
  2. Shouldn't this comment read # output sam (or bam) file ?
    # only include reads with none of the FLAGS in this integer present

@DiDigsDNA
Copy link
Member

DiDigsDNA commented Jun 8, 2019

Inconsistency in variables initialized and those used:


And the output variable not initialized:
I suggest changing initial variable declarations to:

input=""
output=""

...and deleting flags=0 (because the flags variable isn't used)
...and changing

# only include reads with none of the FLAGS in this integer present

to read
`# output sam (or bam) file

@DiDigsDNA
Copy link
Member

Two issues with this script:

  1. Need to add -n|--name-order to USAGE stmt.
    USAGE="qsub $( basename "$BASH_SOURCE" ) [-h] -i|--input SAM_BAM -o|--output SAM_BAM\n\
  2. Should this comment read # output sam (or bam) file ?
    # only include reads with none of the FLAGS in this integer present

@DiDigsDNA
Copy link
Member

queries is initialized and declared but not reference otherwise in script:


I can delete it.

@DiDigsDNA
Copy link
Member

I noticed 4 issues with shovill.sh:

  1. -q|--queries declared in USAGE stmt but not used in script --> suggest removing it
    USAGE="qsub $( basename "$BASH_SOURCE" ) [-h] -1|--R1 INPUT_R1_FASTQ -2|--R2 INPUT_R2_FASTQ -q|--queries QUERIES_MSH -o|--output_file OUTPUT_FILE\n\
  2. -o|--output_file in in USAGE stmt doesn't match below, where -o|--output_dir is parsed
  3. -c|--mincov used in script but not declared or initialized and not mentioned in USAGE stmt. --> Should this be initialized to "" or some default value?
  4. -l|--minlen used in script but not declared or initialized and not mentioned in USAGE stmt. --> Should this be initialized to "" or some default value?

@DiDigsDNA
Copy link
Member

DiDigsDNA commented Jun 8, 2019

2 issues with snippy_ctgs.sh:

  1. 2 unused variables --> suggest eliminating them
  2. Suggest adding ctgs="" declaration to line 21 as this variable assigned later
    Note that snippy perl code sets default for ctgs to '' so this should work.
  3. USAGE stmt needs to instruct the user to specify a contigs file on the command line
    i.e. instead of USAGE="qsub $( basename "$BASH_SOURCE" ) [-h] -r|--ref REFERENCE_FASTA -c|--ctgs -o|--outdir OUTPUT_FILE
    use this:
    USAGE="qsub $( basename "$BASH_SOURCE" ) [-h] -r|--ref REFERENCE_FASTA -c|--ctgs CONTIGS_FASTA_FILE -o|--outdir OUTPUT_FILE

@DiDigsDNA
Copy link
Member

Only thing I noticed for snp-dists.sh is that comment states that the alignment file is a reference fasta

# reference fasta file

I suggest changing this to read # alignment fasta

@DiDigsDNA
Copy link
Member

Suggestion for snp-sites.sh to change comment from

# reference fasta file

to read # alignment fasta

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants