Skip to content

Spritz commandline usage

Anthony edited this page Aug 4, 2021 · 16 revisions

Spritz commandline usage

Running Spritz

If you are working on a protected-access machine, please perform the external Spritz setup before running Spritz.

  1. Install or activate conda, such as by installing miniconda3.
  2. Install git with conda install git
  3. Clone Spritz with git clone https://github.com/smith-chem-wisc/Spritz.git; cd Spritz/Spritz/workflow/
  4. Create a conda environment for spritz by running conda env create --name spritzbase --file envs/spritzbase.yaml; conda activate spritzbase .
  5. Adapt the config/config.yaml file manually. Briefly:
  • Specify your analysis directory, which should have any input FASTQ files, and which will be used for saving output. If you downloaded SRAs externally in Spritz setup, make sure the SRA
  • Please place the data you intend to use in sra, fq, fq_se, sra_se. Leave empty the ones you don't intend to use; for example, sra: [] indicates you do not intend to use paired-end SRAs.
  • Please note that FASTQ filenames should be located in the specified analysis directory. Input FASTQs must have a filename with the format {prefix}_1.fastq, and the prefixes should be listed in the fq or fq_se fields, respectively. The filenames themselves should not be listed, just the prefixes.
  • Specify the organism, genome version, and gene model version.
  1. Run Spritz with snakemake -j {threads} --use-conda --conda-frontend mamba --resources mem_mb={memory_megabytes}, where {threads} and {memory_megabytes} are replaced with your specifications. For example, this would be snakemake -j 24 --use-conda --conda-frontend mamba --resources mem_mb=100000 if using 24 threads and 100 GB of RAM.

External Spritz setup. Use on protected-access analysis machines without access to the URLs below.

Spritz requires access to these URLs to perform its setup:

http://www.uniprot.org
https://api.nuget.org
http://ftp.ensembl.org
https://ftp.ncbi.nih.gov/
https://github.com/

You can test whether your analysis machine can access these addresses by running ping http://www.uniprot.org and such.

  1. Install or activate conda, such as by installing miniconda3.
  2. Install git with conda install git
  3. Clone Spritz with git clone https://github.com/smith-chem-wisc/Spritz.git; cd Spritz/Spritz/workflow/
  4. Create a conda environment for setting up spritz by running conda env create --name spritzbase --file envs/spritzbase.yaml; conda activate spritzbase .
  5. Specify SRAs, organism, and gene model version in the config/config.yaml file manually. Briefly:
  • Please place the data you intend to use in sra, fq, fq_se, sra_se. Leave empty the ones you don't intend to use; for example, sra: [SRR629563] specifies to download this SRA, and sra: [] indicates you do not intend to use paired-end SRAs.
  • Specify the organism, genome version, and gene model version.
  • If you intend to use your own FASTQs, specify them in the next section after setting up Spritz and copying it to your analysis server.
  1. Run snakemake -j {threads} --use-conda --conda-frontend mamba ../resources/setup.txt to set up Spritz with {threads} replaced with the number of threads on your machine. For example, use snakemake -j 16 --use-conda --conda-frontend mamba ../resources/setup.txt if 16 threads are available.
  2. Run cd ../../../ to exit the Spritz folder.
  3. Bundle and compress Spritz with tar cvzf Spritz.tar.gz Spritz
  4. Copy Spritz.tar.gz to the server for your analysis
  5. Uncompress Spritz on the analysis server with tar xvzf Spritz.tar.gz

After this setup, you can follow the steps above, starting at step 5. Some notes:

  • You may need to run module load conda instead of downloading and installing miniconda.
  • You may need to run snakemake using a SLURM command, such as srun -A sens2020### -t 2-0 -c 16 snakemake -j 16 --resources mem_mb=112000