Skip to content

Commit 8c1ac4a

Browse files
authored
Update quick start guide
Removed reference genome and diamond database path instructions from the quick start guide.
1 parent 247d794 commit 8c1ac4a

File tree

1 file changed

+4
-6
lines changed

1 file changed

+4
-6
lines changed

posts/IMAM-02-quick_start.qmd

Lines changed: 4 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -70,25 +70,23 @@ singularity exec \
7070
-p {project.folder} \
7171
-n {name} \
7272
-r {reads} \
73+
--ref-genome {reference} \
74+
--diamond-db {database} \
7375
-t {threads}
7476
```
7577

7678
- `{name}`: The name of your study, no spaces allowed.
7779
- `{project.folder}`: The project folder where you run your workflow and store results.
7880
- `{reads}`: The folder that contains your raw .fastq.gz files. Raw read files must adhere to the naming scheme as described [here](https://help.basespace.illumina.com/files-used-by-basespace/fastq-files#naming){target="_blank"}.
81+
- `{reference}`: Absolute path pointing to your reference genome (.fna, .fasta, .fa).
82+
- `{database}`: Absolute path pointing to your diamond database (.dmnd).
7983

8084
::: callout-important
8185
**The `--bind` arguments are needed to explicitly tell Singularity to mount the necessary host directories into the container.** The part before the colon is the path on the host machine that you want to make available. The path after the colon is the path inside the container where the host directory should be mounted.
8286

8387
As a default, Singularity often automatically binds your home directory (`$HOME`) and the current directory (`$PWD`). We also explicitly bind `/mnt/viro0002-data` in this example. If your input files (reads, reference, databases) or output project directory reside outside these locations, you MUST add specific `--bind /host/path:/container/path` options for those locations, otherwise the container won’t be able to find them.
8488
:::
8589

86-
::: callout-note
87-
When **prepare_project.py** prompts for the **reference genome** and **diamond database** paths, you must enter the absolute host paths, and these paths must be accessible via one of the bind mounts.
88-
89-
Also, it'll ask if you want to create a raw_data/ folder with softlinks to your raw fastq.gz files. This is not required for running the workflow, but it can be convenient to have softlinks to your raw data available in your project directory.
90-
:::
91-
9290
After running the prepare_project.py helper script, you should have the following files in your project directory:
9391

9492
- The **sample.tsv** should have 3 columns: sample (sample name), fq1 and fq2 (paths to raw read files). Please note that samples sequenced by Illumina machines can be ran across different lanes. In such cases, the Illumina software will generate multiple fastq files for each sample that are lane specific (e.g. L001 = Lane 1, etc). So you may end up with a sample.tsv file that contains samples like `1_S1_L001` and `1_S1_L002`, even though these are the same sample, just sequenced across different lanes. The snakemake workflow will recognize this behaviour and merge these files together accordingly.

0 commit comments

Comments
 (0)