screen-sra

A Snakemake workflow for quickly screening assembled genomes against SRA datasets.

What it does

For each genome and SRA run, the workflow:

Downloads reads from SRA (fasterq-dump)
Quality-trims reads (fastp)
Maps reads to reference (bwa mem + samtools)
Computes per-feature coverage (bedtools)

Output: One Excel file per genome with mapping statistics and feature-level coverage.

Quick start

# 1. Install dependencies
conda env create -f environment.yml
conda activate screen-sra

# 2. Edit config.yaml with your SRA IDs and genome files

# 3. Run (workflow parallelizes over SRA IDs/genomes with --cores)
snakemake --cores 16

# Note: `threads: 8` in config is used for `bwa mem` mapping.
# Download uses 4 threads, QC uses 4 threads.

# 4. Check results in results/excel/

Configuration

Edit config.yaml:

threads: 8                     # Threads for bwa mem mapping
sra_ids_file: SRR_Acc_List.txt # File with SRA IDs (one per line)
keep_aux: true                 # Keep intermediate files (true/false)
keep_mapping: true             # Keep CRAM/BAM files (true/false)

genomes:
  - genome_id: your_genome
    fasta: path/to/genome.fna
    gff3: path/to/genome.gff3

Inputs

SRA IDs: Text file with one SRA run ID per line (e.g., SRR123456)
Genomes: FASTA + GFF3 files for each reference genome

Outputs

results/
├── excel/
│   └── {genome_id}.xlsx    # Final reports (one per genome)
│       ├── General_mapping sheet
│       └── Per-SRA sample sheets
├── reads/                  # Raw fastq.gz files
├── qc/                     # fastp reports
├── mapping/                # BAM files
└── gene_tables/            # Per-sample TSV tables

Notes

This pipeline is designed for bacterial genome screening with annotation files generated by Bakta .
Other annotation files should also work, depending on their GFF/GFF3 structure.
Code was reviewed and optimized with GPT-5.3-Codex.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
test_data		test_data
README.md		README.md
Snakefile		Snakefile
config.yaml		config.yaml
environment.yml		environment.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

screen-sra

What it does

Quick start

Configuration

Inputs

Outputs

Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

screen-sra

What it does

Quick start

Configuration

Inputs

Outputs

Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages