
# <span style="color:#006E7F">__Introduction to Oxford Nanopore Data Analysis__ <a class="anchor"></span>  


Created by J. Orjuela (DIADE-IRD), F. Sabot (DIADE-IRD) and G. Sarah (AGAP-INRAE) - Septembre 2021 Formation SouthGreen

Adapted by J. Orjuela (DIADE-IRD), F. Sabot (DIADE-IRD) - Novembre 2022

Adapted by J. Orjuela (DIADE- IRD) sept 2025

# <span style="color:#006E7F">__TP4 - VARIANTS DETECTION__ <a class="anchor" id="data"></span>  
    
# <span style="color: #4CACBC;"> Structural variation with Sniffles</span>  

Sniffles is a structural variation caller using third generation sequencing (PacBio or Oxford Nanopore).

It detects all types of SVs (10bp+) using evidence from split-read alignments, high-mismatch regions, and coverage analysis.
    
Check the sniffles website https://github.com/fritzsedlazeck/Sniffles/ an its wiki for more details.

## Prepare data

In [None]:
PTW="/path/to/work"

In [None]:
# download  all clones fastq.gz
cd $PTW/DATA
# download your compressed CloneX 
wget --no-check-certificat -rm -nH --cut-dirs=1 --reject="index.html*" https://itrop.ird.fr/ont-training/all_clones_short.tar.gz

In [None]:
#decompress it
cd $PTW/DATA
tar zxvf all_clones_short.tar.gz

In [None]:
# create SNIFFLES folder
mkdir -p $PTW/RESULTS/SNIFFLES/
cd  $PTW/RESULTS/SNIFFLES/

# declare your Clone
CLONE="Clone10"

# symbolic links of reference 
ln -s $PTW/DATA/${CLONE}/reference.fasta .
REF="reference.fasta"

In [None]:
ls ~/work/DATA/all_clones_short/

# <span style="color: #4CACBC;">1. Mapping and SV detection for all CLONES</span>  



### Obtain calls for each samples

Call SV candidates and create an associated .snf file for each sample:

`sniffles --input sample1.bam --snf sample1.snf`


In [None]:
for i in {2,6,10,15,18}
    do
      cd  $PTW/RESULTS/SNIFFLES/
      echo "\n\n============ Clone$i==============\n";
      CLONE="Clone${i}" # this is the first parametter of this fonction
      REF="reference.fasta"
      ONT="$PTW/DATA/all_clones_short/${CLONE}.fastq.gz"
      ## Mapping using minimap2 : Mapping ONT reads (clone) vs a reference using minimap2 
      minimap2 -t 8 -ax map-ont --MD  -R '@RG\tID:${CLONE}\tSM:${CLONE}' ? ? > ${CLONE}.bam
      ## Sort BAM
      samtools sort -@4 -o ${CLONE}_SORTED.bam ${CLONE}.bam
      #index bam
      samtools index -@4 ${CLONE}_SORTED.bam
      # Obtain calls for a samples
      sniffles -t 4 -i ${CLONE}_SORTED.bam --snf ${CLONE}.snf --allow-overwrite   > ${CLONE}_SV.log
    done

# -s/--min_support	Minimum number of reads that support a SV to be reported. Default: 10
# -l/--min_length	Minimum length of SV to be reported. Default: 30bp
# -q/--minmapping_qual	Minimum mapping quality of alignment to be taken into account. Default: 20
# -r/--min_seq_size	Discard read if non of its segment is larger then this. Default: 2kb

### Count the number of variations, 

How much SV were found for each Clone ? 

check log files !

### Create a variable containing the snf files names

In [None]:
SNFS=""
for i in {2,6,10,15,18}; do SNFS="$SNFS Clone${i}.snf"; done
echo $SNFS

# <span style="color: #4CACBC;"> 2. Merge all the vcf files across all samples</span>  

Combined calling using multiple .snf files into a single .vcf: 

`sniffles --input sample1.snf sample2.snf ... sampleN.snf --vcf multisample.vcf`

# Have a look on the VCF file