Skip to content

genepi/lpa-pipeline

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

48 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LPA Server Pipeline

This repository includes the used pipeline for the article A comprehensive map of single base polymorphisms in the hypervariable LPA Kringle-IV-2 copy number variation region. The paper can be found here.

You can upload a BAM file and get annotated low-level variants in return. In the current version, indel detection and BAQ features are disabled.

BAM file creation

We recommend to use BWA-MEM to align your FASTQ files against a reference. Please use this reference for the alignment process.

bwa mem kiv2_6.fasta <file1.fastq> <file2.fastq>| gzip -3 > aln-pe.sam
samtools view -S -b aln-pe.sam > sample.bam

Data

All sequence data has been upload to Dataverse and can be accesed here.

Source Code

Please check out this repository if you want to have a look at the actual source code.

Pipeline Steps

  • Low-level Variant Detection
  • Type-B Base Annotation
  • Region Annotation
  • Overall Statistics

Run the LPA Pipeline

  1. Install Cloudgene with the following commands
mkdir cloudgene
cd cloudgene
curl -s install.cloudgene.io | bash
  1. Install the LPA workflow
./cloudgene gh seppinho/lpa-workflow

3a) Start the local web service and run

./cloudgene server

Open your web browser and enter http://localhost:8082. Use admin and admin1978 to login.

3b) Run on the command line

./cloudgene run seppinho-lpa-workflow --input <bam-folder> --archive <fasta file> --annotateBase <annotation file> --annotateRegion <region file>

Test Data

./cloudgene run seppinho-lpa-workflow --input <bam-folder-withAK-Sample> --archive kiv2_6.fasta --annotateBase typeb_annotation.csv --annotateRegion maplocus_v3.txt

Download 1000G Paper Data

The script to download data from 1000 Genomes can be found here.

Contact

Please contact Stefan Coassin and Sebastian Schoenherr in case of problems.