GitHub - anshess/SimGBS.jl: A simple method to simulate Genotyping-by-Sequencing (GBS) data.

SimGBS: A Julia Package to Simulate Genotyping-by-Sequencing (GBS) Data

Introduction

SimGBS is a versatile method of simulating Genotyping-by-Sequencing (GBS) data. It can be implemented with any genome of choice. Users can modify different parameters to customise GBS setting, such as the choice of restriction enzyme and sequencing depth. By taking the gene-drop approach, users can also specify the demographic history and define population structure (by supplying a pedigree file). Like real sequencers, SimGBS will output data into FASTQ format.

Installation

SimGBS.jl is registered in the General registry. It can be installed using Pkg.add,

julia> import Pkg;Pkg.add("SimGBS")

or simply

julia> ] 
pkg> add SimGBS

Input

Reference genome of the target species in FASTA format (e.g., xxx.fasta.gz/xxx.fa.gz)
A list of Illumina barcodes (e.g., GBS_Barcodes.txt)
(optional) Pedigree File (e.g.,small.ped)

Output

GBS fragments generated by virtual digestion (e.g.,rawGBStags.txt)
Selected GBS fragments after fragment size-selection (e.g.,GBStags.txt)
Haplotypes, SNP and QTL genotypes (e.g.,hap.txt, snpGeno.txt and qtlGeno.txt)
Basic information about simulated GBS experiment (e.g.,keyFile.txt)
Simulated GBS reads in FASTQ format (e.g.,xxxxx.fastq)

etc.

Overview

For more information, please visit the documentation page.

Citation

Please cite the following if you use SimGBS.jl,

Hess, A. S., M. K. Hess, K. G. Dodds, J. C. McEwan, S. M. Clarke, and S. J. Rowe. "A method to simulate low-depth genotyping-by-sequencing data for testing genomic analyses." Proc 11th World Congr Genet Appl to Livest Prod 385 (2018).

What's Next?

The following tools are recommended for downstream analyses of GBS data,

snpGBS: a simple bioinformatics workflow to identify single nucleotide polymorphism (SNP) from Genotyping-by-Sequencing (GBS) data.
KGD: R code for the analysis of genotyping-by-sequencing (GBS) data, primarily to construct a genomic relationship matrix for the genotyped individuals.
GUSLD: An R package for estimating linkage disequilibrium using low and/or high coverage sequencing data without requiring filtering with respect to read depth.
SMAP a software package that analyzes read mapping distributions and performs haplotype calling to create multi-allelic molecular markers.

Name		Name	Last commit message	Last commit date
Latest commit History 247 Commits
.github/workflows		.github/workflows
docs		docs
example		example
src		src
test		test
tutorials		tutorials
.gitignore		.gitignore
LICENSE		LICENSE
Manifest.toml		Manifest.toml
Project.toml		Project.toml
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SimGBS: A Julia Package to Simulate Genotyping-by-Sequencing (GBS) Data

Introduction

Installation

Input

Output

Overview

Citation

What's Next?

About

Releases 8

Contributors 4

Languages

License

anshess/SimGBS.jl

Folders and files

Latest commit

History

Repository files navigation

SimGBS: A Julia Package to Simulate Genotyping-by-Sequencing (GBS) Data

Introduction

Installation

Input

Output

Overview

Citation

What's Next?

About

Resources

License

Stars

Watchers

Forks

Releases 8

Contributors 4

Languages