Skip to content

Haplotype Fixation Index for crop populations with homozygous nature, such as rice

License

Notifications You must be signed in to change notification settings

zhuochenbioinfo/HFI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HFI

Haplotype Fixation Index (HFI) for crop populations with homozygous nature, such as rice.

NOTICE: The formulas and pictures in this page may not be displayed properly in some regions due to local Internet policies.

By Zhuo CHEN, contact: chenomics@163.com or zhuochen@fafu.edu.cn

Motivation and description:

Considering the homozygous nature of cultivated rice, I designed a haplotype-based estimate HFI for genetic differentiation analysis. The design of HFI estimate was inspired from fixation index Weir and Cockerham's FST but with multiple major changes.

Artificial hybrid breeding had shuffled the distribution of various haplotypes in crop popuation genomes, and the number and allele frequency of different haplotypes in a population may be more important than the base differences between the haplotypes.

The motivation for designing this method was to assess the changes in the absolute value of haplotype diversity rather than the fold-change, and ignoring the number of base differences between haplotypes.

HFI is based on two other haplotype-based estimates, namely hapDiv (haplotype diversity) and hapDist (haplotype distance).

image

where n is the number of haplotypes in the window; xi and xj are the allele frequency of the haplotype i and j; dij is the genetic distance between haplotype i and j. If there are any clear base difference (excluding missing genotype or heterozygous genotype) between the two haplotypes, dij will be set as 1; otherwise, it will be set as zero.

image

where definition of n, x and d in the equation are the same with those in the equation of hapDiv.

Then:

image

Usage:

Typical usage:

perl HFI.pl --in pop.geno --out out.hfi --list1 pop1.list --list2 pop2.list

For detailed usage:

perl HFI.pl

Input format: tab-delimited table with header. Each line contains: chr pos geno1 geno2 ... genoX

This program was designed only for populations with homozygous nature. The genotype coding in the input file is:

0 for reference type; 1 for alternative type; - for missing or heterozygous genotype.

The input file is recommended to be created from a VCF format file and pruned with the script SNP_pruning.r2.pl

perl SNP_pruning.r2.pl --in pop.vcf --out pop.geno

Reference:

Weir BS, Cockerham CC. ESTIMATING F-STATISTICS FOR THE ANALYSIS OF POPULATION STRUCTURE. Evolution. 1984;38(6):1358–1370. doi:10.1111/j.1558-5646.1984.tb05657.x

Citation:

Zhuo Chen, Xiuxiu Li, Hongwei Lu, Qiang Gao, Huilong Du, Hua Peng, Peng Qin, Chengzhi Liang. Genomic atlases of introgression and differentiation reveal breeding footprints in Chinese cultivated rice. Journal of Genetics and Genomics, 2020, ISSN 1673-8527, https://doi.org/10.1016/j.jgg.2020.10.006.

About

Haplotype Fixation Index for crop populations with homozygous nature, such as rice

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages