Skip to content

Code Used To Call Polymorphic TEs between B73 and the remaining 25 Nested Association Mapping Lines

Notifications You must be signed in to change notification settings

mam737/PolymorphicTEs_NAM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 

Repository files navigation

Combined analysis of transposable elements and structural variation in maize genomes reveals genome contraction outpaces expansion

This repository contains scripts used to identify polymorphic TEs between B73 and the remaining 25 maize inbred founder lines for the maize Nested Association Mapping (NAM) population. All relevant datasets can be found at the Data Repository for U of M at xxxxxx.

Subfolders within the scripts folder correlate to specific steps of the analysis.

  1. proccess_EDTA_TEAnnotations - Scripts used to filter and perform QC on the raw panEDTA TE Annotations generated in Ou et al. (bioRxiv 2022). These TE Annotations are publicly available on MaizeGDB

    • EDTA_gff_to_bed.sh: Shell script to change GFF files to BED files
    • process_initial_EDTA_bed.R: Remove annotations for nonTE, helitron, and specific features of structurally annotated LTRs
    • filter_problematic_TEAnnotation_overlaps.R: Filter overlaps between TE annotations that reflect situations that are biologically unfeasible and should consequently be filtered from TE Annotation File
  2. process_AnchorWave_gvcfs - Scripts used to parse AnchorWave gvcf pairwise alignments into alignable, structural variant, and unalignable sequence

    • parse_AnchorWave_gvcfs.R: Parse AnchorWave GVCF pairwise alignment into nonvariant, SNP, InDel (<50bp), and structural variant (>50bp) sequence
    • generated_summarisedAW.R: Using parsed AnchorWave outputs bin regions into either alignable (nonvariant, SNP, and Indel), structural variant (>50 bp in one genome, 0 bp in other genome), and unalignable sequence
  3. classify_feature_annotations - Scripts used to classify features by intersecting annotations with summarised AnchorWave alignments

    • classify_polymorphic_TEAnnotations.R: Classify TE Annotations
    • classify_polymorphic_geneAnnotations.R: Classify Gene Annotations (both exon-only and full-length)

About

Code Used To Call Polymorphic TEs between B73 and the remaining 25 Nested Association Mapping Lines

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published