Adephaga_UCE README

title	author	reviewed by	date
Adephaga_UCE	CR Cardenas	Dr. Jeremy Gauthier	2023.09.26

Adephaga_UCE README

Tasks

double check synteny workflow
add R markdown/script files
check grammar in markdown files

UCE characerization and concatenation for phylogenomic analysis of Adephaga

The goal is to map probes to genomes in order to realize the total overalp of all probes used n the Adephaga data for concatenation/merging multiple UCEs (as in Van Dam et al 2021) on the same gene and UCE characterization. In general, folks will only have one probeset to use. But because I am integrating anchored hybrid enrichment data, Adephaga UCE probes Adephaga UCE probes, and the original Coleoptera UCE probes I will need to create a merged dataset (nameed joined).

Two important assumptions being made about the genome and genes being used:

Intergenic sequences close to genes are not being considered as promotors. This is mainly due to this information being unknown.
There is no alternative splicing or overlapping genes (ex: overlaping genes).

Goals of this workflow:

identify what probes are genetic or intergenic
identify the overlap/intersect of probesets used between datasets
create a new probeset that integrates all probes for use in Phyluce
create a list of probes that should be concatenated in a partition file for phylogenetic analysis
1. create a script to integrate it based on an existing partition file (e.g., output of Phyluce)

Software and packages used

!!! create a json file with environment for download and easy duplication of the environent

Recomended installation procedure:

conda create --name characterization
conda activate characterization
conda install -c bioconda bedtools blast bwa samtools seqkit 
conda install -c anaconda natsort

This pipeline should be able to run on a personal laptop (windows with linux subsystem and linux, uncertain about mac) with sufficient storage, memory and CPU available.

All scripts are found in the scripts directory

To follow this workflow, for now see the complete markdown files; or follow the step by step workflow:

Additional Adephaga UCE scripts

An additional downstream result of this workflow is testing the effect of flanking+core-probe-regions, flanking, and core-probe-regions of sequence data in phylogenetic analysis inside phyluce. Initial analyses showed incredibly gappy alignments, likely due to the depth of evolutionary relationships and integration of a diverse set of genomic data collection methods.

The following markdown file: contains the scripts necessary to "slice" flanking regions from UCE data.

Slice flanking from core probe region
Use synteny for a comparison of probe targets between genomes.
Identify feature changes of probes between genomes inprep

Name		Name	Last commit message	Last commit date
Latest commit History 82 Commits
markdown_files		markdown_files
old_scripts		old_scripts
scripts		scripts
synteny_workflow		synteny_workflow
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Adephaga_UCE README

Tasks

UCE characerization and concatenation for phylogenomic analysis of Adephaga

Goals of this workflow:

Software and packages used

!!! create a json file with environment for download and easy duplication of the environent

Additional Adephaga UCE scripts

About

Languages

crcardenas/Adephaga_UCE

Folders and files

Latest commit

History

Repository files navigation

Adephaga_UCE README

Tasks

UCE characerization and concatenation for phylogenomic analysis of Adephaga

Goals of this workflow:

Software and packages used

!!! create a json file with environment for download and easy duplication of the environent

Additional Adephaga UCE scripts

About

Resources

Stars

Watchers

Forks

Languages