You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Creation of circos figure showing duplication overlap with effectors/HGT
Circos plot to represent TE TD HGT and effector predictions
#/work/GIF/remkv6/Baum/CamTechGenomeComparison/58_Renamatorium/31_Fig3Circos
#Copied template karyotype.conf, ideogram.conf, bands.conf, and ticks.conf from Globodera rostochiensis synteny study
cp ../../26_GloboderaSynteny/otherGloboderaGenomes/rostochiensis/circos/* .
#remade the karyttype file so it is only scn
bioawk -c fastx '{print $name,length($seq)}' ../1_genomeNgff/genome738sl.polished.mitoFixed.fa |sort -k2,2nr |head -n 20 |awk '{print "chr","-",$1,$1,"0",$2}' >Top20Scaf.kary
#need to make a histogram or plot points. I am going with histogram first to get a feel for the presentation
awk '{print $2}' Top20Scaf.kary |while read line; do echo "awk '\$1==\""$line"\" && \$3==\"gene\"' ../1_genomeNgff/fixed.augustus.gff3 >>Genesoftop20scaffolds.gff" ;done >GetGenesoftop20scaffolds.sh
Genesoftop20scaffolds.gff|awk '{print $1"\t"$4"\t"$5"\t100"}' >Genes.histo
awk '{print $2}' Top20Scaf.kary |while read line; do echo "awk '\$1==\""$line"\" ' ../27_TandemRedo/ >>Genesoftop20scaffolds.gff" ;done >GetGenesoftop20scaffolds.sh
#/work/GIF/remkv6/Baum/CamTechGenomeComparison/58_Renamatorium/30_BenNewHGT
less HighConfHGT.list |sed 's/\.t/\t/g'|cut -f 1 |sort|uniq|grep -f - <(awk '$3=="gene"' ../1_genomeNgff/fixed.augustus.gff3|sed 's/;/\t/g') >HighConfHGT.gene.gff
#/work/GIF/remkv6/Baum/CamTechGenomeComparison/58_Renamatorium/31_Fig3Circos
awk '{print $2}' Top20Scaf.kary |while read line; do echo "awk '\$1==\""$line"\"' HighConfHGT.gene.gff >>HGTGenesoftop20scaffolds.gff" ;done >HGTGetGenesoftop20scaffolds.sh
sh HGTGetGenesoftop20scaffolds.sh
less HGTGenesoftop20scaffolds.gff|awk '{print $1"\t"$4"\t"$5"\t100"}' >HGTGenes.histo
ln -s ../26_ExpressionSets/AllPredictedEffectors.list
less AllPredictedEffectors.list |grep -f - <(awk '$3=="gene"' Genesoftop20scaffolds.gff|sed 's/;/\t/g') >AllPredictedEffectorsoftop20scaffolds.gff
less AllPredictedEffectorsoftop20scaffolds.gff|awk '{print $1"\t"$4"\t"$5"\t100"}' >AllPredictedEffectors.histo
ln -s ../23_LTR_finder/LtrRetroelementClassified.gff
awk '{print $2}' Top20Scaf.kary |while read line; do echo "awk '\$1==\""$line"\"' LtrRetroelementClassified.gff >>LTRsoftop20scaffolds.gff" ;done >LTRsoftop20scaffolds.sh
less LTRsoftop20scaffolds.gff|awk '{print $1"\t"$4"\t"$5"\t100"}' >LTRsoftop20scaffolds.histo
ln -s ../24_IRF_DNATrans/SupportedIRFMergeClassified.gff
awk '{print $2}' Top20Scaf.kary |while read line; do echo "awk '\$1==\""$line"\"' SupportedIRFMergeClassified.gff >>TIRsoftop20scaffolds.gff" ;done >TIRsoftop20scaffolds.sh
less TIRsoftop20scaffolds.gff|awk '{print $1"\t"$4"\t"$5"\t100"}' >TIRsoftop20scaffolds.histo
ln -s ../2_repeats/genome738sl.polished.mitoFixed.fa.out.gff
awk '{print $2}' Top20Scaf.kary |while read line; do echo "awk '\$1==\""$line"\"' genome738sl.polished.mitoFixed.fa.out.gff >>Repeatsoftop20scaffolds.gff" ;done >Repeatsoftop20scaffolds.sh
less Repeatsoftop20scaffolds.gff|awk '{print $1"\t"$4"\t"$5"\t100"}' >Repeatsoftop20scaffolds.histo
ln -s ../27_TandemRedo/TotalSections.gff
awk '{print $2}' Top20Scaf.kary |while read line; do echo "awk '\$1==\""$line"\"' TotalSections.gff >>TDRoftop20scaffolds.gff" ;done >TDRoftop20scaffolds.sh
less TDRoftop20scaffolds.gff|awk '{print $1"\t"$4"\t"$5"\t100"}' >TDRoftop20scaffolds.histo
these needed changed to heatmap, and were not very interesting for the top 20 largest scaffolds. decided to look at the scaffolds with the greatest numbers of duplications
Figured out what needed displayed here, effecotr/hgt overlap with duplicative tracks. So grabbign top ten scaffolds from this list and repeating above analysis