# <span style="color:#ff1414"> BEDtools analysis. </span>

This is a script to answer research questions outlined elsewhere. In summary, this script:

1. compares methylation results between different methylation-callers, and between different methylation sequencing methods.

2. compares methylation between genes and non-gene regions

3. compares methylation between transposons and non-repetitive regions

4. compares transposons and genes


Note:
- PB/pb = PacBio
- ONT/ont = Oxford Nanopore Technology
- NP = Nanopolish

In [31]:
import pybedtools
from pybedtools import BedTool
import os
import glob
import pprint
import numpy # need for p-value stats
import scipy

In [32]:
#First we need to define the base dirs
DIRS ={}
DIRS['BASE1'] = '/home/anjuni/methylation_calling/pacbio'
DIRS['BASE2'] = '/home/anjuni/analysis'
DIRS['BED_INPUT'] = os.path.join(DIRS['BASE2'], 'bedtools_output', 'sequencing_comparison')
DIRS['GFF_INPUT'] = os.path.join(DIRS['BASE2'], 'gff_output')
DIRS['WINDOW_OUTPUT'] = os.path.join(DIRS['BASE2'], 'windows')
DIRS['WINDOW_INPUT'] = os.path.join(DIRS['BASE2'], 'input_for_windows')
DIRS['REF'] = '/home/anjuni/Pst_104_v13_assembly/'

In [33]:
#Quick chech if directories exist
for value in DIRS.values():
    if not os.path.exists(value):
        print('%s does not exist' % value)

In [34]:
#Make filepaths
bed_file_list = [fn for fn in glob.iglob('%s/*.bed' % DIRS['BED_INPUT'], recursive=True)]
gff_file_list = [fn for fn in glob.iglob('%s/*anno.gff3' % DIRS['GFF_INPUT'], recursive=True)]
te_file_list = [fn for fn in glob.iglob('%s/*.gff' % DIRS['GFF_INPUT'], recursive=True)]

In [35]:
#Check that the list works
print(*bed_file_list, sep='\n')
print(*gff_file_list, sep='\n')
print(*te_file_list, sep='\n')

/home/anjuni/analysis/bedtools_output/sequencing_comparison/5mC_CpG_tombo_np.bed
/home/anjuni/analysis/bedtools_output/sequencing_comparison/5mC_tombo_np.bed
/home/anjuni/analysis/bedtools_output/sequencing_comparison/6mA_pb_ont.bed
/home/anjuni/analysis/bedtools_output/sequencing_comparison/5mC_hc_tombo_sorted.bed
/home/anjuni/analysis/bedtools_output/sequencing_comparison/5mC_CpG_np_tombo.bed
/home/anjuni/analysis/bedtools_output/sequencing_comparison/6mA_ont_pb.bed
/home/anjuni/analysis/bedtools_output/sequencing_comparison/5mC_np_tombo.bed
/home/anjuni/analysis/gff_output/Pst_104E_v13_p_ctg_combined_sorted_anno.gff3
/home/anjuni/analysis/gff_output/Pst_104E_v13_h_ctg_combined_sorted_anno.gff3
/home/anjuni/analysis/gff_output/Pst_104E_v13_h_ctg.REPET.sorted.filtered.superfamily.gff
/home/anjuni/analysis/gff_output/Pst_104E_v13_p_ctg.REPET.sorted.filtered.superfamily.gff
/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.REPET.sorted.filtered.superfamily.gff


## <span style='color:deeppink'> 1. Comparing methylation sequencing methods <span/>

In [8]:
%%bash

# find overlap between 6mA from PacBio and Nanopore for 6mA data

pb=/home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/6mA_prob_smrtlink_sorted.bed # use basecall accuracy instead of Phred score
ont=/home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/6mA_hc_tombo_sorted.bed # use sites with non-zero methylation

out1=/home/anjuni/analysis/bedtools_output/sequencing_comparison/6mA_pb_ont.bed
out2=/home/anjuni/analysis/bedtools_output/sequencing_comparison/6mA_ont_pb.bed

echo $pb
echo $ont

bedtools intersect -a $pb -b $ont > $out1
bedtools intersect -a $ont -b $pb > $out2

/home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/6mA_prob_smrtlink_sorted.bed
/home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/6mA_hc_tombo_sorted.bed


In [12]:
%%bash

#check how many overlapping sites there were

cd /home/anjuni/analysis/bedtools_output/sequencing_comparison/
echo PacBio sites:
less /home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/6mA_prob_smrtlink_sorted.bed | wc -l

echo Overlapping sites:
less 6mA_pb_ont.bed | wc -l

echo Nanopore sites:
less /home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/6mA_hc_tombo_sorted.bed | wc -l

echo Overlapping sites:
less 6mA_ont_pb.bed | wc -l

echo Total adenine sites:
less /home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/6mA_tombo_sorted.bed | wc -l

PacBio sites:
88932
Overlapping sites:
84733
Nanopore sites:
83451878
Overlapping sites:
84733
Total adenine sites:
85779879


In [27]:
# Descriptive Statistics

print('Percentage overlap between PacBio and Nanopore as a proportion of PB sites:', (84733/88932))
# overlap between pb and ont, divided by total PacBio sites

print('Percentage overlap between PacBio and Nanopore as a proportation of ONT sites:', (84733/83451878))
# = overlap between pb and ont, divided by total Nanopore sites

print('Percentage adenine methylation:', (84733/85779879))
# = overlapping sites, divided by total number of adenines (gained from number of lines on tombo file. tombo counts all adenines)

Percentage overlap between PacBio and Nanopore as a proportion of PB sites: 0.9527841496874017
Percentage overlap between PacBio and Nanopore as a proportation of ONT sites: 0.0010153516257596982
Percentage adenine methylation: 0.0009877957510292129


#### <span style='color:deeppink'> Observations <span/>

Very high similarity between Nanopore and PacBio, when compared to PacBio. But PacBio sites are only a small fraction of Tombo sites, and only include highly accurate sites.

When overlapping PacBio and all Nanopore (Tombo) sites, there was a higher overlap (88932) than when overlapping only non-zero PB and ONT sites (84733). This indicates PB detected sites that Nanopore did not, and these were high probability sites that were missed, as PB only had high probability (>99% basecall accuracy) sites.

There are also more overlapped sites when using the zero-probability sites from tombo, compared to only using only high confidence sites from both. This also suggests that Tombo/Nanopore had missed some methylated sites.

## <span style='color:#ff14ff'> 2. Comparing methylation detection methods <span/>

In [3]:
%%bash

# compare overlap between Tombo and Nanopolish for 5mC data
np=/home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/5mC_hc_nanopolish_sorted.bed
tombo=/home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/5mC_hc_tombo_sorted.bed

out1=/home/anjuni/analysis/bedtools_output/sequencing_comparison/5mC_np_tombo.bed
out2=/home/anjuni/analysis/bedtools_output/sequencing_comparison/5mC_tombo_np.bed

echo $np
echo $tombo

bedtools intersect -a $np -b $tombo > $out1
bedtools intersect -a $tombo -b $np > $out2

/home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/5mC_hc_nanopolish_sorted.bed
/home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/5mC_hc_tombo_sorted.bed


In [5]:
%%bash

#check how many overlapping sites there were

cd /home/anjuni/analysis/bedtools_output/sequencing_comparison/

echo Nanopolish sites:
less /home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/5mC_hc_nanopolish_sorted.bed | wc -l

echo Overlapping sites:
less 5mC_np_tombo.bed | wc -l

echo Tombo sites:
less /home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/5mC_hc_tombo_sorted.bed | wc -l

echo Overlapping sites:
less 5mC_tombo_np.bed | wc -l

echo Total cytosine sites:
less /home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/5mC_tombo_sorted.bed | wc -l

echo Total CpG sites:
less /home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/5mC_nanopolish_sorted.bed | wc -l

Nanopolish sites:
3783438
Overlapping sites:
1681653
Tombo sites:
67308386
Overlapping sites:
1681653
Total cytosine sites:
68536018
Total CpG sites:
5302131


In [14]:
# Descriptive Statistics

print('Percentage overlap between Nanopolish and Tombo as a proportion of NP sites:', (1681653/3783438))
# overlap between np and tombo, divided by total NP sites

print('Percentage overlap between Nanopolish and Tombo as a proportation of Tombo sites:', (1681653/67308386))
# = overlap between np and tombo, divided by total Tombo sites

print('Percentage cytosine methylation:', (1681653/68536018))
# = overlapping sites, divided by total number of cytosines (gained from number of lines on tombo file. tombo counts all cytosines)

print('Percentage of CpG sites methylated:', (1681653/5302131))
# = overlapping sites, divided by total number of CpG sites (gained from number of lines on np file. np counts all cpg sites)

Percentage overlap between Nanopolish and Tombo as a proportion of NP sites: 0.4444774831779984
Percentage overlap between Nanopolish and Tombo as a proportation of Tombo sites: 0.02498430136179465
Percentage cytosine methylation: 0.02453677714395371
Percentage of CpG sites methylated: 0.3171654944021564


#### <span style='color:#ff14ff'> Observations <span/>
While adenine methylation had high similarity between ONT and PB, cytosine methylation had only 44% similarity between NP and tombo. This is likely because NP only has CpG sites, and Tombo has all cytosine sites, so Tombo will detect far more potentially methylated sites, even those that are not CpG sites, so it will have far more sites than NP to begin with.

In [24]:
%%bash

#check how many cytosine sites and CpG sites there are

cd /home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/

echo CpG sites:
less 5mC_nanopolish_sorted.bed | wc -l

echo Cytosine sites:
less 5mC_tombo_sorted.bed | wc -l

CpG sites:
5302131
Cytosine sites:
68536018


In [23]:
# Descriptive Statistics

print('Percentage of CpG sites as a proportion of cytosine sites:', (5302131/68536018))
# np sites divided by tombo sites

Percentage of CpG sites as a proportion of cytosine sites: 0.07736269416761271


#### <span style='color:#ff14ff'> Solution <span/>
1. Make a file of methylated CpG sites detected by Tombo.
2. Intersect this with methylated (CpG) sites detected by Nanopolish.

I will be overlapping CpG sites from Tombo and NP, because NP only has CpG sites and Tombo has all cytosine sites.
So from the start, the overlap wouldn't have been accurate, because tombo considers sites that NP does not.

### <span style='color:#ff14ff'> CpG Sites <span/>

In [39]:
%%bash

# make a file of CpG sites from tombo
all_cpg=/home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/5mC_nanopolish_sorted.bed
all_tombo=/home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/5mC_hc_tombo_sorted.bed
tombo_cpg=/home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/5mC_hc_tombo_sorted.CpG.bed

bedtools intersect -a $all_cpg -b $all_tombo > $tombo_cpg

In [19]:
%%bash

# intersect Tombo and NP CpG sites
tombo_cpg=/home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/5mC_hc_tombo_sorted.CpG.bed
np_cpg=/home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/5mC_hc_nanopolish_sorted.bed
m_tombo_cpg=/home/anjuni/analysis/bedtools_output/sequencing_comparison/5mC_CpG_tombo_np.bed
m_np_cpg=/home/anjuni/analysis/bedtools_output/sequencing_comparison/5mC_CpG_np_tombo.bed

bedtools intersect -a $tombo_cpg -b $np_cpg > $m_tombo_cpg
bedtools intersect -a $np_cpg -b $tombo_cpg > $m_np_cpg

In [8]:
%%bash

#check how many overlapping sites there were
cd /home/anjuni/analysis/bedtools_output/sequencing_comparison/

echo Nanopolish methylated CpG sites:
less /home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/5mC_hc_nanopolish_sorted.bed | wc -l

echo Tombo methylated CpG sites:
less /home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/5mC_hc_tombo_sorted.CpG.bed | wc -l

echo Overlapping methylated CpG sites:
less 5mC_CpG_tombo_np.bed | wc -l

echo Total CpG sites:
less /home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/5mC_nanopolish_sorted.bed | wc -l

Nanopolish methylated CpG sites:
3783438
Tombo methylated CpG sites:
2352724
Overlapping methylated CpG sites:
1681653
Total CpG sites:
5302131


In [9]:
# Descriptive Statistics

print('Percentage overlap between Nanopolish and Tombo as a proportion of NP sites:', (1681653/3783438))
# overlap between np and tombo, divided by total NP sites

print('Percentage overlap between Nanopolish and Tombo as a proportation of Tombo sites:', (1681653/2352724))
# = overlap between np and tombo, divided by total Tombo sites

print('Percentage of CpG sites methylated:', (1681653/5302131))
# = overlapping sites, divided by total number of CpG sites (gained from number of lines on np file. np counts all cpg sites)

Percentage overlap between Nanopolish and Tombo as a proportion of NP sites: 0.4444774831779984
Percentage overlap between Nanopolish and Tombo as a proportation of Tombo sites: 0.7147684981323776
Percentage of CpG sites methylated: 0.3171654944021564


#### <span style='color:#ff14ff'> Observations <span/>
While the overlap between Tombo and NP didn't change after only comparing CpG sites, the nearly 50% overlap may be because NP only considers one strand while Tomb considers both.

This may be resolved by using only the plus file from Tombo and comparing its results to NP.

#### <span style='color:#ff14ff'> insert diagram: venn diagram of naopolish CpG and tombo CpG <span/>

## <span style='color:#8a14ff'> 3. Making cutoff files. <span/>

### <span style='color:#8a14ff'> 3.A Making cutoff files for overlapping files from previous section. <span/>

In [None]:
%%bash

#Move the tombo hc file to the 'sequencing_comparison' folder with the other overlapped files to continue analysis
cp 5mC_hc_tombo_sorted.bed ~/analysis/bedtools_output/sequencing_comparison/

In [35]:
#Make filepaths for both 6mA files, both CpG files, and the tombo file
bed_file_list = ['/home/anjuni/analysis/bedtools_output/sequencing_comparison/6mA_ont_pb.bed', \
                 '/home/anjuni/analysis/bedtools_output/sequencing_comparison/6mA_pb_ont.bed', \
                 '/home/anjuni/analysis/bedtools_output/sequencing_comparison/5mC_CpG_tombo_np.bed', \
                 '/home/anjuni/analysis/bedtools_output/sequencing_comparison/5mC_CpG_np_tombo.bed', \
                 '/home/anjuni/analysis/bedtools_output/sequencing_comparison/5mC_hc_tombo_sorted.bed']

In [36]:
# Make the list of cutoffs
cutoff_list = [1.00, 0.99, 0.95, 0.90, 0.80, 0.70, 0.60, 0.50, 0.40, 0.30, 0.20, 0.10]

In [42]:
# Define function to filter
def score_filter(feature, L):
    """Returns True if feature is longer than L"""
    return float(feature.score) >= L

def filter_by_cutoffs(bed_files, cutoffs, initial_file_path, final_file_path):
    """Filters files by the list of cutoffs given, and renames the file according to the cutoff."""
    for file in bed_files:
        pybed_object = BedTool(file)
        for x in cutoffs:
            filtered_file = pybed_object.filter(score_filter, x)
            cutoff_name = '.cutoff.' + str(x) + '.bed'
            out_filename = file.replace('.bed', cutoff_name)
            out_filename = out_filename.replace(initial_file_path, final_file_path)
            filtered_file.saveas(out_filename)

In [None]:
#Run the function to filter all files
initial_fp = '/home/anjuni/analysis/bedtools_output/sequencing_comparison/'
final_fp = '/home/anjuni/analysis/bedtools_output/cutoffs_from_intersects/'
filter_by_cutoffs(bed_file_list, cutoff_list)

### <span style='color:#8a14ff'> 3.B Making cutoff files from original methylation-calling files, and overlap them, so you have a similar cutoff for both files. This is because the cutoffs from the previous section were based on the cutoffs in the -a file in bedtools. <span/>

In [40]:
# make file handles for the five input files
sorted_bed_files = ['/home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/5mC_hc_nanopolish_sorted.bed', \
                   '/home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/5mC_hc_tombo_sorted.CpG.bed', \
                   '/home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/5mC_hc_tombo_sorted.bed', \
                   '/home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/6mA_prob_smrtlink_sorted.bed', \
                   '/home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/6mA_hc_tombo_sorted.bed']

In [45]:
#Run the function to filter all the sorted bed files
initial_fp1 = '/home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/'
final_fp1 = '/home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/cutoffs/'
filter_by_cutoffs(sorted_bed_files, cutoff_list, initial_fp1, final_fp1)

In [61]:
%%bash

#Move the 6mA files and 5mC files to separate folders
cd /home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/
mkdir cutoffs_6mA
mkdir cutoffs_5mC
mv cutoffs/6mA* cutoffs_6mA
mv cutoffs/5mC* cutoffs_5mC
rmdir cutoffs

In [36]:
# make directories for 6mA and 5mC cutoff files
DIRS['BED_CUTOFFS'] = os.path.join(DIRS['BASE1'], 'input', 'sorted_bed_files', 'cutoffs')
DIRS['6MA_CUTOFFS'] = os.path.join(DIRS['BASE1'], 'input', 'sorted_bed_files', 'cutoffs_6mA')
DIRS['5MC_CUTOFFS'] = os.path.join(DIRS['BASE1'], 'input', 'sorted_bed_files', 'cutoffs_5mC')

In [37]:
print(DIRS['BED_CUTOFFS'])
print(DIRS['6MA_CUTOFFS'])
print(DIRS['5MC_CUTOFFS'])

/home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/cutoffs
/home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/cutoffs_6mA
/home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/cutoffs_5mC


In [38]:
# make a list of 6mA cutoff files from Nanopore and PacBio
ont_6mA = [fn for fn in glob.iglob('%s/6mA_hc_tombo*.bed' % DIRS['6MA_CUTOFFS'], recursive=True)]
pb_6mA = [fn for fn in glob.iglob('%s/6mA_prob_smrtlink*.bed' % DIRS['6MA_CUTOFFS'], recursive=True)]

#test out these lists by printing
print(*ont_6mA, sep='\n')
print(*pb_6mA, sep='\n')

/home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/cutoffs_6mA/6mA_hc_tombo_sorted.cutoff.0.90.bed
/home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/cutoffs_6mA/6mA_hc_tombo_sorted.cutoff.0.60.bed
/home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/cutoffs_6mA/6mA_hc_tombo_sorted.cutoff.0.30.bed
/home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/cutoffs_6mA/6mA_hc_tombo_sorted.cutoff.0.80.bed
/home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/cutoffs_6mA/6mA_hc_tombo_sorted.cutoff.0.70.bed
/home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/cutoffs_6mA/6mA_hc_tombo_sorted.cutoff.0.20.bed
/home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/cutoffs_6mA/6mA_hc_tombo_sorted.cutoff.0.10.bed
/home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/cutoffs_6mA/6mA_hc_tombo_sorted.cutoff.0.40.bed
/home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/cutoffs_6mA/6mA_hc_tombo_sorted.cutoff.0.99.bed
/

In [39]:
# make a list of 5mC cutoff files from Nanopolish and Tombo
np_5mC = [fn for fn in glob.iglob('%s/5mC_hc_nanopolish*.bed' % DIRS['5MC_CUTOFFS'], recursive=True)]
tombo_CpG_5mC = [fn for fn in glob.iglob('%s/5mC_hc_tombo_sorted.CpG*.bed' % DIRS['5MC_CUTOFFS'], recursive=True)]
tombo_5mC = [fn for fn in glob.iglob('%s/5mC_hc_tombo_sorted.c*.bed' % DIRS['5MC_CUTOFFS'], recursive=True)]

#test out these lists by printing
print(*np_5mC, sep='\n')
print(*tombo_CpG_5mC, sep='\n')
print(*tombo_5mC, sep='\n')

/home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/cutoffs_5mC/5mC_hc_nanopolish_sorted.cutoff.0.90.bed
/home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/cutoffs_5mC/5mC_hc_nanopolish_sorted.cutoff.0.60.bed
/home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/cutoffs_5mC/5mC_hc_nanopolish_sorted.cutoff.0.80.bed
/home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/cutoffs_5mC/5mC_hc_nanopolish_sorted.cutoff.0.40.bed
/home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/cutoffs_5mC/5mC_hc_nanopolish_sorted.cutoff.0.50.bed
/home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/cutoffs_5mC/5mC_hc_nanopolish_sorted.cutoff.0.10.bed
/home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/cutoffs_5mC/5mC_hc_nanopolish_sorted.cutoff.0.20.bed
/home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/cutoffs_5mC/5mC_hc_nanopolish_sorted.cutoff.0.70.bed
/home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/cutoffs_5

In [40]:
# the lists are not sorted, so sort them before doing cutoffs
ont_6mA = sorted(ont_6mA)
pb_6mA = sorted(pb_6mA)
np_5mC = sorted(np_5mC)
tombo_CpG_5mC = sorted(tombo_CpG_5mC)
tombo_5mC = sorted(tombo_5mC)

In [41]:
#Check if it worked. (It did!) :D
print(*ont_6mA, sep='\n')
print(*pb_6mA, sep='\n')
print(*np_5mC, sep='\n')
print(*tombo_CpG_5mC, sep='\n')
print(*tombo_5mC, sep='\n')

/home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/cutoffs_6mA/6mA_hc_tombo_sorted.cutoff.0.10.bed
/home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/cutoffs_6mA/6mA_hc_tombo_sorted.cutoff.0.20.bed
/home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/cutoffs_6mA/6mA_hc_tombo_sorted.cutoff.0.30.bed
/home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/cutoffs_6mA/6mA_hc_tombo_sorted.cutoff.0.40.bed
/home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/cutoffs_6mA/6mA_hc_tombo_sorted.cutoff.0.50.bed
/home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/cutoffs_6mA/6mA_hc_tombo_sorted.cutoff.0.60.bed
/home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/cutoffs_6mA/6mA_hc_tombo_sorted.cutoff.0.70.bed
/home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/cutoffs_6mA/6mA_hc_tombo_sorted.cutoff.0.80.bed
/home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/cutoffs_6mA/6mA_hc_tombo_sorted.cutoff.0.90.bed
/

In [42]:
#make the filepaths for intersects output

DIRS['I_FROM_C'] = os.path.join(DIRS['BASE2'], 'bedtools_output', 'intersects_from_cutoffs')

In [43]:
# make a for loop to take a list of cutoffs, and a list of -a files and a list of -b files to intersect
def intersect_cutoffs(list_a, list_b, n_elements):
    """Take a list of files and intersect them with another list of files, where files are matched by methylation cutoff."""
    for i in (0, n_elements):
        a_bed = BedTool(list_a[i])
        b_bed = BedTool(list_b[i])
        outname = 
        intersected_cutoff = a_bed.intersect(b_bed).saveas(out_name)

SyntaxError: invalid syntax (<ipython-input-43-e1cd935ebf55>, line 7)

In [None]:
# Make lists of cutoffs from each of the 5 initial BED files
smrtlink

## <span style='color:#144fff'> 4. Making windows. <span/>

In [44]:
# Make folder for windows. Each BED file will contain a series of windows
#os.mkdir(DIRS['WINDOW_OUTPUT'])
#os.mkdir()
gene_fn = '/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.gff3'
te_fn = '/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.TE.sorted.gff3'
reference_genome = os.path.join(DIRS['REF'], 'Pst_104E_v13_ph_ctg.fa')


In [58]:
# Make the genome size file for windows
!samtools faidx /home/anjuni/Pst_104_v13_assembly/Pst_104E_v13_ph_ctg.fa
!cut -f 1,2 /home/anjuni/Pst_104_v13_assembly/Pst_104E_v13_ph_ctg.fa.fai > /home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.genome_file
# Note: this does put the p contig values before h contig ones, while annotation files put h contig before p contig
# May be a problem in the future but probs not
# Sorted it anyway below, as reference genome fasta had contigs in that order arbitrarily:
!/home/anjuni/myapps/gff3sort/gff3sort.pl /home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.genome_file >  /home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.sorted.genome_file

Smartmatch is experimental at /home/anjuni/myapps/gff3sort/gff3sort.pl line 68.
Use of uninitialized value $pos in hash element at /home/anjuni/myapps/gff3sort/gff3sort.pl line 67, <$_[...]> line 1.
Use of uninitialized value $pos in hash element at /home/anjuni/myapps/gff3sort/gff3sort.pl line 67, <$_[...]> line 2.
Use of uninitialized value $pos in hash element at /home/anjuni/myapps/gff3sort/gff3sort.pl line 67, <$_[...]> line 3.
Use of uninitialized value $pos in hash element at /home/anjuni/myapps/gff3sort/gff3sort.pl line 67, <$_[...]> line 4.
Use of uninitialized value $pos in hash element at /home/anjuni/myapps/gff3sort/gff3sort.pl line 67, <$_[...]> line 5.
Use of uninitialized value $pos in hash element at /home/anjuni/myapps/gff3sort/gff3sort.pl line 67, <$_[...]> line 6.
Use of uninitialized value $pos in hash element at /home/anjuni/myapps/gff3sort/gff3sort.pl line 67, <$_[...]> line 7.
Use of uninitialized value $pos in hash element at /home/anjuni/myapps/gff3sort/gff3sor

Use of uninitialized value $pos in hash element at /home/anjuni/myapps/gff3sort/gff3sort.pl line 67, <$_[...]> line 532.
Use of uninitialized value $pos in hash element at /home/anjuni/myapps/gff3sort/gff3sort.pl line 67, <$_[...]> line 533.
Use of uninitialized value $pos in hash element at /home/anjuni/myapps/gff3sort/gff3sort.pl line 67, <$_[...]> line 534.
Use of uninitialized value $pos in hash element at /home/anjuni/myapps/gff3sort/gff3sort.pl line 67, <$_[...]> line 535.
Use of uninitialized value $pos in hash element at /home/anjuni/myapps/gff3sort/gff3sort.pl line 67, <$_[...]> line 536.
Use of uninitialized value $pos in hash element at /home/anjuni/myapps/gff3sort/gff3sort.pl line 67, <$_[...]> line 537.
Use of uninitialized value $pos in hash element at /home/anjuni/myapps/gff3sort/gff3sort.pl line 67, <$_[...]> line 538.
Use of uninitialized value $pos in hash element at /home/anjuni/myapps/gff3sort/gff3sort.pl line 67, <$_[...]> line 539.
Use of uninitialized val

Argument "" isn't numeric in sort at /home/anjuni/myapps/gff3sort/gff3sort.pl line 104.
Argument "" isn't numeric in sort at /home/anjuni/myapps/gff3sort/gff3sort.pl line 104.
Argument "" isn't numeric in sort at /home/anjuni/myapps/gff3sort/gff3sort.pl line 104.
Argument "" isn't numeric in sort at /home/anjuni/myapps/gff3sort/gff3sort.pl line 104.
Argument "" isn't numeric in sort at /home/anjuni/myapps/gff3sort/gff3sort.pl line 104.
Argument "" isn't numeric in sort at /home/anjuni/myapps/gff3sort/gff3sort.pl line 104.
Argument "" isn't numeric in sort at /home/anjuni/myapps/gff3sort/gff3sort.pl line 104.
Argument "" isn't numeric in sort at /home/anjuni/myapps/gff3sort/gff3sort.pl line 104.
Argument "" isn't numeric in sort at /home/anjuni/myapps/gff3sort/gff3sort.pl line 104.
Argument "" isn't numeric in sort at /home/anjuni/myapps/gff3sort/gff3sort.pl line 104.
Argument "" isn't numeric in sort at /home/anjuni/myapps/gff3sort/gff3sort.pl line 104.
Argument "" isn't num

Argument "" isn't numeric in sort at /home/anjuni/myapps/gff3sort/gff3sort.pl line 104.
Argument "" isn't numeric in sort at /home/anjuni/myapps/gff3sort/gff3sort.pl line 104.
Argument "" isn't numeric in sort at /home/anjuni/myapps/gff3sort/gff3sort.pl line 104.
Argument "" isn't numeric in sort at /home/anjuni/myapps/gff3sort/gff3sort.pl line 104.
Argument "" isn't numeric in sort at /home/anjuni/myapps/gff3sort/gff3sort.pl line 104.
Argument "" isn't numeric in sort at /home/anjuni/myapps/gff3sort/gff3sort.pl line 104.
Argument "" isn't numeric in sort at /home/anjuni/myapps/gff3sort/gff3sort.pl line 104.
Argument "" isn't numeric in sort at /home/anjuni/myapps/gff3sort/gff3sort.pl line 104.
Argument "" isn't numeric in sort at /home/anjuni/myapps/gff3sort/gff3sort.pl line 104.
Argument "" isn't numeric in sort at /home/anjuni/myapps/gff3sort/gff3sort.pl line 104.
Argument "" isn't numeric in sort at /home/anjuni/myapps/gff3sort/gff3sort.pl line 104.
Argument "" isn't num

In [59]:
pprint.pprint(DIRS) # for reference

{'5MC_CUTOFFS': '/home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/cutoffs_5mC',
 '6MA_CUTOFFS': '/home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/cutoffs_6mA',
 'BASE1': '/home/anjuni/methylation_calling/pacbio',
 'BASE2': '/home/anjuni/analysis',
 'BED_CUTOFFS': '/home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/cutoffs',
 'BED_INPUT': '/home/anjuni/analysis/bedtools_output/sequencing_comparison',
 'GFF_INPUT': '/home/anjuni/analysis/gff_output',
 'I_FROM_C': '/home/anjuni/analysis/bedtools_output/intersects_from_cutoffs',
 'REF': '/home/anjuni/Pst_104_v13_assembly/',
 'WINDOW_INPUT': '/home/anjuni/analysis/input_for_windows',
 'WINDOW_OUTPUT': '/home/anjuni/analysis/windows'}


In [46]:
# Make the window BED files
# Test it out on large windows for a small dataset
# Define all file paths
window_fn_dict = {}
window_bed_dict = {}
window_fn_dict['100kb'] = os.path.join(DIRS['WINDOW_OUTPUT'], 'Pst_104E_v13_ph_ctg_w100kb.bed')
window_fn_dict['30kb'] = os.path.join(DIRS['WINDOW_OUTPUT'], 'Pst_104E_v13_ph_ctg_w30kb.bed')
window_fn_dict['10kb'] = os.path.join(DIRS['WINDOW_OUTPUT'], 'Pst_104E_v13_ph_ctg_w10kb.bed')
genome_size_f_fn = os.path.join(DIRS['WINDOW_INPUT'], 'Pst_104E_v13_ph_ctg.sorted.genome_file')

In [61]:
# Check whether the dictionary looks nice :) (it does!) :D
pprint.pprint(window_fn_dict)

{'100kb': '/home/anjuni/analysis/windows/Pst_104E_v13_ph_ctg_w100kb.bed',
 '10kb': '/home/anjuni/analysis/windows/Pst_104E_v13_ph_ctg_w10kb.bed',
 '30kb': '/home/anjuni/analysis/windows/Pst_104E_v13_ph_ctg_w30kb.bed'}


In [62]:
# Make the actual windows! :D
!bedtools makewindows -g {genome_size_f_fn} -w 100000 > {window_fn_dict['100kb']}
!bedtools makewindows -g {genome_size_f_fn} -w 30000 > {window_fn_dict['30kb']}
!bedtools makewindows -g {genome_size_f_fn} -w 10000 > {window_fn_dict['10kb']}

In [65]:
#new make a bedtools window dataframe
for key, value in window_fn_dict.items() :
    window_bed_dict[key] = BedTool(value)

In [66]:
# Check whether the bed file dictionary looks nice :) (it does!) :D
pprint.pprint(window_bed_dict)

{'100kb': <BedTool(/home/anjuni/analysis/windows/Pst_104E_v13_ph_ctg_w100kb.bed)>,
 '10kb': <BedTool(/home/anjuni/analysis/windows/Pst_104E_v13_ph_ctg_w10kb.bed)>,
 '30kb': <BedTool(/home/anjuni/analysis/windows/Pst_104E_v13_ph_ctg_w30kb.bed)>}


In [67]:
# Make filepaths for feature files for genes, effectors, TE, methylation
feature_fn_dict = {}
feature_fn_dict['genes'] = gene_fn
feature_fn_dict['TE'] = te_fn
feature_fn_dict['effector'] = os.path.join(DIRS['WINDOW_INPUT'], 'Pst_104E_v13_ph_ctg.effectors.gff3' )
feature_fn_dict['ont_6mA_0.10'] = ont_6mA[0]
feature_fn_dict['pb_6mA_0.10'] = pb_6mA[0]

In [68]:
# Check whether the function file dictionary works (it does)
pprint.pprint(feature_fn_dict)

{'TE': '/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.TE.sorted.gff3',
 'effector': '/home/anjuni/analysis/input_for_windows/Pst_104E_v13_ph_ctg.effectors.gff3',
 'genes': '/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.gff3',
 'ont_6mA_0.10': '/home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/cutoffs_6mA/6mA_hc_tombo_sorted.cutoff.0.10.bed',
 'pb_6mA_0.10': '/home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/cutoffs_6mA/6mA_prob_smrtlink_sorted.cutoff.0.10.bed'}


In [17]:
%%bash
# Downloading the effector file (need raw version with only the file)
cd /home/anjuni/analysis/input_for_windows
wget https://raw.githubusercontent.com/BenjaminSchwessinger/Pst_104_E137_A-_genome/master/supplemental_files/Supplemental_file_9.txt
mv Supplemental_file_9.txt Candidate_effectors.txt

--2018-08-07 15:30:17--  https://raw.githubusercontent.com/BenjaminSchwessinger/Pst_104_E137_A-_genome/master/supplemental_files/Supplemental_file_9.txt
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.80.133
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.80.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 49295 (48K) [text/plain]
Saving to: 'Supplemental_file_9.txt’

     0K .......... .......... .......... .......... ........  100% 2.80M=0.02s

2018-08-07 15:30:22 (2.80 MB/s) - 'Supplemental_file_9.txt’ saved [49295/49295]



In [24]:
# Make a GFF file of effector proteins

# First extract all lines with genes (not exon or CDS) from the gene annotation file
! grep 'gene' /home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.gff3 > /home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3

In [28]:
# Then extract the effector lines from the gene file

# First write grep function for python
import fileinput
import re
import glob

def grep(PAT, FILES):
    """Same function as 'grep' in bash."""
    fileinput.close() # close the file in case the iterable was previously open to prevent the "input() already active" error
    for line in fileinput.input(glob.glob(FILES)):
        if re.search(PAT, line):
            print(fileinput.filename(), fileinput.lineno(), line)
            return line
    fileinput.close()
            
            
# Write function for filtering effectors from gene file using grep            
def make_effector_gff(effector_list, gene_gff, out_gff):
    """Get effector protein features out of gene annotation files."""
    with open(out_gff, mode = 'w') as out_file:
        for effector in effector_list:
            print(grep(effector, gene_gff), file = out_file)
        
# And make a list of effects
def make_effector_list(input_file):
    list_name = []
    with open(input_file) as file:
        for line in file:
            line = line.strip()
            list_name.append(line)
    return list_name

In [20]:
# Make the effector list
effectors = make_effector_list('/home/anjuni/analysis/input_for_windows/Candidate_effectors.txt')

In [26]:
# Check if list works (it does, but is not sorted)
print(effectors)

['evm.TU.hcontig_000_003.1', 'evm.TU.hcontig_000_003.10', 'evm.TU.hcontig_000_003.120', 'evm.TU.hcontig_000_003.158', 'evm.TU.hcontig_000_003.2', 'evm.TU.hcontig_000_003.20', 'evm.TU.hcontig_000_003.26', 'evm.TU.hcontig_000_003.314', 'evm.TU.hcontig_000_003.340', 'evm.TU.hcontig_000_003.380', 'evm.TU.hcontig_000_003.402', 'evm.TU.hcontig_000_003.419', 'evm.TU.hcontig_000_003.421', 'evm.TU.hcontig_000_003.423', 'evm.TU.hcontig_000_003.444', 'evm.TU.hcontig_000_003.450', 'evm.TU.hcontig_000_003.90', 'evm.TU.hcontig_000_031.4', 'evm.TU.hcontig_000_050.114', 'evm.TU.hcontig_000_050.122', 'evm.TU.hcontig_000_050.141', 'evm.TU.hcontig_000_050.144', 'evm.TU.hcontig_000_050.149', 'evm.TU.hcontig_000_050.34', 'evm.TU.hcontig_000_050.85', 'evm.TU.hcontig_000_050.87', 'evm.TU.hcontig_000_050.9', 'evm.TU.hcontig_000_050.93', 'evm.TU.hcontig_000_054.13', 'evm.TU.hcontig_000_054.43', 'evm.TU.hcontig_000_054.68', 'evm.TU.hcontig_001_001.103', 'evm.TU.hcontig_001_001.128', 'evm.TU.hcontig_001_001.129'

In [22]:
# Sort effector list
effectors.sort()
print(effectors) # Check if sorting worked (it did)

['evm.TU.hcontig_000_003.1', 'evm.TU.hcontig_000_003.10', 'evm.TU.hcontig_000_003.120', 'evm.TU.hcontig_000_003.158', 'evm.TU.hcontig_000_003.2', 'evm.TU.hcontig_000_003.20', 'evm.TU.hcontig_000_003.26', 'evm.TU.hcontig_000_003.314', 'evm.TU.hcontig_000_003.340', 'evm.TU.hcontig_000_003.380', 'evm.TU.hcontig_000_003.402', 'evm.TU.hcontig_000_003.419', 'evm.TU.hcontig_000_003.421', 'evm.TU.hcontig_000_003.423', 'evm.TU.hcontig_000_003.444', 'evm.TU.hcontig_000_003.450', 'evm.TU.hcontig_000_003.90', 'evm.TU.hcontig_000_031.4', 'evm.TU.hcontig_000_050.114', 'evm.TU.hcontig_000_050.122', 'evm.TU.hcontig_000_050.141', 'evm.TU.hcontig_000_050.144', 'evm.TU.hcontig_000_050.149', 'evm.TU.hcontig_000_050.34', 'evm.TU.hcontig_000_050.85', 'evm.TU.hcontig_000_050.87', 'evm.TU.hcontig_000_050.9', 'evm.TU.hcontig_000_050.93', 'evm.TU.hcontig_000_054.13', 'evm.TU.hcontig_000_054.43', 'evm.TU.hcontig_000_054.68', 'evm.TU.hcontig_001_001.103', 'evm.TU.hcontig_001_001.128', 'evm.TU.hcontig_001_001.129'

In [29]:
# Run the function to make a file of effectors
genes_only_fn = '/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3'
effector_fn  = '/home/anjuni/analysis/input_for_windows/Pst_104E_v13_ph_ctg.effectors.gff3'
make_effector_gff(effectors, genes_only_fn, effector_fn)

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 1 hcontig_000_003	EVM	gene	1023	1469	.	+	.	ID=evm.TU.hcontig_000_003.1;Name=gene_model_hcontig_0000_03.1;locus_tag=Pst104E_15928

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 8 hcontig_000_003	EVM	gene	39750	40041	.	-	.	ID=evm.TU.hcontig_000_003.10;Name=gene_model_hcontig_0000_03.10;locus_tag=Pst104E_15935

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 102 hcontig_000_003	EVM	gene	488403	489285	.	-	.	ID=evm.TU.hcontig_000_003.120;Name=gene_model_hcontig_0000_03.120;locus_tag=Pst104E_16029

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 131 hcontig_000_003	EVM	gene	655467	655958	.	+	.	ID=evm.TU.hcontig_000_003.158;Name=gene_model_hcontig_0000_03.158;locus_tag=Pst104E_16058

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 2 hcontig_000_003	EVM	gene	4850	5854	.	+	.	ID=evm.T

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 1516 hcontig_002_028	EVM	gene	283952	284425	.	+	.	ID=evm.TU.hcontig_002_028.54;Name=gene_model_hcontig_0002_28.54;locus_tag=Pst104E_17443

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 1525 hcontig_002_028	EVM	gene	315708	316464	.	+	.	ID=evm.TU.hcontig_002_028.63;Name=gene_model_hcontig_0002_28.63;locus_tag=Pst104E_17452

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 1536 hcontig_002_028	EVM	gene	360966	361576	.	-	.	ID=evm.TU.hcontig_002_028.75;Name=gene_model_hcontig_0002_28.75;locus_tag=Pst104E_17463

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 1545 hcontig_002_028	EVM	gene	418475	419292	.	+	.	ID=evm.TU.hcontig_002_028.84;Name=gene_model_hcontig_0002_28.84;locus_tag=Pst104E_17472

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 1834 hcontig_003_002	EVM	gene	471745	47

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 2261 hcontig_004_020	EVM	gene	92789	93154	.	-	.	ID=evm.TU.hcontig_004_020.27;Name=gene_model_hcontig_0004_20.27;locus_tag=Pst104E_18188

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 2487 hcontig_004_020	EVM	gene	1265660	1266794	.	-	.	ID=evm.TU.hcontig_004_020.299;Name=gene_model_hcontig_0004_20.299;locus_tag=Pst104E_18414

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 2503 hcontig_004_020	EVM	gene	1370503	1371624	.	+	.	ID=evm.TU.hcontig_004_020.324;Name=gene_model_hcontig_0004_20.324;locus_tag=Pst104E_18430

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 2270 hcontig_004_020	EVM	gene	125461	125784	.	+	.	ID=evm.TU.hcontig_004_020.36;Name=gene_model_hcontig_0004_20.36;locus_tag=Pst104E_18197

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 2556 hcontig_004_020	EVM	gene	170

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 3529 hcontig_007_005	EVM	gene	244299	244856	.	+	.	ID=evm.TU.hcontig_007_005.66;Name=gene_model_hcontig_0007_05.66;locus_tag=Pst104E_19456

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 3475 hcontig_007_005	EVM	gene	19345	19862	.	+	.	ID=evm.TU.hcontig_007_005.7;Name=gene_model_hcontig_0007_05.7;locus_tag=Pst104E_19402

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 3532 hcontig_007_005	EVM	gene	276801	277354	.	+	.	ID=evm.TU.hcontig_007_005.75;Name=gene_model_hcontig_0007_05.75;locus_tag=Pst104E_19459

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 3535 hcontig_007_005	EVM	gene	281264	281824	.	+	.	ID=evm.TU.hcontig_007_005.78;Name=gene_model_hcontig_0007_05.78;locus_tag=Pst104E_19462

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 3627 hcontig_007_006	EVM	gene	492521	493106

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 4569 hcontig_010_016	EVM	gene	975936	976392	.	-	.	ID=evm.TU.hcontig_010_016.207;Name=gene_model_hcontig_0010_16.207;locus_tag=Pst104E_20496

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 4579 hcontig_010_016	EVM	gene	1018040	1018584	.	+	.	ID=evm.TU.hcontig_010_016.220;Name=gene_model_hcontig_0010_16.220;locus_tag=Pst104E_20506

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 4583 hcontig_010_016	EVM	gene	1028818	1029371	.	+	.	ID=evm.TU.hcontig_010_016.224;Name=gene_model_hcontig_0010_16.224;locus_tag=Pst104E_20510

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 4584 hcontig_010_016	EVM	gene	1031109	1031638	.	+	.	ID=evm.TU.hcontig_010_016.225;Name=gene_model_hcontig_0010_16.225;locus_tag=Pst104E_20511

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 4585 hcontig_010_016	EVM	

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 5055 hcontig_012_010	EVM	gene	291315	291625	.	-	.	ID=evm.TU.hcontig_012_010.72;Name=gene_model_hcontig_0012_10.72;locus_tag=Pst104E_20982

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 5149 hcontig_012_024	EVM	gene	316136	316481	.	+	.	ID=evm.TU.hcontig_012_024.75;Name=gene_model_hcontig_0012_24.75;locus_tag=Pst104E_21076

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 5259 hcontig_012_028	EVM	gene	410421	411419	.	-	.	ID=evm.TU.hcontig_012_028.106;Name=gene_model_hcontig_0012_28.106;locus_tag=Pst104E_21186

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 5273 hcontig_012_028	EVM	gene	505351	505809	.	+	.	ID=evm.TU.hcontig_012_028.128;Name=gene_model_hcontig_0012_28.128;locus_tag=Pst104E_21200

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 5289 hcontig_012_028	EVM	gene	57134

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 6117 hcontig_016_024	EVM	gene	574437	574868	.	-	.	ID=evm.TU.hcontig_016_024.126;Name=gene_model_hcontig_0016_24.126;locus_tag=Pst104E_22044

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 6121 hcontig_016_024	EVM	gene	596255	596725	.	+	.	ID=evm.TU.hcontig_016_024.133;Name=gene_model_hcontig_0016_24.133;locus_tag=Pst104E_22048

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 6146 hcontig_016_024	EVM	gene	751578	752304	.	+	.	ID=evm.TU.hcontig_016_024.163;Name=gene_model_hcontig_0016_24.163;locus_tag=Pst104E_22073

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 6039 hcontig_016_024	EVM	gene	99985	100686	.	-	.	ID=evm.TU.hcontig_016_024.22;Name=gene_model_hcontig_0016_24.22;locus_tag=Pst104E_21966

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 6067 hcontig_016_024	EVM	gene	2307

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 6745 hcontig_019_001	EVM	gene	119121	119738	.	+	.	ID=evm.TU.hcontig_019_001.33;Name=gene_model_hcontig_0019_01.33;locus_tag=Pst104E_22672

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 6729 hcontig_019_001	EVM	gene	49121	49706	.	+	.	ID=evm.TU.hcontig_019_001.9;Name=gene_model_hcontig_0019_01.9;locus_tag=Pst104E_22656

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 6813 hcontig_019_013	EVM	gene	52357	52829	.	+	.	ID=evm.TU.hcontig_019_013.16;Name=gene_model_hcontig_0019_13.16;locus_tag=Pst104E_22740

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 6815 hcontig_019_013	EVM	gene	55991	56458	.	-	.	ID=evm.TU.hcontig_019_013.18;Name=gene_model_hcontig_0019_13.18;locus_tag=Pst104E_22742

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 6819 hcontig_019_013	EVM	gene	88995	89597	.	+	.

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 7562 hcontig_021_026	EVM	gene	109922	110599	.	+	.	ID=evm.TU.hcontig_021_026.22;Name=gene_model_hcontig_0021_26.22;locus_tag=Pst104E_23489

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 7563 hcontig_021_026	EVM	gene	119377	120069	.	+	.	ID=evm.TU.hcontig_021_026.23;Name=gene_model_hcontig_0021_26.23;locus_tag=Pst104E_23490

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 7604 hcontig_021_027	EVM	gene	119481	120082	.	-	.	ID=evm.TU.hcontig_021_027.28;Name=gene_model_hcontig_0021_27.28;locus_tag=Pst104E_23531

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 7583 hcontig_021_027	EVM	gene	11278	12506	.	-	.	ID=evm.TU.hcontig_021_027.3;Name=gene_model_hcontig_0021_27.3;locus_tag=Pst104E_23510

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 7609 hcontig_021_027	EVM	gene	141335	142613

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 8250 hcontig_026_002	EVM	gene	20580	21473	.	-	.	ID=evm.TU.hcontig_026_002.7;Name=gene_model_hcontig_0026_02.7;locus_tag=Pst104E_24177

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 8251 hcontig_026_002	EVM	gene	24126	24991	.	-	.	ID=evm.TU.hcontig_026_002.8;Name=gene_model_hcontig_0026_02.8;locus_tag=Pst104E_24178

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 8406 hcontig_026_011	EVM	gene	49102	59249	.	-	.	ID=evm.TU.hcontig_026_011.10;Name=gene_model_hcontig_0026_11.10;locus_tag=Pst104E_24333

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 8407 hcontig_026_011	EVM	gene	61902	62767	.	-	.	ID=evm.TU.hcontig_026_011.11;Name=gene_model_hcontig_0026_11.11;locus_tag=Pst104E_24334

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 8405 hcontig_026_011	EVM	gene	45556	46449	.	-	.	ID=

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 9018 hcontig_029_010	EVM	gene	273019	273313	.	-	.	ID=evm.TU.hcontig_029_010.62;Name=gene_model_hcontig_0029_10.62;locus_tag=Pst104E_24945

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 9031 hcontig_029_010	EVM	gene	388393	388646	.	+	.	ID=evm.TU.hcontig_029_010.80;Name=gene_model_hcontig_0029_10.80;locus_tag=Pst104E_24958

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 9034 hcontig_029_010	EVM	gene	402639	402923	.	+	.	ID=evm.TU.hcontig_029_010.84;Name=gene_model_hcontig_0029_10.84;locus_tag=Pst104E_24961

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 9143 hcontig_029_013	EVM	gene	458647	459253	.	-	.	ID=evm.TU.hcontig_029_013.103;Name=gene_model_hcontig_0029_13.103;locus_tag=Pst104E_25070

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 9144 hcontig_029_013	EVM	gene	460782	

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 9486 hcontig_031_005	EVM	gene	443813	444235	.	-	.	ID=evm.TU.hcontig_031_005.99;Name=gene_model_hcontig_0031_05.99;locus_tag=Pst104E_25413

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 9528 hcontig_031_007	EVM	gene	105614	106143	.	+	.	ID=evm.TU.hcontig_031_007.24;Name=gene_model_hcontig_0031_07.24;locus_tag=Pst104E_25455

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 9530 hcontig_031_007	EVM	gene	112637	113128	.	-	.	ID=evm.TU.hcontig_031_007.26;Name=gene_model_hcontig_0031_07.26;locus_tag=Pst104E_25457

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 9539 hcontig_031_007	EVM	gene	139515	140569	.	+	.	ID=evm.TU.hcontig_031_007.35;Name=gene_model_hcontig_0031_07.35;locus_tag=Pst104E_25466

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 9542 hcontig_031_007	EVM	gene	164722	16

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 9989 hcontig_034_022	EVM	gene	28443	28769	.	-	.	ID=evm.TU.hcontig_034_022.9;Name=gene_model_hcontig_0034_22.9;locus_tag=Pst104E_25916

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 10040 hcontig_035_008	EVM	gene	30475	31177	.	-	.	ID=evm.TU.hcontig_035_008.10;Name=gene_model_hcontig_0035_08.10;locus_tag=Pst104E_25967

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 10041 hcontig_035_008	EVM	gene	33225	34496	.	-	.	ID=evm.TU.hcontig_035_008.11;Name=gene_model_hcontig_0035_08.11;locus_tag=Pst104E_25968

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 10133 hcontig_035_008	EVM	gene	603042	603524	.	+	.	ID=evm.TU.hcontig_035_008.129;Name=gene_model_hcontig_0035_08.129;locus_tag=Pst104E_26060

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 10134 hcontig_035_008	EVM	gene	604985	6057

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 10447 hcontig_037_011	EVM	gene	144909	145327	.	+	.	ID=evm.TU.hcontig_037_011.31;Name=gene_model_hcontig_0037_11.31;locus_tag=Pst104E_26374

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 10448 hcontig_037_011	EVM	gene	147380	147849	.	+	.	ID=evm.TU.hcontig_037_011.32;Name=gene_model_hcontig_0037_11.32;locus_tag=Pst104E_26375

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 10451 hcontig_037_011	EVM	gene	163919	164974	.	-	.	ID=evm.TU.hcontig_037_011.35;Name=gene_model_hcontig_0037_11.35;locus_tag=Pst104E_26378

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 10481 hcontig_037_011	EVM	gene	361900	362345	.	+	.	ID=evm.TU.hcontig_037_011.73;Name=gene_model_hcontig_0037_11.73;locus_tag=Pst104E_26408

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 10483 hcontig_037_011	EVM	gene	3693

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 10913 hcontig_041_004	EVM	gene	257090	257785	.	-	.	ID=evm.TU.hcontig_041_004.60;Name=gene_model_hcontig_0041_04.60;locus_tag=Pst104E_26840

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 10926 hcontig_041_004	EVM	gene	322078	322900	.	-	.	ID=evm.TU.hcontig_041_004.78;Name=gene_model_hcontig_0041_04.78;locus_tag=Pst104E_26853

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 10931 hcontig_041_004	EVM	gene	333233	334299	.	+	.	ID=evm.TU.hcontig_041_004.83;Name=gene_model_hcontig_0041_04.83;locus_tag=Pst104E_26858

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 10948 hcontig_041_005	EVM	gene	82261	82679	.	+	.	ID=evm.TU.hcontig_041_005.12;Name=gene_model_hcontig_0041_05.12;locus_tag=Pst104E_26875

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 10964 hcontig_041_005	EVM	gene	156779

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 11508 hcontig_045_002	EVM	gene	7036	7368	.	-	.	ID=evm.TU.hcontig_045_002.3;Name=gene_model_hcontig_0045_02.3;locus_tag=Pst104E_27435

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 11580 hcontig_045_002	EVM	gene	306464	306710	.	+	.	ID=evm.TU.hcontig_045_002.83;Name=gene_model_hcontig_0045_02.83;locus_tag=Pst104E_27507

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 11585 hcontig_045_002	EVM	gene	324223	324867	.	+	.	ID=evm.TU.hcontig_045_002.88;Name=gene_model_hcontig_0045_02.88;locus_tag=Pst104E_27512

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 11589 hcontig_045_002	EVM	gene	334698	334935	.	+	.	ID=evm.TU.hcontig_045_002.92;Name=gene_model_hcontig_0045_02.92;locus_tag=Pst104E_27516

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 11593 hcontig_045_002	EVM	gene	348216	348

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 12194 hcontig_050_121	EVM	gene	77328	77933	.	-	.	ID=evm.TU.hcontig_050_121.18;Name=gene_model_hcontig_050_121.18;locus_tag=Pst104E_28121

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 12187 hcontig_050_121	EVM	gene	20985	22001	.	+	.	ID=evm.TU.hcontig_050_121.9;Name=gene_model_hcontig_050_121.9;locus_tag=Pst104E_28114

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 12300 hcontig_052_006	EVM	gene	401349	402094	.	-	.	ID=evm.TU.hcontig_052_006.115;Name=gene_model_hcontig_0052_06.115;locus_tag=Pst104E_28227

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 12301 hcontig_052_006	EVM	gene	410205	411327	.	+	.	ID=evm.TU.hcontig_052_006.117;Name=gene_model_hcontig_0052_06.117;locus_tag=Pst104E_28228

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 12328 hcontig_052_006	EVM	gene	526951

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 12844 hcontig_057_003	EVM	gene	437904	438491	.	-	.	ID=evm.TU.hcontig_057_003.120;Name=gene_model_hcontig_0057_03.120;locus_tag=Pst104E_28771

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 12845 hcontig_057_003	EVM	gene	439781	441066	.	-	.	ID=evm.TU.hcontig_057_003.121;Name=gene_model_hcontig_0057_03.121;locus_tag=Pst104E_28772

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 12857 hcontig_057_003	EVM	gene	503572	504851	.	+	.	ID=evm.TU.hcontig_057_003.133;Name=gene_model_hcontig_0057_03.133;locus_tag=Pst104E_28784

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 12773 hcontig_057_003	EVM	gene	140063	140743	.	-	.	ID=evm.TU.hcontig_057_003.36;Name=gene_model_hcontig_0057_03.36;locus_tag=Pst104E_28700

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 12787 hcontig_057_003	EVM	gen

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 13275 hcontig_064_007	EVM	gene	272615	273123	.	-	.	ID=evm.TU.hcontig_064_007.75;Name=gene_model_hcontig_0064_07.75;locus_tag=Pst104E_29202

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 13220 hcontig_064_007	EVM	gene	30779	32046	.	-	.	ID=evm.TU.hcontig_064_007.9;Name=gene_model_hcontig_0064_07.9;locus_tag=Pst104E_29147

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 13313 hcontig_064_098	EVM	gene	182298	183031	.	+	.	ID=evm.TU.hcontig_064_098.38;Name=gene_model_hcontig_064_098.38;locus_tag=Pst104E_29240

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 13344 hcontig_065_003	EVM	gene	117769	118922	.	-	.	ID=evm.TU.hcontig_065_003.39;Name=gene_model_hcontig_0065_03.39;locus_tag=Pst104E_29271

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 13414 hcontig_065_094	EVM	gene	168027	1

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 14306 hcontig_189_001	EVM	gene	3136	3934	.	+	.	ID=evm.TU.hcontig_189_001.2;Name=gene_model_hcontig_0189_01.2;locus_tag=Pst104E_30233

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 14313 hcontig_225_001	EVM	gene	1065	1804	.	+	.	ID=evm.TU.hcontig_225_001.1;Name=gene_model_hcontig_0225_01.1;locus_tag=Pst104E_30240

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 14423 pcontig_000	EVM	gene	439981	440863	.	-	.	ID=evm.TU.pcontig_000.112;Name=gene_model_pcontig_000.112;locus_tag=Pst104E_00101

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 14340 pcontig_000	EVM	gene	66837	67400	.	-	.	ID=evm.TU.pcontig_000.19;Name=gene_model_pcontig_000.19;locus_tag=Pst104E_00018

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 14547 pcontig_000	EVM	gene	1075623	1077113	.	+	.	ID=evm.TU.pcontig_000.

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 14925 pcontig_001	EVM	gene	106800	107367	.	-	.	ID=evm.TU.pcontig_001.28;Name=gene_model_pcontig_001.28;locus_tag=Pst104E_00603

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 15155 pcontig_001	EVM	gene	1173998	1174985	.	-	.	ID=evm.TU.pcontig_001.288;Name=gene_model_pcontig_001.288;locus_tag=Pst104E_00833

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 15165 pcontig_001	EVM	gene	1232368	1232897	.	+	.	ID=evm.TU.pcontig_001.298;Name=gene_model_pcontig_001.298;locus_tag=Pst104E_00843

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 14907 pcontig_001	EVM	gene	7235	7782	.	-	.	ID=evm.TU.pcontig_001.3;Name=gene_model_pcontig_001.3;locus_tag=Pst104E_00585

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 15225 pcontig_001	EVM	gene	1537663	1538074	.	+	.	ID=evm.TU.pcontig_001.363;Name=g

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 15679 pcontig_002	EVM	gene	911584	912557	.	+	.	ID=evm.TU.pcontig_002.246;Name=gene_model_pcontig_002.246;locus_tag=Pst104E_01357

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 15489 pcontig_002	EVM	gene	124331	124824	.	-	.	ID=evm.TU.pcontig_002.28;Name=gene_model_pcontig_002.28;locus_tag=Pst104E_01167

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 15729 pcontig_002	EVM	gene	1193051	1193524	.	+	.	ID=evm.TU.pcontig_002.303;Name=gene_model_pcontig_002.303;locus_tag=Pst104E_01407

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 15739 pcontig_002	EVM	gene	1230375	1231131	.	+	.	ID=evm.TU.pcontig_002.315;Name=gene_model_pcontig_002.315;locus_tag=Pst104E_01417

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 15740 pcontig_002	EVM	gene	1235409	1236168	.	+	.	ID=evm.TU.pcontig_002.31

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 16143 pcontig_003	EVM	gene	1041891	1042959	.	+	.	ID=evm.TU.pcontig_003.240;Name=gene_model_pcontig_003.240;locus_tag=Pst104E_01821

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 16146 pcontig_003	EVM	gene	1051048	1051599	.	+	.	ID=evm.TU.pcontig_003.243;Name=gene_model_pcontig_003.243;locus_tag=Pst104E_01824

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 16150 pcontig_003	EVM	gene	1075985	1077012	.	-	.	ID=evm.TU.pcontig_003.248;Name=gene_model_pcontig_003.248;locus_tag=Pst104E_01828

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 16151 pcontig_003	EVM	gene	1079579	1080592	.	-	.	ID=evm.TU.pcontig_003.249;Name=gene_model_pcontig_003.249;locus_tag=Pst104E_01829

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 16156 pcontig_003	EVM	gene	1099037	1099956	.	-	.	ID=evm.TU.pcontig_

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 16434 pcontig_004	EVM	gene	65519	66389	.	+	.	ID=evm.TU.pcontig_004.19;Name=gene_model_pcontig_004.19;locus_tag=Pst104E_02112

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 16565 pcontig_004	EVM	gene	796769	797262	.	-	.	ID=evm.TU.pcontig_004.190;Name=gene_model_pcontig_004.190;locus_tag=Pst104E_02243

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 16566 pcontig_004	EVM	gene	799420	799997	.	-	.	ID=evm.TU.pcontig_004.191;Name=gene_model_pcontig_004.191;locus_tag=Pst104E_02244

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 16570 pcontig_004	EVM	gene	808330	809023	.	-	.	ID=evm.TU.pcontig_004.195;Name=gene_model_pcontig_004.195;locus_tag=Pst104E_02248

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 16589 pcontig_004	EVM	gene	882382	882924	.	+	.	ID=evm.TU.pcontig_004.214;Name=g

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 16838 pcontig_005	EVM	gene	200594	200910	.	+	.	ID=evm.TU.pcontig_005.51;Name=gene_model_pcontig_005.51;locus_tag=Pst104E_02516

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 16839 pcontig_005	EVM	gene	202004	203359	.	+	.	ID=evm.TU.pcontig_005.52;Name=gene_model_pcontig_005.52;locus_tag=Pst104E_02517

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 16793 pcontig_005	EVM	gene	18643	19574	.	-	.	ID=evm.TU.pcontig_005.6;Name=gene_model_pcontig_005.6;locus_tag=Pst104E_02471

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 16869 pcontig_005	EVM	gene	375893	376876	.	+	.	ID=evm.TU.pcontig_005.91;Name=gene_model_pcontig_005.91;locus_tag=Pst104E_02547

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 16873 pcontig_005	EVM	gene	393183	394679	.	+	.	ID=evm.TU.pcontig_005.95;Name=gene_model

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 17958 pcontig_007	EVM	gene	1713698	1714205	.	+	.	ID=evm.TU.pcontig_007.412;Name=gene_model_pcontig_007.412;locus_tag=Pst104E_03636

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 17962 pcontig_007	EVM	gene	1745437	1745968	.	+	.	ID=evm.TU.pcontig_007.418;Name=gene_model_pcontig_007.418;locus_tag=Pst104E_03640

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 17963 pcontig_007	EVM	gene	1755110	1755628	.	-	.	ID=evm.TU.pcontig_007.421;Name=gene_model_pcontig_007.421;locus_tag=Pst104E_03641

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 17964 pcontig_007	EVM	gene	1758365	1758886	.	-	.	ID=evm.TU.pcontig_007.422;Name=gene_model_pcontig_007.422;locus_tag=Pst104E_03642

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 17966 pcontig_007	EVM	gene	1762214	1762847	.	-	.	ID=evm.TU.pcontig_

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 18813 pcontig_010	EVM	gene	1012905	1013354	.	-	.	ID=evm.TU.pcontig_010.220;Name=gene_model_pcontig_010.220;locus_tag=Pst104E_04491

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 18824 pcontig_010	EVM	gene	1059873	1060315	.	+	.	ID=evm.TU.pcontig_010.231;Name=gene_model_pcontig_010.231;locus_tag=Pst104E_04502

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 18827 pcontig_010	EVM	gene	1070558	1071360	.	+	.	ID=evm.TU.pcontig_010.234;Name=gene_model_pcontig_010.234;locus_tag=Pst104E_04505

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 18830 pcontig_010	EVM	gene	1077227	1077728	.	-	.	ID=evm.TU.pcontig_010.237;Name=gene_model_pcontig_010.237;locus_tag=Pst104E_04508

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 18832 pcontig_010	EVM	gene	1082379	1082835	.	-	.	ID=evm.TU.pcontig_

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 19177 pcontig_011	EVM	gene	1230638	1231671	.	+	.	ID=evm.TU.pcontig_011.269;Name=gene_model_pcontig_011.269;locus_tag=Pst104E_04855

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 19195 pcontig_011	EVM	gene	1327282	1327725	.	-	.	ID=evm.TU.pcontig_011.292;Name=gene_model_pcontig_011.292;locus_tag=Pst104E_04873

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 19228 pcontig_011	EVM	gene	1486172	1486705	.	+	.	ID=evm.TU.pcontig_011.326;Name=gene_model_pcontig_011.326;locus_tag=Pst104E_04906

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 19245 pcontig_011	EVM	gene	1557403	1557829	.	+	.	ID=evm.TU.pcontig_011.347;Name=gene_model_pcontig_011.347;locus_tag=Pst104E_04923

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 19262 pcontig_011	EVM	gene	1606840	1607610	.	+	.	ID=evm.TU.pcontig_

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 19786 pcontig_013	EVM	gene	890614	891162	.	+	.	ID=evm.TU.pcontig_013.216;Name=gene_model_pcontig_013.216;locus_tag=Pst104E_05464

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 19804 pcontig_013	EVM	gene	962550	963070	.	+	.	ID=evm.TU.pcontig_013.238;Name=gene_model_pcontig_013.238;locus_tag=Pst104E_05482

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 19805 pcontig_013	EVM	gene	964082	964611	.	+	.	ID=evm.TU.pcontig_013.239;Name=gene_model_pcontig_013.239;locus_tag=Pst104E_05483

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 19813 pcontig_013	EVM	gene	1005660	1006500	.	-	.	ID=evm.TU.pcontig_013.247;Name=gene_model_pcontig_013.247;locus_tag=Pst104E_05491

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 19839 pcontig_013	EVM	gene	1131843	1132613	.	-	.	ID=evm.TU.pcontig_013.27

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 20662 pcontig_017	EVM	gene	645924	647381	.	+	.	ID=evm.TU.pcontig_017.154;Name=gene_model_pcontig_017.154;locus_tag=Pst104E_06340

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 20691 pcontig_017	EVM	gene	751493	752171	.	-	.	ID=evm.TU.pcontig_017.184;Name=gene_model_pcontig_017.184;locus_tag=Pst104E_06369

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 20703 pcontig_017	EVM	gene	778874	779535	.	+	.	ID=evm.TU.pcontig_017.196;Name=gene_model_pcontig_017.196;locus_tag=Pst104E_06381

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 20530 pcontig_017	EVM	gene	988	1494	.	-	.	ID=evm.TU.pcontig_017.2;Name=gene_model_pcontig_017.2;locus_tag=Pst104E_06208

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 20707 pcontig_017	EVM	gene	790006	790623	.	+	.	ID=evm.TU.pcontig_017.200;Name=gene_m

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 21208 pcontig_019	EVM	gene	803567	804100	.	+	.	ID=evm.TU.pcontig_019.168;Name=gene_model_pcontig_019.168;locus_tag=Pst104E_06886

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 21219 pcontig_019	EVM	gene	837168	837948	.	+	.	ID=evm.TU.pcontig_019.179;Name=gene_model_pcontig_019.179;locus_tag=Pst104E_06897

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 21232 pcontig_019	EVM	gene	898366	898885	.	+	.	ID=evm.TU.pcontig_019.197;Name=gene_model_pcontig_019.197;locus_tag=Pst104E_06910

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 21237 pcontig_019	EVM	gene	917385	917750	.	-	.	ID=evm.TU.pcontig_019.202;Name=gene_model_pcontig_019.202;locus_tag=Pst104E_06915

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 21257 pcontig_019	EVM	gene	1046095	1046680	.	+	.	ID=evm.TU.pcontig_019.231;

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 21867 pcontig_021	EVM	gene	1005612	1006212	.	-	.	ID=evm.TU.pcontig_021.253;Name=gene_model_pcontig_021.253;locus_tag=Pst104E_07545

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 21872 pcontig_021	EVM	gene	1036111	1037387	.	+	.	ID=evm.TU.pcontig_021.259;Name=gene_model_pcontig_021.259;locus_tag=Pst104E_07550

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 21885 pcontig_021	EVM	gene	1091270	1092108	.	+	.	ID=evm.TU.pcontig_021.273;Name=gene_model_pcontig_021.273;locus_tag=Pst104E_07563

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 21892 pcontig_021	EVM	gene	1131098	1131593	.	+	.	ID=evm.TU.pcontig_021.282;Name=gene_model_pcontig_021.282;locus_tag=Pst104E_07570

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 21897 pcontig_021	EVM	gene	1150752	1151568	.	+	.	ID=evm.TU.pcontig_

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 22345 pcontig_023	EVM	gene	730652	731770	.	+	.	ID=evm.TU.pcontig_023.158;Name=gene_model_pcontig_023.158;locus_tag=Pst104E_08023

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 22357 pcontig_023	EVM	gene	795774	796412	.	-	.	ID=evm.TU.pcontig_023.175;Name=gene_model_pcontig_023.175;locus_tag=Pst104E_08035

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 22364 pcontig_023	EVM	gene	821737	822241	.	+	.	ID=evm.TU.pcontig_023.182;Name=gene_model_pcontig_023.182;locus_tag=Pst104E_08042

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 22374 pcontig_023	EVM	gene	869570	870098	.	-	.	ID=evm.TU.pcontig_023.192;Name=gene_model_pcontig_023.192;locus_tag=Pst104E_08052

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 22389 pcontig_023	EVM	gene	950166	950781	.	+	.	ID=evm.TU.pcontig_023.212;Na

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 23009 pcontig_027	EVM	gene	601083	601569	.	-	.	ID=evm.TU.pcontig_027.132;Name=gene_model_pcontig_027.132;locus_tag=Pst104E_08687

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 23036 pcontig_027	EVM	gene	723192	723738	.	-	.	ID=evm.TU.pcontig_027.163;Name=gene_model_pcontig_027.163;locus_tag=Pst104E_08714

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 23037 pcontig_027	EVM	gene	724699	725247	.	-	.	ID=evm.TU.pcontig_027.164;Name=gene_model_pcontig_027.164;locus_tag=Pst104E_08715

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 23067 pcontig_027	EVM	gene	900255	900839	.	+	.	ID=evm.TU.pcontig_027.202;Name=gene_model_pcontig_027.202;locus_tag=Pst104E_08745

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 23070 pcontig_027	EVM	gene	912236	912833	.	+	.	ID=evm.TU.pcontig_027.205;Na

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 23391 pcontig_029	EVM	gene	419571	420534	.	+	.	ID=evm.TU.pcontig_029.90;Name=gene_model_pcontig_029.90;locus_tag=Pst104E_09069

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 23393 pcontig_029	EVM	gene	423922	424880	.	+	.	ID=evm.TU.pcontig_029.92;Name=gene_model_pcontig_029.92;locus_tag=Pst104E_09071

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 23396 pcontig_029	EVM	gene	438340	439586	.	+	.	ID=evm.TU.pcontig_029.97;Name=gene_model_pcontig_029.97;locus_tag=Pst104E_09074

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 23635 pcontig_030	EVM	gene	536355	537405	.	-	.	ID=evm.TU.pcontig_030.121;Name=gene_model_pcontig_030.121;locus_tag=Pst104E_09313

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 23683 pcontig_030	EVM	gene	795780	796296	.	-	.	ID=evm.TU.pcontig_030.175;Name=gen

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 24136 pcontig_033	EVM	gene	971215	971960	.	-	.	ID=evm.TU.pcontig_033.221;Name=gene_model_pcontig_033.221;locus_tag=Pst104E_09814

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 24144 pcontig_033	EVM	gene	986812	987515	.	-	.	ID=evm.TU.pcontig_033.229;Name=gene_model_pcontig_033.229;locus_tag=Pst104E_09822

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 24154 pcontig_033	EVM	gene	1047031	1047597	.	-	.	ID=evm.TU.pcontig_033.240;Name=gene_model_pcontig_033.240;locus_tag=Pst104E_09832

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 24004 pcontig_033	EVM	gene	330197	331056	.	+	.	ID=evm.TU.pcontig_033.73;Name=gene_model_pcontig_033.73;locus_tag=Pst104E_09682

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 24258 pcontig_034	EVM	gene	555423	555752	.	-	.	ID=evm.TU.pcontig_034.113;Na

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 24425 pcontig_035	EVM	gene	258505	259273	.	+	.	ID=evm.TU.pcontig_035.67;Name=gene_model_pcontig_035.67;locus_tag=Pst104E_10103

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 24426 pcontig_035	EVM	gene	260389	261137	.	+	.	ID=evm.TU.pcontig_035.68;Name=gene_model_pcontig_035.68;locus_tag=Pst104E_10104

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 24427 pcontig_035	EVM	gene	264098	264833	.	+	.	ID=evm.TU.pcontig_035.69;Name=gene_model_pcontig_035.69;locus_tag=Pst104E_10105

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 24428 pcontig_035	EVM	gene	273726	274463	.	+	.	ID=evm.TU.pcontig_035.70;Name=gene_model_pcontig_035.70;locus_tag=Pst104E_10106

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 24438 pcontig_035	EVM	gene	303311	304116	.	+	.	ID=evm.TU.pcontig_035.80;Name=gene_m

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 25051 pcontig_039	EVM	gene	828186	828678	.	-	.	ID=evm.TU.pcontig_039.169;Name=gene_model_pcontig_039.169;locus_tag=Pst104E_10729

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 25055 pcontig_039	EVM	gene	844980	845597	.	-	.	ID=evm.TU.pcontig_039.174;Name=gene_model_pcontig_039.174;locus_tag=Pst104E_10733

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 25056 pcontig_039	EVM	gene	847234	847798	.	-	.	ID=evm.TU.pcontig_039.175;Name=gene_model_pcontig_039.175;locus_tag=Pst104E_10734

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 25057 pcontig_039	EVM	gene	858904	859514	.	-	.	ID=evm.TU.pcontig_039.177;Name=gene_model_pcontig_039.177;locus_tag=Pst104E_10735

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 25061 pcontig_039	EVM	gene	869591	871112	.	+	.	ID=evm.TU.pcontig_039.181;Na

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 25531 pcontig_042	EVM	gene	481553	482560	.	-	.	ID=evm.TU.pcontig_042.130;Name=gene_model_pcontig_042.130;locus_tag=Pst104E_11209

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 25574 pcontig_042	EVM	gene	664350	665262	.	+	.	ID=evm.TU.pcontig_042.179;Name=gene_model_pcontig_042.179;locus_tag=Pst104E_11252

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 25582 pcontig_042	EVM	gene	720144	720566	.	-	.	ID=evm.TU.pcontig_042.188;Name=gene_model_pcontig_042.188;locus_tag=Pst104E_11260

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 25444 pcontig_042	EVM	gene	76119	76709	.	-	.	ID=evm.TU.pcontig_042.31;Name=gene_model_pcontig_042.31;locus_tag=Pst104E_11122

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 25431 pcontig_042	EVM	gene	5968	6965	.	+	.	ID=evm.TU.pcontig_042.4;Name=gene_mo

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 26174 pcontig_046	EVM	gene	698972	699530	.	-	.	ID=evm.TU.pcontig_046.159;Name=gene_model_pcontig_046.159;locus_tag=Pst104E_11852

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 26176 pcontig_046	EVM	gene	701169	701715	.	-	.	ID=evm.TU.pcontig_046.161;Name=gene_model_pcontig_046.161;locus_tag=Pst104E_11854

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 26177 pcontig_046	EVM	gene	705561	706108	.	-	.	ID=evm.TU.pcontig_046.162;Name=gene_model_pcontig_046.162;locus_tag=Pst104E_11855

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 26048 pcontig_046	EVM	gene	10717	11565	.	+	.	ID=evm.TU.pcontig_046.3;Name=gene_model_pcontig_046.3;locus_tag=Pst104E_11726

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 26076 pcontig_046	EVM	gene	170382	170737	.	+	.	ID=evm.TU.pcontig_046.35;Name=gene

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 26812 pcontig_052	EVM	gene	210164	211434	.	-	.	ID=evm.TU.pcontig_052.57;Name=gene_model_pcontig_052.57;locus_tag=Pst104E_12490

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 26813 pcontig_052	EVM	gene	212781	213626	.	+	.	ID=evm.TU.pcontig_052.58;Name=gene_model_pcontig_052.58;locus_tag=Pst104E_12491

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 26770 pcontig_052	EVM	gene	29108	29477	.	+	.	ID=evm.TU.pcontig_052.8;Name=gene_model_pcontig_052.8;locus_tag=Pst104E_12448

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 27018 pcontig_054	EVM	gene	520349	520886	.	-	.	ID=evm.TU.pcontig_054.116;Name=gene_model_pcontig_054.116;locus_tag=Pst104E_12696

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 27021 pcontig_054	EVM	gene	528679	529274	.	-	.	ID=evm.TU.pcontig_054.119;Name=gene_mo

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 27576 pcontig_059	EVM	gene	365500	365887	.	+	.	ID=evm.TU.pcontig_059.100;Name=gene_model_pcontig_059.100;locus_tag=Pst104E_13254

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 27594 pcontig_059	EVM	gene	435775	436458	.	-	.	ID=evm.TU.pcontig_059.118;Name=gene_model_pcontig_059.118;locus_tag=Pst104E_13272

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 27597 pcontig_059	EVM	gene	459166	460247	.	-	.	ID=evm.TU.pcontig_059.123;Name=gene_model_pcontig_059.123;locus_tag=Pst104E_13275

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 27501 pcontig_059	EVM	gene	49628	50394	.	-	.	ID=evm.TU.pcontig_059.13;Name=gene_model_pcontig_059.13;locus_tag=Pst104E_13179

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 27505 pcontig_059	EVM	gene	77912	78666	.	-	.	ID=evm.TU.pcontig_059.17;Name=gene

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 28006 pcontig_066	EVM	gene	2169	2840	.	+	.	ID=evm.TU.pcontig_066.1;Name=gene_model_pcontig_066.1;locus_tag=Pst104E_13684

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 28014 pcontig_066	EVM	gene	57439	58719	.	+	.	ID=evm.TU.pcontig_066.15;Name=gene_model_pcontig_066.15;locus_tag=Pst104E_13692

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 28046 pcontig_066	EVM	gene	250098	250741	.	-	.	ID=evm.TU.pcontig_066.53;Name=gene_model_pcontig_066.53;locus_tag=Pst104E_13724

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 28063 pcontig_066	EVM	gene	335626	336999	.	-	.	ID=evm.TU.pcontig_066.71;Name=gene_model_pcontig_066.71;locus_tag=Pst104E_13741

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 28065 pcontig_066	EVM	gene	344346	344833	.	-	.	ID=evm.TU.pcontig_066.73;Name=gene_model_pco

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 28798 pcontig_078	EVM	gene	156944	158830	.	+	.	ID=evm.TU.pcontig_078.34;Name=gene_model_pcontig_078.34;locus_tag=Pst104E_14476

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 28802 pcontig_078	EVM	gene	176380	177837	.	-	.	ID=evm.TU.pcontig_078.38;Name=gene_model_pcontig_078.38;locus_tag=Pst104E_14480

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 28807 pcontig_078	EVM	gene	203730	204651	.	-	.	ID=evm.TU.pcontig_078.45;Name=gene_model_pcontig_078.45;locus_tag=Pst104E_14485

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 28810 pcontig_078	EVM	gene	218768	219500	.	+	.	ID=evm.TU.pcontig_078.48;Name=gene_model_pcontig_078.48;locus_tag=Pst104E_14488

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 28819 pcontig_078	EVM	gene	262755	263910	.	+	.	ID=evm.TU.pcontig_078.60;Name=gene_m

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 29438 pcontig_092	EVM	gene	29053	29570	.	+	.	ID=evm.TU.pcontig_092.7;Name=gene_model_pcontig_092.7;locus_tag=Pst104E_15116

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 29439 pcontig_092	EVM	gene	32018	32820	.	+	.	ID=evm.TU.pcontig_092.8;Name=gene_model_pcontig_092.8;locus_tag=Pst104E_15117

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 29510 pcontig_095	EVM	gene	177817	179290	.	+	.	ID=evm.TU.pcontig_095.44;Name=gene_model_pcontig_095.44;locus_tag=Pst104E_15188

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 29533 pcontig_097	EVM	gene	99943	100330	.	+	.	ID=evm.TU.pcontig_097.21;Name=gene_model_pcontig_097.21;locus_tag=Pst104E_15211

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 29568 pcontig_100	EVM	gene	116545	117367	.	-	.	ID=evm.TU.pcontig_100.29;Name=gene_model_pcon

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 30176 pcontig_203	EVM	gene	19978	20847	.	-	.	ID=evm.TU.pcontig_203.6;Name=gene_model_pcontig_203.6;locus_tag=Pst104E_15854

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 30193 pcontig_207	EVM	gene	1295	2150	.	+	.	ID=evm.TU.pcontig_207.1;Name=gene_model_pcontig_207.1;locus_tag=Pst104E_15871

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 30236 pcontig_233	EVM	gene	24904	25501	.	+	.	ID=evm.TU.pcontig_233.6;Name=gene_model_pcontig_233.6;locus_tag=Pst104E_15914

/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.genes_only.gff3 30242 pcontig_235	EVM	gene	24769	25928	.	+	.	ID=evm.TU.pcontig_235.7;Name=gene_model_pcontig_235.7;locus_tag=Pst104E_15920



In [54]:
# Make a dictionary of feature files
feature_bed_dict = {}
for key, value in feature_fn_dict.items():
    feature_bed_dict[key] = BedTool(value)

In [69]:
# Check whether the function bed dictionary works (it does)
pprint.pprint(feature_bed_dict)

{'TE': <BedTool(/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.TE.sorted.gff3)>,
 'effector': <BedTool(/home/anjuni/analysis/input_for_windows/Pst_104E_v13_ph_ctg.effectors.gff3)>,
 'genes': <BedTool(/home/anjuni/analysis/gff_output/Pst_104E_v13_ph_ctg.anno.sorted.gff3)>,
 'ont_6mA_0.10': <BedTool(/home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/cutoffs_6mA/6mA_hc_tombo_sorted.cutoff.0.10.bed)>,
 'pb_6mA_0.10': <BedTool(/home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/cutoffs_6mA/6mA_prob_smrtlink_sorted.cutoff.0.10.bed)>}


In [76]:
%%bash
# Make a subset of windows from pcontig_019 as a test dataset
cd /home/anjuni/analysis/windows/
mkdir test_windows
for x in *.bed
do
len=${#x}
name=${x::len-4}
echo ${name}
grep 'pcontig_019' ${x} > test_windows/${name}.pcontig_019.bed
done

Pst_104E_v13_ph_ctg_w100kb
Pst_104E_v13_ph_ctg_w10kb
Pst_104E_v13_ph_ctg_w30kb


In [77]:
# Make a filepath dictionary and a bed file dictionary of the test windows
test_window_fn_dict = {}
test_window_fn_dict['100kb'] = os.path.join(DIRS['WINDOW_OUTPUT'], 'test_windows', 'Pst_104E_v13_ph_ctg_w100kb.pcontig_019.bed')
test_window_fn_dict['10kb'] = os.path.join(DIRS['WINDOW_OUTPUT'], 'test_windows', 'Pst_104E_v13_ph_ctg_w10kb.pcontig_019.bed')
test_window_fn_dict['30kb'] = os.path.join(DIRS['WINDOW_OUTPUT'], 'test_windows', 'Pst_104E_v13_ph_ctg_w30kb.pcontig_019.bed')

test_window_bed_dict = {}
for key, value in test_window_fn_dict.items():
    test_window_bed_dict[key] = BedTool(value)

In [78]:
pprint.pprint(test_window_bed_dict)

{'100kb': <BedTool(/home/anjuni/analysis/windows/test_windows/Pst_104E_v13_ph_ctg_w100kb.pcontig_019.bed)>,
 '10kb': <BedTool(/home/anjuni/analysis/windows/test_windows/Pst_104E_v13_ph_ctg_w10kb.pcontig_019.bed)>,
 '30kb': <BedTool(/home/anjuni/analysis/windows/test_windows/Pst_104E_v13_ph_ctg_w30kb.pcontig_019.bed)>}


In [143]:
# Adjust the overlap function to apply to this test dataset
def overlap_windows_with_features(window_bed_dict, feature_bed_dict, feature_overlap_df_dict):
    for wkey, wbed in window_bed_dict.items():
        for fkey, fbed in feature_bed_dict.items():
            tmp_df = wbed.coverage(fbed, F=0.1).to_dataframe().iloc[:,[0,1,2,3,6]]
            tmp_df.rename(columns={'name': 'overlap_count', 'thickStart': 'overlap_fraction'}, inplace=True)
            tmp_fn = feature_fn_dict[fkey].replace('bed', '%s.overlap.bed' % wkey)
            feature_overlap_df_dict[tmp_fn.split('/')[-1]] = tmp_df
            tmp_df.to_csv(tmp_fn, sep='\t', header=None, index=None)
            tmp_fn = feature_fn_dict[fkey].replace('bed', '%s.overlap.circabed' % wkey)
            tmp_df.to_csv(tmp_fn, sep='\t', index=None)

In [79]:
%%bash
# Test out overlaps for test dataset on command line
cd /home/anjuni/analysis/windows/test_windows
features=/home/anjuni/analysis/gff_output
methyl=/home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/cutoffs_6mA
ont_6mA_100kb=100kb_6mA_hc_tombo_0.10.bed
pb_6mA_100kb=100kb_6mA_prob_smrtlink_0.10.bed

coverageBed -a Pst_104E_v13_ph_ctg_w100kb.pcontig_019.bed -b ${methyl}/6mA_hc_tombo_sorted.cutoff.0.10.bed > 100kb_6mA_hc_tombo_0.10.bed

In [81]:
%%bash
# Test out the histogram function in coverageBed
cd /home/anjuni/analysis/windows/test_windows
features=/home/anjuni/analysis/gff_output
methyl=/home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/cutoffs_6mA
coverageBed -a Pst_104E_v13_ph_ctg_w100kb.pcontig_019.bed -b ${methyl}/6mA_hc_tombo_sorted.cutoff.0.10.bed -hist > h100kb_6mA_hc_tombo_0.10.bed

# It just puts a row for all(?) at the bottom?

In [82]:
# Test out Ben's function to see if it's easier?
# make a dataframe to put headings
# (the function kwarg .coverage(F=0.1) indicates minimum fraction overlap)
tmp_df = test_window_bed_dict['100kb'].coverage(feature_fn_dict['ont_6mA_0.10']).to_dataframe().iloc[:,[0,1,2,3,6]]

In [83]:
# check dataframe
tmp_df.head()

Unnamed: 0,chrom,start,end,name,thickStart
0,pcontig_019,0,100000,53996,0.53996
1,pcontig_019,100000,200000,52718,0.52718
2,pcontig_019,200000,300000,54436,0.54436
3,pcontig_019,300000,400000,53445,0.53445
4,pcontig_019,400000,500000,52814,0.52814


In [84]:
# rename headings
tmp_df.rename(columns={'name': 'overlap_count', 'thickStart': 'overlap_fraction'}, inplace=True)
tmp_df.head()

Unnamed: 0,chrom,start,end,overlap_count,overlap_fraction
0,pcontig_019,0,100000,53996,0.53996
1,pcontig_019,100000,200000,52718,0.52718
2,pcontig_019,200000,300000,54436,0.54436
3,pcontig_019,300000,400000,53445,0.53445
4,pcontig_019,400000,500000,52814,0.52814


In [87]:
# change output file path
tmp_fn = feature_fn_dict['ont_6mA_0.10'].replace('.bed', '%s.overlap.bed' % '.100kb')
print(tmp_fn)

/home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/cutoffs_6mA/6mA_hc_tombo_sorted.cutoff.0.10.100kb.overlap.bed


In [93]:
# make a dictionary for overlap file name as key and dataframe as value
feature_overlap_df_dict = {}
feature_overlap_df_dict[tmp_fn.split('/')[-1]] = tmp_df
pprint.pprint(feature_overlap_df_dict)

{'6mA_hc_tombo_sorted.cutoff.0.10.100kb.overlap.bed':           chrom    start      end  overlap_count  overlap_fraction
0   pcontig_019        0   100000          53996          0.539960
1   pcontig_019   100000   200000          52718          0.527180
2   pcontig_019   200000   300000          54436          0.544360
3   pcontig_019   300000   400000          53445          0.534450
4   pcontig_019   400000   500000          52814          0.528140
5   pcontig_019   500000   600000          54537          0.545370
6   pcontig_019   600000   700000          53492          0.534920
7   pcontig_019   700000   800000          52387          0.523870
8   pcontig_019   800000   900000          52341          0.523410
9   pcontig_019   900000  1000000          52721          0.527210
10  pcontig_019  1000000  1100000          52403          0.524030
11  pcontig_019  1100000  1200000          53150          0.531500
12  pcontig_019  1200000  1300000          44636          0.446360
13  pcon

In [94]:
# save to a csv (note: pybedtools has more decimal places than bash bedtools)
tmp_df.to_csv(tmp_fn, sep='\t', header=None, index=None) # no headers or row names in csv

In [95]:
# made a circabed file(???)
tmp_fn = feature_fn_dict['ont_6mA_0.10'].replace('.bed', '%s.overlap.circabed' % '.100kb')
print(tmp_fn)

/home/anjuni/methylation_calling/pacbio/input/sorted_bed_files/cutoffs_6mA/6mA_hc_tombo_sorted.cutoff.0.10.100kb.overlap.circabed


In [96]:
# this actually just has headings and I may not need this file
tmp_df.to_csv(tmp_fn, sep='\t', index=None)

In [None]:
# Fixed up this loop to work for my files
# changed suffixes because fn for bed files had bed in the path twice, and the first bed was being replaced instead of the fn
feature_overlap_df_dict = {}
for wkey, wbed in window_bed_dict.items():
    for fkey, fbed in feature_bed_dict.items():
        tmp_df = wbed.coverage(fbed).to_dataframe().iloc[:,[0,1,2,3,6]] # make a dataframe to put headings
        tmp_df.rename(columns={'name': 'overlap_count', 'thickStart': 'overlap_fraction'}, inplace=True) # rename headings
        tmp_fn = feature_fn_dict[fkey].replace('.bed', '.%s.overlap.bed' % wkey) # change output file path
        feature_overlap_df_dict[tmp_fn.split('/')[-1]] = tmp_df # file name as key and dataframe as value for overlap dict
        tmp_df.to_csv(tmp_fn, sep='\t', header=None, index=None) # save to a csv(pybedtools outputs more d.p. than BEDTools)

### In summary, use pybedtools for coverage and use adapted version of Ben's function (above)

In [None]:
# Run overlaps for test dataset


In [None]:
# Run overlaps between windows and features
overlap_windows_with_features()

In [137]:
# Run overlaps between windows and features
feature_overlap_df_dict = {}
for wkey, wbed in window_bed_dict.items():
    for fkey, fbed in feature_bed_dict.items():
        tmp_df = wbed.coverage(fbed, F=0.1).to_dataframe().iloc[:,[0,1,2,3,6]] #(F=0.1 indicates minimum fraction overlap)
        tmp_df.rename(columns={'name': 'overlap_count', 'thickStart': 'overlap_fraction'}, inplace=True)
        tmp_fn = feature_fn_dict[fkey].replace('bed', '%s.overlap.bed' % wkey)
        feature_overlap_df_dict[tmp_fn.split('/')[-1]] = tmp_df
        tmp_df.to_csv(tmp_fn, sep='\t', header=None, index=None)
        tmp_fn = feature_fn_dict[fkey].replace('bed', '%s.overlap.circabed' % wkey)
        tmp_df.to_csv(tmp_fn, sep='\t', index=None)

## <span style='color:#148aff'> 5. Intersecting methylation with gene annotation files. <span/>

## <span style='color:#14c4ff'> 6. Analysing gene expression files. <span/>

## <span style='color:#15c66e'> 7. Intersecting with transposons expression files. <span/>

## <span style='color:#9ac615'> 8. Comparing methylated transposons and genes. <span/>

## <span style='color:#ffa347'> 9. Identifying effector genes. <span/>

## <span style='color:#ff4f14'> 10. Expression of methylation machinery throughout Pst life cycle. <span/>