Genome 1 refers to the initially assembled genome (in the scripts this is the "old genome"), and Genome 2 refers to the genome after assembling with flye and multiple rounds of medaka polishing (referred to the new genome in the scripts)
Summary of all results is documented on this spreadsheet
- Genome files (Genome 1 and Genome 2)
- RNA-seq library generated for P. africana
- Protein files
- Protein file with both P. africana and Physomitrium patens
- Only P. africana proteins
BUSCO was run for the genome produced after the genome asssembly of P. africana. The genome was run against the eukaryota and viridiplantae databases.
Ran HISAT2 to check mapping rates of the for:
Transcriptome input | Genome input | Mapping rate |
---|---|---|
P. africana trimmed transcriptomic reads | Genome 1 | 90.06% |
P. africana untrimmed transcriptomic reads | Genome 1 | 69.90% |
P. patens trimmed reads | Genome 1 | 42.62% |
P. africana trimmed transcriptomic reads | Genome 2 | 90.62% |
With this, we know that the RNA-seq libraries are definitely of P. africana based on the higher mapping rates compared to P. patens
Multiple iterations of BRAKER were run (repeated for BRAKER v2.0.5 and BRAKER v2.1.5)
- Only genome + bam file from transcriptome
- Genome + bam + protein from P. africana and P. patens
There were additional runs of BRAKER run, for troubleshooting purposes (for only v2.1.5):
- Genome + P. patens bam file
- Genome + bam + protein from P. africana