Skip to content

Characterization of diploid genomes using k mer spectra analysis

Kamil S. Jaron edited this page Feb 9, 2023 · 1 revision

4. Fitting genomescope models

This is a follow-up tutorial to the Introduction to K-mer spectra analysis.

resources needed

This tutorial expect following software to be installed:

If you use conda, you can follow this tutorial how to make a conda environment that includes all the software needed in this section.


Probably the easiest way to get a feel for fitting genomescope models is to go through multiple k-mer histograms and fit those models. We have collected histograms of several species in a collection you can download from following url.

wget https://github.com/KamilSJaron/oh-know/blob/main/data/data_for_genomescope.tar.gz
tar -xzvf data_for_genomescope.tar.gz

There histograms are showing various patterns and problems of wrongly fitted models, some of the common pitfalls are explained in the lecture introducing this section.

Now, let's fit all the genomescope models. If you are not sure how to go about it, see the crayfish example bellow. If you manage to fit the models 1. - 8., check also the bunch of pretty messed up k-mer spectra of nematodes in the funky_nematodes direcotry.

  1. Begonia
  2. Bombina
  3. Cape Bees
  4. Crayfish
  5. Mercurialis
  6. Springtails
  7. Stick insects
  8. Strawberry
  9. BONUS Funky nematodes

Where to go next

You can check our guest lecture: allo or auto ploid? What do I expect my kmer spectra look like? by Hannes Becher to see other approaches how to fit models to a k-mer spectrum.

Table of content

Introduction

k-mer spectra analysis

Separation of chromosomes

Species assignment using short k-mers

Others

Clone this wiki locally