Skip to content

Latest commit

 

History

History
78 lines (52 loc) · 1.92 KB

library.rst

File metadata and controls

78 lines (52 loc) · 1.92 KB

Using the Python library

kPAL provides a light-weight Python library for creating, analysing, and manipulating k-mer profiles. It is implemented on top of NumPy.

This is a gentle introduction to the library. Consult the api for more detailed documentation.

k-mer profiles

kpal.klib

The class Profile is the central object in kPAL. It encapsulates k-mer counts and provides operations on them.

Instead of using the Profile constructor directly, you should generally use one of the profile construction methods. One of those is Profile.from_fasta. The following code creates a 6-mer profile by counting from a FASTA file:

>>> from kpal.klib import Profile
>>> p = Profile.from_fasta(open('a.fasta'), 6)

The profile object has several properties. For example, we can ask for the k-mer length (also known as k), the total k-mer count, or the median count per k-mer:

>>> p.length
6
>>> p.total
49995
>>> p.median
12.0

Counts are stored as a NumPy ~numpy.ndarray of integers, one for each possible k-mer, in alphabetical order:

>>> len(p.counts)
4096
>>> p.counts
array([ 8, 11,  5, ...,  7, 12, 13])

We can get the index in that array for a certain k-mer using the ~Profile.dna_to_binary method:

>>> i = p.dna_to_binary('AATTAA')
>>> p.counts[i]
13

Storing k-mer profiles

Todo.

Differences between k-mer profiles

Todo.