kPAL provides a light-weight Python library for creating, analysing, and manipulating k-mer profiles. It is implemented on top of NumPy.
This is a gentle introduction to the library. Consult the api
for more detailed documentation.
kpal.klib
The class Profile
is the central object in kPAL. It encapsulates k-mer counts and provides operations on them.
Instead of using the Profile
constructor directly, you should generally use one of the profile construction methods. One of those is Profile.from_fasta
. The following code creates a 6-mer profile by counting from a FASTA file:
>>> from kpal.klib import Profile
>>> p = Profile.from_fasta(open('a.fasta'), 6)
The profile object has several properties. For example, we can ask for the k-mer length (also known as k), the total k-mer count, or the median count per k-mer:
>>> p.length
6
>>> p.total
49995
>>> p.median
12.0
Counts are stored as a NumPy ~numpy.ndarray
of integers, one for each possible k-mer, in alphabetical order:
>>> len(p.counts)
4096
>>> p.counts
array([ 8, 11, 5, ..., 7, 12, 13])
We can get the index in that array for a certain k-mer using the ~Profile.dna_to_binary
method:
>>> i = p.dna_to_binary('AATTAA')
>>> p.counts[i]
13
Todo.
Todo.