Skip to content

Commit

Permalink
Merge pull request #13 from jermp/dev
Browse files Browse the repository at this point in the history
Dev
  • Loading branch information
jermp committed Jul 1, 2022
2 parents f03d70e + 6a0b266 commit 61061b8
Showing 1 changed file with 12 additions and 7 deletions.
19 changes: 12 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,18 +17,23 @@ Please, cite these papers if you use SSHash.
For a dictionary of n k-mers,
two basic queries are supported:

- i = Lookup(g), where i is in [0,n) if the k-mer g is found in the dictionary or i = -1 otherwise;
- g = Access(i), where g is the k-mer associated to the identifier i.
- i = **Lookup**(g), where i is in [0,n) if the k-mer g is found in the dictionary or i = -1 otherwise;
- g = **Access**(i), where g is the k-mer associated to the identifier i.

If also the weights of the k-mers (their frequency counts) are stored in the dictionary, then the dictionary is said to be *weighted* and it also supports:

- w = Weight(i), where i is a given k-mer identifier and w is the weight of the k-mer.
- w = **Weight**(i), where i is a given k-mer identifier and w is the weight of the k-mer.

A membership query (determine if a given k-mer is present in the dictionary or not) is, therefore, supported by means of the Lookup query.
The dictionary can also stream through all k-mers of a given DNA file
Other supported queries are:

- **Membership Queries**: determine if a given k-mer is present in the dictionary or not.
- **Streaming Queries**: stream through all k-mers of a given DNA file
(.fasta or .fastq formats) to determine their membership to the dictionary.
- **Navigational Queries**: given a k-mer g[1..k] determine if g[2..k]+x is present (forward neighbourhood) and if x+g[1..k-1] is present (backward neighbourhood), for x = A, C, G, T ('+' here means string concatenation).
SSHash internally stores a set of strings, called *contigs* in the following, each associated to a distinct identifier.
If a contig identifier is specified for a navigational query (rather than a k-mer), then the backward neighbourhood of the first k-mer and the forward neighbourhood of the last k-mer in the contig are returned.

**NOTE**: The Lookup query assumes that two k-mers being the *reverse complement* of each other are the same.
**NOTE**: It is assumed that two k-mers being the *reverse complement* of each other are the same.

#### Table of contents
* [Compiling the Code](#compiling-the-code)
Expand Down Expand Up @@ -357,7 +362,7 @@ Below the complete query reports.
Author
------

Giulio Ermanno Pibiri - <giulio.ermanno.pibiri@isti.cnr.it>
[Giulio Ermanno Pibiri](https://jermp.github.io) - <giulioermanno.pibiri@unive.it>

References
-----
Expand Down

0 comments on commit 61061b8

Please sign in to comment.