Skip to content
Permalink
Browse files
fix: sampled suffix array (#447)
* Fixed sampled suffix array sentinels bug and added more tests

* Serde derive for sampled suffix array

* Cargo fmt

* Clarified comments

* Allow Occ to be clonable

* Make fm index accessible in sampled suffix array

* Added unsafe fmd index contruction useful for special cases

* Added random test cases
  • Loading branch information
Daniel-Liu-c0deb0t committed Aug 23, 2021
1 parent 125bf20 commit 00f9846
Show file tree
Hide file tree
Showing 3 changed files with 250 additions and 141 deletions.
@@ -73,7 +73,7 @@ pub fn invert_bwt(bwt: &BWTSlice) -> Vec<u8> {
}

/// An occurrence array implementation.
#[derive(Serialize, Deserialize)]
#[derive(Clone, Serialize, Deserialize)]
pub struct Occ {
occ: Vec<Vec<usize>>,
k: u32,
@@ -480,6 +480,21 @@ impl<DBWT: Borrow<BWT>, DLess: Borrow<Less>, DOcc: Borrow<Occ>> FMDIndex<DBWT, D

self.backward_ext(&interval.swapped(), comp_a).swapped()
}

/// Construct a new instance of the FMD index (see Heng Li (2012) Bioinformatics)
/// without checking whether the text is over the DNA alphabet with N.
/// This expects a BWT that was created from a text over the DNA alphabet with N
/// (`alphabets::dna::n_alphabet()`) consisting of the
/// concatenation with its reverse complement, separated by the sentinel symbol `$`.
/// I.e., let T be the original text and R be its reverse complement.
/// Then, the expected text is T$R$. Further, multiple concatenated texts are allowed, e.g.
/// T1$R1$T2$R2$T3$R3$.
/// It is unsafe to construct an FMD index from an FM index that is not built on the DNA alphabet.
pub unsafe fn from_fmindex_unchecked(
fmindex: FMIndex<DBWT, DLess, DOcc>,
) -> FMDIndex<DBWT, DLess, DOcc> {
FMDIndex { fmindex }
}

Check warning on line 497 in src/data_structures/fmindex.rs

GitHub Actions / clippy

unsafe function's docs miss `# Safety` section

warning: unsafe function's docs miss `# Safety` section
   --> src/data_structures/fmindex.rs:493:5
    |
493 | /     pub unsafe fn from_fmindex_unchecked(
494 | |         fmindex: FMIndex<DBWT, DLess, DOcc>,
495 | |     ) -> FMDIndex<DBWT, DLess, DOcc> {
496 | |         FMDIndex { fmindex }
497 | |     }
    | |_____^
    |
    = note: `#[warn(clippy::missing_safety_doc)]` on by default
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#missing_safety_doc

}

#[cfg(test)]

0 comments on commit 00f9846

Please sign in to comment.