chordclust

Chordclust implements similarity clustering using rust-bio.

Algorithm

The algorithm is a greedy search, similar to what is explained in https://www.drive5.com/usearch/manual/uclust_algo.html. It uses similarity instead of identity (for now):

Sort by sequence length (bigger is first)
For each sequence, compare it with the database of centroids:

If identity with best match > T: add to cluster of best match
Else: form a new cluster

Hierarchical

With this kind of heuristic clustering, it is indicated to use a hierarchical approach:

Given the sequences to cluster seqs and a descending array of similarity thresholds [T].
For each similarity threshold T in [T]:

Apply clustering with T to seqs
seqs <- current centroids

The final structure is built by expanding the lower similarity clusters with the members of their corresponding higher clusters.

License

Licensed under either of

Apache License, Version 2.0, (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
MIT license (LICENSE-MIT or http://opensource.org/licenses/MIT)

at your option.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

README.md is automatically generated on CI using cargo-readme. Please, modify README.tpl or lib.rs instead (check the github worflow for more details).

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
.github/workflows		.github/workflows
benches		benches
examples		examples
src		src
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
Cargo.toml		Cargo.toml
LICENSE-APACHE		LICENSE-APACHE
LICENSE-MIT		LICENSE-MIT
README.md		README.md
README.tpl		README.tpl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Licenses found

Repository files navigation

chordclust

Algorithm

Hierarchical

License

Contribution

About

Licenses found

Releases

Packages

Languages

License

Licenses found

carrascomj/chordclust

Folders and files

Latest commit

History

Repository files navigation

chordclust

Algorithm

Hierarchical

License

Contribution

About

Resources

License

Licenses found

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages