Releases: sourmash-bio/sourmash
v4.8.0
Major new features:
Minor new features:
- update prefetch/gather output to be clearer (#2543)
Cleanup and documentation updates:
Developer updates:
- Remove pkg_resources usage (#2505)
- Add LICENSE and test data to sdist (#2490)
- Build pyodide wheels in CI (#2433)
- Update nix deps (#2506)
- bump to pyodide 0.23 (#2545)
Dependabot updates:
- Bump serde_json from 1.0.94 to 1.0.95 (#2540)
- Bump needletail from 0.5.0 to 0.5.1 (#2541)
- Bump serde from 1.0.156 to 1.0.158 (#2534)
- Bump thiserror from 1.0.39 to 1.0.40 (#2533)
- Bump typed-builder from 0.13.0 to 0.14.0 (#2527)
- Bump serde from 1.0.152 to 1.0.156 (#2530)
- Bump minimum rust version to 1.60 (#2528)
- Bump myst-parser from 0.19.1 to 1.0.0 (#2525)
- Bump chrono from 0.4.23 to 0.4.24 (#2524)
- Bump pypa/cibuildwheel from 2.12.0 to 2.12.1 (#2523)
- Bump myst-parser from 0.18.1 to 0.19.1 (#2507)
- Bump rayon from 1.6.1 to 1.7.0 (#2515)
- Bump tempfile from 3.3.0 to 3.4.0 (#2514)
- Bump needletail from 0.4.1 to 0.5.0 (#2512)
- Bump memmap2 from 0.5.9 to 0.5.10 (#2516)
- Bump thiserror from 1.0.38 to 1.0.39 (#2509)
- Bump mymindstorm/setup-emsdk from 11 to 12 (#2508)
- Bump serde_json from 1.0.93 to 1.0.94 (#2510)
- Bump typed-builder from 0.12.0 to 0.13.0 (#2511)
v4.7.0
sourmash release 4.7.0
Major new features:
- provide an initial plugin architecture for sourmash that supports new signature saving & loading mechanisms (#2428)
- add plugin support for new command-line subcommands (#2438)
- debias all containment values (#2243)
Minor new features:
- Use RankLineageInfo to simplify reading lineages (#2467)
- store taxids in lineageDB (#2466)
- Use new tax classes for taxonomic summarization (#2443)
- add tax summarization dataclasses for safety and flexibility (#2439)
- add
--scaled
to sourmash compare (#2414) - replace
lca_utils.LineagePair
withtax_utils.LineagePair
(#2441) - Add new classes for lineage manipulation (#2437)
Cleanup and documentation updates:
Developer updates:
- fix python tests by bumping tox and pip cache versions (#2424)
- Update sphinx requirement from <6,>=4.4.0 to >=4.4.0,<7 (#2430)
- Build: replace milksnake with maturin (#2393)
- importlib_metadata is a dependency on old Python versions (#2484)
- Release docs: use two separate sed commands (#2483)
- minor fixes to release behavior (#2479)
- Use screed and maturin from nixpkgs in
flake.nix
(#2481) - update release procedure after v4.6.0 and v4.6.1 (#2386)
- Update makefile and docs (#2432)
Dependabot updates:
- Bump once_cell from 1.17.0 to 1.17.1 (#2488)
- Bump ouroboros from 0.15.5 to 0.15.6 (#2487)
- Bump memmap2 from 0.5.8 to 0.5.9 (#2486)
- Bump supercharge/redis-github-action from 1.4.0 to 1.5.0 (#2485)
- Bump proptest from 1.0.0 to 1.1.0 (#2460)
- Bump web-sys from 0.3.60 to 0.3.61 (#2461)
- Bump serde_json from 1.0.91 to 1.0.93 (#2471)
- Bump wasm-bindgen-test from 0.3.33 to 0.3.34 (#2463)
- Bump cachix/install-nix-action from 18 to 19 (#2459)
- Bump wasm-bindgen from 0.2.83 to 0.2.84 (#2464)
- Bump typed-builder from 0.11.0 to 0.12.0 (#2451)
- Bump bumpalo from 3.9.1 to 3.12.0 (#2450)
- Bump pypa/cibuildwheel from 2.11.4 to 2.12.0 (#2447)
- Bump bzip2 from 0.4.3 to 0.4.4 (#2444)
- Bump once_cell from 1.14.0 to 1.17.0 (#2429)
- Bump serde from 1.0.151 to 1.0.152 (#2423)
- Bump pypa/cibuildwheel from 2.11.3 to 2.11.4 (#2422)
- Bump serde_json from 1.0.89 to 1.0.91 (#2418)
- Bump serde from 1.0.150 to 1.0.151 (#2419)
- Bump thiserror from 1.0.37 to 1.0.38 (#2417)
- Bump finch from 0.4.3 to 0.5.0 (#2416)
- Bump rayon from 1.6.0 to 1.6.1 (#2404)
- Bump serde from 1.0.149 to 1.0.150 (#2403)
- Bump pypa/cibuildwheel from 2.11.2 to 2.11.3 (#2402)
- Bump serde from 1.0.148 to 1.0.149 (#2397)
- Bump capnp from 0.14.5 to 0.14.11 (#2396)
v4.6.1
This is a quick patch-fix for sourmash v4.6.0, which introduced bug #2390. This bug broke sourmash sketch ... -o <file>
with multiple ksizes, so that .zip
and .sqldb
output files contained only one ksize.
Bug fixes:
v4.6.0
The major new feature in this release is the addition of tax summarize
, which produces a human-readable summary of taxonomy databases.
The various tax
functions also now support ingest of the output of tax annotate
as a lineage spreadsheet - see the tax prepare
documentation. This allows you to (for example) run tax summarize
on the output of tax annotate
.
Major new features:
- add
tax summarize
and support gather-tax input to taxonomy functions (#2333) - report both weighted and unweighted % recovered in gather (#2301)
- replace chernoff bounds with exact probabilities (#2268)
Minor new features:
- switch remaining sig submodule commands over to sourmash_args sig output (#2377)
- use modern signature saving API throughout main CLI commands. (#2338)
- add column 3 to kreport (#2306)
- allow gzipped gather csv inputs to tax (#2339)
- display a better error message on attempting to write a read-only sqlite database (#2376)
- fix manifest load function to properly catch
gzip.BadGzipFile
(#2375) - update kreport proportion for better resolution; match other tool outputs (#2331)
Bug fixes:
- Fix
multigather
so that the output CSV contains all matches. (#2322) - remove default ksize of 31 from help message when it's not actually true. (#2295)
Cleanup and documentation updates:
- Updated python version (#2286)
- update docs re using multiple dbs (#2296)
- fix some
tax
doc issues (#2365) - fix kreport documentation (#2302)
Developer updates:
- Fix cibuildwheel actions (#2384, #2385, #2388)
- return Err for angular_similarity when abundance tracking is off (#2327)
- cargo check fixes for Rust beta (1.65) (#2298)
- fix unnecessary typecasts in Rust (#2366)
- fix
Signature.minhash
API duringsourmash sketch
(#2329) - fix return type of
LCA_SqliteDatabase.select
(#2382)
Dependabot updates:
- Bump conda-incubator/setup-miniconda from 2.1.1 to 2.2.0 (#2363)
- Bump counter from 0.5.6 to 0.5.7 (#2336)
- Bump finch from 0.4.1 to 0.4.3 (#2283)
- Bump getrandom from 0.2.7 to 0.2.8 (#2347)
- Bump memmap2 from 0.5.7 to 0.5.8 (#2364)
- Bump myst-parser from 0.18.0 to 0.18.1 (#2345)
- Bump pypa/cibuildwheel from 2.11.1 to 2.11.2 (#2353)
- Bump pypa/cibuildwheel from 2.9.0 to 2.10.2 (#2307)
- Bump rayon from 1.5.3 to 1.6.0 (#2373)
- Bump serde from 1.0.145 to 1.0.147 (#2348)
- Bump serde from 1.0.147 to 1.0.148 (#2378)
- Bump serde_json from 1.0.86 to 1.0.87 (#2349)
- Bump serde_json from 1.0.87 to 1.0.88 (#2374)
- Bump serde_json from 1.0.88 to 1.0.89 (#2379)
- Bump typed-builder from 0.10.0 to 0.11.0 (#2356)
- Update bitstring requirement from <4,>=3.1.9 to >=3.1.9,<5 (#2372)
- Update docutils requirement from <0.18,>=0.17.1 to >=0.17.1,<0.20 (#2344)
- Update pytest requirement from <7.2.0,>=6.2.4 to >=6.2.4,<7.3.0 (#2354)
- Update pytest-cov requirement from <4.0,>=2.12 to >=2.12,<5.0 (#2346)
- Version bumps (#2282)
- Combine latest dependabot PRs: cibuildwheel, finch, serde_json (#2343)
- Rust deps updates without MSRV bump (#2315)
v4.5.0
sourmash v4.5.0 provides several minor bug fixes, as well as a number of new features.
This release also includes two minor Python API breaking changes - by default, SourmashSignature
objects loaded from files are "frozen", and we force explicit keyword arguments on MinHash
object construction.
Finally, this release updates the sourmash documentation with several new tutorials, including one on using sourmash tax
to classify metagenomes with MAGs + GTDB.
Bug fixes
- Fix
sourmash tax
argument parsing for multiple-g
and-t
arguments (#2218) - Prevent loading multiple independent gather results files in
sourmash tax
(#2244) - Fix
query_abundance
column when--ignore-abundance
is set in gather (#2251) - fix pickle protocol to properly adjust
ksize
in__getstate__
(#2265) - clean up zip error handling for bad zip files (#2270)
Minor new features
- Use the bias factor for containment when estimating ANI (#2057)
- add human output format to
sourmash tax
; provide tutorials (#2158) - add kreport output format to tax metagenome (#2239, #2249)
- add
--distance-matrix
option tosourmash compare
(#2225) - update database load UX for
gather
etc. (#2204) - add generic support for gzipped and zipfile CSVs (#2195)
- implement
tax grep
to produce identifier picklists from taxonomies (#2178)
Cleanup and documentation fixes
- add
sourmash tax
tutorial (#2158) - revise command-line docs for
sourmash sig
subcommands (#1714, #1717) - Clarify containment direction for matrix output (#2215)
- Add ANGUS tutorial to docs (#1114)
- update links to static rmd (#1177)
- update
search
documentation, help, and output. (#2222) - Fix signature filter command (#2159)
- fix notification message about query scaled (#2183)
- adjust gather output width on terminal (#2176)
Developer updates
- Add
FrozenSourmashSignature
(#1610) - force explicit kwargs on MinHash constructor (#2174)
- fix ReadTheDocs by using a more recent conda version (#2231)
- refactor and add tests for containment direction for ANI calculation (#2215)
- fix
test_storage_convert
to allow success ofsourmash convert
(#2232) - Updating
tests/test_sourmash.py::test_storage_convert
to useruntmp
fixture instead ofutils.TempDirectory()
(#1739) - Bump pypa/cibuildwheel from 2.8.1 to 2.9.0 (#2207)
- use stderr for test output printing (#2217)
- fix for sphinx 5.10 (#2147)
v4.4.3
v4.4.2
Minor fixes and performance improvements:
- circumvent a very slow
MinHash.remove_many(...)
call insourmash gather
(#2123)
Developer updates:
- substantial refactoring of
CounterGather
and relatedIndex
code. (#2116) - update
Index
protocol tests to include tests forpeek
andconsume
(#2111) - Bump pypa/cibuildwheel from 2.7.0 to 2.8.0 (#2118)
- test insert after downsample for LCA_Database (#2117)
- update release notes & pyproject.toml after v4.4.1 (#2114)
v4.4.1
Major new features:
- less stringent size accuracy parameters for ANI accuracy reporting (#2074)
- only skip dist est if containment/jaccard are 0 or 1 (#2060)
- emit fewer warnings about potential ANI estimation issues (#2061)
Minor new features:
- fix
lca summarize
to support general collections for queries (#2107) - add compare --avg-containment (#2056)
Documentation updates:
- fix search and gather docs (#2105)
- fix
CITATION.cff
YAML and add a test for parseability and content. (#2103)
Developer updates:
- move setup.cfg into pyproject.toml (#2097)
- Fix downsample_scaled in
core
(#2108) - add picklist tests; support for allow_empty (#2106)
- remove LazyLoadedIndex (#2104)
- Bump web-sys from 0.3.57 to 0.3.58 (#2092)
- Bump getrandom from 0.2.6 to 0.2.7 (#2090)
- Bump wasm-bindgen-test from 0.3.30 to 0.3.31 (#2093)
- Bump pypa/cibuildwheel from 2.6.1 to 2.7.0 (#2089)
- Build: nix updates (#2088)
- CI: split wheel building (#2087)
- rust version bumps (#2086)
- Update sphinx requirement from <5,>=4.4.0 to >=4.4.0,<6 (#2068)
- Bump actions/setup-python from 3 to 4 (#2080)
- Bump myst-parser from 0.17.2 to 0.18.0 (#2081)
- Bump pypa/cibuildwheel from 2.5.0 to 2.6.1 (#2079)
- remove unnecessary
object
fromclass
definitions (#2077)
v4.4.0
This release contains many new features! Of particular note:
- sourmash now estimates and outputs average nucleotide identity (ANI) based on k-mer measures;
sourmash sketch translate
is no longer unusably slow;- we provide Mac OS 'arm64' wheels for the new M1 Macs;
- we've added a number of support features for managing large collections of signatures and building very large databases;
- and we've added support for SQLite databases that can be used for storing and searching signatures and doing Kraken-style LCA analysis of genomes and metagenomes.
In addition, we have built updated Genbank genome databases (with contents from March 2022) as well as GTDB R07-RS207 databases; see the prepared databases page. We've also made some benchmarks available for these databases, so you can get some idea of the necessary computational resources for your searches.
Last but by no means least, we have begun providing a number of examples and recipes for using sourmash - see the new sourmash examples Web site!
Major new features:
- add ANI output to search, prefetch, and gather (#1934, #1952, #1955, #1966, #1967, #2011, #2031, #2032)
- new GTDB and Genbank database releases (#2013, #2038)
- provide macos arm64 wheels (#1935)
- support for SQLite databases (#1808)
- implement
sourmash sketch fromfile
(#1884, #1885, #1886, #2009) - add
sourmash sig check
for comparing picklists and databases (#1907, #1915, #1917) - add
sig collect
command (#2036) for building standalone manifests from many databases - Add direct loading of manifest CSVs as sourmash indices (#1891)
- add
-A/--abundance-from
tosig subtract
& addsig inflate
(#1889) - advanced database format documentation (#2025)
Minor new features:
- add
-d/--debug
tosourmash sig describe
; upgrade output errors. (#1782) - add
sum_hashes
tosourmash sig describe
output. (#1882)
Bug fixes:
- catch TypeError in search w/abund vs flat at the command line (#1928)
- speed up
SeqToHashes
translate
(#1938, #1946)
Cleanup and documentation fixes:
- better handle some pickfile errors (#1924)
- remove unnecessary downsampling warnings (#1971)
- use same wording for dayhoff/hp as for dna/protein (#1929)
- rename
covered_bp
property to better reflect function (#2050)
Developer updates:
- provide "protocol" tests for
Index
,CollectionManifest
, andLCA_Database
classes (#1936) - remove khmer CI tests (#1950)
- Benchmarks for seq_to_hashes in protein mode (#1944)
- add some tests for Jaccard output ordering (#1926)
- Oxidize ZipStorage (#1909)
- cleanup and commenting of
test_index.py
tests. (#1898, #1900) - rationalize
_signatures_with_internal
(#1896) - Convert nix to flakes (#1904)
- fix docs build (#1897)
- Fix build/CI and unused imports papercuts (#1974)
- fix hypothesis CI (#2028)
- dependabot version updates (#1977, #1978, #1979, #1980, #1981, #1982, #1983, #1984, #1985, #1986, #1987, #1988, #1989, #1991, #1993, #1994, #1995, #1996, #1997, #1998, #2017, #2019, #2020, #2021, #2022, #2023, #2042)
v4.3.0
New features:
- add
sourmash sig grep
(#1864) - add
sourmash sig summarize
(#1837, #1863) - add
--include-db-pattern
and--exclude-db-pattern
to many commands (#1871) - update lca summarize output to output total counts (#1838)
Bug fixes:
- fix
sourmash prefetch
to work when db scaled is larger than query scaled (#1870) - fix
sourmash prefetch
for multiple ksizes in database (#1866) - allow missing columns in tax CSV files (#1869)
- fix containment calculation for nodegraphs (#1862)
- fix
tax prepare
SQL code for empty/blank taxonomic ranks (#1843)
Cleanup and documentation fixes:
- clean up 'describe' a little bit, add a test (#1861)
- add --output-dir as alias for every --outdir (#1817)
- fix doc titles in
command-line.md
and update description a bit (#1874)
Developer updates: