Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] Fix multigather so that the output CSV contains all matches. #2322

Merged
merged 4 commits into from
Oct 14, 2022

Conversation

ctb
Copy link
Contributor

@ctb ctb commented Oct 10, 2022

This PR fixes #2321 so that more than one output line is placed in the CSV. Oops!

It also adds a notification of what the CSV output file name is.

Last but not least, it supports --output-dir as a way to set the base path for all output files.

Fixes #2321.

TODO:

@codecov
Copy link

codecov bot commented Oct 10, 2022

Codecov Report

Merging #2322 (0f488ec) into latest (5518c6f) will increase coverage by 7.30%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           latest    #2322      +/-   ##
==========================================
+ Coverage   84.84%   92.14%   +7.30%     
==========================================
  Files         131      100      -31     
  Lines       15689    11420    -4269     
  Branches     2190     2191       +1     
==========================================
- Hits        13311    10523    -2788     
+ Misses       2083      602    -1481     
  Partials      295      295              
Flag Coverage Δ
python 92.14% <100.00%> (+<0.01%) ⬆️
rust ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/sourmash/cli/multigather.py 100.00% <100.00%> (ø)
src/sourmash/commands.py 90.28% <100.00%> (+0.03%) ⬆️
src/core/src/ffi/index/revindex.rs
src/core/src/ffi/cmd/compute.rs
src/core/src/index/search.rs
src/core/src/ffi/hyperloglog.rs
src/core/src/signature.rs
src/core/src/sketch/minhash.rs
src/core/tests/storage.rs
src/core/src/ffi/mod.rs
... and 23 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@ctb ctb changed the title Fix multigather so that the output CSV contains all matches. [WIP] Fix multigather so that the output CSV contains all matches. Oct 13, 2022
@ctb ctb changed the title [WIP] Fix multigather so that the output CSV contains all matches. [MRG] Fix multigather so that the output CSV contains all matches. Oct 13, 2022
@ctb
Copy link
Contributor Author

ctb commented Oct 13, 2022

@sourmash-bio/devs @dkoslicki ready for review!

@ctb ctb closed this Oct 13, 2022
@ctb ctb reopened this Oct 13, 2022
Copy link
Collaborator

@dkoslicki dkoslicki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me! When I have a chance, I'll probably give a shot at parallelizing this. I have plans for 1000's of gathers

src/sourmash/cli/multigather.py Show resolved Hide resolved
@ctb ctb merged commit 49008f1 into latest Oct 14, 2022
@ctb ctb deleted the fix/multigather_indent branch October 14, 2022 02:05
olgabot pushed a commit that referenced this pull request Aug 21, 2023
Note: PR into #2065.

This PR updates #2065 with all the changes from `latest`. This includes
the fix & update to multigather in
#2322, which:
- fixes the output of multigather to include more than one line 😆 
- prints out the output filename
- supports `--output-dir`

No functional changes are made beyond the merge, so some tests are still
failing; will discuss fixes in yet a new PR :).

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Keya Barve <53328492+keyabarve@users.noreply.github.com>
Co-authored-by: ccbaumler <63077899+ccbaumler@users.noreply.github.com>
Co-authored-by: Tessa Pierce Ward <bluegenes@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Taylor Reiter <taylorreiter@gmail.com>
Co-authored-by: Erik Young <eeyoung@ucdavis.edu>
Co-authored-by: David Koslicki <dmkoslicki@gmail.com>
Co-authored-by: Luiz Irber <luizirber@users.noreply.github.com>
Co-authored-by: Colton Baumler <baumlerc@farm.ucdavis.edu>
Co-authored-by: Luiz Irber <contact+github@luizirber.org>
Co-authored-by: N. Tessa Pierce-Ward <ntpierce@gmail.com>
Co-authored-by: Peter Cock <p.j.a.cock@googlemail.com>
Co-authored-by: Francesco Beghini <francesco.beghini@yale.edu>
Co-authored-by: Jason Stajich <jason.stajich@ucr.edu>
Co-authored-by: Katrin Leinweber <9948149+katrinleinweber@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

multigather only saves first match to CSV
2 participants