Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more SubstitutionMatrix operations #60

Merged
merged 2 commits into from
Jul 1, 2022

Conversation

timholy
Copy link
Contributor

@timholy timholy commented Nov 22, 2021

SubstitutionMatrix is not a subtype of AbstractArray.
This may or may not be intentional, but currently the number of
supported operations is fairly minimal.
This provides a few more conveniences for users who might want to
manipulate them beyond mere lookup tables.

Types of changes

This PR implements the following changes:
(Please tick any or all of the following that are applicable)

  • ✨ New feature (A non-breaking change which adds functionality).
  • πŸ› Bug fix (A non-breaking change, which fixes an issue).
  • πŸ’₯ Breaking change (fix or feature that would cause existing functionality to change).

πŸ“‹ Additional detail

  • This implements similar, isassigned, a meaningful eltype, and logical indexing for SubstitutionMatrix.
  • These are useful utilities if, for example, a user creates a new measure and wants to compare the values against standard substitution matrix values.
  • The tests are good examples of the functionality.
  • This primarily implements new behavior. The only change is that setindex! now returns val rather than the matrix. This allows assignment operations to be chained and is the standard way setindex! should behave. (I would call the old behavior a bug, hence the minor increment rather than breaking release.)

β˜‘οΈ Checklist

  • 🎨 The changes implemented is consistent with the julia style guide.
  • πŸ“˜ I have updated and added relevant docstrings, in a manner consistent with the documentation styleguide.
  • πŸ“˜ I have added or updated relevant user and developer manuals/documentation in docs/src/.
  • πŸ†— There are unit tests that cover the code changes I have made.
  • πŸ†— The unit tests cover my code changes AND they pass.
  • πŸ“ I have added an entry to the [UNRELEASED] section of the manually curated CHANGELOG.md file for this repository.
  • πŸ†— All changes should be compatible with the latest stable version of Julia.
  • πŸ’­ I have commented liberally for any complex pieces of internal code.

Most of the items in this checklist don't apply, since these are just "missing" Base methods that I'd expect from anything with Matrix in the name.

@timholy
Copy link
Contributor Author

timholy commented Nov 22, 2021

Alternatively, I'd be happy to make this a subtype of AbstractMatrix. I think it mostly comes down to whether you want to support indexing with integers or not. My guess is "not," and that's the reason these are not subtypes.

@codecov
Copy link

codecov bot commented Nov 22, 2021

Codecov Report

Merging #60 (a0c70c3) into master (5a3354e) will increase coverage by 0.44%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master      #60      +/-   ##
==========================================
+ Coverage   87.64%   88.08%   +0.44%     
==========================================
  Files          16       16              
  Lines        1052     1133      +81     
==========================================
+ Hits          922      998      +76     
- Misses        130      135       +5     
Impacted Files Coverage Ξ”
src/submat.jl 98.44% <100.00%> (+0.37%) ⬆️
src/pairwise/algorithms/banded_needleman_wunsch.jl 75.00% <0.00%> (-1.36%) ⬇️
src/pairwise/algorithms/needleman_wunsch.jl 83.59% <0.00%> (-0.71%) ⬇️
src/anchors.jl 100.00% <0.00%> (ΓΈ)
src/pairwise/algorithms/edit_distance.jl 100.00% <0.00%> (ΓΈ)
src/pairwise/algorithms/hamming_distance.jl 100.00% <0.00%> (ΓΈ)
src/pairwise/alignment.jl 98.26% <0.00%> (+0.04%) ⬆️
src/pairwise/algorithms/common.jl 90.62% <0.00%> (+0.30%) ⬆️
src/models.jl 85.50% <0.00%> (+0.43%) ⬆️
... and 5 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Ξ” = absolute <relative> (impact), ΓΈ = not affected, ? = missing data
Powered by Codecov. Last update 5a3354e...a0c70c3. Read the comment docs.

SubstitutionMatrix is not a subtype of AbstractArray.
This may or may not be intentional, but currently the number of
supported operations is fairly minimal.
This provides a few more conveniences for users who might want to
manipulate them beyond mere lookup tables.
@kescobo
Copy link
Member

kescobo commented Nov 22, 2021

I didn't write this, so I'm speculating, but I'm guessing the reason the wasn't implemented previously is that substitution matrices tend to be pretty static, and don't require much beyond being able to reference a particular value. Things like BLOSUM have been around for decades more or less unchanged.

That said, I can't think of any reason NOT to do this, other than the up-front effort that you've already expended. So I'm in favor of merging. Additional maintenance burden seems trivial.

@timholy
Copy link
Contributor Author

timholy commented Nov 22, 2021

Out of curiousity, do you happen to know which version is implemented? https://www.nature.com/articles/nbt0308-274

@kescobo
Copy link
Member

kescobo commented Nov 22, 2021

We have all of them - I think they're just downloaded direct from NCBI: https://biojulia.net/BioAlignments.jl/dev/pairalign/#Substitution-matrix-types-1

@timholy
Copy link
Contributor Author

timholy commented Nov 22, 2021

Right, I just meant, does NCBI host the original or corrected versions? But now I see the "corrected" versions are called RBLOSUMxx, so I'm guessing they must be the original ones.

@kescobo
Copy link
Member

kescobo commented Nov 22, 2021

Oh, I see - I misread that reference 😳. I think you're right

@CiaranOMara CiaranOMara mentioned this pull request Jun 8, 2022
@CiaranOMara
Copy link
Member

CiaranOMara commented Jun 8, 2022

I think the hold up here is that nobody has been able to provide definitive clarity around the types used. Unless the original authors @bicycle1885 or @SabrinaJaye can comment, someone would need to go through the code and rationalise the already made decisions, though I think @timholy has done this and is most familiar with this aspect of the code. So @timholy, I am happy to take your preference, whether to subtype from AbstractMatrix or AbstractSubstitutionMatrix. I also agree with @kescobo that maintenance either way would seem trivial.

@kescobo kescobo modified the milestone: Version 3.0 Jun 16, 2022
@CiaranOMara CiaranOMara merged commit 1ea30c7 into BioJulia:master Jul 1, 2022
CiaranOMara pushed a commit that referenced this pull request Jul 1, 2022
SubstitutionMatrix is not a subtype of AbstractArray.
This may or may not be intentional, but currently, the number of supported operations is fairly minimal.
This provides a few more conveniences for users who might want to manipulate them beyond mere lookup tables.
@timholy timholy deleted the teh/abstractarray branch May 24, 2024 14:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging this pull request may close these issues.

None yet

3 participants