Skip to content

Matrix subset according to type of input #3485

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 33 commits into
base: v15
Choose a base branch
from

Conversation

dvd101x
Copy link
Collaborator

@dvd101x dvd101x commented May 31, 2025

Hi Jos,

This addresses #2344

It changes the behavior of index to accommodate for scalar indices and the size of the output depends on the input.

I'm still doing some tests and reviewing what documentation needs to change, but it all seems to work according to:
#2344 (comment)

A = [1,2,3; 4,5,6]
A[2,3]  --> 6
A[[1,2],[2,3]] --> [2,3; 5,6]
A[[2],[2,3]] --> [[5, 6]]
A[2,[2,3]] --> [5,6]
A[[1,2],[2]] --> [2; 5]  // which is the same as [[2], [5]]
A[[1,2],2] --> [2,5]
A[[2], [2]] --> [[5]]
A[2, [2]] --> [5]
A[[2], 2] --> [5]
A[[],[]] --> []
A[1,[]] --> []
A[[],1] --> []

@dvd101x
Copy link
Collaborator Author

dvd101x commented May 31, 2025

I think it's mostly ready for review.

Just wanted to review the following,

The elements of an index are called ranges like index(range1, range2) but the ranges are not always Range. Sometimes ranges are number, string, bigNumber, Array, Matrix, Array of Booleans, Matrix of Booleans or Range.

So I was thinking if in the documentation and maybe in the code to find a more general name for the elements of an index like index(indexElement1, indexElement2).

As a reference in numpy it's more like you slice with indices and there is no index class. So you slice A[index1, index2]

https://mathjs.org/docs/reference/functions/index.html

@josdejong josdejong changed the base branch from develop to v15 June 4, 2025 08:45
@josdejong josdejong added this to the v15 milestone Jun 4, 2025
@josdejong josdejong mentioned this pull request Jun 4, 2025
6 tasks
@josdejong
Copy link
Owner

The code looks really neat, thanks David!

Some thoughts:

  1. This is a breaking change, therefore I've changed this PR to merge into the v15 branch, and listed it at #3453.

  2. So I was thinking if in the documentation and maybe in the code to find a more general name for the elements of an index like index(indexElement1, indexElement2).

    Maybe we can call the arguments index(dim1, dim2, ...)?

  3. Does this PR have impact on the performance of subset?

  4. I'm a bit concerned about how hugely breaking this change is. Existing code will probably still work but now return differently dimensioned matrices, which can give subtle and hard to debug errors. Can we somehow help people migrate, or detect "old" usage and warn the user? Or make it backward compatible with a config flag?

@dvd101x
Copy link
Collaborator Author

dvd101x commented Jun 4, 2025

Thanks for your review Jos,

  1. Ok.
  2. Good idea! that aligns better with the rest of the code base.
  3. Makes a few of the tests slightly faster within the margin of error, but the main speed benefit might be the Matrix creation during the creation of Index, which is outside the scope of the subset benchmark. I will add some tests including the Index creation.
  4. I agree it's a big change. I'm not sure how to proceed. I'll give it some thought.

@dvd101x
Copy link
Collaborator Author

dvd101x commented Jun 7, 2025

Regarding the benchmarks, I said something wrong. The current benchmarks already take into account the index creation.

The main benefit is the index creation, if tested on its own it's about twice as fast. But if tested on slicing as I said it's slightly faster.

image

So this task was already done.

Regarding the new behavior, I think it's possible to do the config flag, what would be a good name for it? Like config.legacy ?

@josdejong
Copy link
Owner

Thanks for testing the performance David, most important is that there is not a sudden detoriation. Little performance improvements are of course always welcome :)

Regarding the new behavior, I think it's possible to do the config flag, what would be a good name for it? Like config.legacy ?

Great to hear a config flag is most likely possible. I think we should give it a name specific for this very subset/index refactoring. Something like config.legacyIndex or config.backwardCompatibleIndex? I think I like config.legacyIndex more of these two but no strong feelings.

@dvd101x
Copy link
Collaborator Author

dvd101x commented Jun 19, 2025

Thanks Jos!

It might be too complicated to change the behavior of Index so I would like to try first to change subset with config.legacySubset flag. I will try this first and report back for your review.

@dvd101x
Copy link
Collaborator Author

dvd101x commented Jun 26, 2025

Hi, this is still a work in progress but I think the overall implementation is working. Next steps are:

  • add tests for subset with config.legacySubset
  • sylvester, row and column depend on subset, thus they should save current config.legacySubset, change it to false and restore its original value at the end.
  • change documentation.

@dvd101x
Copy link
Collaborator Author

dvd101x commented Jul 2, 2025

I think it's mostly done, the only thing left is changing the documentation.

@dvd101x
Copy link
Collaborator Author

dvd101x commented Jul 3, 2025

Hi, I included the changes in documentation and it's ready for review.

Copy link
Owner

@josdejong josdejong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks David, this is an impressive piece of work! I'm really happy you managed to implement the legacySubset config option, without it it will be a really painful upgrade for many users I think.

I made a few minor inline comments, can you have a look at those?

@@ -274,9 +274,30 @@ created using the function `index`. When getting a single value from a matrix,
`subset` will return the value itself instead of a matrix containing just this
value.

The function `subset` normally returns a subset, but when getting or setting a
single value in a matrix, the value itself is returned.
**New behavior:**
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think people upgrading to this breaking change will have the following question: "I upgrade to v15 and now my existing code is broken. How do I migrate"?

Can you write a bit more about how to update existing code? I think it should be a separate section with a title like "Migrate to v15". I think a small table showing some examples of old notation and new notation side by side (to get the same result) would help. What do you think?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a good idea, I included that in the latest commit.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I would like to try to simplify the table a bit, it's showing quite a lot at the same time. How about this:

const m = math.matrix([[1, 2, 3], [4, 5, 6]])
v14 code v15 equivalent code Result
math.subset(m, math.index([0, 1], [1, 2])) No change needed [[2, 3], [5, 6]]
math.subset(m, math.index(1, [1, 2])) math.subset(m, math.index([1], [1, 2])) [[5, 6]]
math.subset(m, math.index([0, 1], 2)) math.subset(m, math.index([0, 1], [2])) [[3], [6]]
math.subset(m, math.index(1, 2)) No change needed 6

I would also love to add a similar "Migration section" to the Matrix section on the Syntax page: https://mathjs.org/docs/expressions/syntax.html#matrices. To help people migrate that only use the expression parser and not the JS API. Can you please add that too?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, yes, it makes sense, I will work on it and report back soon.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, the comments are addressed and it's ready for review.

@dvd101x
Copy link
Collaborator Author

dvd101x commented Jul 4, 2025

Thanks Jos, for your kind words and thorough review.

Will address these topics and report back in the following days.

@dvd101x
Copy link
Collaborator Author

dvd101x commented Jul 7, 2025

Hi, the topics are addressed in general, please let me know if the chapter of migration with this new behavior should be more in detail or with a different focus.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants