StateVector indexing leads to unnecessarily complicated code #186

erogers-dstl · 2020-04-06T11:37:55Z

Firstly, let me say that I think the StateVector type generally useful and the removes confusion (keeping vectors as column vectors is a good idea).

Having said that, in the code I have been writing recently, I have been find some 'gotchas' in the behaviour of StateVectors and I think it could be changed to improve things.

The main issue is that my_state_vector[1] returns a 1x1 Statevector rather than a number (int, float etc). If you have

my_list = [0, 1, 2]
my_array = np.array([0, 1, 2])
my_sv = StateVector([0, 1, 2])

then the following are true (and expected)

my_list[1] == 1 and isinstance(my_list[1], int)
my_array[1] == 1 and isinstance(my_array[1], int)

However the following is NOT true

my_sv[1] == 1 and isinstance(my_sv[1], int)

In fact (in psuedocode) my_sv[1] == StateVector([[1]]) i.e. a 1x1 StateVector with the right value, but unexpected type. To work around this, my code (and a lot of code in the code base) has to use the pattern my_sv[1, 0]. If we do that then:

my_sv[1, 0] == 1 and isinstance(my_sv[1, 0], int)

as expected.

Current behaviour also means that iterating through a StaveVector produces a sequence of StateVector rather than the expected sequence of numbers.

There is a fairly simple fix to this, overriding the behaviour of StateVector.__getitem__() and I will submit a pull request shortly demonstrating this. The change does not break the existing pattern m_sv[1, 0], but it does break a few tests that rely on iteration. However, I have fixed those in the pull request, and I would argue that the fixes reduce complexity and increase readability.

I acknowledge that this change may be personal preference and would appreciate other thoughts on this. To me, though, it seems a simple improvement, without much downside.

The text was updated successfully, but these errors were encountered:

sdhiscocks · 2020-04-06T15:21:33Z

I'd be hesitant to do this, as it's unknown what impact this might have when passing a StateVector to numpy, scipy or other 3rd party libraries. It also have different behaviour if in some cases you have a mix of StateVector and standard column vectors in your code.

erogers-dstl · 2020-04-06T16:08:59Z

Hmm, that's an interesting point about the interface with other libraries. I hadn't considered that.

In the existing codebase, it works as expected, but I suppose maybe it could cause subtle bugs elsewhere. I would argue that you "shouldn't" be using other column vectors, but not sure if we can assume it.

I guess to me it comes down to which is the "lesser of two evils"

erogers-dstl · 2020-04-06T18:45:10Z

I've been thinking more (always dangerous I know!)

While there could be some edge cases where library code breaks, it would have to be doing something a bit odd: deliberately selecting a 1x1 array out of a column vector. If it tried to select an element, my code would not break it. If it is done in c code, my code would not break the c interface.

This is not to say my code couldn't break numpy, just that it would be an odd case if it did.

sdhiscocks · 2020-04-07T08:11:44Z

I'm not sure that saving typing of , 0 is significant to outweigh unexpected consequences. The PR already has an example where tuples cause incorrect behaviour with this change.

erogers-dstl · 2020-04-07T08:58:21Z

OK, let's drop this for now.

There is a (sort of) work around: using my_statevector.flat which solves some of the problems.

Just for info, I would say this is more about unexpected consequences before the change, than clean code after. This, of course, depends on what you expect if the developer is more used to column vectors than I am, they may get exactly the behaviour they expect.

I would also argue that the tuple change above is "incorrect" before, rather than after, though again that is debatable. The extra index ([tuple, 0] tips numpy into advanced indexing mode (which I doubt was what the developer had in mind when they used a tuple: everywhere else mappings are lists or arrays.

Anyway, should I mark this closed and close the PR, or does anyone else want to chime in?

sdhiscocks · 2020-04-07T10:16:00Z

OK, let's drop this for now.

There is a (sort of) work around: using my_statevector.flat which solves some of the problems.

What issue are you having?

Just for info, I would say this is more about unexpected consequences before the change, than clean code after. This, of course, depends on what you expect if the developer is more used to column vectors than I am, they may get exactly the behaviour they expect.

I think it better to stick with standard way column vectors are handled in numpy, as this will be more familiar with community than introducing a bespoke way.

I would also argue that the tuple change above is "incorrect" before, rather than after, though again that is debatable. The extra index ([tuple, 0] tips numpy into advanced indexing mode (which I doubt was what the developer had in mind when they used a tuple: everywhere else mappings are lists or arrays.

Again, this is acceptable usage in numpy, but to be fair does catch people out. Generally tuples are preferable, as list and arrays are mutable types that are really designed for other usage than just a fixed sequence. It wont be obvious that a mapping will necessarily be used in slicing inside the code (from perspective of someone using the component), as such if this change was made, this would have to be factored in.

Anyway, should I mark this closed and close the PR, or does anyone else want to chime in?

I wouldn't close yet as to give others chance to comment.

erogers-dstl · 2020-04-07T10:57:00Z

What issue are you having?

This is very topical as (even after being aware of the issue yesterday) I just ran into it: I am testing some code I wrote yesterday afternoon and ran into a bug which I think makes a good example:

The (minimal working example) code in question is

velocity = StateVector([0, 0, 1])
_, bearing, elevation = cart2sphere(*velocity)
orientation = StateVector([0, bearing, elevation])
assert np.allclose(orientation, StateVector([0, 0, np.pi/2]))

This, to me, looks entirely reasonable (given that cart2sphere takes 3 input arguments, rather than a vector). Interestingly it's not, and the error is (which I at least couldn't have predicted):
TypeError: ufunc 'isfinite' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

The problem (which isn't obvious to me until a fair amount of debugging) is that *velocity splits into a sequence of 1x1 matrices: therefore bearing and elevation are 1x1 matrices, so orientation.dtype == object, and so isclose can't parse it.

Workarounds:

_, bearing, elevation = cart2sphere(*velocity.flat)
_, bearing, elevation = cart2sphere(*velocity[:, 0])

The workarounds aren't too bad, but the unexpected behaviour what I'd prefer to avoid.

erogers-dstl · 2020-04-08T06:54:52Z

Steve just pointed out another issue: How does setting items work?

What does state_vector[1] = 10 do?

I need to look into this...

erogers-dstl · 2020-04-08T09:54:14Z

In answer to

What does state_vector[1] = 10 do?

It currently does "RuntimeError: Getitem not returning array"

This is fixed by also overriding __setitem__ in a similar way to __getitem__ and the PR is now updated with this code

sdhiscocks · 2020-04-08T12:06:14Z

What issue are you having?

This is very topical as (even after being aware of the issue yesterday) I just ran into it: I am testing some code I wrote yesterday afternoon and ran into a bug which I think makes a good example:

The (minimal working example) code in question is
velocity = StateVector([0, 0, 1])
_, bearing, elevation = cart2sphere(*velocity)
orientation = StateVector([0, bearing, elevation])
assert np.allclose(orientation, StateVector([0, 0, np.pi/2]))

It this confusion because we allow StateVectors to be defined using an array? This was added in #104. I'm wondering if we should reverse this?

erogers-dstl · 2020-04-08T15:28:09Z

The issue would remain if you passed in a list: either way I (personally) expect StateVector to be iterable, and basically it isn't. At least, not in a way you would expect.

erogers-dstl · 2020-04-08T15:30:24Z

Maybe it's a PEBCAK, and I shouldn't expect them to be iterable

DaveKirkland · 2020-04-08T18:13:31Z

What issue are you having?

This is very topical as (even after being aware of the issue yesterday) I just ran into it: I am testing some code I wrote yesterday afternoon and ran into a bug which I think makes a good example:
The (minimal working example) code in question is
velocity = StateVector([0, 0, 1])
_, bearing, elevation = cart2sphere(*velocity)
orientation = StateVector([0, bearing, elevation])
assert np.allclose(orientation, StateVector([0, 0, np.pi/2]))
It this confusion because we allow StateVectors to be defined using an array? This was added in #104. I'm wondering if we should reverse this?

Actually, you're passing in a list. I added this as a convenience function when creating StateVectors. Before you often saw.
vec = StateVector(np.array([[1., 2., 3.]]).T) or

vec = StateVector([[1], [2], [3]])
Either way it was converting a list into a StateVector

I believe the documentation tries to clearly state that this is for convenience :) If not, I should rewrite it :) I hadn't heard of PEBCAK before :)

erogers-dstl · 2020-04-09T13:31:00Z

@DaveKirkland I notice you didn't say what you think of the idea... I would love to get more viewpoints from those with more Stone Soup experience!

erogers-dstl · 2020-04-09T13:36:18Z

After all the discussion above, I think I now have a better idea of the problem(s) I was trying to fix in the first place.

There are two:

StateVector is not (usefully) iterable. I would expect it to be, and it can be very useful (iteration is powerful python tool for a reason).
The indexing is a bit awkward: statevector[i, 0]. "Solving" this is purely for convenience.

Both these can be worked around easily, if the user is aware. The former causes problems that are harder to debug that the latter.

I do think that maybe Steve is right, and "fixing" these might cause more headaches later on than it is worth, but I'd like to hear more voices.

jmbarr · 2020-04-17T09:50:44Z

Apologies for arriving late, but I've just encountered this issue in practice.

My opinion: indexing StateVector should return an element of the type in the state vector. Just returning a StateVector type feels like a bit of a cop out. What if you want to change behaviour according to type?

State vectors can be composed of different types; they are Stone Soup's bread and butter and it's worth spending a bit of time getting them right. I wouldn't expect a brand new user to have to do mystatevector[i, 0] to get the appropriate type.

sdhiscocks · 2020-04-17T10:31:23Z

History just to add some context:

Originally we used pure numpy arrays as state vectors.
People had habit of defining them as np.ndarray([1, 2, 3]) which is an array not a vector, which caused errors and odd behaviour (especially in matrix multiplication) which weren't obvious were they came from.
We added the StateVector type, to enforce defining them as vectors (e.g. np.ndarray([[1], [2], [3]])
We added a convenience function to StateVector on initialisation so that an array would be converted into a vector. So StateVector([1, 2, 3]) would convert to StateVector([[1], [2], [3]]).

Proposal here is to change from equivalence:
state_vector[n] -> state_vector[n, :]
to:
state_vector[n] -> state_vector[n, 0]

And iteration from:
for val in state_vector[:, 0]
to:
for val in state_vector

My main concern as raised is that this will cause unexpected behaviour, so what third party code (e.g. numpy, scipy examples of heavily used libraries) sees as a vector, now has array like behaviour for iterating and slicing rows (by integer only).

I believe the way numpy handles arrays vs. vectors is different to MatLab, so not sure if this is why behaviour is expected to be more like the proposed?

sdhiscocks · 2020-04-27T15:14:11Z

Having recently dipped my toe into the depths of NumPy and SciPy in #204 and #205, and gotten a better understanding of sub-classing, custom types, etc. I think in reality, we'll likely come up against issues with 3rd party code at some point regardless. With introduction of Angle types, I think we've already decided we need custom behaviour, so therefore I'm also supportive of this proposed change to indexing.

sdhiscocks · 2020-07-16T18:16:06Z

Closed with #187

erogers-dstl added enhancement question labels Apr 6, 2020

erogers-dstl self-assigned this Apr 6, 2020

erogers-dstl mentioned this issue Apr 6, 2020

Change StateVector indexing #187

Merged

jmbarr mentioned this issue Jul 16, 2020

Unscented transform doesn't know about relative offsets #255

Closed

sdhiscocks closed this as completed Jul 16, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

StateVector indexing leads to unnecessarily complicated code #186

StateVector indexing leads to unnecessarily complicated code #186

erogers-dstl commented Apr 6, 2020

sdhiscocks commented Apr 6, 2020

erogers-dstl commented Apr 6, 2020 •

edited

Loading

erogers-dstl commented Apr 6, 2020

sdhiscocks commented Apr 7, 2020

erogers-dstl commented Apr 7, 2020 •

edited

Loading

sdhiscocks commented Apr 7, 2020

erogers-dstl commented Apr 7, 2020 •

edited

Loading

erogers-dstl commented Apr 8, 2020

erogers-dstl commented Apr 8, 2020

sdhiscocks commented Apr 8, 2020

erogers-dstl commented Apr 8, 2020 •

edited

Loading

erogers-dstl commented Apr 8, 2020 •

edited

Loading

DaveKirkland commented Apr 8, 2020

erogers-dstl commented Apr 9, 2020

erogers-dstl commented Apr 9, 2020

jmbarr commented Apr 17, 2020

sdhiscocks commented Apr 17, 2020

sdhiscocks commented Apr 27, 2020

sdhiscocks commented Jul 16, 2020

StateVector indexing leads to unnecessarily complicated code #186

StateVector indexing leads to unnecessarily complicated code #186

Comments

erogers-dstl commented Apr 6, 2020

sdhiscocks commented Apr 6, 2020

erogers-dstl commented Apr 6, 2020 • edited Loading

erogers-dstl commented Apr 6, 2020

sdhiscocks commented Apr 7, 2020

erogers-dstl commented Apr 7, 2020 • edited Loading

sdhiscocks commented Apr 7, 2020

erogers-dstl commented Apr 7, 2020 • edited Loading

erogers-dstl commented Apr 8, 2020

erogers-dstl commented Apr 8, 2020

sdhiscocks commented Apr 8, 2020

erogers-dstl commented Apr 8, 2020 • edited Loading

erogers-dstl commented Apr 8, 2020 • edited Loading

DaveKirkland commented Apr 8, 2020

erogers-dstl commented Apr 9, 2020

erogers-dstl commented Apr 9, 2020

jmbarr commented Apr 17, 2020

sdhiscocks commented Apr 17, 2020

sdhiscocks commented Apr 27, 2020

sdhiscocks commented Jul 16, 2020

erogers-dstl commented Apr 6, 2020 •

edited

Loading

erogers-dstl commented Apr 7, 2020 •

edited

Loading

erogers-dstl commented Apr 7, 2020 •

edited

Loading

erogers-dstl commented Apr 8, 2020 •

edited

Loading

erogers-dstl commented Apr 8, 2020 •

edited

Loading