Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Indexed LValue bug #2795

Open
VMatthijs opened this issue Aug 7, 2019 · 11 comments

Comments

@VMatthijs
Copy link
Member

commented Aug 7, 2019

Summary:

Currently, Stan treats LValues (left-hand-side of assignments) as having a flat indexing structure consisting of a single list of indices. This seems in contradiction with the treatment of RValues (right-hand-side of assignments).

Description:

For example: a[:][2] and a[:, 2] are equivalent as an LValue but not as an RValue.

Reproducible Steps:

Run

model {
    int a[2, 2] = {{1, 2}, {3, 4}};
    a[:][2] = {17, 18};
    print(a);
}

and

model {
    int a[2, 2] = {{1, 2}, {3, 4}};
    a[:, 2] = {17, 18};
    print(a);
}

They give the same output.

Then run

model {
    int a[2, 2] = {{1, 2}, {3, 4}};
    print(a[:][2]);
}

and

model {
    int a[2, 2] = {{1, 2}, {3, 4}};
    print(a[:, 2]);
}

They give different output.

Current Output:

Inconsistent behaviour between indexing in LValues and RValues.

Expected Output:

Consistency. I think the behaviour of RValues is the correct one.

Current Version:

v2.20.0

@VMatthijs

This comment has been minimized.

Copy link
Member Author

commented Aug 8, 2019

@mitzimorris , do you know if this would be easy to fix in the code generation? Or would it be a lot of work?

The indexing system is just pretty complicated and it is also not entirely sufficiently documented in the manual.

@mitzimorris

This comment has been minimized.

Copy link
Member

commented Aug 8, 2019

@VMatthijs - I will look into it and let you know - can't get to this one right away - tomorrow?

@VMatthijs

This comment has been minimized.

Copy link
Member Author

commented Aug 8, 2019

Thanks! No rush at all. It's not something that is blocking me. Just felt like it might be important to let someone know who has more knowledge of the codegen than I do. :-)

@bob-carpenter

This comment has been minimized.

Copy link
Contributor

commented Aug 8, 2019

I ran this:

parameters {
  real<lower = 0, upper = 1> theta;
}
model {
    int a[2, 2] = {{1, 2},
                   {3, 4}};
    print("a[2] = ", a[2]);
    print("a[ : ][2] = ", a[ : ][2]);
    print("a[ : , 2] = ", a[ : , 2]);
}

which when run prints

a[2] = [3,4]
a[ : ][2] = [3,4]
a[ : , 2] = [2,4]

As @VMatthijs said in the issue, this is the intended behavior for rvalues.

To write out the semantics more clearly, : used as an index is shorthand for

a[:] = a[1:size(a)]

where size here might be size for arrays, rows for matrices or vectors, and cols for row vectors. Then the value of a range is just the sequence

lb:ub = {lb, lb + 1, ..., ub - 1, ub }

and in general sequence indexing is defined by

a[idxs][i] = a[idxs[i]]

That generalizes to multiple indexes,

a[idxs1, ..., idxsN][i1, ..., iN] = a[idxs1[i1], ..., idxsN[iN]]

We really need to write out a proper semantics for the language!

@seantalts

This comment has been minimized.

Copy link
Member

commented Aug 8, 2019

We can also leave this unfixed in the old parser, right?

@bob-carpenter

This comment has been minimized.

Copy link
Contributor

commented Aug 8, 2019

Right---I labeled it won't fix till new compiler.

@seantalts

This comment has been minimized.

Copy link
Member

commented Aug 8, 2019

Makes sense - thanks.

Also, I thought I posted the following on this thread but can't find it - sorry if I'm repeating myself.

I would propose that we move to disallow nested indexing in the lhs of assignment rather than fixing it to do the proper thing. My reasoning is that the proper semantics here would be meaningless - if we move towards rvalue semantics, a multi-index always generates a copy, so the effect of a[multi_index][2] = rhs; is always a no-op if properly interpreted. (where multi-index is any of 2:3, :, 2:, :2, some_int_array).

Another alternative would be to retain the current behavior and just document it, but I'd rather deal with folks who are mad that their model now generates an error message (assuming we also include the "how to easily fix" in the error message) rather than the folks who learned the wrong semantics for lhs and are confused in a weird way about indexing semantics. Perhaps this is because I was myself confused about index semantics in exactly this way and wasted a few full-time days because of that confusion. 😅

@bob-carpenter

This comment has been minimized.

Copy link
Contributor

commented Aug 8, 2019

@VMatthijs

This comment has been minimized.

Copy link
Member Author

commented Aug 9, 2019

My reasoning is that the proper semantics here would be meaningless - if we move towards rvalue semantics, a multi-index always generates a copy, so the effect of a[multi_index][2] = rhs; is always a no-op if properly interpreted. (where multi-index is any of 2:3, :, 2:, :2, some_int_array).

Could you explain that a bit more? Presumably,
a[multi_index] = rhs; would not generate a copy right and be a no-op, right? How is that implemented? Can we not implement something similar for multiple indices on the LHS? Or is it just not worth the hassle?

@bob-carpenter

This comment has been minimized.

Copy link
Contributor

commented Aug 9, 2019

@seantalts

This comment has been minimized.

Copy link
Member

commented Aug 9, 2019

I guess there are a few ways you could interpret it (one of them being the current interpretation, which seems like a bug) and there is a reasonable non-noop operation you could perform after simplification, but it seems better to just limit users to expressing indexing in a single coherent index notation. I just meant that if you were thinking about nested indexing as you think about nested function calls (which is how they work on the rhs) then the inner a[multi_index] would return a copy that [2] then indexes into, so an assignment into that copy would immediately be thrown away. Anyway, doesn't really matter.

Can the code just be written so that lvalues and rvalues both get evaluated as references?

Not 100% sure if this is what you mean, but I think if we allowed something like Eigen Maps everywhere in our Math library this might be possible. It'd be a backwards-incompatible change, though, since currently we provide copies.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.