-
-
Notifications
You must be signed in to change notification settings - Fork 377
FEATURE REQUEST: Can we have native support for ragged arrays? #3326
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
There is an older design document which proposes this feature at stan-dev/design-docs#7 You are correct that there are some implementation challenges, but hearing that people would really use it is always motivating! |
There are two semantic issues with the current approach. One, we have to translate data types. Two, we have to apply our own constraints. To mitigate both issues, we really should have our built-in constraints implemented to map an array, a position, and a size/constraint specification into the relevant data structure---building in that deserialization would help. I'm thinking something like: tuple(matrix, real) read_cov_matrix(array[] real x, int start) that returns the constrained matrix and log Jacobian. This is essentially how our built-in constraints work. Alternatively, we could define the deserializer as matrix read_cov_matrix_jac(array[] real x, int start) and increment the log density for the Jacobian within the function. Our built-ins support this pattern, too. I'm not sure if we actually have the Native ragged arrays would be much better, but they're hard to specify with Stan's size-based design. Please feel free to comment on the design doc. Along with closures, it's one of the two big features we're looking at implementing. Tuples was the last one. |
It's called But it hasn't been started. I was originally imagining it as wrapping the stan-dev/math versions of the functions, while your comment sounds more like wrapping the stan-dev/stan deserializer class versions, so there would have to be some call there (the main difference is for e.g. simplex_jacobian, wrapping the stan-math implementation would require the input already be a vector, not an array[] real) |
Thanks, @WardBrian. Using a vector would also work. I just realized that what I proposed above isn't enough for a stateless implementation. We really need something like this: tuple(matrix, real, int) read_cov_matrix(vector x, int pos, int N) where If we want just the plain old transforms, where we consume a whole vector, that'd be tuple(matrix, real) to_cov_matrix(vector x); This would require the user to pass a vector that has a size equal to N choose 2 + N for some N, and the return is just the covariance matrix and Jacobian adjustment. We can also have a slightly more efficient version without the Jacobian. |
I think it would be great if there is a native support for ragged arrays in the datatypes. The need for this appears in practice, when we have groups of measurements, but each group has different size. Such problems are often suitable for hierarchical modelling for which STAN is often a good tool of choice.
The user guide suggests two approaches how to address this in the current state.
It's quite obvious that 1) is a possibly huge waste of memory in many cases.
Technically, 2) works but it's still not great. The main issue I have with it is that it's at the cost of loss of semantics: It makes the code harder to write, read and maintain. Performance-wise, 2) still (according to the guide) makes a copy of parts of the array within
model
(using a function likeblock
orsegment
), which is a bit of a waste, it's not terrible, but certainly not optimal if we could instead reference the data directly without copying.Description:
My idea is to allow type declarations for array of arrays of unknown length (especially in the
data
section) such as:Then I expect to be able to get a vector using
vector[] v = array_of_varying_length_vectors[k]
and obtaining it's length usingsize(v)
or a similar function.Not sure how difficult is this to implement, I'm just a measly user, seems probably like a bigger change.
The text was updated successfully, but these errors were encountered: