C++ getParents and conditionally independent sets#1094
Conversation
…tionallyIndependentSets and test fun
|
Comment for the record: Test failures are of two kinds:
|
|
I was confused by the About posterior predictive nodes, I guess I'm fine to have the default being to exclude (not include them) in the results, but how about via an option I'm ok with the long name I imagine that eventually we'd make these model class functions, is that the idea?
|
|
I pushed some changes before finished by accident, so this branch is in an intermediate unintended state. |
|
This PR is now substantially changed. The two main components are now:
Note that arguments closely follow getDependencies but have different defaults. By default, only stochastic nodes, omitting self, are returned. There is also a An intermediate step, described when this PR was opened, was
We could provide this feature not as a model method if there is concern about clogging up models. Here is a copy of the draft roxygen method text: Get a list of conditionally independent sets of nodes in a nimble model. Conditionally independent sets of nodes are typically groups of latent states whose joint [conditional] probability (density) will not change even if any other non-fixed node is changed. Default fixed nodes are data nodes and parameter nodes ([nodes] with no parent nodes), but this can be controlled. model: A nimble model object (uncompiled or compiled). nodes: A vector of node names or their graph IDs that are the starting nodes from which conditionally independent sets of nodes should be found. If omitted, the default will be all latent nodes, defined as stochastic nodes that are not data and have at least one stochastic parent node (possible with determinstic nodes in between). Note that this will omit latent states that have no hyperparameters. An example is the first latent state in some state-space (time-series) models, which is sometimes declared with known prior. See type because it relates to the interpretation of nodes. givenNodes: A vector of node names or their graph IDs that should be considered as fixed and hence can be conditioned on. If omitted, the default will be all data nodes and all parameter nodes, the latter defined as nodes with no stochastic parent nodes (skipping over deterministic parent nodes). omit: A vector of node names or their graph IDs that should be omitted and should block further graph exploration. type: Type of graph exploration (upstream through parent nodes, downstream through dependent nodes, or both). For ""both"", the input nodes are interpreted as latent states, from which both downstream and upstream exploration should be done to find nodes in the same set (nodes that are not conditionally independent from each other). For ""fromTop"", the input nodes are interpreted as parameters, so graph exploration begins from the top (input) downstream. For ""fromBottom"", the input nodes are interpreted and data nodes, so graph exploration begins from the bottom (input) upstream. stochOnly: Logical for whether only stochastic nodes should be returned (default = TRUE). If FALSE, both deterministic and stochastic nodes are returned. returnType: Either ""names"" for returned nodes to be node names or ""ids"" for returned nodes to be graph IDs. returnScalarComponents: If FALSE (default), multivariate nodes are returned as full names (e.g. ""x[1:3]""). If TRUE, they are returned as scalar elements (e.g. ""x[1]"", ""x[2]"", ""x[3]""). Details: This function returns sets of conditionally independent nodes. Multiple input nodes might be in the same set or different sets, and other nodes (not in codes) will be included. There is a non-exported function Return value: List of nodes that are in conditionally independent sets. Within each set, nodes are returned in topologically sorted order. The sets themselves are returned in topologically sorted order of their first nodes. There are some tests in test-getDependencies. These are features that invite creative models and invocations, potentially including unwise/unintended invocations, so there is room for more tests. |
|
Comments from 2021-02-18 meeting. For Once Perry does final development work, Chris can merge in and can update the |
|
I have updated this PR as follows:
There is no special handling of posterior predictive nodes. An early comment in code raised this issue, but they end up treated as any other nodes right now. |
|
@perrydv Do we definitely like the new name And thanks for explaining (by way of your A, B, C, D, .... example) the new functionality for grouping deterministic nodes along with given nodes. By way of your example, that makes a lot of sense, and I agree sounds like the functionality that we want. |
|
minor two cents - I agree with Daniel about 'immediate' seeming more intuitive to me. |
|
I changed I think this is ready to include. Note that I have not put anything in the user manual. on getParents or getConditionallyIndependentSets. We could keep both low profile for now, or we could declare getParents more stable (and document it) but hold off on getConditionallyIndependentSets in case its syntax needs to evolve. Up to you @paciorek. Both do have roxygen method entries. |
|
I added a brief mention of |
This PR has two distinct components:
getParents. This is behind a nimble optionuse_C_getParentswhich is turned on so that testing will be interesting. Note that the uncompiled version ofgetParentsreturned nodes in undefined order. The new version returns nodes in sorted order. That means the new version will not match the old version in order, and it does not seem worth the effort to make it do so, since the orders result from different algorithms that make sense in the different languages. A model interface to getParents might want some options similar to getDependencies.2a. A new model structure algorithm called
getConditionallyIndependentSets. (Ideas for a shorter name or do we like how explicit it is?). Documentation is drafted in comments above the function in BUGS_model.R. Basic use:getConditionallyIndependentSets(model)will default to finding conditionally independent sets of stochastic latent nodes given top nodes and data nodes.2b. A test function
testConditionallyIndependentSetsthat checks whether sets of nodes are conditionally independent by holding one constant while simulating all the others and checking that the logProbs of the held-constant set does not change, then doing that for each set. See comments above the function.Items 2a and 2b are not exported and not used. They are for upcoming development.