Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not checking if _lpmf functions are only used within other _lp* functions and generating invalid C++ #335

Closed
wds15 opened this issue Oct 14, 2019 · 9 comments
Labels
bug Something isn't working typechecker

Comments

@wds15
Copy link

wds15 commented Oct 14, 2019

The old compiler previously allowed functions containing "lp" to access target; the new compiler follows the manual and only allows functions ending in _lp, _lpmf, _lpdf, or _log to access them. Right now this is generating bad C++ code that gives a compile error complaining about error: use of undeclared identifier 'lp__' or error: use of undeclared identifier 'lp_accum__'.

Example

This model runs just fine with 2.20, but the new stanc 2.21 compiler seems to create non-compiling C++ code.

fail log, code and data attached.

stanc3-failure-oncobayes.zip

@seantalts
Copy link
Member

This looks like it's failing because Stan 2 assumed that a function ending in _lpmf_comp was also a user-defined distribution, but as far as I know that's a bug (is that right, @bob-carpenter ?). You can see the spec here: https://mc-stan.org/docs/2_18/stan-users-guide/user-defined-probability-functions.html

@wds15
Copy link
Author

wds15 commented Oct 14, 2019

But stanc3 creates non compile-able c++ code. I would expect an error message from the parser at least.

@seantalts
Copy link
Member

Fair enough! This does seem like something the type checker should be catching.

@seantalts seantalts changed the title Failing model Not checking if _lpmf functions are only used within other _lp* functions and generating invalid C++ Oct 14, 2019
@seantalts seantalts added typechecker bug Something isn't working labels Oct 14, 2019
@seantalts seantalts added this to the initial release milestone Oct 14, 2019
@wds15
Copy link
Author

wds15 commented Oct 15, 2019

After fixing the problem with _lpmf I am now getting a C++ file which cannot be compiled:

../OncoBayes2/adapt_delta_tweak/ob2_model-stanc3.hpp:3989:68: note: insert an explicit cast to silence this issue
                                                                   g,
                                                                   ^
                                                                   static_cast<int>( )
../OncoBayes2/adapt_delta_tweak/ob2_model-stanc3.hpp:3990:68: error: non-constant-expression cannot be narrowed from type 'unsigned long' to 'int' in initializer list [-Wc++11-narrowing]
                                                                   (g + num_groups)}), nil_index_list()), "beta"),
                                                                   ^~~~~~~~~~~~~~~~
../OncoBayes2/adapt_delta_tweak/ob2_model-stanc3.hpp:3990:68: note: insert an explicit cast to silence this issue
                                                                   (g + num_groups)}), nil_index_list()), "beta"),
                                                                   ^~~~~~~~~~~~~~~~
                                                                   static_cast<int>( )
../OncoBayes2/adapt_delta_tweak/ob2_model-stanc3.hpp:3993:68: error: non-constant-expression cannot be narrowed from type 'size_t' (aka 'unsigned long') to 'int' in initializer list
      [-Wc++11-narrowing]
                                                                   g,
                                                                   ^
../OncoBayes2/adapt_delta_tweak/ob2_model-stanc3.hpp:3993:68: note: insert an explicit cast to silence this issue
                                                                   g,
                                                                   ^
                                                                   static_cast<int>( )
../OncoBayes2/adapt_delta_tweak/ob2_model-stanc3.hpp:3994:68: error: non-constant-expression cannot be narrowed from type 'unsigned long' to 'int' in initializer list [-Wc++11-narrowing]
                                                                   (g + num_groups)}), nil_index_list()), "eta"),
                                                                   ^~~~~~~~~~~~~~~~
../OncoBayes2/adapt_delta_tweak/ob2_model-stanc3.hpp:3994:68: note: insert an explicit cast to silence this issue
                                                                   (g + num_groups)}), nil_index_list()), "eta"),
                                                                   ^~~~~~~~~~~~~~~~
                                                                   static_cast<int>( )
../OncoBayes2/adapt_delta_tweak/ob2_model-stanc3.hpp:3989:68: error: non-constant-expression cannot be narrowed from type 'size_t' (aka 'unsigned long') to 'int' in initializer list
      [-Wc++11-narrowing]
                                                                   g,
                                                                   ^
../OncoBayes2/adapt_delta_tweak/ob2_model-stanc3.hpp:5206:14: note: in instantiation of function template specialization 'ob2_model_stanc3_model_namespace::ob2_model_stanc3_model::log_prob<false,
      false, stan::math::var>' requested here
      return log_prob<propto__,jacobian__,T_>(vec_params_r, vec_params_i, pstream);
             ^

Modified model is here:

ob2_model-stanc3_stan.txt

@seantalts seantalts added bug Something isn't working and removed bug Something isn't working labels Oct 19, 2019
@seantalts
Copy link
Member

Re: the post-fix model - it looks like that's because you have a function ending in complpmf that tries to use distributions, but it would need to end in _lpmf (note the underscore) for it to have access to those. And for it to end in _lpmf, it would need to return a real, not a vector. So I think that function as-is shouldn't work, though we need to still catch this bug during semantic checking and before invalid C++ is generated.

@seantalts
Copy link
Member

@VMatthijs Do you have a sense for where this should happen in Semantic_check? Basically need to check that only functions that end in _lp(mf|df) have access to other functions that end in _lp(mf|df), I think. (and does that sound right?).

@nhuurre
Copy link
Collaborator

nhuurre commented Dec 6, 2019

The old compiler allowed calling _lpdf functions anywhere. Stan Math library _lpdf functions do not access target.

Stan User Guide section 18.8 says

Only functions ending in _lpmf or _lpdf and with return type real may be used as probability functions in sampling statements.

Only functions ending in _lp may access the log probability accumulator through sampling statements or target += statements. Such functions may only be used in the transformed parameters or model blocks.

Stan Reference Manual section 9.3 says

Functions whose name ends in _lpdf or _lpmf (density and mass functions) may be used as probability functions and may be used in place of parameterized distributions on the right-hand-side of sampling statements. There is no restriction on where such functions may be used.

and later

Functions whose names end in _lp assume access to the log probability accumulator and are only available in the transformed parameter and model blocks. Functions whose names end in _rng assume access to the random number generator and may only be used within the generated quantities block, transformed data block, and within user-defined functions ending in _rng.

Section 9.5 also has

Functions that include sampling statements or log probability increment statements must have a name that ends in _lp. Attempts to use sampling statements or increment log probability statements in other functions leads to a compile-time error.

While it does say that

Functions whose names end in _lpdf and _lpmf (density and mass functions) can be used as probability functions in sampling statements.

there is no claim that sampling statements can be used inside _lpdf functions.

Section 13.4 says that _log functions are deprecated in favour of _lpdf/_lpmf functions. But how would that work if _lpdf functions are more restricted?

So the documentation is univocal: _lpdf functions are just regular pure functions; they don't have access to target.

Also, it's not just Semantic_check. Modifying target is a side effect and the optimizer assumes that only _lp functions can have side effects. (This is a bit wrong though, any function can reject() which is sort of a "side effect".)

@bob-carpenter
Copy link
Contributor

bob-carpenter commented Dec 6, 2019 via email

@seantalts
Copy link
Member

Closed by #397.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working typechecker
Projects
None yet
Development

No branches or pull requests

4 participants