Parser errors/warnings for new syntax are confusing #1940

jgabry · 2016-06-28T15:10:48Z

Summary:

When playing around with the new syntax I found that the parser errors and warnings were not very informative. Well, they were informative enough for me because I already know the correct syntax and was intentionally trying to trigger the error messages, but I imagine they would pretty opaque to users. Example in Reproducible Steps, below.

Description:

See below in Reproducible Steps

fix error message for missing vertical bar

The duplication problem is a duplicate isseu with #1946, where there are more cases to fix.

Reproducible Steps:

First put the following in logistic.stan

data {
  int<lower=0> N;
  int<lower=0,upper=1> y[N];
}
parameters {
  real theta;
}
model {
  theta ~ normal(0, 1);
  y ~ bernoulli_logit(theta);
}
generated quantities {
  vector[N] log_lik;
  for (n in 1:N)
    log_lik[n] = bernoulli_logit_log(y[n], theta);
}

then run

library("rstan")
stanc(file = "logistic.stan")

This gives the warning:

DIAGNOSTIC(S) FROM PARSER:
Warning: Deprecated function 'bernoulli_logit_log'; please replace suffix '_log' with '_lpdf' for density functions or '_lpmf' for mass functions
Warning: Deprecated function 'bernoulli_logit_log'; please replace suffix '_log' with '_lpdf' for density functions or '_lpmf' for mass functions

which is good (not one of the ones I found confusing), but why does it appear twice if
bernoulli_logit_log is only used once in the Stan program? Is it because the sampling
statement y ~ bernoulli_logit(theta) is also used? Either way it's kind of weird.

The stuff I found more confusing comes next, when trying to follow the suggestion in the warning. First I modify the bernoulli_logit_log line in generated quantities:

log_lik[n] = bernoulli_logit_lpdf(y[n], theta); \\ I know this is incorrect

and the error is

SYNTAX ERROR, MESSAGE(S) FROM PARSER:
ERROR at line 15

 13:      vector[N] log_lik;
 14:      for (n in 1:N)
 15:        log_lik[n] = bernoulli_logit_lpdf(y[n], theta);
                                             ^
 16:    }

PARSER EXPECTED: <probability function arguments>
Error in stanc(model_code = paste(program, collapse = "\n"), model_name = model_cppname,  : 
  failed to parse Stan model 'logistic' due to the above error.

which just says that it expected <probability function arguments>. I assume that's trying to tell me that I forgot the vertical bar but that's not what the message says. Would it be possible to mention the vertical bar in the message? In addition, I used a function that doesn't exist (bernoulli_logit_lpdf even though it's a mass function) but didn't get any error about that. So it seems to be giving me an error about using the wrong arguments to a function that doesn't exist.

So next I use the vertical bar:

log_lik[n] = bernoulli_logit_lpdf(y[n] | theta);

and that gets rid of the mysterious <probability function arguments> and now I get the error about bernoulli_logit_lpdf not existing:

SYNTAX ERROR, MESSAGE(S) FROM PARSER:

No matches for: 

  bernoulli_logit_lpdf(int, real)

Function bernoulli_logit_lpdf not found.

ERROR at line 15

 13:      vector[N] log_lik;
 14:      for (n in 1:N)
 15:        log_lik[n] = bernoulli_logit_lpdf(y[n] | theta);
                                                           ^
 16:    }

Error in stanc(model_code = paste(program, collapse = "\n"), model_name = model_cppname,  : 
  failed to parse Stan model 'logistic' due to the above error.

except it doesn't tell me that I just need to use lpmf instead of lpdf. Would it be possible to detect that that someone is using a function that just has an incorrect suffix?

Current Output:

The current output is given above in Reproducible Steps

Expected Output:

I think it would be helpful if:

Errors about functions not existing were thrown before errors about incorrect arguments
Errors about missing vertical bars said something about missing vertical bars instead of
just <probability function arguments>
Only one warning is issued if deprecated function is only once in a Stan program
Using lpdf instead of lpmf or vice-versa resulted in a more informative message
indicating that the correct function just has a different suffix

Are there reasons I'm overlook why any of these would be difficult?

Additional Information:

Current Version:

v2.10.0

The text was updated successfully, but these errors were encountered:

jgabry · 2016-06-28T20:42:00Z

Here are some possible alternatives for a few of the error/warning messages mentioned above:

Current:

Function bernoulli_logit_lpdf not found.

Suggestions:

Function bernoulli_logit_lpdf not found. Did you mean bernoulli_logit_lpmf?
Function bernoulli_logit_lpdf not found. For mass functions use _lpmf instead of _lpdf.

Current: - `PARSER EXPECTED: `

I saw this in the context of a missing vertical bar, but I'm not sure if that's the only time this error is thrown. If it only pertains to the vertical bar then maybe replace with something like one of these

Suggestions:

PARSER EXPECTED: vertical bar instead of comma after first argument of bernoulli_logit_lpmf
PARSER EXPECTED: '|' instead of ',' after first argument of bernoulli_logit_lpmf

jrnold · 2016-07-03T22:18:32Z

Regarding PARSER EXPECTED: <probability function arguments>, between that error message and the manual, it's not clear to me when I'm supposed to use |.

bob-carpenter · 2016-07-04T18:54:42Z

After the first argument whenever there's an lpdf, lpmf, lcdf, or lccdf.

I could relax the syntax to make the '|' optionally a ',', but I've been
reluctant to have two ways of doing the exact same thing.

On Jul 3, 2016, at 3:18 PM, Jeffrey Arnold notifications@github.com wrote:

Regarding PARSER EXPECTED: , between that error message and the manual, it's not clear to me when I'm supposed to use |.

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.

betanalpha · 2016-07-04T19:11:10Z

But can the parser identify when a | is missing and suggest it instead
of just giving the “probability function arguments” warning?

On Jul 4, 2016, at 8:54 PM, Bob Carpenter notifications@github.com wrote:

After the first argument whenever there's an lpdf, lpmf, lcdf, or lccdf.

I could relax the syntax to make the '|' optionally a ',', but I've been
reluctant to have two ways of doing the exact same thing.

On Jul 3, 2016, at 3:18 PM, Jeffrey Arnold notifications@github.com wrote:

Regarding PARSER EXPECTED: , between that error message and the manual, it's not clear to me when I'm supposed to use |.

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.

bob-carpenter · 2016-07-04T20:38:36Z

Yes, that's what I plan to do. It's already recognizing
it, just a poor choice of error message---I should've provided
an example, as in variate | parameters. I'm still not exactly
sure how to say that. Should I just say "expected |" and be done
with it?

On Jul 4, 2016, at 12:11 PM, Michael Betancourt notifications@github.com wrote:

But can the parser identify when a | is missing and suggest it instead
of just giving the “probability function arguments” warning?

On Jul 4, 2016, at 8:54 PM, Bob Carpenter notifications@github.com wrote:

After the first argument whenever there's an lpdf, lpmf, lcdf, or lccdf.

I could relax the syntax to make the '|' optionally a ',', but I've been
reluctant to have two ways of doing the exact same thing.

On Jul 3, 2016, at 3:18 PM, Jeffrey Arnold notifications@github.com wrote:

Regarding PARSER EXPECTED: , between that error message and the manual, it's not clear to me when I'm supposed to use |.

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.

—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub, or mute the thread.

betanalpha · 2016-07-04T21:21:05Z

How about “Expected a | separating the random variables from the parameters in ”?

On Jul 4, 2016, at 10:38 PM, Bob Carpenter notifications@github.com wrote:

Yes, that's what I plan to do. It's already recognizing
it, just a poor choice of error message---I should've provided
an example, as in variate | parameters. I'm still not exactly
sure how to say that. Should I just say "expected |" and be done
with it?

On Jul 4, 2016, at 12:11 PM, Michael Betancourt notifications@github.com wrote:

But can the parser identify when a | is missing and suggest it instead
of just giving the “probability function arguments” warning?

On Jul 4, 2016, at 8:54 PM, Bob Carpenter notifications@github.com wrote:

After the first argument whenever there's an lpdf, lpmf, lcdf, or lccdf.

I could relax the syntax to make the '|' optionally a ',', but I've been
reluctant to have two ways of doing the exact same thing.

On Jul 3, 2016, at 3:18 PM, Jeffrey Arnold notifications@github.com wrote:

Regarding PARSER EXPECTED: , between that error message and the manual, it's not clear to me when I'm supposed to use |.

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.

—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub, or mute the thread.

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.

Samreay · 2016-08-22T04:55:52Z

I'll +1 this issue as well if possible.

I was going through the deprecation appendix in the latest stan doco and updating all _log functions to _lpdf calls, and had no idea that the | syntax was even used, especially because the API documentation explicitly uses a comma and not a separator (eg chapter 44.3 for skew_normal - the first distribution I went to update and got stuck on, not a special example).

bob-carpenter · 2016-08-22T12:19:17Z

Agreed. Those will both be fixed for the next release. The issue's flagged here for the parser warning and in the manual issue for the manual.

bob-carpenter · 2016-08-22T16:07:02Z

@betanalpha That would be confusing because any of the arguments can be random variables and the first one isn't required to be a random variable. It's much more work to call out the specific function (I'd have to write a new rule for the case where someone forgets the vertical bar, then create a custom message for it, then have it fail after it's matched and reported an error.) What I went with produces this:

SYNTAX ERROR, MESSAGE(S) FROM PARSER:

ERROR at line 5

  3:    }
  4:    model {
  5:      target += normal_lpdf(y, 0, 1);
                               ^
  6:    }

PARSER EXPECTED: <probability function arguments with first
 two arguments separated by a vertical bar;
 e.g., foo_lpdf(y | alpha, beta).
 This syntax required for functions with suffixes
 _lpdf, _lpmf, _lcdf, _lccdf>

betanalpha · 2016-08-22T16:50:38Z

To be technical, variables on the right hand side of the bar aren’t
actually random anymore. When you condition you condition on
an explicit value. In other words, the conditional distribution is not
defining a distribution for those variables, hence they take on a
different interpretation.

On Aug 22, 2016, at 9:07 AM, Bob Carpenter notifications@github.com wrote:

@betanalpha That would be confusing because any of the arguments can be random variables and the first one isn't required to be a random variable. It's much more work to call out the specific function (I'd have to write a new rule for the case where someone forgets the vertical bar, then create a custom message for it, then have it fail after it's matched and reported an error.) What I went with produces this:

SYNTAX ERROR, MESSAGE(S) FROM PARSER:

ERROR at line 5

3: }
4: model {
5: target += normal_lpdf(y, 0, 1);
^
6: }

PARSER EXPECTED: <probability function arguments with first
two arguments separated by a vertical bar;
e.g., foo_lpdf(y | alpha, beta).
This syntax required for functions with suffixes
_lpdf, _lpmf, _lcdf, _lccdf>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

bob-carpenter · 2016-08-22T18:03:15Z

Understood, but it's the "anymore" that's problematic. They are
random variables from the Stan program's perspective. And if you
talk to Andrew (or read BDA), then everything's a random variable
in Bayesian stats, even the predictors in a regression that come in
as data and get no distribution. That's why he refuses to let us use
the semicolon notation (the frequentists use that precisely to distinguish
the random variables from other variables). Going against BDA, even with
technical correctness on our side, seems like a mug's game --- it'll just
confuse users and annoy Andrew without much upside.

betanalpha · 2016-08-22T18:50:00Z

That’s not quite the argument.

The semicolon is purely frequentist and is meant to avoid
interpreting the likelihood as a conditional probability distribution,
and I agree that this has no place in Bayesian inference (in some
sense the difference between | and ; distills the fundamental
differences between frequentist and Bayesian inference).

Instead the difference is in the definition and use of a random
variable. When using a conditional probability distribution, any
variables to the left of the | are having their distribution defined
by the conditional distribution. Any variables to the right of the
|, however, are just be queried by value independent of their
distribution. Sequential generative modeling with conditional
probability distributions is just a way to isolate how a random
variable is defined and how it is used to influence other random
variables.

In other words, the interpretation of what is a random variable
and what isn’t is being locally scoped just to the definition of
that conditional probability distribution. I understand how this
would be confusing to users not familiar with probability, and
even to those who are as the error messages would be
referring to this very local scope and not the global scope of
the program where everything is random variable. It was just
a suggestion. Ultimately anything we choose is either going to
be ambiguous at some level, inconsistent with the actual math,uses
or confusing to most users.

On Aug 22, 2016, at 11:03 AM, Bob Carpenter notifications@github.com wrote:

Understood, but it's the "anymore" that's problematic. They are
random variables from the Stan program's perspective. And if you
talk to Andrew (or read BDA), then everything's a random variable
in Bayesian stats, even the predictors in a regression that come in
as data and get no distribution. That's why he refuses to let us use
the semicolon notation (the frequentists use that precisely to distinguish
the random variables from other variables). Going against BDA, even with
technical correctness on our side, seems like a mug's game --- it'll just
confuse users and annoy Andrew without much upside.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

update error msg for , instead of | in prob funs, fixes #1940

bob-carpenter added feature language labels Jul 4, 2016

bob-carpenter added this to the v2.10.0++ milestone Jul 4, 2016

bob-carpenter self-assigned this Jul 4, 2016

syclik modified the milestones: v2.11.0, v2.11.0++ Jul 27, 2016

bob-carpenter mentioned this issue Aug 19, 2016

nile.stan file missing stan-dev/example-models#73

Closed

jgabry mentioned this issue Aug 23, 2016

update error msg for , instead of | in prob funs, fixes #1940 #2026

Merged

3 tasks

bob-carpenter closed this as completed in 86547e6 Aug 24, 2016

bob-carpenter pushed a commit that referenced this issue Aug 24, 2016

Merge pull request #2026 from stan-dev/feature/1940-prob-syntax-warn

00dbb53

update error msg for , instead of | in prob funs, fixes #1940

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parser errors/warnings for new syntax are confusing #1940

Parser errors/warnings for new syntax are confusing #1940

jgabry commented Jun 28, 2016 •

edited by bob-carpenter

Loading

jgabry commented Jun 28, 2016

jrnold commented Jul 3, 2016

bob-carpenter commented Jul 4, 2016

betanalpha commented Jul 4, 2016

bob-carpenter commented Jul 4, 2016

betanalpha commented Jul 4, 2016

Samreay commented Aug 22, 2016

bob-carpenter commented Aug 22, 2016 via email

bob-carpenter commented Aug 22, 2016

betanalpha commented Aug 22, 2016

bob-carpenter commented Aug 22, 2016

betanalpha commented Aug 22, 2016

Parser errors/warnings for new syntax are confusing #1940

Parser errors/warnings for new syntax are confusing #1940

Comments

jgabry commented Jun 28, 2016 • edited by bob-carpenter Loading

Summary:

Description:

Reproducible Steps:

Current Output:

Expected Output:

Additional Information:

Current Version:

jgabry commented Jun 28, 2016

jrnold commented Jul 3, 2016

bob-carpenter commented Jul 4, 2016

betanalpha commented Jul 4, 2016

bob-carpenter commented Jul 4, 2016

betanalpha commented Jul 4, 2016

Samreay commented Aug 22, 2016

bob-carpenter commented Aug 22, 2016 via email

bob-carpenter commented Aug 22, 2016

betanalpha commented Aug 22, 2016

bob-carpenter commented Aug 22, 2016

betanalpha commented Aug 22, 2016

jgabry commented Jun 28, 2016 •

edited by bob-carpenter

Loading