Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

next manual, 2.20++ #36

Closed
mitzimorris opened this issue Mar 22, 2019 · 34 comments
Closed

next manual, 2.20++ #36

mitzimorris opened this issue Mar 22, 2019 · 34 comments
Milestone

Comments

@mitzimorris
Copy link
Member

mitzimorris commented Mar 22, 2019

Summary:

This is the place to record typos and brainos or suggestions for the Stan manuals: User's Guide, Reference Manual, Functions Reference. Please report broken links, incorrect code or math as well as questions and corrections to discussion and modeling advice.

Current Version:

v2.19.0

@mitzimorris mitzimorris added this to the 2.19++ milestone Mar 22, 2019
@avehtari
Copy link
Contributor

avehtari commented May 17, 2019

  • In section Normal-Id Generalised Linear Model (Linear Regression)
    normal_id_glm_lpmf should be normal_id_glm_lpdf
    and 15.2.1 Probability Mass Function should be 15.2.1 Probability Distribution Function

@jaehyunjoo
Copy link

jaehyunjoo commented May 22, 2019

  • Just reporting minor things.

Stan User's Guide:
In section 9.2 Soft K-Means, need to remove a redundant close paren after soft_z[n]
// likelihood
for (n in 1:N)
target += log_sum_exp(soft_z[n]));

Stan Language Functions Reference:
In section 15.1 Normal Distribution, may need a close brace? \pitemTwo{y}
std_normal(\pitemTwo{y) and std_normal_lpdf( y | \pitemTwo{y)

@bob-carpenter
Copy link
Contributor

Thanks, @jaehyunjoo. We'll get those fixed.

@bertcarnell
Copy link

bertcarnell commented Jun 18, 2019

I think that the above section should be changed from
MultiNormalPrecision(y|mu, Omega) = MultiNormal(y|mu, Sigma^-1)
to
MultiNormalPrecision(y|mu, Omega) = MultiNormal(y|mu, Omega^-1)

@mitzimorris
Copy link
Member Author

mitzimorris commented Jun 18, 2019

@bertcarnell - good catch, many thanks!

revising above statement - docs are correct, not fixing

OK - @bertcarnell - absolutely correct - I made same mistake as whoever wrote this.
many thanks!

@enbrown
Copy link

enbrown commented Jun 22, 2019

This line in the Custom Probability Functions chapter of the User Manual doesn't show correctly in the HTML output

\mbox{binormal\_cdf}(z_1, z_2, \rho) = \mbox{Pr}[Z_1 > z_1 \mbox{ and } Z_2 > z_2]

Additionally, to have a more consistent format in the LaTeX PDF output, it's possible that the use of \mbox{binormal\_cdf} would be better replaced by \mathtt{binormal\_cdf} so that the font of this Stan function name is in a monospaced font similar to the monospaced/console font used in the verbatim Stan code sections. There are many places throughout the documentation where using \mathtt in equations when referring to Stan functions would likely be more clear.

@enbrown
Copy link

enbrown commented Jun 22, 2019

These 4 equations don't render correctly in the LaTeX PDF version of the User Guide and all show up on the same line without linebreaks between the 4 equations

$$
e \sim \mathsf{Exponential}(r_e)
\\
l \sim \mathsf{Exponential}(r_l)
\\
s \sim \mathsf{Uniform}(1, T)
\\
D_t \sim \mathsf{Poisson}(t < s \ ? \ e \ : \ l)
$$

This code works in LaTeX but I'm not sure how a LaTeX align environment outside of math mode would be processed into HTML by pandoc:

\begin{align*}
e   &\sim  \mathsf{Exponential}(r_e) \\
l   &\sim  \mathsf{Exponential}(r_l) \\
s   &\sim  \mathsf{Uniform}(1, T) \\
D_t &\sim  \mathsf{Poisson}(t < s \ ? \ e \ : \ l)
\end{align*}

If the align LaTeX environment is able to be used, then there are many places throughout the User Guide that could benefit. For example, a little later in the same chapter, the code

$$
p(e,l,s,D)
= p(e) \, p(l) \, p(s) \, p(D | s, e, l)
\\
=
\mathsf{Exponential}(e|r_e) \ \mathsf{Exponential}(l|r_l) \
\mathsf{Uniform}(s|1, T)
\prod_{t=1}^T \mathsf{Poisson}(D_t | t < s \ ? \ e \ : \ l),
$$

produces an unreadably-long line in the PDF output because its all on one line. Replacing it with

\begin{align*}
p(e,l,s,D)
 &=  p(e) \, p(l) \, p(s) \, p(D | s, e, l) \\
 &=
\mathsf{Exponential}(e|r_e) \ \mathsf{Exponential}(l|r_l) \
\mathsf{Uniform}(s|1, T) \\
 &\qquad \prod_{t=1}^T \mathsf{Poisson}(D_t | t < s \ ? \ e \ : \ l),
\end{align*}

would help immensely with readability.

@enbrown
Copy link

enbrown commented Jun 22, 2019

There are scattered lines that show up in both the HTML and PDF outputs as cryptic references to something like id:foo-stuff.figure. One example is

id:change-point-model.figure

@mitzimorris
Copy link
Member Author

@enbrown - many thanks for all of the above. will investigate and try to address.

@enbrown
Copy link

enbrown commented Jun 22, 2019

Here's another table that's present in the LaTeX PDF output that seems to be missing from the HTML files

\begin{center}
\begin{tabular}{c|ccc|c}
& \multicolumn{3}{|c|}{{\it captures}}
\\
{\it profile} & 1 & 2 & 3 & {\it probability}
\\ \hline
{0} & - & - & - & n/a
\\
{1} & - & - & + & n/a
\\ \hline
{2} & - & + & - & $\chi_2$
\\
{3} & - & + & + & $\phi_2 \, p_3$
\\ \hline
{4} & + & - & - & $\chi_1$
\\
{5} & + & - & + & $\phi_1 \, (1 - p_2) \, \phi_2 \, p_3$
\\ \hline
{6} & + & + & - & $ \phi_1 \, p_2 \, \chi_2$
\\
{7} & + & + & + & $\phi_1 \, p_2 \, \phi_2 \, p_3$
\end{tabular}
\end{center}

@enbrown
Copy link

enbrown commented Jun 22, 2019

Here are some small typos in the latent-discrete.Rmd file:

doct@rs should be doctors in:

doct@s say aboutpatient histories; the same model can be used

In a few places the LaTeX nonbreaking space ~ is used rather than the Markdown \ form. So in one, Equation~(2.7) should be Equation\ (2.7):

in the Stan program using `target~+=` rather than

In the inner loop, the observed capture status `y[i,~t]` for

Equation~(2.7), required for the E-step in their expectation


The two figures in section 7.2 seem to result in broken links on the HTML website (possibly due to them being PDF images):
```{r include=TRUE, fig.align="center", fig.cap=c("Analytical change-point posterior"), echo=FALSE}
knitr::include_graphics("./img/change-point-posterior.pdf")
```

```{r include=TRUE, fig.align="center", fig.cap=c("Sampled change-point posterior"), echo=FALSE}
knitr::include_graphics("./img/s-discrete-posterior.pdf")
```

@bob-carpenter
Copy link
Contributor

@enbrown Thanks so much for all the careful comments on doc.

Rather than \mbox{foo_cdf}, we should be using \textrm{foo\_cdf}. Not a big deal.

it's possible that the use of \mbox{binormal_cdf} would be better replaced by \mathtt{binormal_cdf}

I'm OK with this, but it would be a huge change throughout the doc.

WARNING: For words, we need to use \texttt{binormal\_cdf} rather than the mathtt version. Also, we should be using \textrm rather than \mbox as it scales size properly.

@enbrown
Copy link

enbrown commented Jun 30, 2019

I'm OK with this, but it would be a huge change throughout the doc.

Yup, its massive. I've forked the repository, made a ton of changes, and I hope created a pull request correctly. There were random typos, tables and pictures that were missing from the HTML version, text that was missing from the HTML version (but present in the LaTeX version), and fixes to the formatting of the equations (the vast majority.)

@bob-carpenter
Copy link
Contributor

bob-carpenter commented Jul 1, 2019 via email

@mitzimorris
Copy link
Member Author

this looks totally awesome - happy to review your PR as is - many thanks for doing this!

@enbrown
Copy link

enbrown commented Jul 1, 2019

this looks totally awesome - happy to review your PR as is - many thanks for doing this!

No problem. Procrastinating with typography!

Is there a Stan code highlighting style that is used by other people? Its not too difficult to add syntax highlighting (4 line change to an _output.yml file, annotating all Stan code blocks with {.stan} and including two MIT-licensed style and theme files.)

@mitzimorris
Copy link
Member Author

what kind of style and theme files? latex? css?

@enbrown
Copy link

enbrown commented Jul 2, 2019

what kind of style and theme files? latex? css?

Pandoc-based syntax highlighting:
https://github.com/enbrown/stan-docs/blob/format_stan_blocks/src/stan-users-guide/stan.xml
https://github.com/enbrown/stan-docs/blob/format_stan_blocks/src/stan-users-guide/stan.theme

The _output.yml file just needs to pass options to Pandoc to use these:
https://github.com/enbrown/stan-docs/blob/format_stan_blocks/src/stan-users-guide/_output.yml

Finally, Stan code blocks need to be identified:

``` {.stan}
data {
 ...
}
model {
  y ~ normal(0, 1)
}
```

@bob-carpenter
Copy link
Contributor

I thought RMarkdown compiled Stan models when it saw them in markdown. We don't want to try to be compiling these things as a lot of them are just program fragments.

@enbrown
Copy link

enbrown commented Jul 2, 2019

RMarkdown shows code annotated like this without actually compiling anything, but it can highlight the syntax in them: make the different program blocks one color, control flow (if/then/for) another color, known functions a different color, and distributions bold. Its purely formatting like this:

image

When I annotated all Stan blocks in the user guide (375 I think), I didn't notice it take any longer to make the HTML and PDF outputs, but it could add a few seconds overall to the processing time.

@bob-carpenter
Copy link
Contributor

Processing time's no big deal. It already takes a long time to generate HTML and PDF.

Thanks much for looking into this and pasting in an example.

I've never been crazy about this level of syntax highlighting. I turn on the lightest mode possible in anything I'm doing as I find all the colors and fonts distracting. I'm curious what others think on this, as I'm not the dictator of this thing.

Green and red won't work as color choices due to color blindness. Any idea why "normal" is set in the same color as "lower"? The block names should probably be a more distinct color, too.

I like the lighter color for ocmments, but I'm not crazy about the italic typewriter font. At least the font's fixed width.

@enbrown
Copy link

enbrown commented Jul 4, 2019

These were completely arbitrary choices (that I made based on the defaults which were worse) that can be modified at will. The font color and background color can be set to any RGB color. Then they can be bold or not, italicized or not, and underlined or not. These are the things that can be changed in the Pandoc theme.

It might be possible to delve deeper into the CSS and overwrite these choices (because the Stan codeblocks are actually in HTML <code class="sourceCode stan"> elements with <span class="XX"> subelements that do the styling. But that's likely more trouble than its worth (although you could use that to do user-customizable style selections so someone could have WILD colors and others can be monochromatic.)

@bob-carpenter
Copy link
Contributor

bob-carpenter commented Jul 6, 2019

We shouldn't hold up this PR for color choices in Stan programs. Just please choose something relatively low contrast, with matching darknesses, and don't let users adjust it. We can fix it later, but discussing it ad nauseum now would literally be an instance of a classic bikeshed issue.

Let me outline some guidelines from standard book design practice:

  • contrast shouldn't be too high so that it's easy to read
  • darkness (what typographers call "color") should be matched and relatively dark
  • keep a minimal number of fonts (ideally one)
  • keep line-length down to 60 max

Together, these let you read the marked-up text like regular text much more easily and will make the text in code look more seamlessly like part of the rest of the book. Darkness is what typographers call "color". It has to match for things to be readable. And for them to be projectable. That's really the main consideration here---can people read them in print and on screen and when projected.

In the end, we want to move to a situation where we have code in actual functional program contexts that we can check for syntax, etc., then cut into the doc.

P.S. CSS would be problematic as none of the components have span styling.

@mitzimorris
Copy link
Member Author

I think that the above section should be changed from
MultiNormalPrecision(y|mu, Omega) = MultiNormal(y|mu, Sigma^-1)
to
MultiNormalPrecision(y|mu, Omega) = MultiNormal(y|mu, Omega^-1)

the documentation is correct - this statement is relating the MultiNormalPrecision to the MultiNormal - the precision parameter Omega is related to the variance parameter Sigma.

@mitzimorris
Copy link
Member Author

would like to close this for 2.20 release but would also like to get @enbrown's fixes in - where does this discussion stand?

@bertcarnell
Copy link

Sorry for missing your disagreement with my comment. I don't think that you are correct. Think about the mathematical statement that is being made, If K is an integer, and mu is a real vector, and Omega is a real square matrix, then for y as a real vector is distributed as multivariate normal using the Omega precision matrix, or equivalent, it is distributed as multivariate normal using the variance matrix (which is the Omega inverse). If you introduce the Sigma matrix here, then you haven't defined it. See the multivariate normal Cholesky parameterization as an equivalent.

@bob-carpenter
Copy link
Contributor

See: #71

@bob-carpenter bob-carpenter changed the title next manual, 2.19++ next manual, 2.20++ Jul 29, 2019
@lukas-rokka
Copy link

lukas-rokka commented Sep 1, 2019

  • Typo on page 106 in the Function reference documentation 2.20:
    "lognormal_rng(reals mu, reals beta)" should be "lognormal_rng(reals mu, reals sigma)"

Current Version:

v2.20.0

@mitzimorris
Copy link
Member Author

good catch, many thanks, will fix!

@mcol mcol mentioned this issue Oct 23, 2019
2 tasks
@asperamanka
Copy link

asperamanka commented Oct 29, 2019

  • In Stan function manual 2.20 chapter 15.1:
    sampling statement of std_normal should be y ~ std_normal() instead of y ~ std_normal(y)

@mitzimorris
Copy link
Member Author

good catch - thanks - will fix!

@wesbarnett
Copy link

wesbarnett commented Nov 25, 2019

@mitzimorris
Copy link
Member Author

that's funny!

@mitzimorris
Copy link
Member Author

leaving this issue open until 2.22 release

note that changes suggested by enbrown have been incorporated via PRs #90 and #126

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants