Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explain sample_stats naming convention #1063

Merged
merged 9 commits into from Jan 16, 2021

Conversation

nitishp25
Copy link
Contributor

@nitishp25 nitishp25 commented Feb 10, 2020

Description

fixes #1053

Checklist

  • Does the PR follow official PR format?
  • Is the code style correct (follows pylint and black guidelines)?
  • Is the change listed in changelog?

@nitishp25
Copy link
Contributor Author

@OriolAbril I have updated the descriptions and added a few missing variables. Are there any more variables to be added?

doc/schema/schema.md Outdated Show resolved Hide resolved
doc/schema/schema.md Outdated Show resolved Hide resolved
@OriolAbril
Copy link
Member

I was thinking about it and I think we should either rename it (preferably) or not have it in sample stats. My main concerns are: as I understand it, one of the goals of ArviZ is also to ease the comparison between different inference libraries, we do not use it for now, but our users may use it and accessing a value in an inference data should not depend on the inference library; the second concern is that we may have one use for it at some point (I can't think of nothing else than including the acceptance prob warning in our summary right now but who knows) the naming differences may block or delay the new feature. This second concern is why I prefer renaming over removing it.

@nitishp25
Copy link
Contributor Author

I see. I will rename it once you confirm @OriolAbril @ahartikainen

@sethaxen
Copy link
Member

I also support renaming.

@ahartikainen
Copy link
Contributor

For Stan the outputs for sample_stats are

original name current name unified name meaning source
lp__ lp the log posterior density (up to a constant) 1 src1
accept_stat__ accept_stat the average acceptance probabilities of all possible samples in the proposed tree. src2
stepsize__ stepsize the step size used by NUTS in its Hamiltonian simulation. src2
treedepth__ treedepth the depth of tree used by NUTS, which is the log (base 2) of the number of leapfrog steps taken during the Hamiltonian simulation. src2
n_leapfrog__ n_leapfrog the actual number of leapfrog steps computed src3
divergent__ diverging the number of leapfrog transitions with diverging error. Because NUTS terminates at the first divergence this will be either 0 or 1 for each iteration. src2
energy__ energy the value of the Hamiltonian (up to an additive constant) at each iteration. src2

1. The lp__ value also represents the potential energy in the Hamiltonian system and is rate bounded by
the randomly supplied kinetic energy each iteration, which follows a Chi-square distribution in the number
of parameters.

@ahartikainen
Copy link
Contributor

ahartikainen commented Feb 21, 2020

Then there is also other settings

see for example for CmdStan output

# stan_version_major = 2
# stan_version_minor = 18
# stan_version_patch = 0
# model = eight_schools_nc_model
# method = sample (Default)
#   sample
#     num_samples = 100
#     num_warmup = 1000 (Default)
#     save_warmup = 0 (Default)
#     thin = 1 (Default)
#     adapt
#       engaged = 1 (Default)
#       gamma = 0.050000000000000003 (Default)
#       delta = 0.80000000000000004 (Default)
#       kappa = 0.75 (Default)
#       t0 = 10 (Default)
#       init_buffer = 75 (Default)
#       term_buffer = 50 (Default)
#       window = 25 (Default)
#     algorithm = hmc (Default)
#       hmc
#         engine = nuts (Default)
#           nuts
#             max_depth = 10 (Default)
#         metric = diag_e (Default)
#         metric_file =  (Default)
#         stepsize = 1 (Default)
#         stepsize_jitter = 0 (Default)
# id = 0 (Default)
# data
#   file = eight_schools.data.R
# init = 2 (Default)
# random
#   seed = 779839997
# output
#   file = eight_schools_output1.csv
#   diagnostic_file =  (Default)
#   refresh = 100 (Default)

These are also probably needed, and also for other libs too.

@sethaxen
Copy link
Member

Then there is also other settings

see for example for CmdStan output

These would probably go under attributes though, right?

@ahartikainen
Copy link
Contributor

Yes, I would put them under attributes.

Maybe under sampler_settings or something similar.

@sethaxen
Copy link
Member

Yes, I would put them under attributes.

Maybe under sampler_settings or something similar.

That sounds good. Perhaps that should be a separate issue.

For Turing.jl, we have the following sample stats for HMC:

  • acceptance_rate: MH stats, i.e. sum of MH accept prob for all leapfrog steps (src)
  • hamiltonian_energy: value of the hamiltonian energy for the accepted proposal to within an additive constant
  • hamiltonian_energy_error: difference in the hamiltonian energy between the initial point and the proposed point,
  • is_adapt: boolean, whether the current sample is part of adaptation
  • max_hamiltonian_energy_error: energy in tree with largest absolute difference from initial energy (src)
  • n_steps: total # of leap frog steps, i.e. phase points in a trajectory (src)
  • numerical_error: termination due to large energy deviation from starting (possibly numerical errors) (src)
  • lp/log_density: log probability to within an additive constant
  • tree_depth: the number of tree doublings in the balanced binary tree
  • step_size : current integration step size (src)
  • nom_step_size: Get the nominal integration step size. The current integration step size may differ from this, for example if the step size is jittered. Nominal step size is usually used in adaptation. (src)

For SMC:

  • le: The log evidence retrieved from the particle
  • weight: The weight of the particle the sample was retrieved from.

Others that are supposedly parameters but I've never seen used:

  • elapsed
  • eval_num
  • lf_eps

@OriolAbril
Copy link
Member

OriolAbril commented Feb 21, 2020

I'll try to summarize the information gathered grouping equivalent sampler stats from different libraries. Descriptions and sources should still be checked in the original comment. Feel free to edit (I think members should have permission).

HMC

Stan Turing.jl PyMC3 Pyro NumPyro Unified name
lp__ lp/log_density model_logp - potential_energy lp
accept_stat__ acceptance_rate mean_tree_accept acceptance rate accept_prob acceptance_rate
stepsize__ step_size step_size - adapt_state.step_size step_size
- nom_step_size - -
- - step_size_bar -
treedepth__ tree_depth depth - tree_depth
n_leapfrog__ n_steps tree_size - num_steps n_steps
divergent__ numerical_error diverging divergences diverging diverging
energy__ hamiltonian_energy energy - energy energy
- hamiltonian_energy_error energy_error - energy_error
- max_hamiltonian_energy_error max_energy_error - max_energy_error
- is_adapt tune - removed (see #1126)

@fehiepsi could you please check Pyro and NumPyro names? I think I am on the right track but not completely sure.

I don't know enough about tfp as to include anything here. Is there anybody we could tag that comes to mind?

SMC

any thoughts @aloctavodia ?

Turing.jl PyMC3 Unified name
le
weight

MH?

Notes:

I think the only sampler stats currently used (in plotting or in stats) are diverging and energy.

PyMC3 reference: src1, (MH related, not really sure they are relevant: src2, src3, src4)
Pyro reference: src
NumPyro reference: src

@fehiepsi
Copy link
Member

Thank @OriolAbril, they are correct names.

@ahartikainen
Copy link
Contributor

What tfp use?

@junpenglao
Copy link
Contributor

TFP does not have internal naming convention, as they are function output (tensor or array) and user are free to name it whatever they want - I was manually mapping it eg: https://colab.research.google.com/github/tensorflow/probability/blob/master/tensorflow_probability/examples/jupyter_notebooks/Modeling_with_JointDistribution.ipynb#scrollTo=4qQdOPk90f7t

@ahartikainen
Copy link
Contributor

ahartikainen commented Feb 22, 2020

Stan hmc has these (I need to verify)

accept_stat__ stepsize__ int_time__

@aloctavodia
Copy link
Contributor

Currently PyMC3's SMC does not return any statistics, but I should fix that.

@OriolAbril
Copy link
Member

I updated my previous comment with the table summary to try to restart the discussion. I tried to use the names that felt had more consensus (e.g. more than one library had similar or equal names), please comment, I actually don't know the reasons behind any of the naming conventions chosen by each library.

The one I am having trouble with step_size related parameters. After warmup (whose sample stats should be stored in warmup groups, see #1126), step_size should have converged to a given value, therefore we are basically storing repeated values unless the step size is jittered (which seems to only be available in Turing). It feels unnecessary to store step_size in an (nchains, ndraws) array, but given the jitter possibility it may be the simplest way to accommodate all the libraries into a single naming convention (with nom_step_size generally missing).

Regarding step_size_bar, I thought it could be removed from sample_stats: PyMC3 docs say that after tuning step_size is set to step_size_bar but it does not seem to be the case. All stored ArviZ inferencedata objects have different values for step_size and step_size_bar (each of them is constant for all draws but they are different between them).

@nitishp25
Copy link
Contributor Author

Btw, the renaming work is to be done in this PR itself right? Or leave this one only for the descriptions of the existing sample_stats?

@nitishp25 nitishp25 force-pushed the sample-stats-schema branch 2 times, most recently from 000776a to af4834f Compare April 27, 2020 15:47
@canyon289
Copy link
Member

Checking in on old PRs. Any possibility of bringing this over the line or should we close?

@OriolAbril
Copy link
Member

This one depends on core contributors having some consensus on the names to use. Maybe a lab meeting could help in finishing this?

@canyon289
Copy link
Member

Sounds good. Ill add it as topic for next lab meeting

Copy link
Member

@sethaxen sethaxen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly clarifying questions. Also your above table is super useful, and I got a question just this week about the meaning if these parameters, and it would be great if this table was somewhere. And perhaps ArviZ's docs is the best place for it to be. Would you like to include it?

doc/schema/schema.md Outdated Show resolved Hide resolved
doc/schema/schema.md Outdated Show resolved Hide resolved
doc/schema/schema.md Outdated Show resolved Hide resolved
doc/schema/schema.md Outdated Show resolved Hide resolved
doc/schema/schema.md Outdated Show resolved Hide resolved
doc/schema/schema.md Outdated Show resolved Hide resolved
doc/schema/schema.md Outdated Show resolved Hide resolved
doc/schema/schema.md Outdated Show resolved Hide resolved
@OriolAbril
Copy link
Member

OriolAbril commented Sep 30, 2020

We have agreed on the names, see table in previous comment, but we still need to update the PR to match said table and add the definitions. There are some definitions in previous comments when describing Stan and Turing sample stats.

I think now would be a good time to implement the convention so sample stats can be used in https://github.com/arviz-devs/arviz_dashboard independently of the sampling backend

@ahartikainen
Copy link
Contributor

Explicit name is better, then let's add description somewhere what is what.

E.g. attrs field could contain something?

@nitishp25
Copy link
Contributor Author

nitishp25 commented Oct 3, 2020

Okay, so will you be working on this PR now? I don't have much knowledge about the definitions but I can update if you want

Copy link
Contributor

@ahartikainen ahartikainen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to mention sample_stats_prior which is analogous to sample_stats but the 1to1 correspondence is with the prior and not with the posterior

doc/source/schema/schema.md Outdated Show resolved Hide resolved
doc/source/schema/schema.md Outdated Show resolved Hide resolved
@ahartikainen
Copy link
Contributor

ahartikainen commented Nov 25, 2020

Oh, did we miss the int_time__ --> https://mc-stan.org/docs/2_25/cmdstan-guide/mcmc-intro.html

int_time__ - total integration time (static HMC sampler)

Also lp__ is mentioned as the total log probability density (up to an additive constant) at each sample

@OriolAbril
Copy link
Member

Oh, did we miss the int_time__

Looks like it. int_time looks like a good name, unless someone wants to propose an alternative.

@OriolAbril OriolAbril added this to In progress in Documentation via automation Jan 8, 2021
@OriolAbril OriolAbril moved this from In progress to Review in progress in Documentation Jan 8, 2021
@OriolAbril OriolAbril changed the title [WIP] Explain sample_stats naming convention Explain sample_stats naming convention Jan 13, 2021
@mjhajharia
Copy link
Contributor

I'll try to summarize the information gathered grouping equivalent sampler stats from different libraries. Descriptions and sources should still be checked in the original comment. Feel free to edit (I think members should have permission).

HMC

Stan Turing.jl PyMC3 Pyro NumPyro Unified name
lp__ lp/log_density model_logp - potential_energy lp
accept_stat__ acceptance_rate mean_tree_accept acceptance rate accept_prob acceptance_rate
stepsize__ step_size step_size - adapt_state.step_size step_size

  • nom_step_size - -
    • step_size_bar -
      treedepth__ tree_depth depth - tree_depth
      n_leapfrog__ n_steps tree_size - num_steps n_steps
      divergent__ numerical_error diverging divergences diverging diverging
      energy__ hamiltonian_energy energy - energy energy
  • hamiltonian_energy_error energy_error - energy_error
  • max_hamiltonian_energy_error max_energy_error - max_energy_error
  • is_adapt tune - removed (see Add warmup iterations and _group_warmup #1126)
    @fehiepsi could you please check Pyro and NumPyro names? I think I am on the right track but not completely sure.

I don't know enough about tfp as to include anything here. Is there anybody we could tag that comes to mind?

SMC

any thoughts @aloctavodia ?

Turing.jl PyMC3 Unified name
le
weight

MH?

Notes:

I think the only sampler stats currently used (in plotting or in stats) are diverging and energy.

PyMC3 reference: src1, (MH related, not really sure they are relevant: src2, src3, src4)
Pyro reference: src
NumPyro reference: src

Maybe I missed something, but I saw

perf_counter_diff , process_time_diff and perf_counter_start in trace.sample_stats

@OriolAbril
Copy link
Member

These were added after this comment was written, let's continue this in pymc-devs/pymc-examples#95 better

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Documentation
  
Done
Development

Successfully merging this pull request may close these issues.

Explain naming convention in sample_stats
9 participants