Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dev 114 cmdstan #801

Merged
merged 46 commits into from
Jan 21, 2023
Merged

Dev 114 cmdstan #801

merged 46 commits into from
Jan 21, 2023

Conversation

edwinnglabs
Copy link
Collaborator

@edwinnglabs edwinnglabs commented Jan 3, 2023

Description

A working branch to propose first solution in using cmdstanpy instead of pystan

Fixes #793

Type of change

  • Using CmdStanPy in Stan Estimator instead of PyStan
  • Updating all documents to reflect outlook using the new API
  • Further enhancement can be done by suppressing CmdStanPy log
  • Added Python 3.9 for testing and reduce trigger to just publish

How Has This Been Tested?

All the original unit tests should be sufficient since this is a change just on the API. One small change is to add loglk in the posterior keys in all types of estimators with Stan.

@edwinnglabs edwinnglabs self-assigned this Jan 7, 2023
@edwinnglabs edwinnglabs added documentation Improvements or additions to documentation review needed need someone to review backend improvements on backend work such as integration test and deployment automation enhancement Utils and interface enhancement / more flexibility. labels Jan 7, 2023
Copy link
Collaborator

@jeongyoonlee jeongyoonlee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Comment on lines 170 to 175
training_metrics = {"loglk": loglk}
training_metrics.update(
{"log_posterior": stan_mcmc_fit.get_logposterior(inc_warmup=True)}
)
# log_posterior is not supported in cmdstanpy
# training_metrics.update(
# {"log_posterior": stan_mcmc_fit.get_logposterior(inc_warmup=True)}
# )
training_metrics.update({"sampling_temperature": sampling_temperature})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we can create training_metrics at once instead of in two steps.

training_metrics = {
    "loglk": loglk,
    "sampling_temperature": sampling temperature,
}

Comment on lines 260 to 266
training_metrics = dict()

# extract `log_prob` in addition to defined model params
# this is for the BIC calculation
# loglk is needed for BIC calculation
training_metrics.update({"loglk": stan_extract["loglk"]})
# FIXME: this needs to be the full length of all parameters instead of the one we sampled?
# FIXME: or it should be not include latent varaibles / derive variables?
# TODO: this needs to be the full length of all parameters instead of the one we sampled?
# TODO: or it should be not include latent variables / derive variables?
training_metrics.update({"num_of_params": len(model_param_names)})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: same as above. Are there any reasons we're doing this way (i.e., creating a dict in multiple steps)?

@edwinnglabs edwinnglabs merged commit 30701da into dev Jan 21, 2023
@edwinnglabs edwinnglabs deleted the dev-114-cmdstan branch January 21, 2023 00:42
@edwinnglabs edwinnglabs mentioned this pull request Jan 23, 2023
@juanitorduz
Copy link
Contributor

amazing! 🚀 !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend improvements on backend work such as integration test and deployment automation documentation Improvements or additions to documentation enhancement Utils and interface enhancement / more flexibility. review needed need someone to review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

cmdstanpy instead of pystan
3 participants