Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Paper: Parameter Estimation Using the Python Package pymcmcstat #466

Open
wants to merge 35 commits into
base: 2019
from

Conversation

Projects
None yet
4 participants
@prmiles
Copy link

commented May 21, 2019

If you are creating this PR in order to submit a draft of your paper,
see http://procbuild.scipy.org/ for logs generated by the build
process.

See the project readme
for more information.

@nudro
Copy link

left a comment

I might shy away from indicating that the API is straight forward for "new Python users"; I would qualify in your abstracts for new Python users with a mathematical background applying metropolis algorithms and implementing mcmc analysis for physics and engineering use cases. I might also consider proactively laying out for the reader the two types of use cases or case studies you intend to walk through using pymcmcstat to include smart material systems and radiation source localizations.

@@ -174,7 +179,7 @@ The information defined in the data structure can be accessed in the sum-of-squa
The function is evaluated at a subset of times and then interpolated to the points where experimental data exists. This speeds up the model evaluation with a limited decrease in accuracy as it is reasonable to assume linear behavior within the time intervals specified. We note that computational performance can be significantly improved by writing the model functions in C++ or Fortran. You can easily call these functions by utilizing the `ctypes package <https://docs.python.org/3/library/ctypes.html>`_ and an example of how to do this with pymcmcstat can be found in the `Viscoelasticity Example <https://nbviewer.jupyter.org/github/prmiles/pymcmcstat/blob/master/tutorials/viscoelasticity/viscoelastic_analysis_using_ctypes.ipynb>`_ tutorial.

This comment has been minimized.

Copy link
@nudro

nudro Jun 11, 2019

How long does it currently take, given the VHB 4910 dataset, since you claim the computational performance can be improved writing in C++ and Fortran?

This comment has been minimized.

Copy link
@prmiles

prmiles Jun 24, 2019

Author

I wrote the model in both Python and C++. On average, the C++ version evaluated in 0.09 milliseconds, whereas the Python version took about 8 milliseconds. I have added discussion to this section, including the average run times, in order to clarify the potential usefulness of having a computationally efficient model.

@@ -209,7 +214,7 @@ Most models are comprised of multiple parameters, not all of which should be inc
We note that the :code:`lstheta0` and :code:`bounds` are simply dictionaries containing the results of a least-squares optimization and user-defined parameter bounds, respectively. The values can be put directly in the parameter structure or referenced in any manner that the user prefers.
Figure :ref:`figcpvisc` shows the burned-in parameter values sampled during the MCMC process. We remove the first part of the sampling chain to allow for burn-in to the posterior distribution. Given the consistent behavior of the sampling chain, we can be reasonably confident that the chains have converged to the posterior densities. In Figure :ref:`figdpvisc`, we have plotted the posterior distributions using a kernel density estimator (KDE). The distributions appear to be nominally Gaussian in nature; however, that is not a requirement when running MCMC. For a more rigorous assessment of chain convergence, the user can generate multiple sets of chains and use Gelman-Rubin diagnostics :cite:`gelman1992inference`. An example of how to generate multiple chains with pymcmcstat can be found in the `Running Parallel Chains <https://nbviewer.jupyter.org/github/prmiles/pymcmcstat/blob/master/tutorials/running_parallel_chains/running_parallel_chains.ipynb>`_ tutorial.
Figure :ref:`figcpvisc` shows the burned-in parameter values sampled during the MCMC process. By "burn-in" we mean that we have sampled the values sufficiently such that the chains have converged to the posterior densities. We remove the first part of the sampling chain to allow for burn-in and take the remaining portion to be the posterior distribution. Given the consistent behavior of the sampling chain, we can be reasonably confident that the chains have converged to the posterior densities. In Figure :ref:`figdpvisc`, we have plotted the posterior distributions using a kernel density estimator (KDE). The distributions appear to be nominally Gaussian in nature; however, that is not a requirement when running MCMC. For a more rigorous assessment of chain convergence, the user can generate multiple sets of chains and use Gelman-Rubin diagnostics :cite:`gelman1992inference`. An example of how to generate multiple chains with pymcmcstat can be found in the `Running Parallel Chains <https://nbviewer.jupyter.org/github/prmiles/pymcmcstat/blob/master/tutorials/running_parallel_chains/running_parallel_chains.ipynb>`_ tutorial.

This comment has been minimized.

Copy link
@nudro

nudro Jun 11, 2019

I would note that the Running Parallel Chains tutorial contains how to diagnose using Gelman-Rubin.

This comment has been minimized.

Copy link
@prmiles

prmiles Jun 24, 2019

Author

I added a note indicating that the tutorial contains information on how to use Gelman-Rubin. Thanks for pointing this out.

@nudro
Copy link

left a comment

Thank you generously for the opportunity to review. I was able to provide my comments earlier than expected. This is a rich paper defined very specifically for the materials science, mathematics, and engineering communities. I think there is an opportunity for the authors to clarify the language, minimize technical jargon, and better articulate the reasons why pymcmcstat are useful for specific periods in the workflow of the various use cases.

In pymcmcstat, the user is provided with a suite of Metropolis based algorithms, with the primary approach being Delayed Rejection Adaptive Metropolis (DRAM) :cite:`haario2006dram`, :cite:`haario2001adaptive`. A simple procedure of adding data, defining model parameters and settings, and setting up simulation options provides the user with a wide variety of computational tools for considering inverse problems. This approach to inverse problems utilizes data to provide insight into model limitations and provide accurate estimation of the underlying model and observation uncertainty.

In pymcmcstat, the user is provided with a suite of Metropolis based algorithms, with the primary approach being Delayed Rejection Adaptive Metropolis (DRAM) :cite:`haario2006dram`, :cite:`haario2001adaptive`. A simple procedure of adding data, defining model parameters and settings, and setting up simulation options provides the user with a wide variety of computational tools for considering inverse problems. This approach to inverse problems utilizes data to provide insight into model limitations and provide accurate estimation of the underlying model and observation uncertainty.

As many Python packages currently exist for performing MCMC simulations, we had several goals in developing this code. To our knowledge, no current package contains the :math:`n`-stage delayed rejection algorithm, so pymcmcstat was intended to fill this gap. Furthermore, many researchers in our community have extensive experience using the MATLAB toolbox `mcmcstat <https://mjlaine.github.io/mcmcstat/>`_. Our implementation provides a similar user environment, while exploiting Python structures. We hope to decrease dependence on MATLAB in academic communities by advertising comparable tools in Python.

This comment has been minimized.

Copy link
@nudro

nudro Jun 11, 2019

For example taking the below "This package has been applied to a wide variety of engineering problems, including constitutive model development of smart material systems as well as radiation source localization" and moving up to the abstract.

This comment has been minimized.

Copy link
@prmiles

prmiles Jun 24, 2019

Author

I have an item in the abstract that mentions these examples. I'm happy to update it further if desired.


Direct evaluation of (:ref:`eqnbayes`) is often computationally untenable due to the integral in the denominator. To avoid the issues that arise due to quadrature, we alternatively employ Markov Chain Monte Carlo (MCMC) methods. In MCMC we use sampling based Metropolis algorithms whose stationary distribution is the posterior density :math:`\pi(q|F^{obs}(i))`.
Direct evaluation of (:ref:`eqnbayes`) is often computationally untenable due to the integral in the denominator. To avoid the issues that arise due to quadrature, we alternatively employ Markov Chain Monte Carlo (MCMC) methods. In MCMC, we use sampling based Metropolis algorithms whose stationary distribution is the posterior density :math:`\pi(q|F^{obs}(i))`.

The pymcmcstat package is designed to work with statistical models of the form

This comment has been minimized.

Copy link
@nudro

nudro Jun 11, 2019

Although I appreciate the transparency via the maths you provide, I'm concerned that this will be misconstrued as "technical jargon" unless the goal and utility for why you lay out the functions are relevant - particularly to emphasize the goal of how approximating these functions directly are intractable (as you mentioned the quadtature)... and therefore are better suited for mcmc - enter pymcmcstat and metropolis algorithms.

This comment has been minimized.

Copy link
@prmiles

prmiles Jun 24, 2019

Author

Thank you for pointing this out. I have broken the Methodology section into two subsections: 1) "Bayesian Framework" and 2) "Basic Example". I believe these two sections compliment each other, and help give context to the technical jargon. I also added a separate section to discuss Metropolis algorithms, including a basic description of how they work.

+----+--------------------------+
|DRAM| DR + AM |
+----+--------------------------+

Procedurally, to run an MCMC simulation using pymcmcstat, the user will need to complete the following steps:

This comment has been minimized.

Copy link
@nudro

nudro Jun 11, 2019

This is one of the clearer parts of this paper, where you describe a workflow and exemplify in code. I think the sections that become quite dense particularly as you describe the various features addressable through pymcmcstat in Smart Material Systems.

This comment has been minimized.

Copy link
@prmiles

prmiles Jun 24, 2019

Author

In the latest draft, I have expanded the discussion in this section and cut back on discussing Smart Material Systems.

@@ -150,7 +155,7 @@ We can perform the MCMC simulation using the basic procedure previously outlined
num=400)
The information defined in the data structure can be accessed in the sum-of-squares function

This comment has been minimized.

Copy link
@nudro

nudro Jun 11, 2019

Unclear to the reader what this code bloc def ssfun(thetavec, data) is trying to convey. What does stretch_function mean? Is this a placeholder for a specific nonaffine and integer-order model?

This comment has been minimized.

Copy link
@prmiles

prmiles Jun 24, 2019

Author

Thank you for pointing this out. I agree it is unclear, and I believe explaining all the nuances of how it works would distract the reader. I removed this part of the discussion/code from the section.


Smart Material Systems
----------------------
Many smart material systems depend on robust constitutive relations for applications in robotics, flow control, and energy harvesting :cite:`lines2001principles`, :cite:`cattafesta2011actuators`. To fully characterize the material or system behavior, uncertainty in the model must be accurately represented. By using experimental data in conjunction with pymcmcstat, we can estimate the model parameter distributions and visualize how that uncertainty propagates through the system. We will consider specific examples in viscoelastic modeling of dielectric elastomers and also continuum approximations of ferroelectric monodomain crystal structures.

Viscoelastic Modeling of Dielectric Elastomers

This comment has been minimized.

Copy link
@nudro

nudro Jun 11, 2019

In this entire section, I reread the paragraphs several times to better understand your flow of describing how the pymcmcstat package addresses the following:

For example, Viscoeleastic Modeling of Dielectric Elastomers:

  • You describe the data VHB 4910
  • You describe how you will model the non-affine and integer-order models
  • You add additional information to this current workflow by adding user-defined-objects (udo)
  • You indicate the functions are easily callable
  • You show how to ignore specific functions via sample = false
  • You show convergence and how the values have been burned in and mention a diagnostic to measure this
  • You describe how you can show how uncertainty propagates through the model via a 95% CI

These sections are a bit disjointed and lack a segue from paragraph to paragraph and sometimes comes off as rambling; almost as if they were from a set of PowerPoint slides, blended together. I would suggest starting off each paragraph with a reason why these specific activities are relevant and/or common to a physicist/engineer going through a similar challenge. You wrote, "In many problems it is of interest to observe how uncertainty propagates through the model to affect the output." which is a little better. This section requires the most clarification. It currently reads like a scientific results section in a materials science paper as opposed to a submission for SciPy that emphasizes the application of Python for materials science.

This comment has been minimized.

Copy link
@prmiles

prmiles Jun 24, 2019

Author

Thank you for pointing this out. I apologize for the lack of transitions between paragraphs. I updated this section quite a bit in the latest draft, and I have moved most of the material to an initial example section ("Basic Example"). The basic example walks through the entire workflow of using pymcmcstat without focusing on a specific scientific problem. I use the two case studies (Viscoelasticity and Radiation Source Localization) to highlight specific features that I have found useful.

Monodomain Crystal Structure Modeling in Ferroelectric Ceramics
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Ferroelectric materials are used in a wide variety of engineering applications, necessitating methodologies that can account for uncertainty across multi-scale physics models. Bayesian statistics allow us to quantify model parameter uncertainty associated with approximating lattice strain and full-field electron density from density functional theory (DFT) calculations as a homogenized, electromechanical continuum.
Consider the 6th order Landau function, :math:`u(q, {\bf P})`, where :math:`q = [\alpha_{1},\alpha_{11}, \alpha_{111},\alpha_{12},\alpha_{112},\alpha_{123}]`. The Landau energy is a function of 3-dimensional polarization space, :math:`{\bf P}=[P_1, P_2, P_3]`. For the purpose of this example, we consider the case where :math:`P_1 = 0`. Often times we are interested in using information calculated from DFT calculations in order to inform our continuum approximations, such as our Landau function. For this example, we will assume we have a set of energy DFT calculations corresponding to different values of :math:`P_2` and :math:`P_3` as seen in Figure :ref:`figdftdata`. For more details regarding this research, the reader is referred to :cite:`miles2018analysis` and :cite:`leon2018analysis`.

This comment has been minimized.

Copy link
@nudro

nudro Jun 11, 2019

Again, I am a bit concerned with the level of math and jargon that is very specific to your topic area. I might preface the reason why we are to consider the 6th order Landau function. Not knowing myself very much on the topic, I might preface it with motivations to model phase transitions.

"approximating lattice strain and full-field electron density from density functional theory (DFT) calculations as a homogenized, electromechanical continuum" - this section is near unreadable for an untrained physicist or engineer.

This comment has been minimized.

Copy link
@Mahdisadjadi

Mahdisadjadi Jun 12, 2019

I want to highlight this comment. The examples in the paper have great potential to convey the usefulness of pymcmcstat and personally, I enjoyed them a lot. But the examples are somewhat inaccessible for a reader with a non-physics background.

This comment has been minimized.

Copy link
@prmiles

prmiles Jun 24, 2019

Author

Thank you both for pointing this out. I do not believe this section added a lot of value to understanding pymcmcstat, so I have removed it in the latest draft of the paper. The information I hoped to convey has been included in other sections of the paper.

.. figure:: figures/x_vs_y_2d.png
:figclass: tb
Marginal posteriors from MCMC simulation presented in urban environment. Actual source location is denoted by the red circle. :label:`figxymarg`
Concluding Remarks
------------------
In this paper we have demonstrated several distinct areas of scientific study where MCMC methods provide enhanced understanding of the underlying physics. The pymcmcstat package presents a robust platform from which to perform a wide array of Bayesian inverse problems using the Delayed Rejection Adaptive Metropolis (DRAM) algorithm.

This comment has been minimized.

Copy link
@nudro

nudro Jun 11, 2019

Here in your concluding remarks you indicate "demonstrated several distinct areas of scientific study". What do you envision is more important to communicate in this paper - the scientific findings of these use cases, or the application of the pymcmcstat package you developed? It seems like there is a battle between both focuses in the paper. This currently reads like a technical paper submitted to a material sciences journal with a smattering of Python. There is still greater opportunity to demonstrate the value of pymcmcstat as it directly solves these use cases. My recommendation is to revisit the language and more clearly articulate the reasons why specific use cases are being conveyed, why the reader would care to use pymcmcstat over their own implementation, and how key issues and problems that commonly face physicists would be more made more efficient using pymcmcstat.

This comment has been minimized.

Copy link
@prmiles

prmiles Jun 24, 2019

Author

Thank you for pointing this out - I had the wrong audience in mind when I first wrote the paper. My background is mechanical engineering/material science, and I am very knew to the Python community.

I have tried to communicate key features using generic examples, which hopefully will be easier for users to transfer to their own research. I have kept parts of the two case studies (Viscoelasticity and Radiation Source Localization) in order to highlight specific aspects of how pymcmcstat was used to solve those problems.

I hope I have struck a better balance, but please let me know if I need to clarify any details.

Monodomain Crystal Structure Modeling in Ferroelectric Ceramics
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Ferroelectric materials are used in a wide variety of engineering applications, necessitating methodologies that can account for uncertainty across multi-scale physics models. Bayesian statistics allow us to quantify model parameter uncertainty associated with approximating lattice strain and full-field electron density from density functional theory (DFT) calculations as a homogenized, electromechanical continuum.
Consider the 6th order Landau function, :math:`u(q, {\bf P})`, where :math:`q = [\alpha_{1},\alpha_{11}, \alpha_{111},\alpha_{12},\alpha_{112},\alpha_{123}]`. The Landau energy is a function of 3-dimensional polarization space, :math:`{\bf P}=[P_1, P_2, P_3]`. For the purpose of this example, we consider the case where :math:`P_1 = 0`. Often times we are interested in using information calculated from DFT calculations in order to inform our continuum approximations, such as our Landau function. For this example, we will assume we have a set of energy DFT calculations corresponding to different values of :math:`P_2` and :math:`P_3` as seen in Figure :ref:`figdftdata`. For more details regarding this research, the reader is referred to :cite:`miles2018analysis` and :cite:`leon2018analysis`.
Consider the 6th order Landau function, :math:`u(q, {\bf P})`, where :math:`q = [\alpha_{1},\alpha_{11}, \alpha_{111},\alpha_{12},\alpha_{112},\alpha_{123}]`. The Landau energy is a function of 3-dimensional polarization space, :math:`{\bf P}=[P_1, P_2, P_3]`. For the purpose of this example, we consider the case where :math:`P_1 = 0`. We are often interested in using information calculated from DFT calculations in order to inform our continuum approximations, such as our Landau function. For this example, we will assume we have a set of energy DFT calculations corresponding to different values of :math:`P_2` and :math:`P_3` as seen in Figure :ref:`figdftdata`. For more details regarding this research, the reader is referred to :cite:`miles2018analysis` and :cite:`leon2018analysis`.

This comment has been minimized.

Copy link
@nudro

nudro Jun 11, 2019

You cite many figures. Are these figures a part of the pymcmcstat API, or must a user write matplotlib code to develop them independently of the package?

This comment has been minimized.

Copy link
@prmiles

prmiles Jun 24, 2019

Author

I removed this section, so these particular figures are no longer included. The "Basic Example" section walks through the main plotting methods that are part of the pymcmcstat API, so hopefully that makes things easier to understand.

@Mahdisadjadi

This comment has been minimized.

Copy link

commented Jun 12, 2019

Hi @prmiles ! I'm one of the reviewers for this paper. Nice work! I think pymcmcstat is a very useful package with an intuitive and flexible API.

I generally agree with @nudro 's comments, so I try to focus on notes not yet covered:

  • I think it'd be useful to include some discussion on performance and tips on how to make more efficient simulations.

  • You have mentioned the inclusion of delayed rejection (DR) as a goal for your package. I suggest you include some discussion as to why the addition of DR (and in turn DRAM) is important and expand the explanation of DR beyond "helps to stimulate mixing within the sampling chain".

  • The Github repo, documentation, and tutorials for pymcmcstat appear as links. But I think their URLs can be added as references to make them more accessible/searchable. In fact, the repo points to many more code examples that might be worth mentioning briefly in the paper.

# Add model parameters
mcstat.parameters.add_model_parameter(
name='m',
theta0=2.)

This comment has been minimized.

Copy link
@Mahdisadjadi

Mahdisadjadi Jun 12, 2019

I think that theta0 is the initial guess for the defined parameter. This can be added as a comment.

This comment has been minimized.

Copy link
@prmiles

prmiles Jun 24, 2019

Author

Thank you for pointing this out. I have added a comment specifying that theta0 is the initial value.

kind='linear')
# compute sum-of-squares error
res = ydata - model(time).reshape(ydata.shape)
return ss = (res ** 2).sum(axis=0)

This comment has been minimized.

Copy link
@Mahdisadjadi

Mahdisadjadi Jun 12, 2019

This line will return a syntax error. Probably, you meant:

ss = (res ** 2).sum(axis=0)
return ss

This comment has been minimized.

Copy link
@prmiles

prmiles Jun 24, 2019

Author

Thank you for pointing this out. I corrected this in the code snippets in the latest draft.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Dielectric elastomers as part of adaptive structures provide unique capabilities for control of a structure's shape, stiffness, and damping :cite:`smith2005smart`. Many of these materials exhibit viscoelastic behavior, which varies significantly with the rate of deformation :cite:`rubinstein2003polymer`. Figure :ref:`figfinalcycles` shows uni-axial experimental data for the elastomer Very High Bond (VHB) 4910, which highlights how the hysteretic behavior increases with the rate of deformation. For more details regarding the experimental procedure used to generate this data, the reader is referred to :cite:`miles2015bayesian`.

.. figure:: figures/final_cycle_for_each_rate.png

This comment has been minimized.

Copy link
@Mahdisadjadi

Mahdisadjadi Jun 12, 2019

The unit for stretch (x-axis) is missing in the figure.

This comment has been minimized.

Copy link
@prmiles

prmiles Jun 24, 2019

Author

Thank you for pointing this out. Stretch is actually a unit less quantity, but I don't think I mentioned that anywhere. I'll try to add a note to the text to clarify this. Would you also like me to update the x-label to be "Stretch (-)"?

.. figure:: figures/ionv_pi.png
:figclass: tb

This comment has been minimized.

Copy link
@Mahdisadjadi

Mahdisadjadi Jun 12, 2019

The axes don't have labels/units (should be similar to Fig 1).

This comment has been minimized.

Copy link
@prmiles

prmiles Jun 24, 2019

Author

In my updated draft I have removed this figure. Thank you for pointing out the mistake.

prmiles added some commits Jun 21, 2019

updated intro
- Generalized intro to make it amenable to a broader audience
- Added discussion of model development and uncertainty
fixed typo
- ke -> key
added description of key pieces
- connected dots between api and Bayesian stats... hopefully
@prmiles

This comment has been minimized.

Copy link
Author

commented Jun 24, 2019

Thank you both for your feedback as well as patience as I work through your comments. I have made significant modifications for the latest draft in order to communicate to a more general audience. I have commented on specific items above, but there are a few items that I am still working on. Please let me know if there are any items requiring additional clarity. Also, I'm not entirely sure what the due date is, so I apologize for not getting to this sooner.

I had questions regarding these comments by @Mahdisadjadi

  • I think it'd be useful to include some discussion on performance and tips on how to make more efficient simulations.

Do you want me to provide more details on how adaptation and delayed rejection can yield more efficient simulations?

  • You have mentioned the inclusion of delayed rejection (DR) as a goal for your package. I suggest you include some discussion as to why the addition of DR (and in turn DRAM) is important and expand the explanation of DR beyond "helps to stimulate mixing within the sampling chain".

I added a section called "Metropolis Algorithms". I added a couple sentences explaining the usefulness of delayed rejection, so please let me know if more information would be helpful.

  • The Github repo, documentation, and tutorials for pymcmcstat appear as links. But I think their URLs can be added as references to make them more accessible/searchable. In fact, the repo points to many more code examples that might be worth mentioning briefly in the paper.

Would you like me to include the full URLs somewhere? The URLs are fairly long because of how I setup the tutorials, so I wasn't sure.

@deniederhut

This comment has been minimized.

Copy link
Member

commented Jun 25, 2019

The urls will get printed in a smaller font (and take up less space) if you have them as a footnote 😄.

@nudro will you have time to take a second look?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.