Add information from model to fit #75

ahartikainen · 2019-06-02T07:36:12Z

Hi,

could we add model.program_code maybe other information too, so one can infer the model and recreate the model without model instance.

They could to dict under fit.model_info?

The text was updated successfully, but these errors were encountered:

riddell-stan · 2019-06-02T14:01:59Z

I'd like to keep things as simple as possible, in the interest of making the code very easy/pleasant to read. Perhaps we could wait until someone asks for this explicitly?

…

On 6/2/19 3:36 AM, Ari Hartikainen wrote: Hi, could we add |model.program_code| maybe other information too, so one can infer the model and recreate the model without model instance. They could to dict under |fit.model_info|? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#75>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AJQUBVS5R2L64WL4J7TITJDPYNZ6ZANCNFSM4HSCCQIA>.

ahartikainen · 2019-06-02T14:20:25Z

True. Could we atleast add stancode? I don't know how many times I have pickled a fit without knowledge about the model (this is not an issue with pystan, where I pickle fit + model together).

This could also simplify the implementation in arviz.

riddell-stan · 2019-06-03T11:12:57Z

How about storing a copy of model_name which is a hash of the program_code with the fit object. This would let us avoid storing duplicate information.

ahartikainen · 2019-06-03T11:41:21Z

That could work, but fails when user incrementally updates model, overwriting the old model.

I don't think stan-code would create too much duplication, but it would increase the robustness of the fit object.

riddell-stan · 2019-06-03T12:11:36Z

I have trouble imagining a scenario where I would lose track of my Stan program code. (I like the idea of storing the fit_name with the fit. This includes the model name (which has the hash of the stan program code).)

So here are my two arguments against adding the program code to the Fit object:

It violates DRY; we already have the program_code attached to the Model instance. People should keep track of their Models. If they don't, that's their fault.
It breaks the abstraction. The Fit is not the Model. The Model is the logical place for the program code.

If people are losing track of their Models, then this is a problem and we should find out a way to help them. I can imagine some sort of helper function which bundles Model and Fits together. We could add this helper function to the documentation just like we did with stan_cache with PyStan 2.

Edit: edits for clarity

ahartikainen · 2019-06-03T14:59:24Z

True that it violates DRY. But Model and Fit are separate objects. Users will still save only Fit, change models, forget models (reload Fit after many years; share Fit to coworker; make mistakes). Also Stan-code is small in size, doesn't lose its descriptiveness, "universal" being str.
Sure we could make them to save together, but then what would be the point of having external Fit object.

Scikit-learn keeps model and fit in same class. I'm not saying we need to create hooks to Model object. Just that we could add it's stan code (maybe also include files, if they are txt), which enables one to wrap Fit object so user can recreate the model used to sample its data.

Also stan code is currently only way to parse dtypes for models, but I hope Stan3 fix this problem.

riddell-stan · 2019-06-03T17:52:14Z

Could we perhaps address this after the final 3.0 version? We don't even have a way of serializing fits yet (from the pystan side). Also, I'm willing to change my opinion if there's an example or two of people saving fits and forgetting the model from which it came. In general, I'm genuinely worried about avoiding cruft <https://martinfowler.com/articles/is-quality-worth-cost.html> and keeping things as simple as possible because we have limited developer time. PyStan 2 is so messy (esp the C++ stuff) that it's virtually impossible to understand, hence costly to maintain.

…

On 6/3/19 10:59 AM, Ari Hartikainen wrote: 1. True that it violates DRY. But Model and Fit are separate objects. Users will still save only Fit, change models, forget models (reload Fit after many years; share Fit to coworker; make mistakes). Also Stan-code is small in size, doesn't lose its descriptiveness, "universal" being str. 2. Sure we could make them to save together, but then what would be the point of having external Fit object. Scikit-learn keeps model and fit in same class. I'm not saying we need to create hooks to Model object. Just that we could add it's stan code (maybe also include files, if they are txt), which enables one to wrap Fit object so user can recreate the model used to sample its data. Also stan code is currently only way to parse dtypes for models, but I hope Stan3 fix this problem. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#75>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AJQUBVRCYUTDSVATRE3TXBTPYUWU3ANCNFSM4HSCCQIA>.

ahartikainen · 2019-06-03T18:13:05Z

Yes, this can be addressed after Stan3 lands.

I think my logic is that, given stan code is text / +#include texts, the inclusion would not add complexity, and would be easy way to increase the robustness of the fit (what is theta etc).

The current Stan2 fit... is a beast :) (was trying to add some minimal changes to VI to get log_p and log_g, but I guess the reader needs rewrite).

riddell-stan · 2019-06-04T12:38:11Z

I definitely think it is a bug that there's no way right now to match Fit instance to Model instance. We definitely need to add fit_name as a field so people can at least look at the hashes. This discussion has brought that up.

There are so many ways to do this particular API. I'm not confident that this way is any better than, say, the way sklearn does it. All I know is that it is essentially what we settled on many years ago here: https://github.com/stan-dev/stan/wiki/User-Interface-Guidelines-for-Developers

Let's definitely revisit this in the future.

ahartikainen · 2019-06-04T15:33:05Z

Is there any mechanism for this in RStan or CmdStan?

cc. @bgoodri @seantalts

stale · 2019-10-22T17:27:11Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

riddell-stan mentioned this issue Jun 4, 2019

No way to match Fit instance with Model instance #76

Closed

stale bot added the wontfix label Oct 22, 2019

stale bot closed this as completed Oct 29, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add information from model to fit #75

Add information from model to fit #75

ahartikainen commented Jun 2, 2019

riddell-stan commented Jun 2, 2019 via email

ahartikainen commented Jun 2, 2019

riddell-stan commented Jun 3, 2019

ahartikainen commented Jun 3, 2019

riddell-stan commented Jun 3, 2019 •

edited

ahartikainen commented Jun 3, 2019

riddell-stan commented Jun 3, 2019 via email

ahartikainen commented Jun 3, 2019

riddell-stan commented Jun 4, 2019

ahartikainen commented Jun 4, 2019

stale bot commented Oct 22, 2019

Add information from model to fit #75

Add information from model to fit #75

Comments

ahartikainen commented Jun 2, 2019

riddell-stan commented Jun 2, 2019 via email

ahartikainen commented Jun 2, 2019

riddell-stan commented Jun 3, 2019

ahartikainen commented Jun 3, 2019

riddell-stan commented Jun 3, 2019 • edited

ahartikainen commented Jun 3, 2019

riddell-stan commented Jun 3, 2019 via email

ahartikainen commented Jun 3, 2019

riddell-stan commented Jun 4, 2019

ahartikainen commented Jun 4, 2019

stale bot commented Oct 22, 2019

riddell-stan commented Jun 3, 2019 •

edited