Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Two different syntaxes for getting Parameter values? #45

Closed
tBuLi opened this issue Dec 21, 2015 · 11 comments
Closed

Two different syntaxes for getting Parameter values? #45

tBuLi opened this issue Dec 21, 2015 · 11 comments

Comments

@tBuLi
Copy link
Owner

tBuLi commented Dec 21, 2015

Currently there are two different ways to get the value and stdev of a parameter:

# a = Parameter()

assert fit_result.params.a == fit_result.value(a)

This kinda conflicts with the principle that there should be only one obvious way to do something. Which is better?

If we decide to remove the first one, that means the ParameterDict object can be removed entirely and replaced with an OrderedDict. This is something I would like, as the current ParameterDict is kind of a "don't touch it, it works"-thing.

Furthermore, the second syntax is better when dealing with dynamically generated Parameter objects whose name you don't know or care about.

The only thing going for the first one is that I think it's rather elegant.
@Jhsmit, @Pitje06, @Eljee What do you think?

@pckroon
Copy link
Collaborator

pckroon commented Dec 31, 2015

I always find these questions very hard... But here's my two cents.

I'm in favor of the first syntax, since IMHO it's more Pythonic (attributes over getter functions). Optionally you could change it to fit_result.a.
The main downside of the first variant is however, that you need to know the names of your parameters when you're writing your code. Otherwise you're stuck with getattr to retrieve your values (but this should be <20% of all use cases).
Also, a should be a parameter object (subclass from float?) that has an stddev attribute so it can be used as transparently as possible. If possible, it should be tied in with one of the sympy primitives such that errors get propagated.
As for the implementation, I don't see why you can not implement this relatively simple with a __getattr__ method that first tries to get it from the object itself (__dict__?) and otherwise from a param_dict attribute. that gets it from a param_dict attribute. See https://docs.python.org/3/reference/datamodel.html#object.__getattr__

EDIT: read up on Python documentation. __getattr__ is called only if normal attribute lookup fails.

@tBuLi
Copy link
Owner Author

tBuLi commented Jan 10, 2016

Thanks for your reply. Upon reading it I realized I should have thought about my original post longer because I didn't specify all the problems I have with the current fit_result.params.a syntax. So here goes:

To get standard deviation in each paradigm you do

assert fit_result.params.a_stdev == fit_result.stdev(a)

(Btw, the first works by overloading __getattr__ like you suggest)

However, this way of accesing the value and stdev has no consistent way of getting covariances between fit parameters. Therefore it would be nice to have one syntax which feels self-consistent. This is why I implemented the following functions (#43):

fit_result.value(a)
fit_result.stdev(a)
fit_result.covariance(a, b)
fit_result.variance(a) 

They all have a similar feel to them, which is why I like them better.

Now, the ParameterDict was implemented purely to create the ** shorthand when calling models:

y = model(x=xdata, **fit_result.params)

And then given the additional tasks of returning the value and stdev by attribute. (This is why it's fit_result.params.a rather than fit_result.a.)

So this is why currently my way of thinking is to get the values, covariance etc. with the new syntax but to still have fit_result.params as a simple OrderedDict so the ** still works.

One could even consider making FitResults objects themselves mappable for this, but I don't think that a good idea since conceptually I would not expect to iterate over parameter values when iterating over FitResults. Rather, I would expect to get the different solutions in case of multiple solutions, which is indeed something that is currently not accounted for.

The fact that all this originates from the need for having the ** shorthand hopefully also explains why fit_result.params.a returns a float, not an object. It's a wrapper for fit_result.params['a']. This could of course be changed if we think that would be better, but personally I think returning an object would ad an unnecessary level of complexity.

@pckroon
Copy link
Collaborator

pckroon commented Jan 14, 2016

Ok, that changes things.

Now, a covariance between a and b only makes sense in the context of a FitResult. You can use the same reasoning to argue that the variance of a also only makes sense in this context. This would mean that you 'fix' the following:

fit_result.stdev(a)
fit_result.covariance(a, b)
fit_result.variance(a)

Then, for the sake of consistency you should also use fit_result.value(a).
Food for thought: fit_result.value(a) or fit_result.value("a")?

When making a FitResult mappable, can you distinguish this from an iterable? I.e. **fit_result acts like a parameter dict while iter(fit_result) iterates over the separate solutions?
After reading the python data model (again): you can't distinguish these, and that's probably a good thing.

Or should you return a list of FitResults when you have multiple solutions (but that messes with your return types)?

@pckroon
Copy link
Collaborator

pckroon commented May 20, 2016

Coming back to this my view hasn't changed much.

a = Parameter(3.14)
...  # set up model and Fit object
fit_result = fit.execute()

a.value should always be 3.14 (my initial guess) since I may reuse this parameter in a later fit with the same initial guess. Conversely, fit_result.value(a) should be whatever value is found in fitting. fit_results.params should be a mapping to the resulting values (the outcome of the fit) for sake of convenience.
I personally don't really care about fit_result.params.a. It might be clearer to deprecate this syntax altogether in order to emphasize the difference between a.value and fit_result.value(a).

Note that InteractiveGuess (PR #33) does for param in model.params: param.value = new_value. It would be nice if that didn't break ;-)
This would mean that model.params is a(n) (ordered)dict which maps parameter objects to their values. This should (of course) be the same kind of object as fit_result.params; but I guess that has to be a {p.name: p.value} mapping...
Now for the bonus question: should model.params refer to the original parameter objects which have their initial guesses as values, or should it refer to the parameter objects that are the result from fitting? I vote for the first.

@Jhsmit
Copy link
Collaborator

Jhsmit commented Aug 3, 2016

To continue the disussion in #57, for me it would make sense to implement more Dict-like behaviour. I really like the ability to ** unpack, which suggests a dict, which makes me want to do this:

for key in result.params:
    result.params[key]

Which doesnt work because the key is a Parameter instance and not a string. This does work:

for key in result.params:
    result.params[str(key)]

Would it make sense to change the __getitem__ of ParameterDict to:

def __getitem__(self, param):
    """
   ....
    """
    if isinstance(param, Parameter):
        param_name = str(param)
    elif isinstance(param, str):
        param_name = param
    else: 
        #maybe some error raised
    return getattr(self, param_name)

@tBuLi
Copy link
Owner Author

tBuLi commented Aug 3, 2016

Could work, but actually I think it would make more sense to change result.params from a ParameterDict to an OrderedDict. I think that would offer all the functionality needed if we do away with the result.params.a syntax (which I think we are all onboard with).

This would solve all these problems right?

@pckroon
Copy link
Collaborator

pckroon commented Aug 3, 2016

Hang on, I'm totally lost now.

We have 3 (?) things related to parameters: Parameter objects, Model objects (which have a params attribute) and FitResults objects (which also have a params attribute).

Requirements stated above:

  1. Parameter object's values do not change when fitting
  2. model(x=data, **fitresults.params) works
  3. Model.params and FitResults.params should be the same kind of object (should they? The first could be a list AFAICT)
  4. Model.params, Model.independent_vars and Model.dependent_vars should be similar/identical in behaviour
  5. preferably, it should have the Parameter object as key, but that might be incompatible with requirement 2.

Still unclear:

  1. should the values of Model.params change upon fitting? I think not.

I think an OrderedDict fits all these requirements, but did I miss anything?

@tBuLi
Copy link
Owner Author

tBuLi commented Aug 4, 2016

Let me go through those points.

  1. yes
  2. yes
  3. Model.params, Model.independent_vars and Model.dependent_vars all are list containing the respective Parameter's or Variable's for that model, ordered by alphabet. To clarify, no information of the fit is transferred back to the Model in any way. Parameter.value is only read by symfit, never written.
  4. They are not the same, and in retrospect FitResults.params maybe should have had a different name, but I couldn't think of a better one at the time. Maybe FitResults.kwargs or FitResults.as_kwargs? Because it's simply not even related to Model.params in terms of what it does. It's simply Parameter.name: best_fit_value pairs. (It's still early enough in the project that this could be changed while adding a deprecation warning to the old name such that it still works)
  5. FitResults.params only exists for the purpose of ** unpacking. It is not currently and will never be used for anything else. It's just a view on the actual info. With that in mind it's key has to be str.

I think that also answers the unclear point. So yes, I think an OrderedDict should work right?

And maybe another name to avoid confusion?

@pckroon
Copy link
Collaborator

pckroon commented Aug 5, 2016

Thanks.
Both a normal dict and and OrderedDict would suit these purposes. Another name might be desirable to avoid confusion with Model.params.
Would it be feasible to make FitResults the actual mapping object? i.e. model(x=data, **fitresults). IIRC the argument against this was that you could get multiple results from one fit, right? But I think that opens up a different discussion that should get it's own issue.

tBuLi pushed a commit that referenced this issue Feb 5, 2017
…dentity crisis. It has been replaced with an OrderedDict. However, Pitje06's point in #45 is now even more evident; confusion between the content of FitResults.params and Model.params is bound to arrise.
@pckroon
Copy link
Collaborator

pckroon commented Jan 25, 2018

Is this still an issue?

@tBuLi
Copy link
Owner Author

tBuLi commented Jan 26, 2018

No, the switch to OrederedDicts has been made. Thanks

@tBuLi tBuLi closed this as completed Jan 26, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants