Skip to content
This repository has been archived by the owner on May 26, 2022. It is now read-only.

added new model parameter API. #9

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

wkerzendorf
Copy link
Member

I have started to implement basic WCS functionality in specutils and worked with the new Model class from astropy.modeling. This showed me that the functionality in Model is quite good, however the API is a bit "clunky". I've worked a lot with the sqlalchemy - an object relational mapper. They represent their Tables (or table rows) as classes and have implemented a simple, but powerful API. Each individual column has a name, a type (and other properties like Foreignkey). This is very similar in my mind to a Model which has Parameters which have properties (e.g. fixed). Attached my suggestion of what I feel would make Model much easier to work with (mostly when writing your own).

What are people's thoughts.

@embray
Copy link
Member

embray commented Jun 25, 2013

Yes, this declarative approach to defining model classes looks very similar to what I already started in #1086. That didn't include descriptors like ParameterSet or ModelInput/Output but those seem worth considering as well.

@wkerzendorf
Copy link
Member Author

@iguananaut I think it would be good to make a combined effort to deal with this. Currently, we have efforts scattered across astropy/astropy#1203, astropy/astropy#1086, astropy/astropy#980 as far as I can tell.

It doesn't necessarily have to be here on this PR, but I think an API PR would be great. If you think this PR is a good start feel free to commit to it.

@embray
Copy link
Member

embray commented Jun 26, 2013

Well. I think 980 doesn't represent any particular substantive work--it just says "this is unideal". 1086 is my effort to simply and streamline things; though I really need to update and rebase it. 1203 is a separate effort to improve some of the issues brought up in 980 but for the most part doesn't overlap with it functionality-wise.

@wkerzendorf
Copy link
Member Author

@iguananaut is there anything I can do to help? should I close this PR here?

@embray
Copy link
Member

embray commented Jun 27, 2013

To the contrary, I think maybe we should post a link to this on the mailinglist or something to get a little more attention.

In my PR I sort of just went straight forward and DID what I had in mind, but that led to a very complicated diff and made it more difficult to go back and explain the issue (the comments in the PR got further confused by a mostly unrelated discussion about computational efficiency).

It would be nice to have more discussion about this API approach and its merits.

@wkerzendorf
Copy link
Member Author

@iguananaut: Well we should make sure that this is inline with your PR. What ideas are in your PR that are not on this API PR?

@embray
Copy link
Member

embray commented Jun 27, 2013

The main thing I emphasized in my PR that is not in your notes here is that I wanted to avoid duplication of data, and to store all pertinent data related to a model in one place.

The problem with the current design is that there are at least two copies of all the parameter values: There are copies of the values stored in the Parameter objects as attributes on the Model class, and then there's an additional copy stored in the .parameters array. This leads to complicated, messy code to try to keep the two in sync. Better would be to have the parameter values stored once and only once.

My current approach stored the parameter values as attributes on the Model object and the .parameters attribute presented a sort of array-like view of the values. But Nadia pointed out that this is too slow for passing the parameters to the fitter and updating them through multiple rounds of fitting.

So what I really need to do is the opposite of what I currently have: make the .parameters array the place where parameter values are stored, and make the per-parameter attributes extract their values from that array. Does that make sense?

This post specifically discussed parameter values, but the same principle applies to things like constraints.

@wkerzendorf
Copy link
Member Author

@iguananaut I think what you're discussing here are more technical problems that are not that related to the actual API (still very important). So if you don't have anything to add, I will get something out to astropy-dev suggesting a similar API to this PR and linking to your implementation PR (highlighting the mentioned problem).

@embray
Copy link
Member

embray commented Jun 27, 2013

You're right--I realized that while I was out to lunch. All that stuff I just wrote is more implementation detail than API-related (though it can affect the API in some places, it's mostly a separate issue). Otherwise, I really like your suggestion here, and it's basically the same thing I had in mind.


mylittlepoly = Poly1DModel()
mylittlepoly.c_0 = 10.
mylittlepoly.fixed

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does it mean to have a whole model that is fixed? Should it be mylittlepoly.c_0.fixed=True?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be my guess as to what was meant. Though the current ParametricModel class does I think have a "fixed" attribute (or maybe it's "_fixed") that just returns a dict mapping each parameter to whether or not it's fixed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to show that fixed is a property of a parameter (like currently in the Model class). You know what I mean @keflavich?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I admit I'm still a little lost. What should I see if I do print mylittlepoly.fixed? Is it as @iguananaut says:
{'c_0': False, 'c_1': False, ...}
or just False, or
{'c':[False,False,False,False,False]}, or
something else?

[edit: the API suggests False on line 51, but I thing @iguananaut's suggestion probably makes more sense]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually it's wrong I should have written mylittlepoly.c_0.fixed - sorry for the confusion. I think mylittlepoly.fixed shouldn't print anything it should throw an AttributeError because you can't "fix" a model, just individual parameters of a model.

@keflavich
Copy link

I like the general idea, and have no specific objections to what's shown here, but I think the examples could be fleshed out a little more.


class Poly1DModel(modeling.Model):

c = ParameterSet(5, syntax='_%s', unit=None, equivalency=None, fixed=False, tied=False, bounds=())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also rather than "syntax" I might call this something like "format"...maybe?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suffix_format, perhaps?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear to me what ParameterSet is. In the current framework the term is used for defining simultaneously the same kind of model with different parameters. For example p = Poly1DModel(degree=2, param_dim=2) will create a Poly1DModel with two parameter sets (effectively 2 models) which can be evaluated and fit simultaneously. The comment above states that ParameterSet is used for model with multiple parameters . Did you mean parameters which are not scalars? If so why is a separate class needed?

@wkerzendorf
Copy link
Member Author

@keflavich Can you specify some examples that you want to see.

@nden: I changed from ParameterSet to ParameterCollection as the idea is different than the currently implemented ParameterSet. I don't quite understand why we need a ParameterSet, why can't we just a list of the same models with different parameters?

@nden
Copy link

nden commented Jun 28, 2013

@wkerzendorf The short answer is efficiency.
If you have a cube (30, 1000, 1000) of ramp data and want to fit a polynomial to each pixel (a common use case in IR), you'll get a list of 1000000 models and so many fitters vs 1 model, 1 fitter and that many parameter sets. Huge difference!

@nden
Copy link

nden commented Jun 28, 2013

I am still not sure what ParameterCollection is used for. The proposed API says it's for models with multiple parameters. Does the Gaussian1DModel, for example, fall into this case, since it has three parameters or do you mean single parameters which are not scalars? If so how are these two cases handled:

  • a model with two parameters - one is an array, one is a scalar
  • the above case of multiple parameter sets

@embray
Copy link
Member

embray commented Jun 28, 2013

@nden I agree with you that a ParameterSet descriptor is unnecessary--it should just be possible to instantiate a model with multiple parameter dimensions.

I think that what he's trying to go for with ParameterCollection is a generalization of models like polynomial models that can have a variable number of parameters possibly in a variable number of dimensions. Its purpose just needs to be worded better...assuming my understanding is correct.

@wkerzendorf
Copy link
Member Author

So the ParameterCollection is exactly what @iguananaut is saying. A way to have an arbitrary number of Parameters being created. I think one could argue if that's necessary. I think just saying mylittlepoly.c[0] = 5 might be good enough to access these different parameters.

That brings me to the next point: There are two very different concepts we are addressing with parameter "dimensions" (I'm going to be using the word Modelclass for all models that use the same parameters and the same evaluation function, e.g. y=mx+b is a Modelclass, y=5*x+10 is a Model).

The first concept is the one that Nadia describes - sometimes there's a need to fit the same Modelclass to a large collection of data. I feel that there should be a ModelSet that does this because it's many models being fit to many datasets.

The second concept is that a parameter might be intrinsically multidimensional (e.g. a rotation matrix, polynomial coefficients, lookup table, ..) but still is one Model.

When we mix these we run the danger of having inconsistent models: What should happen if I give a n-dimensional input? Does it evaluate all the inputs with all the ParameterSets(using the term as @nden uses it). Do we evaluate the inputs with the first first parameterset?

@wkerzendorf
Copy link
Member Author

@iguananaut @nden: I want to push this a little bit. What needs doing to get a version of this API implemented (I think this a more a question for @iguananaut). What's up with the PR that already exists?

@embray
Copy link
Member

embray commented Jul 8, 2013

So, I need to do some work on rebasing my PR on master and integrating it with some of the changes that have already since occurred in the modeling package. Then I need to fix several issues it has with parameter constraints, and address the performance concerns Nadia had. I was actually hoping to start on that later this afternoon once I get through a few more things, but I'm not sure how long it will take. It's going to be an ugly merge :)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants