Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

parameter-based GSObject classes #195

Closed
TallJimbo opened this issue Jun 18, 2012 · 34 comments
Closed

parameter-based GSObject classes #195

TallJimbo opened this issue Jun 18, 2012 · 34 comments
Assignees
Labels
base classes Related to GSObject, Image, or other base GalSim classes

Comments

@TallJimbo
Copy link
Member

In the comments for #148, Barney suggested moving one aspect of a proposal I made there into a new issue, so here it is.

The idea is to change the GSObject class (and subclasses) from something that constructs its SBProfile immediately, and then holds it as an attribute, to something that holds all the parameters needed to define a particular SBProfile, and then constructs it (and possible caches it) when it is first requested.

I think there are two and a half motivations for this:

  • It makes it much easier for GSObjects to have user-accessible parameters (like radii and ellipticity); doing this in the current model is problematic because the SBProfile object may be an SBDistort or SBConvolve that does not have accessors.
  • It makes it possible to unify the definitions of the parameters of different kinds of objects in the GSObject hierarchy, so that we wouldn't have to duplicate that information when defining the config file schema, because it makes the GSObject classes more introspectable. That will require additional work on Mechanisms for more restricted config files #148, but I think it makes sense to complete this issue first before pressing forward on that.

Another advantage is that I think it also hides the implementation details of SBProfile from the user a little. For instance, ellipticity will feel more natural as a part of a GSObject's definition; the fact that this is implemented by an SBDistort adapter around the original profile class becomes completely invisible. In fact, while I think I could make this change in a way that doesn't change the current GSObject API, I'd like to also remove the "public" .SBProfile attribute and replace it with something that starts with an underscore (in either case, this would become a property that is computed on-the-fly, of course, though it may be cached).

This touches a lot of code, but I think it'd be very straightforward to do, and even I could get it done pretty quickly. I'm going to start converting a few classes on the branch immediately to serve as examples for discussion.

@ghost ghost assigned TallJimbo Jun 18, 2012
@rmjarvis
Copy link
Member

I think reason 1 is obviated by the changes I made in #142. When it's a Distortion, python knows and gives a TypeError if someone tries to use the getters. I think this is exactly the functionality we want for this. (After all, what should getSigma return if you really have a sheared and/or dilated Gaussian?) In any case, you should probably start your branch from the current #142 branch (which will shortly be merged) since it included quite a few changes around this part of the code.

I'm not sure about your second reason. Probably because I don't really know what you have in mind for #148. I would have thought all the information you need is already in the definition of the constructor for each sub-type of GSObject.

I do agree that the SBProfile should be made "private". I don't think that should be part of the public interface.

@TallJimbo
Copy link
Member Author

Raising a descriptive TypeError is better than some other failure, but I still don't think it's nearly as nice as being able to return the value the user was asking for. And while there are many radii one can define for an elliptical object, people certainly do talk about the radii of elliptical things, and we just need to document what we're returning. And there are things like flux that aren't at all ambiguous are are very useful to be able to get from an object, even if you've transformed it some way.

On the second point, the information we need for the configuration interface is essentially:

  • what parameters define an object
  • what the data types are
  • what their defaults should be
  • whether None is an allowed value (i.e. whether the default can be computed)
  • which parameters are redundant and shouldn't be specified together
  • (optional) what range of values is allowed for each parameter

If we don't use *args and **kwds in our constructors, we could extract some of that from the __init__ method signature with the inspect module, but that's clunky, and it doesn't give us all the information. And I think we do want to use **kwds (as we have been doing).

Alternatively, if we can find a way to make the configs work by just using the constructor, then we don't need to extract that information - but that's not what we've done so far, and I don't think we'd be happy with the config interface we get if we go that route.

Anyhow, I will definitely start from the #142 branch. I'll also take another look to see if I can get #148 a little further so as to provide a little more motivation by example; there's a bit of a chicken-and-egg problem between the two issues.

@rmjarvis
Copy link
Member

And there are things like flux that aren't at all ambiguous are are very useful to be able to get from an object

getFlux is available for all GSObjects.

And I think we do want to use **kwds (as we have been doing).

The only one is Convolve. If that's the sticking point, it wouldn't be hard to switch that to explicit keywords in the init rather than parsing several options. The only reason I did it that way is for the syntactic sugar of allowing both Convolve([psf,gal]) and Convolve(psf,gal). We can easily remove that latter functionality, which would let us use a normal init definition.

@TallJimbo
Copy link
Member Author

Thanks for the clarification, and sorry I maligned the current state of the code!

@barnabytprowe
Copy link
Member

I think that some of the information that the inspect module doesn't give us, and which we want to be available to the config module at least, are the types of parameters - currently we have required, size and optional, all of which may vary in length. Parameter based GSObject classes would allow this information to be accessible to the config parser by introspection as Jim says, without needing the special functions of Mike's original config implementation, or the imperfect object_param_dict currently being used.

I would see this as the primary benefit of this work - allowing a GSObject to know exactly what it stores and thus what it needs to be constructed from heterogeneous user input.

As a secondary point, I do also think it would be nice to continue to return sizes and ellipticities from even distorted objects. Obviously, all bets are off once you get to Adds and Convolves, but if an object is merely sheared and/or dilated we can simply adopt a convention (I always think that r^2 -> a*b for semi major and minor axes a & b makes sense) and stick to it... The actual implementation would be fairly easy, and it would be undeniably nice...

@rmjarvis
Copy link
Member

(I always think that r^2 -> a*b for semi major and minor axes a & b makes sense)

And I prefer r^2 = (a^2 + b^2)/2.

Which gets immediately to the problem. If we have some kind of getSize() that everything implements in some way or another, users will assume it means something in particular, and many of them will assume wrong. I think we should let each class implement the things that actually do make sense for it and not try to shoehorn in things that are imprecise at best and sometimes even nonsensical.

@rmjarvis
Copy link
Member

On the first point, I'll take your and Jim's word for it that this will be useful. So I'm remaining agnostic until I see how this affects the implementation of the config stuff. I don't have any particular aversion to it. I just don't yet see how it will be helpful.

@barnabytprowe
Copy link
Member

Taking on this task in light of the discussion and design review telecon held for #148. Will update with changes ASAP, I fully expect to get this wrong the first one (or two) tries!

barnabytprowe added a commit that referenced this issue Jul 18, 2012
… more directed questions. Think so. Will email to set up a chat.

(#195)
barnabytprowe added a commit that referenced this issue Jul 18, 2012
barnabytprowe added a commit that referenced this issue Jul 19, 2012
barnabytprowe added a commit that referenced this issue Jul 19, 2012
barnabytprowe added a commit that referenced this issue Jul 20, 2012
barnabytprowe added a commit that referenced this issue Jul 20, 2012
@barnabytprowe
Copy link
Member

Hi all,

Have spent a bit of time working on this revision of the GSObject internals, and have put a trial implementation of only the Gaussian class into a new (temporary) file called galsim/sandbox.py.

Would really like to get some people's opinions on this (anyone welcome, but I'm picking out @rmjarvis , @joezuntz , @pmelchior , @rearmstr , @joergdietrich for special pinging!).

I found the stuff on Python descriptors very impenetrable at first, but Jim has been extremely helpful and today we had a breakthrough! Nonetheless, it's something of a big change so we would welcome your thoughts on this trial, 'sandbox' implementation.

Main points:

  1. We wanted to make the GSObject derived classes (e.g. Gaussian or Exponential) know what parameters they need to initialize themselves, so that user input in config files or catalogues can be parsed. Moving this info into the classes themselves is desirable so that there is no longer any need to keep a separate record (e.g. the object_param_dict in base.py, which is our old workaround).
  2. One way to do this is to make these parameters attributes of the Class, with values that get set when an instance is initialized. Then the config routines can cycle through these attributes and find which ones it needs from the user config files.
  3. However, we want these attributes to have some customized behaviour, and so Jim suggested we make them descriptor instances. A new Gaussian1 reimplementation of the Gaussian can be seen in sandbox.py. This also requires two descriptor classes SimpleParam and GetSetParam, and some minor mods to the way the GSObject stores its SBProfile.

So much for the motivation, and broad sweep of what's been tried here. Some of the new features of this implementation for the Gaussian1 :

a) The sigma, half_light_radius, fwhm and flux are all now attributes of a Gaussian1 instance. These keep themselves updated, for example...

In [6]: g = galsim.sandbox.Gaussian1(half_light_radius=1., flux=2.)

In [7]: g.sigma
Out[7]: 1.1774100225154747

In [8]: g.sigma=3

In [9]: g.half_light_radius
Out[9]: 2.547965400864057

(N.B. In the current example implementation I have removed the getSigma, setSigma etc. methods for clarity, but I will re-instate these for checking purposes if completing this revamp. These do the same calculations by different paths.)

b) Not only this, but the C++ SBProfile instance for which the GSObject classes are a glorified container is also updated for any change in the parameter attributes. This is achieved via the freedom to customize the way the descriptors behave: g.sigma = 3 above not only changes the underlying size for the Gaussian1, but it also sets the stored SBProfile to None. When this SBProfile is eventually needed, it is re-initialized using the most up-to-date set of attributes.

This is not completed by a long way, and there are some things that will need to be streamlined/added for handling the whole set of GSObjects: many of them have things in common like being defined by a radial profile, and so a shared intermediate base class would avoid much duplication. We would also probably exampled the descriptors SimpleParam, GetSetParam to store more info of use to users and the config routines, and move these to a separate file descriptors.py.

However, the basic pattern is outlined here.

Questions:

Does this scheme seem comprehensible? (It took me a while to grasp it)

If it seems a little more complex, do we think this is worth it to be able to have all the GSObjects store all the information needed to build themselves in one place?

Is this going to be an overly high bar to future expansion by possible third parties?

My personal feeling is that it is worth it, but smoothing it all out will be a bit of work. Yet, I would love it if the GSObject classes and class instances carried around all the info they need...

@rmjarvis
Copy link
Member

This is a little bit unwieldy, especially since many classes are going to have almost identical code for their set of multiple size types (that takes up most of the code in Gaussian1), so it would be nice if there was an intermediate class that did all this work for the types that had multiple sizes. Or maybe a function that can set this up automatically? For example, I think it would be possible in python to have the Gaussian1 class look something like the following:

class Gaussian1(GSObject):
    half_light_radius = SimpleParam(
        "half_light_radius", default=None,
        doc="Half light radius, kept consistent with the other size attributes.")

    flux = SimpleParam("flux", default=1., doc="flux of this object")

    def _SBInitialize(self):
        GSObject.__init__(self, galsim.SBGaussian(half_light_radius=self.half_light_radius,
                                                  flux=self.flux))

    def __init__(self, half_light_radius=None, sigma=None, fwhm=None, flux=1.):
        checkSizes(half_light_radius, sigma, fwhm)
        self.flux = flux
        self._SBInitialize()

AddAlternateSize(Gaussian1, 'fwhm', 2.)  # fwhm = 2 * half_light_radius
AddAlternateSize(Gaussian1, 'sigma', 1.1774100225154747)   # sigma = sqrt[2ln(2)] * half_light_radius

Then checkSizes would do the checks that exactly one size is set. And AddAlternateSize would add the setters and getters, etc. to the class specified for the extra size name. This would make the upkeep requirements simpler (the code to do all this stuff only exists once), and it lowers the bar for adding new profile classes.

The only other comment I have is that we might not want flux to be a SimpleParam. We don't need to recreate the SBProfile if the user changes the flux. We can just setFlux on the one that is already there, which wraps the existing profile with a flux scaling. Most of the SBProfiles have fairly trivial constructors, so it's probably not too inefficient to recreate them most of the time. But a few of them aren't. (For example, SBMoffat currently calculates its fourier transform on construction. We might want to move this to an Info class like Sersic and Airy do, but that's not how it works right now.) I'm not sure whether the additional complication of making the flux work like this is worth it. For now you should probably leave it as a SimpleParam and have the update to something more sophisticated be a future issue.

@rmandelb
Copy link
Member

Just to be clear, that formalism you have above works for all the simple parametric profiles. We'll have to do something different for AtmosphericPSF, OpticalPSF, RealGalaxy, Add, Convolve. I know that we discussed these briefly (or at least, we discussed the ones that contain an SBInterpolatedImage, maybe not Add or Convolve). Can you give a summary here for how those would work, in case it's useful for everyone in deciding how they like the new proposed structure for our base classes?

@TallJimbo
Copy link
Member Author

Mike has brought up some issues on the #220 that suggest we should hold off on making descriptors for some of the geometric parameters of objects. We'll need to come up with some other way of declaring all the possible geometric transforms and their parameters, in a way that also allows us to specify the order of operations.

We may want to start by looking at all our potential parameters, and categorizing them all as "apply only" (e.g. shears), "get/set only" (e.g. Sersic index), or "apply + get/set" (e.g. flux). And see if there's anything that doesn't fit into one of those categories.

@rmandelb
Copy link
Member

I wonder if there is value to another discussion now that we have an explicit code example to talk about? (Jim, I realize you won't be available next week.) There's been a lot of GitHub activity that might be more easily addressed by talking.

@TallJimbo
Copy link
Member Author

A discussion might be valuable for a discussion to evaluate the complexity/maintainability of what Barney has started with (and I wouldn't need to be there for that, I think).

But I think we need to come up with some example code for how to handle the apply-based parameters in configs before it's worthwhile to discuss those live.

@barnabytprowe
Copy link
Member

Hi all,

Apologies for the radio silence, got home yesterday from Pittsburgh and the internet at home is not functioning. So I'll be quick, and I'll work offline on expanding the examples taking into account the suggestions you've made (e.g. doing some other classes as per Rachel, giving the flux a special status since it doesn't necessitate rebuilding the whole SBProfile as per Mike).

However, the purpose here, since only on Thursday did I actually get this working at all, was to illustrate the new approach on our simplest class and generate informed discussion: this seems to have worked.

A quick direct response:

This is a little bit unwieldy, especially since many classes are going to have almost identical code for their set of multiple size types (that takes up most of the code in Gaussian1), so it would be nice if there was an intermediate class that did all this work for the types that had multiple sizes.

This is not completed by a long way, and there are some things that will need to be streamlined/added for handling the whole set of GSObjects: many of them have things in common like being defined by a radial profile, and so a shared intermediate base class would avoid much duplication.

I think I prefer the intermediate base class to the function idea, as it would perhaps help us to group the classes into their more conceptually distinct types... But I'll try both. At any rate, there will be some cases (e.g. HLR for a truncated Moffat) where a simple multiplication-by-a-constant is not sufficient, so there will need to be some mechanism to allow each GSObject to define a function if it's not super simple.

Finally, I also think I like the suggestion on #220 to make shears "apply/get". We can shear all objects, so the config machinery really doesn't need to know that it has to set these shear parameters: it can assume that it is asked to shear things then it can. That's my 2p anyhow, I'll either be back in touch tomorrow (if our internet is fixed/I go into Caltech) or on Monday with an update. After that, we can think about another chat on the phone...

barnabytprowe added a commit that referenced this issue Aug 12, 2012
barnabytprowe added a commit that referenced this issue Aug 12, 2012
…g group= that I noticed and really should not have got wrong... :S

(#195)
rmandelb pushed a commit that referenced this issue Aug 17, 2012
rmandelb pushed a commit that referenced this issue Aug 17, 2012
barnabytprowe added a commit that referenced this issue Aug 20, 2012
barnabytprowe added a commit that referenced this issue Aug 20, 2012
barnabytprowe added a commit that referenced this issue Aug 20, 2012
barnabytprowe added a commit that referenced this issue Aug 20, 2012
barnabytprowe added a commit that referenced this issue Aug 21, 2012
barnabytprowe added a commit that referenced this issue Aug 21, 2012
barnabytprowe added a commit that referenced this issue Aug 22, 2012
…ns to within _setup_data_store where they can be done as instance variables. Also corrected an error in copy which needs to catch extra such variables in special cases too.

(#195)
barnabytprowe added a commit that referenced this issue Aug 22, 2012
…in _get_sbprofile raising an informative Exception.

(#195)
barnabytprowe added a commit that referenced this issue Aug 22, 2012
@barnabytprowe
Copy link
Member

This is now being closed in light of other config work on #234, #148 and #241.

@rmjarvis rmjarvis added base classes Related to GSObject, Image, or other base GalSim classes and removed core labels Nov 21, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
base classes Related to GSObject, Image, or other base GalSim classes
Projects
None yet
Development

No branches or pull requests

4 participants