Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reworking astropy model core code #7945

Closed
wants to merge 69 commits into from
Closed

Reworking astropy model core code #7945

wants to merge 69 commits into from

Conversation

perrygreenfield
Copy link
Member

Since travis is not currently picking up the latest version, I'm submitting this before knowing what issue travis will raise. The compound-models.rst file doesn't pass doctests currently, but it needs significant editing.

The following describes major changes to the internals of modeling I’ve made over the past year.

Rationale for major changes to modeling:

  1. [This may be more detail than some wish to see] Much of the original rationale was driven by improving the performance. At the time it seemed that compound models were using enormous amounts of memory for complex models (we are talking on the order of 90 component models and approximately 280 parameters). It appeared that parameter objects were being propagated at every node of the expression tree leading to a huge number of parameter object instances. But in doing the work, I don’t actually see how that happened since the previous implementation does go to pains to avoid such parameter creation. It may be that ASDF somehow triggered that regardless, or some other fix made since then has rectified the previous memory issue. Even so, the changes still appeared worthwhile for the reasons listed below.

  2. Removing use of the metaclass from compound models. It is not believed that generating compound classes through expressions is useful enough to justify the complexity introduced by yet another metaclass (the current implementation has the CompoundModel use a metaclass and is also a subclass of Model, which has its own metaclass as well. Besides the complexity of this approach, there are very unexpected results when mixing classes and instances within an expression. The new implementation is more straightforward in approach, though not necessarily shorter in lines of code (though probably as well since calls to other libraries implementing tree handling and such are now eliminated). Essentially, the primary approach is to make CompoundModels intrinsically tree structures, i.e., they are themselves nodes in the expression tree, rather than resulting in a tree attribute that mirrors the implied tree structure.

  3. Changing the semantics of parameters so that parameters are now not shared objects between models of the same class. This has many benefits since the previous behavior led to surprising results (e.g., corresponding parameters in compound models were disconnected from the value contained in the constituent models; change one and the other would not see the change). Making this change makes it simpler to clone models and change aspects of the parameters; something that is quite difficult to do now without resorting to interface violations or tedious construction of the arguments to the class constructor. Making changes to this effectively required a major reimplementation of compound models so these changes were coupled rather than implemented separately.

Changes in Behavior or Interface

This version of modeling has made substantial internal changes to the implementation of modeling and their associated parameters. Some of these changes impact previous behavior. This will summarize the most important changes.

  1. Compound model classes are no longer supported. In previous versions it was possible to create a new model class (not instance!) by doing something like this:

from astropy.modeling.models import Gaussian1D, Const1D
NewCompoundClass = Gaussian1d + Const1D

That no longer is possible. It was removed to make the implementation simpler, as well as eliminate a number of odd results (particularly when combining classes with instances). As far as we know, there are not many uses of this capability, and most such uses have alternatives.

  1. Parameters now store their values within the parameter instance. Previously, all parameter values were stored with the model the parameters were defined with. This leads to differences in behavior. In the past, if one defined a compound model, the parameter values were copied to the compound model instance. If the parameter value was changed in the compound model, that would not be reflected in the parameter value in constituent model that made up part of the compound model. In the current implementation the compound model uses the parameter values within the constituent model so changes to that parameter affect both. With the new implementation, it is possible for multiple models to share the same parameter and thus all be affected by a change in its value. Implementing this change while retaining the syntactic convenience of Descriptors requires some trickery. This was implemented by retaining the definition of parameters at the class level. But now when a model is instantiated, copies of the parameter instances are made and referenced at the model instance level; the existence of these parameter instances in the model’s instance dict effectively shields the class level instances (so long as the class level instance is not a descriptor).

  2. When n_models > 1, previously the shape of the corresponding parameters removed the axis corresponding to the model_set_axis. Now the model_set_axis is part of the parameter shape. To illustrate:

Old:

In [1]: g = Gaussian1D([1,1], [1,2], [2,4], n_models=2)

In [2]: g.amplitude
Out[2]: Parameter('amplitude', value=[1. 1.])

In [3]: g.amplitude.shape
Out[3]: ()

New:

In [1]: g = Gaussian1D([1,1], [1,2], [2,4], n_models=2)

In [2]: g.amplitude
Out[2]: Parameter('amplitude', value=[1. 1.])

In [3]: g.amplitude.shape
Out[3]: (2,)

  1. The implementation of the parameter validator method has changed though the behavior for most cases of use shouldn’t. But note, the current method stands double duty as a decorator and a method depending on the context. In the previous version the context was provided by whether the parameter had a link to a model. If not, it was interpreted as a decorator (it’s complicated; even though the use of the decorator is within a model, at the time the decorator is used, there is no link to the model—no, you don’t want to know the details). Since parameters no longer have links to the model, the method uses the type of the argument to distinguish the context. Callables imply decorator context; numbers imply method context. I recommend the method context be deprecated. A new method called “validate” is now available and recommended for that purpose.

  2. Generally we expect the new implementation is faster, particularly for more complex models. If you find cases where it is significantly slower, please file an issue.

Other changes involve methods of compound model classes and parameter classes, mostly internal ones (i.e., ones that start with an underscore). The new compound model class no longer uses _CompoundModelMeta.

perrygreenfield and others added 30 commits October 18, 2018 16:43
@nden
Copy link
Contributor

nden commented Jan 17, 2019

@perrygreenfield Can you rebase this?

perrygreenfield and others added 8 commits January 17, 2019 13:01
Avoid importing astropy.stats for astropy.modeling

(cherry picked from commit d5a6ac1)
fix np dev failure

(cherry picked from commit 38130bb)
Remove arguments for super

(cherry picked from commit edb7238)
fix serialization of bounding_box with quantities

(cherry picked from commit d78b3cb)
Fix ``Moffat1D`` and ``Moffat2D`` derivatives. [#8094]

(cherry picked from commit 98771ac)
@wkerzendorf
Copy link
Member

@perrygreenfield - as discussed on slack. We are very dependent on this PR as our models contain large arrays and the current implementation seems to deepcopy the model when combining them. Thanks for fixing this. We might give the PR a try (as a beta testers 😉 ) once you think it's in a reasonable state.

@perrygreenfield
Copy link
Member Author

@wkerzendorf Yes, I think it currently is worth trying. Let me know if there are any problems. A number of private methods and attributes may have changed, and creation of compound classes no longer is supported (compound instances still work!)

@pllim
Copy link
Member

pllim commented Mar 19, 2019

@perrygreenfield , if you want people to take this for a spin, it is best to rebase to get rid of the conflicts with master.

@@ -297,17 +297,18 @@ def _separable(transform):
transform : `astropy.modeling.Model`
A transform (usually a compound model).

Returns
-------
Returns :
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original syntax is actually the correct one for NumpyDoc to pick it up.

@astrofrog
Copy link
Member

@perrygreenfield - I'm happy to try and review this for v3.2, but as @pllim said this should be rebased first. Hopefully the conflicts aren't too difficult to resolve!

@nden
Copy link
Contributor

nden commented Apr 12, 2019

@astrofrog @pllim This is going to go in 4.0. I don't think it's possible to rebase or resolve conflicts. In some aspect the code is completely different. The approach taken so far is to just copy the code in modeling. Changes that need to be included have been tracked and copied over.

@astrofrog
Copy link
Member

@nden - 4.0 will be an LTS, so we should make sure we are completely happy with the updated behavior/API if it will be going directly into 4.0. For that reason, I'd recommend that even if this can't make it into 3.2, it's included as soon as possible into master so that some of us working with the latest developer version have a chance to use it for a little while before the 4.0 release (in other word s, this shouldn't be merged during the 4.0 feature freeze)

@nden
Copy link
Contributor

nden commented Apr 12, 2019

@astrofrog The decision taken at the Astropy coordination meeting in Tucson was to include it in 4.0 and merge it with master immediately after the 3.2 release. The reason was some were concerned there's no deprecation period for disappearing features like the ability to combine classes. I wouldn't have a problem putting it in 3.2 if I knew your concerns ahead of time. However, I think it's too late for 3.2 now.

@nden
Copy link
Contributor

nden commented May 3, 2019

@astrofrog Are you still willing to review this PR? I am going to do one final review and would like to merge it as soon as possible so we can work out possible issues. Please let me know if you will have time to review it.

@astrofrog astrofrog self-requested a review May 3, 2019 14:44
@@ -109,6 +109,11 @@ astropy.io.votable
astropy.modeling
^^^^^^^^^^^^^^^^

- Major rework of modeling internals. See release notes for details. Eliminates
support for compound classes (but not compound instances!)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This entry needs to be moved under 4.0 (API changes). Also add the PR number.

Since many old entries show as new it's best to modify the latest version of the file in master.

from .basic import TransformType, ConstantType
from ......modeling.core import Model, CompoundModel
from ......modeling.models import Identity, Mapping
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make these absolute imports like on master.

from .basic import TransformType

from ......tests.helper import assert_quantity_allclose

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make all imports absolute except the ones in this subpackage.

ExpressionTree, AliasDict, get_inputs_and_params,
_BoundingBox, _combine_equivalency_dict)
from ..nddata.utils import add_array, extract_array
#from . import compound
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this


from ..utils import indent
from .utils import combine_labels, _BoundingBox
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make all imports from other subpackages absolute. You can copy from current master - github is not giving correct diff results.

"""

self.model.parameters[:] = np.array([3, 4, 5, 6, 7])
assert (self.model.parameters == [3., 4., 5., 6., 7.]).all()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should probably be mentioned in the change log

def test_parameters_wrong_shape(self):
sh1 = models.Shift(2)
with pytest.raises(InputParameterError):
sh1.offset = [3, 3]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add back the test?

@@ -378,6 +378,9 @@ def test_2d_orthopolynomial_in_compound_model():

fitter = fitting.LevMarLSQFitter() # re-init to compare like with like
compound_model = Identity(2) | Chebyshev2D(2, 2)
#compound_model.map_parameters()
compound_model.fittable = True
compound_model.linear = True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean to assert this? Those attributes appear to be set correctly now.

@@ -42,7 +42,7 @@ def test_evaluate_with_quantities():
# error is raised
with pytest.raises(UnitsError) as exc:
gq(1)
assert exc.value.args[0] == ("Gaussian1D: Units of input 'x', (dimensionless), could not be "
assert exc.value.args[0] == ("Units of input 'x', (dimensionless), could not be "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is realted to the UnitsError change in core.py. If the UnitsError is restored then this test should pass as is.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True for the rest of the differences in this module.

Looks for ordered descriptors in a class definition and
copies such definitions in two new class attributes,
one being a dictionary of these objects keyed by their
attribute names, and the other a simple list of those names.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks out of date.

@astrofrog
Copy link
Member

I can review this shortly, but it might make sense to rebase it first, so I can also play around with it?

@nden
Copy link
Contributor

nden commented May 20, 2019

@astrofrog A rebased version of this PR is coming in the next few days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

None yet