Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spectrum and Morphology as Factor #191

Merged
merged 41 commits into from
Aug 7, 2020
Merged

Spectrum and Morphology as Factor #191

merged 41 commits into from
Aug 7, 2020

Conversation

pmelchior
Copy link
Owner

@pmelchior pmelchior commented Jul 15, 2020

This PR refactors (sic!) FactorizedComponent, which was defined for non-parametric models only. The new API creates a mostly abstract Factor class and derives Spectrum and Morphology from it. Existing code has been moved into classes such as ImageMorphology and StarletMorphology. Each derived class needs to implement get_model() anyway it wants, and set its bbox to determine how the model fits into the model_frame.

This allows for reusable implementations, e.g. combining a custom black-body spectrum with an existing point-source morphology, or a non-parametric spectrum with a starlet-morphology, etc.

In the process, the properties Component.morph and Component.sed got renamed into spectrum and morphology, and FunctionalComponent was deprecated because FactorizedComponent can express functional models for its factors.

Other changes are minor bugfixes, in particular in the initialization, plotting, and docs. In addition, Box gained the ability to make outer products with the @ operator and slices.

@pmelchior
Copy link
Owner Author

One thing we should implement soon is a scarlet recipe repo, where source and factor implementation can be shared. I don't know how exactly but it would be helpful to provide a space for model builders

@herjy
Copy link
Collaborator

herjy commented Jul 15, 2020

Can you explain the rationale behind the switch from sed to spectrum? I'm, a priori, not a huge fan. At the moment, we still have seds. Spectrum is an abuse of language.

@pmelchior
Copy link
Owner Author

au contraire! We never had SEDs, we have SFDs: we count flux, not energy, per wavelength bin. So, it was abuse of language thus far (admittedly an abuse that is common in the astro community), but the fact that we mostly use a few broadband filters is secondary.

@herjy
Copy link
Collaborator

herjy commented Jul 15, 2020

True, but I'm worried that people who are used to working with high resolution spectra will expect then expect scarlet to get full spectra (which we could given the right models, but by default, we only get the band integrated flux)

@pmelchior
Copy link
Owner Author

But that's a matter of quantity, not of quality.

A different proposal: instead of renaming the outward-facing properties .morph and .sed, I suggest to deprecated them.

  1. They only exist for FactorizedComponent, and are thus not general. But more importantly:
  2. They are somewhat misleading because the normalization of the .sed depends on the normalization of the morphology factor (and vice versa).

Because we cannot know how anyone might implement any of the factors, we actually recommend to use the measure methods. They are consistent for all components and component trees because they see the full hyperspectral model. The difference matters already now for MultiComponentSource, which doesn't have .sed but a well-defined per-band flux.

Experienced users can still read off the properties of the factors from their internal structures, but we probably should advertise these once-useful shorthands any longer.

Thumbs up/down, please

@herjy
Copy link
Collaborator

herjy commented Jul 16, 2020

Why do I get an error when I run display notebook:

AttributeError                            Traceback (most recent call last)
<ipython-input-5-61f207fe59e1> in <module>
      2 
      3 fp = open("../hsc_cosmos_35.sca", "rb")
----> 4 sources = pickle.load(fp)
      5 fp.close()
      6 

AttributeError: 'PointSource' object has no attribute '_psf_wrapper'

Copy link
Collaborator

@herjy herjy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly great new stuffs!
I think a couple of things can be changed for consistency. I didn't find anything major. I'm just mildly annoyed by having to create morphology and source objects to make a new source.

docs/1-concepts.ipynb Outdated Show resolved Hide resolved
docs/tutorials/multiresolution.ipynb Show resolved Hide resolved
docs/tutorials/multiresolution.ipynb Outdated Show resolved Hide resolved
docs/tutorials/wavelet_model.ipynb Outdated Show resolved Hide resolved
scarlet/bbox.py Show resolved Hide resolved
scarlet/initialization.py Show resolved Hide resolved
scarlet/source.py Outdated Show resolved Hide resolved
scarlet/source.py Show resolved Hide resolved
scarlet/source.py Show resolved Hide resolved
scarlet/source.py Show resolved Hide resolved
@pmelchior
Copy link
Owner Author

Why do I get an error when I run display notebook:

AttributeError                            Traceback (most recent call last)
<ipython-input-5-61f207fe59e1> in <module>
      2 
      3 fp = open("../hsc_cosmos_35.sca", "rb")
----> 4 sources = pickle.load(fp)
      5 fp.close()
      6 

AttributeError: 'PointSource' object has no attribute '_psf_wrapper'

Probably because you ran that notebook before quickstart, which recreates the sca file with the new structure.

@herjy
Copy link
Collaborator

herjy commented Jul 17, 2020

Indeed, it works now, but I would have expected open to tell me that the file isn't there.

@pmelchior
Copy link
Owner Author

The file was there from an earlier run of quickstart, but it was outdated.

@herjy
Copy link
Collaborator

herjy commented Jul 17, 2020

The file was there from an earlier run of quickstart, but it was outdated.

Then we should make sure that this deprecated file is not in the repo or save the new version.

scarlet/source.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@fred3m fred3m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've looked at everything except the docs, which I'll do now but I thought that I'd give you my comments first so that you can start to look at them. I like this version of the code a lot and I think it will be much more future proof. Most of my corrections are minor things, like undoing some of the improper imports that you implemented in this PR (see https://www.python.org/dev/peps/pep-0008/#imports).

Also, please remember to update your docstrings. Most of the new classes/methods are missing docstrings and many of the old ones are outdated.

scarlet/constraint.py Outdated Show resolved Hide resolved
def parameters(self):
"""List of parameters, including from the children.
"""
return self._parameters + tuple(p for c in self.children for p in c.parameters)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on our discussion offline, I wonder if it's better to have two internal attributes, _parameters and _all_parameters, where _all_parameters is built on initialization and updated if the user sets parameters.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that's necessary any more. The list of parameters is always complete (incl. from the children), so there's no ambiguity.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't mean in terms of ambiguity, I meant because the list of parameters should rarely change, so it would be faster to cache it on init as opposed to adding tuples together every time parameters is called.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like caching the result of an operation that takes very little time

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, but we call self.parameters enough that I feel like it might add up. The only time I don't like caching things like this is when we expect something to change regularly, since we would spend more time caching than we would retrieving it. In this case we are unlikely to ever modify the pointer to _parameters for a model or its children, so a simple change doesn't cost us anything.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I timed the call in the quickstart notebook with 9 sources. It takes 60 µs, comparable to a single application of monotonicity. I think we are OK.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough, I withdraw my objection.

scarlet/model.py Outdated Show resolved Hide resolved
scarlet/component.py Outdated Show resolved Hide resolved
scarlet/component.py Show resolved Hide resolved
scarlet/display.py Outdated Show resolved Hide resolved
tests/test_component.py Outdated Show resolved Hide resolved
scarlet/component.py Outdated Show resolved Hide resolved
scarlet/component.py Show resolved Hide resolved
scarlet/source.py Show resolved Hide resolved
docs/1-concepts.ipynb Outdated Show resolved Hide resolved
Copy link
Collaborator

@fred3m fred3m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docs look good. I would recommend restructuring the "Core Concepts" to explain the data structures in the order Parameter(Constraints, priors, etc)->Model->Component->Source->Blend, as I think that it makes a more natural introduction. But this can be made an issue and done in another ticket (but one that gets merged soon, as I think it will help clarify to users how we intend them to use scarlet. We should also have a tutorial on creating custom sources to illustrate best practices.

@pmelchior
Copy link
Owner Author

Tutorial is #198

@pmelchior
Copy link
Owner Author

I don't know why I can't respond to this comment inline, but so be it: #191 (comment)

I think we are dealing with it now without needing the warning. If someone initializes a sources at the wrong location, they will get nothing, but this should not lead to a crash. With the new non-zero PositivityConstraint we should be fine.

scarlet/initialization.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@herjy herjy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm good with the changes, there are a few conversations that need resolving. I'm particularly attached to this one:
#191 (comment)
But I can address it when pushing a better wavelet tutorial.

@pmelchior pmelchior mentioned this pull request Aug 5, 2020
@pmelchior
Copy link
Owner Author

Thanks, @herjy. As for the comment you're referring to, I made the method more concise in #202, but made sure that your use for starlet sources is still fully supported.

scarlet/component.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@fred3m fred3m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few minor cleanups and unresolved issues to address. Get those resolved and merge #201 and #202 into this branch and this should be OK to merge.

@pmelchior pmelchior merged commit 7e60932 into master Aug 7, 2020
@pmelchior pmelchior deleted the factored branch August 7, 2020 17:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants