Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Datasets class #2030

Merged
merged 10 commits into from Feb 15, 2019

Conversation

Projects
None yet
3 participants
@adonath
Copy link
Member

adonath commented Feb 14, 2019

This PR introduces the Datasets object to enable combining multiple datasets for fitting. There is one important API change involved: the covariance has to be accessed from result.parameters.covariance.

@adonath adonath self-assigned this Feb 14, 2019

@adonath adonath added the feature label Feb 14, 2019

@adonath adonath added this to the 0.11 milestone Feb 14, 2019

@adonath adonath requested review from AtreyeeS and registerrier Feb 14, 2019

@AtreyeeS
Copy link
Member

AtreyeeS left a comment

This looks really good @adonath ! Shall you address the issue of unused parameters in likelihood() in a follow-up PR?

return self.datasets[0].model

@lazyproperty
def is_all_same_type(self):

This comment has been minimized.

Copy link
@AtreyeeS

AtreyeeS Feb 14, 2019

Member

Since you are providing this option, maybe it can also be helpful to have a function to return the types of datasets present?

This comment has been minimized.

Copy link
@adonath

adonath Feb 14, 2019

Author Member

Good suggestion, I'll add such a property.


@property
def model(self):
return self.datasets[0].model

This comment has been minimized.

Copy link
@AtreyeeS

AtreyeeS Feb 14, 2019

Member

I am a bit confused. The same model need not reside on all datasets, right? So what is the utility of this function?

This comment has been minimized.

Copy link
@adonath

adonath Feb 14, 2019

Author Member

I decided to remove this again and break the backwards compatibility. The result object returned by Fit.optimize() does not contain a reference to the model anymore but instead to the joined parameter list. I think this is a justified change, because there is not necessarily
a single model across many datasets (e.g. if there are background model per dataset).

@registerrier
Copy link
Contributor

registerrier left a comment

Thanks @adonath . This looks good.

The global mask might be a problem if we don't have a simple to test that Datasets have the same dimensions.
I suppose also that a global model must be used or we need to introduce at some point a list of models or a Models object.

return self.datasets[0].model

@lazyproperty
def is_all_same_type(self):

This comment has been minimized.

Copy link
@registerrier

registerrier Feb 14, 2019

Contributor

IF the mask is to be applied on likelihood_per_bin then it is not enough to test the type of each member of the list but you also need to test their dimensions. No?

This comment has been minimized.

Copy link
@adonath

adonath Feb 14, 2019

Author Member

Thanks, I modified the check accordingly.


@property
def model(self):
return self.datasets[0].model

This comment has been minimized.

Copy link
@registerrier

registerrier Feb 14, 2019

Contributor

Since the model is not attached to the list of Dataset but to each Datasetin particular, why would you return the first of the list?
OK I understand, this is for compatibility reasons.

This comment has been minimized.

Copy link
@adonath

adonath Feb 14, 2019

Author Member

See answer to @AtreyeeS above.

@adonath adonath dismissed stale reviews from registerrier and AtreyeeS via 07658c1 Feb 14, 2019

@adonath

This comment has been minimized.

Copy link
Member Author

adonath commented Feb 15, 2019

@registerrier @AtreyeeS Any further comments? If not I'd like to go ahead and merge...

@registerrier
Copy link
Contributor

registerrier left a comment

Some comments, but that can probably be addressed later when the various datasets classes are introduced.

@property
def data_shape(self):
"""Shape of the counts data"""
return self.counts.data.shape

This comment has been minimized.

Copy link
@registerrier

registerrier Feb 15, 2019

Contributor

I think this is fine for the moment.
Ultimately, it is probably better to compare the MapGeom because that is what needs to be identical. The issue being that a FluxPointDataset or a SpectrumDataset doesn't have a Geom.

This comment has been minimized.

Copy link
@registerrier

registerrier Feb 15, 2019

Contributor

Shouldn't there be a test specifically for MapDataset testing the properties as well as the likelihood methods?

This comment has been minimized.

Copy link
@adonath

adonath Feb 15, 2019

Author Member

Indeed I'll add a test for MapDataset...

@@ -1082,6 +1082,10 @@ def __init__(self, model, data, mask=None, likelihood="chi2"):
" either 'chi2' or 'chi2assym'"
)

def data_shape(self):
"""Shape of the flux points data"""
return self.data.e_ref.shape

This comment has been minimized.

Copy link
@registerrier

registerrier Feb 15, 2019

Contributor

Same comment here. Should it be the shape or e_ref itself that is equal?

This comment has been minimized.

Copy link
@adonath

adonath Feb 15, 2019

Author Member

Yes, e_ref is the reference energy for the data...

import pytest
import numpy as np
from numpy.testing import assert_allclose
from .test_fit import MyDataset

This comment has been minimized.

Copy link
@registerrier

registerrier Feb 15, 2019

Contributor

Should the test rely only on a fake Dataset or should the be specific tests for each type of dataset and also for a heterogeneous set? Maybe this is for later.

This comment has been minimized.

Copy link
@adonath

adonath Feb 15, 2019

Author Member

I'll add a test for multiple MapDataset now, the joint fit for SpectrumDataset and FluxPointsDataset could come later...

@adonath adonath merged commit d696043 into gammapy:master Feb 15, 2019

3 checks passed

Codacy/PR Quality Review Up to standards. A positive pull request.
Details
Scrutinizer Analysis: 2 new issues, 25 updated code elements – Tests: passed
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.