Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not scale the samples by default #284

Merged
merged 3 commits into from
Jan 28, 2021

Conversation

LDAP
Copy link
Contributor

@LDAP LDAP commented Jan 27, 2021

The __call__ method does scale the result from _compute() by default. This is unexpected and not documented. Also, it does prevent subclasses to generate samples in the input space directly.

@relf
Copy link
Member

relf commented Jan 28, 2021

I agree we should document the fact the __call__ method does a scaling and that _compute is expected to give samples in the [0., 1.] hypercube. At the moment, all sampling methods are intended to follow that pattern that we recently enforced by adding the @abstractmethod on _compute. 😉

If we imagine a sampling method not following that pattern, maybe it means that the @abstractmethod should be moved on the __call__ method which is actually the true API to respect or we just have to remove the @abstractmethod as I feel now it is just getting in the way.

@LDAP
Copy link
Contributor Author

LDAP commented Jan 28, 2021

At least I would ensure that sampling methods are able to compute samples outside the [0,1] hypercube. Also I find it confusing that subclasses should overwrite a __call__ method which does have a special meaning in python.

In any case, I would prefer the call to a utility method instead of using the template method pattern. (here: scale_to_xlimits)
If you move the @abstractmethod then the _compute method should be removed too.

@relf
Copy link
Member

relf commented Jan 28, 2021

At least I would ensure that sampling methods are able to compute samples outside the [0,1] hypercube

Not sure to understand here: do you mean we could ensure that the _compute return only samples in [0., 1.] hypercube?

I prefer to wait for an actual sampling method generating directly in the input space before duplicating scale_to_xlimits calls everywhere or moving the @abstractmethod. As for now, I find the current implementation more straightforward.

Though a bit awkward for sure, the override of __call__ with a no-op implementation for _compute is still possible for such new sampling method. And in in this case, we could reconsider extracting scale_to_xlimits.

@LDAP
Copy link
Contributor Author

LDAP commented Jan 28, 2021

At least I would ensure that sampling methods are able to compute samples outside the [0,1] hypercube

Not sure to understand here: do you mean we could ensure that the _compute return only samples in [0., 1.] hypercube?

No, I would allow a sampling method to generate samples where it wants to.

I prefer to wait for an actual sampling method generating directly in the input space before duplicating scale_to_xlimits calls everywhere or moving the @abstractmethod. As for now, I find the current implementation more straightforward.

Actually, that is why I proposed this pull request. I am currently working on a sampling method which does create samples in the input space directly.

Should I remove _compute then completely? And overwrite __call__ directly in the pr?

@relf
Copy link
Member

relf commented Jan 28, 2021

... or remove the @abstractmethod enforcement (so you can just override __call__) letting options open for the developper, make scale_to_xlimits a method, keep _compute and the template pattern method like:

def __call__(self, n):
    x = self._compute(n)
    return self._scale_to_xlimits(x)

@LDAP
Copy link
Contributor Author

LDAP commented Jan 28, 2021

... or remove the @abstractmethod enforcement (so you can just override __call__) letting options open for the developper, make scale_to_xlimits a method, keep _compute and the template pattern method like:

def __call__(self, n):
    x = self._compute(n)
    return self._scale_to_xlimits(x)

Personally I find this very dangerous: A developer then has to know that overwriting _compute does scale the samples while overwriting __call__ would be the correct option. Since the util module is also public I would suggest the developer to use scale_to_xlimits directly.

(also the doctring of _compute states "Returns: The sampling locations in the input space.")

The @abstractmethod annotation is in this case a good thing because a developer knows directly which method he must overwrite. We could add a hint to scale_to_xlimits in the docstring so that the developer does not need to rewrite existing code.

@relf
Copy link
Member

relf commented Jan 28, 2021

In any case I agree the documentation has to be updated. Still I am not fan of the scale_to_xlimits call duplication and a separate utility very specific to sampling methods.

Let's introduce an intermediate class, ScaledSamplingMethod?, which inherits from a SamplingMethod which implements __call__ with the template method pattern above and exposes the _compute (with proper docstring) to be implemented. Current sampling methods will then inherit from ScaledSamplingMethod and you can implement yours inheriting directly from SamplingMethod.

@LDAP
Copy link
Contributor Author

LDAP commented Jan 28, 2021

I do like this approach. How about UnitIntervalSamplingMethod or UnitHypercubeSamplingMethod to make clear that the samples have to be generated in the unit interval.

@relf
Copy link
Member

relf commented Jan 28, 2021

Yes I have tought about UnitIntervalSamplingMethod but then found a bit weird that __call__ returns something not in the unit interval, so I would keep ScaledSamplingMethod and rather document the class and the _compute method properly.

@LDAP
Copy link
Contributor Author

LDAP commented Jan 28, 2021

See 0581492

Copy link
Member

@relf relf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent!

@relf relf merged commit 257a464 into SMTorg:master Jan 28, 2021
@LDAP LDAP deleted the feature/refactor_sampling_method branch January 28, 2021 14:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants