Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added lower bound option to UniformHypersphere dist #799

Merged
merged 1 commit into from
May 11, 2017

Conversation

xchoo
Copy link
Member

@xchoo xchoo commented Jul 26, 2015

Added low and high options to UniformHypersphere distribution.

nengo/dists.py Outdated
@@ -156,8 +156,10 @@ class UniformHypersphere(Distribution):

"""

def __init__(self, surface=False):
def __init__(self, low=0, high=1, surface=False):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docstring needs to be updated with these parameters, along with an explanation of how they affect the distribution.

@arvoelke
Copy link
Contributor

Missing unit test.

@xchoo
Copy link
Member Author

xchoo commented Jul 26, 2015

Haha. Yeah. I realized after making the pull request that I didn't do the tests for them. 😆

@jgosmann
Copy link
Collaborator

Doesn't this make the surface parameter redundant because you could set low = high = 1.?

@arvoelke
Copy link
Contributor

Edit: Ignore my previous comment. I think you are right. 😄

For backwards-compatibility I wouldn't want to remove the surface parameter, though.

@xchoo
Copy link
Member Author

xchoo commented Jul 26, 2015

Yeah. There's some interaction between the low, high and surface parameters that I didn't consider. I will fix that too.

@tcstewar
Copy link
Contributor

Hmm, if I'm interpretting this right, setting low=0.5 and high=0.75 would lead to sample points that have a radius between 0.5^(1/D) and 0.75^(1/D) (but still uniformly distributed). Is that right? If so, it might be clearer to actually specify the radius, so that the parameters have a clearer meaning. Something like:

samples *= rng.uniform(low=self.low ** d, high=self.high ** d, size=(n, 1)) ** (1.0 / d)

That said, I'm not sure what the situation is you want these parameters for, so maybe they're more useful the way you have them -- I just thought I'd point out the possibility. :)

@xchoo
Copy link
Member Author

xchoo commented Jul 26, 2015

Oh. I think you are right. haha

@xchoo
Copy link
Member Author

xchoo commented Jul 26, 2015

Alrighty. I think I've addressed the issues listed here.

nengo/dists.py Outdated
self.high = max(high, self.low)

if surface:
self.low = self.high
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic (of the interaction between high, low, and surface) should be documented in the parameters

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is? (well, maybe not the surface one, but the rest are?)

@jgosmann
Copy link
Collaborator

I don't like, the interaction of the arguments and the possible to pass nonsensical combinations. With surface=False the vector length is < high, but with surface=True the vector length is <= high. Also, with surface=True the value of low has no effect.

Not sure if there is a good way to work around this. Maybe deprecate surface to remove it in a future version? Or remove high and allow to pass in the value to surface?

@tcstewar
Copy link
Contributor

I don't like, the interaction of the arguments and the possible to pass nonsensical combinations. With surface=False the vector length is < high, but with surface=True the vector length is <= high. Also, with surface=True the value of low has no effect.

That's my concern as well. And right now, the only documentation about this interaction is that "Values provided out of this range will be corrected", which i don't think is clear enough about what this interaction will do.

I'm still not sure what the use case is for this change, but one possibility would be to get rid of high as a parameter. In most of the use cases I can think of, high is redundant with radius, and is just a multiply anyway. So that might be sufficient.

@xchoo , what is the use case you're thinking of for this change? Would it work without high?

@xchoo
Copy link
Member Author

xchoo commented Jul 27, 2015

Yup! It would work without high. Although, if that's the case, I would move to rename UniformHypersphere to UniformUnitHypersphere.

@jgosmann
Copy link
Collaborator

But it's only unit if the radius is 1 ...

@xchoo
Copy link
Member Author

xchoo commented Jul 27, 2015

If you leave out high, the max magnitude of the distribution will be 1. In that case, if you set surface=True, the distribution will ignore the low value and put all the points on the surface of the hypersphere.

@xchoo
Copy link
Member Author

xchoo commented Jul 27, 2015

(Also, the description of the class is Distributions over an n-dimensional unit hypersphere.)

@tcstewar
Copy link
Contributor

Although, if that's the case, I would move to rename UniformHypersphere to UniformUnitHypersphere.

I would vote against that renaming, just on backwards compatibility reasons. :)

@xchoo
Copy link
Member Author

xchoo commented Jul 27, 2015

I would vote against that renaming, just on backwards compatibility reasons. :)

I guessssssssssss.........

@jgosmann
Copy link
Collaborator

You could allow a number as argument for surface (in addition to a boolean) which would then be treated as high.

@xchoo
Copy link
Member Author

xchoo commented Jul 27, 2015

But how would you indicate if you want to have the points on the surface or not? E.g., you want a hypersphere of radius 2, and all points on the surface.

@jgosmann
Copy link
Collaborator

low=2, surface=2'? (but this is somewhat inconsistent withrandintwherelow == high` is not allowed)

Maybe we should implement two distributions: One samples from the whole hypersphere (with arguments low and high) and the other one just from the surface (by internally using the first distribution from the whole hypersphere and normalizing) with one argument radius. For backward compatibility we would have to keep the current API too, but we could mark it as deprecated and remove it one (or a few more releases) later.

@tcstewar
Copy link
Contributor

Your documentation still doesn't indicate the interaction between surface and low. In particular, if surface is True, then low is ignored. Furthermore, your documentation still refers to the high parameter, which has been removed.

I will also note that, how it currently stands, setting surface to True gives exactly the same result as setting low to 1.0, and setting surface to False gives exactly the same result as setting low to 0.0. This suggests that you could also get rid of the low parameter and just have surface accept a float or a boolean. This would at least get rid of the confusion surrounding the interaction between the two variables.

@xchoo
Copy link
Member Author

xchoo commented Jul 27, 2015

This suggests that you could also get rid of the low parameter and just have surface accept a float or a boolean. This would at least get rid of the confusion surrounding the interaction between the two variables.

But naming that parameter surface doesn't make any sense. The surface parameter name has been left there for compatibility. Ideally, I'd have a high and a low and no surface parameter.

@tcstewar
Copy link
Contributor

One samples from the whole hypersphere (with arguments low and high) and the other one just from the surface (by internally using the first distribution from the whole hypersphere and normalizing) with one argument radius. For backward compatibility we would have to keep the current API too, but we could mark it as deprecated and remove it one (or a few more releases) later.

Hmm... I'd like to avoid bringing radius into this discussion, as that's an Ensemble-level parameter and I could see also sorts of potential for confusion here.

Also, before looking at an API change like that, I'd still like to know what the use case is for something like this. I've never wanted to sample from a uniform hyper-doughnut before -- why do we want that capability now?

@xchoo
Copy link
Member Author

xchoo commented Jul 27, 2015

I do. Same logic as to why I use a Uniform distribution with a non-zero low when messing around with non standard intercepts.

@xchoo
Copy link
Member Author

xchoo commented Jul 27, 2015

(i.e. I want to provide only valid sample points within the intercept ranges)

@tcstewar
Copy link
Contributor

But naming that parameter surface doesn't make any sense. The surface parameter name has been left there for compatibility. Ideally, I'd have a high and a low and no surface parameter.

It makes as much sense to me as low and high do. I had to read your code in order to understand that it wasn't a bounding box, which was my initial assumption.

@xchoo
Copy link
Member Author

xchoo commented Jul 27, 2015

It makes as much sense to me as low`` andhigh``` do. I had to read your code in order to understand that it wasn't a bounding box, which was my initial assumption.

I could use the name min_magnitude and max_magnitude. I used low and high to keep it kinda consistent with the uniform distribution.

nengo/dists.py Outdated
Default: False

"""

def __init__(self, surface=False):
def __init__(self, min_magnitude=0, surface=False):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new parameter should go after the existing surface, just in case people were using it positionally. Note that this change will affect the position in the docstring and in __repr__, too.

@Seanny123 Seanny123 changed the title Added low & high options to UniformHypersphere dist Added lower bound option to UniformHypersphere dist Dec 11, 2015
nengo/dists.py Outdated
surface : bool
Whether sample points should be distributed uniformly
over the surface of the hyperphere (True),
or within the hypersphere (False).
Note: Setting surface to True generates sample points from the
hypersphere surface regardless of what the value of min_magnitude is.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd remove this note, and in the description of min_magnitude say that it's ignored if surface == True.

nengo/dists.py Outdated
such that the sampled vector magnitudes >= min_magnitude.
The default value and minimum value for min_magnitude is 0.
The maximum value for min_magnitude is 1. Values outside this range
will be clipped to this range.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be more concise? e.g. "Lower bound on the returned vector magnitudes (i.e. magnitude >= min_magnitude). Must be in the range [0, 1]; defaults to 0. Ignored if surface == True."

@Seanny123
Copy link
Contributor

@hunse I fixed the rest of your comments and it's now passing all the tests. Thanks for being so throrough! Do you think it's ready to merge now?

nengo/dists.py Outdated
return "UniformHypersphere(%s)" % (
"surface=True" if self.surface else "")
return "UniformHypersphere(min_magnitude=%s%r)" % (
", surface=True" if self.surface else "", self.min_magnitude,)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still not right. Also, min_magnitude should only appear if it's different from the default.

@Seanny123
Copy link
Contributor

@hunse I did my best to fix your comments on this issue. Does it look okay now?

@Seanny123 Seanny123 force-pushed the UniformHypersphereLowHighDist branch from 224bf9a to 17ee763 Compare April 22, 2017 10:23
nengo/dists.py Outdated
min_magnitude : Number, optional (Default: 0)
Lower bound on the returned vector magnitudes (i.e.
magnitude >= min_magnitude). Must be in the range [0, 1]; defaults
to 0. Values outside this range will be clipped to this range. Ignored
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this description is not entirely correct. If values outside of the range get clipped, I understand that values get sampled in the range [0, 1] and then everything below min_magnitude will set to min_magnitude (which will cluster vectors at min_magnitude). But as I understand the code values are sampled from [min_magnitude, 1] and thus are without clustering at min_magnitude.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think all this docstring is saying is that values for min_magnitude will be clipped to [0, 1]. I agree that we should have an exception instead of clipping. This is already done by the low and high in the parameter definition.

nengo/dists.py Outdated
super(UniformHypersphere, self).__init__()
self.min_magnitude = np.clip(min_magnitude, 0, 1)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to raise an exception when min_magnitude is outside of the range (which should happen by just removing the call to clip because it should be checked by the NumberParam).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, and I think this is what caused @jgosmann's confusion with the docstring.

nengo/dists.py Outdated
"""

surface = BoolParam('surface')
min_magnitude = NumberParam('min_magnitude', low=0, high=1)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add high_open=True so that we don't allow min_magnitude == 1, which I think would cause problems because then we'd be sampling from an empty interval.

@hunse
Copy link
Collaborator

hunse commented Apr 22, 2017

Ok, I've addressed @jgosmann's suggestions.

Copy link
Collaborator

@jgosmann jgosmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, @hunse!

@Seanny123
Copy link
Contributor

Those changes look awesome. Thanks @hunse. I wasn't sure how much of the tests were redundant and how to fix the __repr__ string code.

Copy link
Member

@tbekolay tbekolay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM; I pushed a fixup to fix the changelog formatting (can't have spaces after / before the asterisks).

I share the concerns people have about the surface and min_magnitude parameters being slightly redundant... Unfortunately I think there's no way around this, since surface=True is more clear than min_magntiude=1. However, to at least be explicit when the two parameters are both set, I added warning when surface is True and min_magnitude > 0 in the fixup commit.

Will fix up history and merge when CI finishes, unless someone stops me in the next half hour or so!

@tbekolay tbekolay force-pushed the UniformHypersphereLowHighDist branch from 1847d12 to 88ce22e Compare May 11, 2017 19:25
@tbekolay tbekolay force-pushed the UniformHypersphereLowHighDist branch from 88ce22e to 07c09ed Compare May 11, 2017 21:10
@tbekolay tbekolay merged commit 07c09ed into master May 11, 2017
@tbekolay tbekolay deleted the UniformHypersphereLowHighDist branch May 11, 2017 21:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants