-
Notifications
You must be signed in to change notification settings - Fork 175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added distribution for dot products / intercepts #768
Conversation
Also, I think the |
Could definitely by useful (but I don't need it the paper/spaopt). |
@arvoelke why is this marked as "work in progress"? What else is left to be done to this pull request before it can be properly reviewed? |
Think I did that in case @jgosmann had suggestions based on his research, and because of the two todo's in the original post (better naming? changelog?). These can be subsumed by the review (so yes it's ready). 😄 |
ndarray | ||
Evaluation points `x` in [0, 1] such that `P(X <= x) = y`. | ||
""" | ||
from scipy.special import betaincinv |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usually if we have a SciPy dependency, we throw a special error if it's not found. Should this be done here as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is how it was done in a couple other places in this file.
376d3de
to
cf8bae9
Compare
Renamed invcdf to ppf. Think this should be ready if no complaints. |
ppf = dist.ppf(cdf) | ||
|
||
# The pdf should reflect the samples | ||
np.random.seed(seed) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You shouldn't have to do this. Make a numpy.random.RandomState
and pass it where you need it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, you can just get one with the rng
fixture. And same thing in the tests above and below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Made a few comments. Go ahead and add the changelog entry, too (that's the submitter's job). |
Added changelog and addressed above comments (thanks)! |
Bumping since this is needed by @mundya and @neworderofjamie this week. |
def __init__(self, dimensions): | ||
super(CosineSimilarity, self).__init__(dimensions) | ||
|
||
def sample(self, num, rng=np.random): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sample
doesn't have the same signature as we typically use for Distribution.sample
. We might as well take the extra d
parameter like SubvectorLength
does, if it's not too much trouble (and I don't think it should be).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Fixed this now, added a test, and removed some repeat logic in the process.
3f86bc5
to
8f482b6
Compare
def __init__(self, dimensions, subdimensions=1): | ||
super(SubvectorLength, self).__init__( | ||
dimensions - subdimensions, subdimensions) | ||
|
||
|
||
class CosineSimilarity(SubvectorLength): | ||
"""Distribution of dot products between random unit vectors. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Distribution of cosine similarity between two random vectors."
or maybe
"Distribution of the cosine of the angle between two random vectors."
then below: "The cosine similarity is given by the cosine of the angle between two vectors, which is equal to the norm of the dot product of the vectors, divided by the norms of the individual vectors".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. But "norm of the dot product" -> just "dot product" (note the result is signed, unlike SubVectorLength
).
Improved documentation to clarify the connection between this distribution and the cosine angle. |
It is the distribution of the cosine of the angle between two random vectors, and can be useful to calculate intercepts such that a particular neuron has a given probability `p` of firing in response to a unit length input.
8de1da4
to
4a6917a
Compare
Subclassed
SubvectorLength
to get the distribution that I use for my adaptive cleanup (heteroassociative memory). This is important for setting the intercepts of neurons so that they fire with some chosen probability (see docstring for details).Also added the inverse cdf to
SqrtBeta
, and some optional unit tests for theSqrtBeta
distribution.