## Probability of two n-dimensional vectors

We know that two n-dimensional vectors are [expected to be orthoginal](https://softwaredoug.com/blog/2022/12/26/surpries-at-hi-dimensions-orthoginality.html). 

But it can be useful to know when we draw a dot product between two vectors, how 'special' is that? What's the probability of that happening by chance?

For example two extremely similar (not orthoginal) vectors would be extremely rare. Whereas orthoginality would be exceptionally common. We have an intuition of why from the above blog post, but can we quantify it?

| Dot Product | Expected Prob |
| ----------- | ------------- |
| > 0.9       | Very rare     |
| > 0.05      | Very common   |


It turns out yes! We can do this by computing the 'cap' area of the sphere compared to the overall area of the sphere. Obviously the 'polar cap' area is much smaller than the 'cap' formed by a much larger angle. As shown below

![](similar_cap.svg )

![](dissimilar_cap.svg)


To compute the probability, we just need to compute the ratio of [the vector to the area of the whole hypersphere](https://docsdrive.com/pdfs/ansinet/ajms/0000/22275-22275.pdf) see also [this question](https://math.stackexchange.com/questions/374022/probability-that-random-vectors-have-a-certain-dot-product)

### N-Sphere area

First we need to compute the area of an n-dimensional sphere, given by the following. We can ignore r because we're dealing with unit spheres.

Here Γ is the [gamma function](https://docs.scipy.org/doc/scipy/reference/generated/scipy.special.gamma.html).

![n sphere area](n_sphere_area.png)

In [3]:
from scipy.special import gamma
from math import pi

def n_sphere_area(n: int):
    numerator = 2 * (pi ** (n / 2))
    denominator = gamma(n/2)
    return numerator / denominator

n_sphere_area(3)

12.566370614359174

## Cap area, dimensions N given angle theta

Now to compute the area of a cap, which is:

![area of cap given theta](cap_area.png)

Here I is the [regularized incomplete beta function](https://docs.scipy.org/doc/scipy/reference/generated/scipy.special.betainc.html) and An is the area of the N-sphere we just defined

In [9]:
from scipy.special import betainc
from math import sin

def n_cap_area(n: int, theta: float):
    sphere_area = 0.5 * n_sphere_area(n)
    sin_squared = sin(theta) ** 2
    return sphere_area * betainc((n-1)/2, 0.5, sin_squared)

# Ninety degrees should be half the sphere
assert n_cap_area(3, pi / 2) / n_sphere_area(3) == 0.5  # With caution for floating point error

n_cap_area(3, pi / 2) / n_sphere_area(3)

0.5

## Dot product area

Now we just need to go from what we have (dot products) to angles. We do this by taking the arccos of the dotproduct to get an angle. As the dot product is given by:

```
u . v = |u| |v| cos(theta)
```

Since these are unit vectors, we can ignore `|u|` and `|v|`

In [10]:
from math import acos

def dot_prod_area(n: int, dot_product: float):
    theta = acos(dot_product)
    return n_cap_area(n, theta)

assert dot_prod_area(3, 0.0) == n_cap_area(3, pi / 2)
dot_prod_area(3, 0.0)

6.283185307179587

## Dot product probability

Probability a dot product is `dot_product` or above is the ratio of dot_prod_area to the area of the n-sphere:

In [12]:
def dot_prod_probability(n: int, dot_product: float):
    return dot_prod_area(n, dot_product) / n_sphere_area(n)

dot_prod_probability(3, 0.25)

0.3749999999999998

In [13]:
dot_prod_probability(3, 0.0)

0.5

In [15]:
dot_prod_probability(3, 0.9)

0.049999999999999996

## Fix for negative angles

We actually need to tweak this for negative dot products, at it appears to give the area for the lower cap.

If we want *at or above* `dot_product` we need to account for this

In [29]:
dot_prod_probability(3, -0.1)

0.44999999999999996

In [31]:
def dot_prod_probability(n: int, dot_product: float):
    if dot_product < 0:
        return 1.0 - dot_prod_area(n, dot_product) / n_sphere_area(n)
    else:
        return dot_prod_area(n, dot_product) / n_sphere_area(n)

In [34]:
dot_prod_probability(3, 0.1)

0.44999999999999996

In [37]:
dot_prod_probability(3, -0.1)

0.55

In [38]:
dot_prod_probability(3, -0.4)

0.7000000000000002

In [39]:
dot_prod_probability(3, -0.8)

0.9