-
Notifications
You must be signed in to change notification settings - Fork 80
error bound tests #3104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
error bound tests #3104
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #3104 +/- ##
=======================================
Coverage 89.95% 89.95%
=======================================
Files 29 29
Lines 32471 32471
Branches 5823 5823
=======================================
Hits 29209 29209
Misses 1860 1860
Partials 1402 1402
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
jeromekelleher
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. @hanbin973 can you have a look please?
| # = \sum_i b_i (lambda - L_i)^2 + lambda^2 |delta|^2 | ||
| # < \epsilon (where epsilon is the spectral norm bound error_bound) | ||
| # so | ||
| # |delta| < sqrt(epsilon / lambda) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lambda should be at the outside of the sqrt.
lambda ^2 |delta| ^2 < \epsilon so |delta| < sqrt(epsilon) / lambda.
| # then let v = \sum_i b_i u_i + delta be the projection of v into U, | ||
| # and we have that | ||
| # |lambda v - U L U* v|^2 | ||
| # = \sum_i b_i (lambda - L_i)^2 + lambda^2 |delta|^2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
b_i should be b_i^2.
See |lambda v- ULU*|^2 = \sum_i (\lambda - L_i)^2 b_i^2 + \lambda^2 |delta|^2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whoops - typo!
| # |delta| < sqrt(epsilon / lambda) | ||
| # since this is the amount by which the eigenvector v isn't hit by the columns of U. | ||
| # Then also for each i that if b_i is not small then | ||
| # |lambda - L_i| < sqrt(epsilon) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to elaborate this part.
Let m = min_i |lambda - L_i|^2. Then, epsilon > \sum_i (lambda - L_i)^2 b_i^2 + lambda^2 |delta|^2 >= m * \sum_i b_i^2 + lambda^2 |delta|^2 = m * (1-|delta|^2) + lambda^2 |delta|^2.
Hence, min_i |lambda-L_i|^2 = m < (epsilon - lambda^2 |delta|^2) / (1- |delta|^2).
I'm not sure how exactly the RHS of the final equation is strictly smaller than sqrt(epsilon).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The RHS is epsilon + |delta|^2 * (epsilon - \lambda^2) / (1-|delta|^2) and assuming that lambda is suitably larger than epsilon which is roughly lambda_{k+1} (k is the rank of the approximation, the formula in eq 1.11 of the paper), the residual term |delta|^2 * (epsilon - \lambda^2) / (1-|delta|^2) is negative, proving the inequality. This somewhat explains why we can't trust the l-th component that is too close to the k-th component: the singular value of the l-th component should be reasonably larger than the k+1-th component for the bound to hold.
| # Bounds on the error are from equation 1.11 in https://arxiv.org/pdf/0909.4061 - | ||
| # this gives a bound on reconstruction error (i.e., spectral norm between the GRM | ||
| # and the low-diml approx). But since the spectral norm is | ||
| # |X| = sup_v |Xv|^2/|v|^2, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The correct definition is |X| = sup_v |Xv|_2 / |v|_2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh good! I got it in my head that it was squared, which seemed wrong.
| # epsilon > \sum_i (lambda - L_i)^2 b_i^2 + lambda^2 |delta|^2 | ||
| # >= m * \sum_i b_i^2 + lambda^2 |delta|^2 | ||
| # = m * (1-|delta|^2) + lambda^2 |delta|^2. | ||
| # Hence, min_i |lambda-L_i|^2 = m < (epsilon - lambda^2 |delta|^2) / (1- |delta|^2). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
L841 and 844's epsilon both needs a square. Everything else looks okay.
|
Let's merge. |
Here's a more precise description of what (we think) the error bound implies for precision, and additional tests.
I was having trouble getting this to pass, but then realized it was because the
time_windowin the test was making it so the matrix was low rank in some genomic windows. So, I removed the time windowing (we don't need to test that part here, as I don't see how it'd interact with the random approximation besides in trivial ways like this).Also note that the error_bound is labeled "experimental" - which is good, as this is not a rigorous benchmarking.