-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Applying SCORE to large datasets #2
Comments
Hello Frederic,
What you actually need to compute is the matrix K, which is nxn and that you should be able to store. Computing X_diff is a handy way to compute K, but indeed is nxnxd. What you can do is to split the computation of K in different parts. Suppose you decompose your dataset in 10 different parts. Then, you need to compute X_diff for each subpart, which will be of size n/10xn/10xd and should be tractable. You then need to do it for the 100 pairs of subparts, but it shouldn’t take too long. And the same can be applied to compute nablaK, nabla2K.
The next challenge you will face is to invert the large dense matrix K. But you don’t necessarily need to invert it, and rather solve the linear system (K+eta I)X = nabla^2 K for X. This can be done using classical techniques, e.g., conjugate gradient. Also note that this is a matrix linear system where X is of size nxd, but it can be separated per column to solve d vector linear systems.
Does this make sense ?
Best,
Paul
… On 26 Jan 2024, at 17:37, fred887 ***@***.***> wrote:
Hello,
I would like to apply SCORE to some very large datasets (e.g. DAGs with d=approx 200 nodes and n=approx 20000 samples).
I am currently blocked by a lack of memory error inside the procedure Stein_hess() due to the X_diff tensor whose memory becomes huge:
memory_X_diff = n * n * d * sizeof(float)
For a tensor of float64 (the default setting), the memory needed by X_diff is ~ 800 GB; ~400GB for float32 or ~200GB for float16: this is just too big to fit in memory.
I know the standard approach would be to subsamples my dataset until everything fits inside memory. But I need to keep all samples for my experiments.
I have seen in your article that it is possible to use kernel approximation methods (such as MEKA) to reduce the computation load.
Indeed, with MEKA we can compute the kernel matrix K without needing to compute the overly large X_diff tensor.
However, the X_diff tensor is still necessary to compute the nablaK and nabla2K variables.
Would you know a way to completely avoid the computation of the very large X_diff tensor?
Or any other way to compute the diagonal elements of the Hessian matrix with large datasets?
Thank you very much for your help,
Best,
Frederic
—
Reply to this email directly, view it on GitHub <#2>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADNORFDSZRPDJIH6N7Q6F3TYQPLUZAVCNFSM6AAAAABCMNE45SVHI2DSMVQWIX3LMV43ASLTON2WKOZSGEYDENJRGQ2DENA>.
You are receiving this because you are subscribed to this thread.
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hello,
I would like to apply SCORE to some very large datasets (e.g. DAGs with d=approx 200 nodes and n=approx 20000 samples).
I am currently blocked by a lack of memory error inside the procedure
Stein_hess()
due to theX_diff
tensor whose memory becomes huge:For a tensor of float64 (the default setting), the memory needed by
X_diff
is ~ 800 GB; ~400GB for float32 or ~200GB for float16: this is just too big to fit in memory.I know the standard approach would be to subsamples my dataset until everything fits inside memory. But I need to keep all samples for my experiments.
I have seen in your article that it is possible to use kernel approximation methods (such as MEKA) to reduce the computation load.
Indeed, with MEKA we can compute the kernel matrix
K
without needing to compute the overly largeX_diff
tensor.However, the
X_diff
tensor is still necessary to compute thenablaK
andnabla2K
variables.Would you know a way to completely avoid the computation of the very large
X_diff
tensor?Or any other way to compute the diagonal elements of the Hessian matrix with large datasets?
Thank you very much for your help,
Best,
Frederic
The text was updated successfully, but these errors were encountered: