Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory usage of dss_line() #56

Closed
eort opened this issue Nov 30, 2021 · 13 comments · Fixed by #58
Closed

Memory usage of dss_line() #56

eort opened this issue Nov 30, 2021 · 13 comments · Fixed by #58
Assignees
Labels
enhancement New feature or request

Comments

@eort
Copy link
Contributor

eort commented Nov 30, 2021

Hi,

I came across this repository when I was looking for an implementation of ZAPline. Really nice that you took the effort to translate the original matlab package. I have played around with it on my data, and for most of it, it works really nicely.

However, one thing that worries me a bit, is the memory usage of the dss_line(). So far, I tried to run it on MEG data (300 channels x 1.8x10^6 samples), but there was no chance that my laptop (24GB) could offer enough memory for dss_line to finish. Only if I cropped the data considerably it worked.

Of course, eventually the scripts will run on an HPC, so memory shouldn't be a big problem there, still it is quite annoying that I can't run and check those scripts locally before exporting them to the hpc. So I was wondering whether there are some magic tricks that I can do to lower the memory footprint? Settings that I have missed or something? I tried to trace the memory consumption of the function (where I was sufficiently confident to not affect functionality), and could indeed reduce the usage here and there, but the peak is still higher than what my pc can handle.

If helpful, I can provide more information of any kind.

@nbara
Copy link
Owner

nbara commented Nov 30, 2021

Hi @eort

You are 100% right and this is something that I've identified already (see #50 ). One obvious thing to try is to compute the covariance by blocks. I'll try to get around to it soon (but if you want to open a PR sooner you're welcome 😉 )

@nbara nbara added the enhancement New feature or request label Nov 30, 2021
@nbara nbara mentioned this issue Nov 30, 2021
4 tasks
@eort
Copy link
Contributor Author

eort commented Nov 30, 2021

Ah great! Well, I doubt I can produce a reasonable PR, but I shall try to further look into what parts of the code are particularly memory-hungry. Thanks!

@nbara
Copy link
Owner

nbara commented Dec 1, 2021

Hi @eort can you try the code in #57 ? I added a blocksize parameter to compute the covariances by blocks. This should speed up the computation significantly.

Let me know.

@nbara nbara self-assigned this Dec 1, 2021
@eort
Copy link
Contributor Author

eort commented Dec 1, 2021

Speed might be up somewhat. It's hard to say, because I don't have a good reference for a complete data file of mine. So far I always ran out of memory before it could finish. However, this time it was quite quick for my memory to be full, so I guess it is somewhat quicker :)

@nbara
Copy link
Owner

nbara commented Dec 1, 2021

In that case I'll leave the PR open until you have had enough time to test it, ok?

@eort
Copy link
Contributor Author

eort commented Dec 1, 2021

Sure! While testing it is already quite clear that the slowest part of the process (so far) are the ffts in gaussfilter:

tmp = np.fft.fft(data, axis=0)
if data.ndim == 2:
tmp *= fx[:, None]
elif data.ndim == 3:
tmp *= fx[:, None, None]
filtdat = 2 * np.real(np.fft.ifft(tmp, axis=0))

@nbara
Copy link
Owner

nbara commented Dec 1, 2021

Yes I came to the same conclusion. I have to do some FFTing here to compute the bias covariance, but at least now it's done over hopefully smaller matrices so it should not hog your memory as much 🤞

@eort
Copy link
Contributor Author

eort commented Dec 1, 2021

Not sure it really did save memory though. I just managed to edit the code here and there to save memory, and it did finish finally! (Beautiful Psds afterwards!). Tomorrow I shall have a somewhat more systematic look which of my edits actually did save memory, and compose a PR of them. Then you can decide what of it you want to keep and what not 😄

@nbara
Copy link
Owner

nbara commented Dec 1, 2021

Sounds good

@eort
Copy link
Contributor Author

eort commented Dec 2, 2021

quick question. Does it make more sense to start the PR based on #57, or based on master?

@nbara
Copy link
Owner

nbara commented Dec 2, 2021

you can base it on #57

@nbara
Copy link
Owner

nbara commented Dec 2, 2021

Closed via #57 (for now)

@nbara nbara closed this as completed Dec 2, 2021
@nbara
Copy link
Owner

nbara commented Dec 2, 2021

Thanks @eort !

@nbara nbara linked a pull request Dec 2, 2021 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants