New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MRG+1: Use frequency domain for xc/ac #4119

Merged
merged 14 commits into from Apr 20, 2017

Conversation

Projects
None yet
8 participants
@larsoner
Member

larsoner commented Mar 29, 2017

Here you go @agramfort.

In addition to tutorials/plot_receptive_field.py and examples/plot_receptive_field.py, I've also been using this gist.

@jona-sassenhagen

This comment has been minimized.

Show comment
Hide comment
@jona-sassenhagen

jona-sassenhagen Mar 31, 2017

Contributor

@agramfort and you made fun of me once for using "\" :D

Contributor

jona-sassenhagen commented Mar 31, 2017

@agramfort and you made fun of me once for using "\" :D

@larsoner

This comment has been minimized.

Show comment
Hide comment
@larsoner

larsoner Mar 31, 2017

Member

Yeah I'm modifying @choldgraf's code, and didn't have the heart to make him change it

Member

larsoner commented Mar 31, 2017

Yeah I'm modifying @choldgraf's code, and didn't have the heart to make him change it

@larsoner

This comment has been minimized.

Show comment
Hide comment
@larsoner

larsoner Apr 4, 2017

Member

Okay I've done some benchmarking. It looks like for any sensible combination of delays and input sizes (e.g., at least a few seconds of data and at least 100 ms in the receptive field) the FFT method is better.

Here is the speedup ratio as a function of the number of input channels, input durations, and RF durations in predicting a single output:

figure

@choldgraf can you run my gist on this branch and see if you get similar results?

Examples still look good:

https://4304-1301584-gh.circle-artifacts.com/0/home/ubuntu/mne-python/doc/_build/html/auto_examples/decoding/plot_receptive_field.html

https://4304-1301584-gh.circle-artifacts.com/0/home/ubuntu/mne-python/doc/_build/html/auto_tutorials/plot_receptive_field.html

Ready for review/merge from my end.

Member

larsoner commented Apr 4, 2017

Okay I've done some benchmarking. It looks like for any sensible combination of delays and input sizes (e.g., at least a few seconds of data and at least 100 ms in the receptive field) the FFT method is better.

Here is the speedup ratio as a function of the number of input channels, input durations, and RF durations in predicting a single output:

figure

@choldgraf can you run my gist on this branch and see if you get similar results?

Examples still look good:

https://4304-1301584-gh.circle-artifacts.com/0/home/ubuntu/mne-python/doc/_build/html/auto_examples/decoding/plot_receptive_field.html

https://4304-1301584-gh.circle-artifacts.com/0/home/ubuntu/mne-python/doc/_build/html/auto_tutorials/plot_receptive_field.html

Ready for review/merge from my end.

@larsoner larsoner changed the title from WIP: Use frequency domain for xc/ac to MRG: Use frequency domain for xc/ac Apr 4, 2017

@choldgraf

This comment has been minimized.

Show comment
Hide comment
@choldgraf

choldgraf Apr 4, 2017

Contributor

that's a cool figure...it would be nice in a blogpost ;-)

I'll get this running sometime today or tomorrow! Just got back from NYC yesterday so I have a mountain of stuff to catch up on!

Contributor

choldgraf commented Apr 4, 2017

that's a cool figure...it would be nice in a blogpost ;-)

I'll get this running sometime today or tomorrow! Just got back from NYC yesterday so I have a mountain of stuff to catch up on!

@larsoner

This comment has been minimized.

Show comment
Hide comment
@larsoner

larsoner Apr 4, 2017

Member

cc @rkmaddox feel free to have a look if you want to see how your code has been reused

Member

larsoner commented Apr 4, 2017

cc @rkmaddox feel free to have a look if you want to see how your code has been reused

@rkmaddox

This comment has been minimized.

Show comment
Hide comment
@rkmaddox

rkmaddox Apr 4, 2017

Contributor

@Eric89GXL Cool! And think of the speedup when dealing with an hour at 10k samples / s.

Any plans to let the FFTs be CUDA-fied, or did that not speed it up much?

Contributor

rkmaddox commented Apr 4, 2017

@Eric89GXL Cool! And think of the speedup when dealing with an hour at 10k samples / s.

Any plans to let the FFTs be CUDA-fied, or did that not speed it up much?

@codecov-io

This comment has been minimized.

Show comment
Hide comment
@codecov-io

codecov-io Apr 4, 2017

Codecov Report

Merging #4119 into master will increase coverage by 0.07%.
The diff coverage is 97.22%.

@@            Coverage Diff             @@
##           master    #4119      +/-   ##
==========================================
+ Coverage   86.18%   86.26%   +0.07%     
==========================================
  Files         356      357       +1     
  Lines       64337    64638     +301     
  Branches     9798     9918     +120     
==========================================
+ Hits        55451    55759     +308     
+ Misses       6180     6172       -8     
- Partials     2706     2707       +1

codecov-io commented Apr 4, 2017

Codecov Report

Merging #4119 into master will increase coverage by 0.07%.
The diff coverage is 97.22%.

@@            Coverage Diff             @@
##           master    #4119      +/-   ##
==========================================
+ Coverage   86.18%   86.26%   +0.07%     
==========================================
  Files         356      357       +1     
  Lines       64337    64638     +301     
  Branches     9798     9918     +120     
==========================================
+ Hits        55451    55759     +308     
+ Misses       6180     6172       -8     
- Partials     2706     2707       +1
@larsoner

This comment has been minimized.

Show comment
Hide comment
@larsoner

larsoner Apr 4, 2017

Member

Any plans to let the FFTs be CUDA-fied, or did that not speed it up much?

Haven't thought about it, but we could do that in a subsequent PR, sure.

Member

larsoner commented Apr 4, 2017

Any plans to let the FFTs be CUDA-fied, or did that not speed it up much?

Haven't thought about it, but we could do that in a subsequent PR, sure.

@choldgraf

This comment has been minimized.

Show comment
Hide comment
@choldgraf

choldgraf Apr 5, 2017

Contributor

@Eric89GXL I just tried to run your gist but ended up crashing my computer because python was using like 6 gigs of ram...is that expected?

Contributor

choldgraf commented Apr 5, 2017

@Eric89GXL I just tried to run your gist but ended up crashing my computer because python was using like 6 gigs of ram...is that expected?

@choldgraf

This comment has been minimized.

Show comment
Hide comment
@choldgraf

choldgraf Apr 5, 2017

Contributor

(not the crash obv, but the 6 gigs)

Contributor

choldgraf commented Apr 5, 2017

(not the crash obv, but the 6 gigs)

@larsoner

This comment has been minimized.

Show comment
Hide comment
@larsoner

larsoner Apr 5, 2017

Member

Yeah, it's a lot of data. You can get rid of the largest N or largest RF size to reduce memory consumption.

Or, get a better computer :)

Member

larsoner commented Apr 5, 2017

Yeah, it's a lot of data. You can get rid of the largest N or largest RF size to reduce memory consumption.

Or, get a better computer :)

@choldgraf

This comment has been minimized.

Show comment
Hide comment
@choldgraf

choldgraf Apr 5, 2017

Contributor

:P

here's what I get:

image

looks similar yeah?

Contributor

choldgraf commented Apr 5, 2017

:P

here's what I get:

image

looks similar yeah?

@larsoner

This comment has been minimized.

Show comment
Hide comment
@larsoner

larsoner Apr 5, 2017

Member

Yeah, and actually a lot better (in some cases) relative FFT performance for you. But the important thing is that, for any reasonable number of n_delays and X.shape[0] combination, the FFT mode wins.

Feel free to review the code, then :)

Member

larsoner commented Apr 5, 2017

Yeah, and actually a lot better (in some cases) relative FFT performance for you. But the important thing is that, for any reasonable number of n_delays and X.shape[0] combination, the FFT mode wins.

Feel free to review the code, then :)

@choldgraf

This comment has been minimized.

Show comment
Hide comment
@choldgraf

choldgraf Apr 5, 2017

Contributor

yeah for sure - def faster :)

ok I'll review in a bit...I spent more time trying to figure out why my computer slowed to a crawl than I was expecting to :P

Contributor

choldgraf commented Apr 5, 2017

yeah for sure - def faster :)

ok I'll review in a bit...I spent more time trying to figure out why my computer slowed to a crawl than I was expecting to :P

@choldgraf

here are a few comments!

Show outdated Hide outdated mne/decoding/receptive_field.py
@@ -186,8 +201,11 @@ def predict(self, X):
if not hasattr(self, 'delays_'):
raise ValueError('Estimator has not been fit yet.')
X, _ = self._check_dimensions(X, None, predict=True)
X_del = _delay_time_series(X, self.tmin, self.tmax, self.sfreq,
newaxis=X.ndim)
if not isinstance(self.estimator_, TimeDelayingRidge):

This comment has been minimized.

@choldgraf

choldgraf Apr 5, 2017

Contributor

could we just spin this into a _delay_for_fit method and save some lines?

@choldgraf

choldgraf Apr 5, 2017

Contributor

could we just spin this into a _delay_for_fit method and save some lines?

elif is_regressor(self.estimator):
estimator = clone(self.estimator)
else:
raise ValueError('`estimator` must be a float or an instance'
' of `BaseEstimator`,'
' got type %s.' % type(self.estimator))
self.estimator_ = estimator
del estimator

This comment has been minimized.

@choldgraf

choldgraf Apr 5, 2017

Contributor

why?

@choldgraf

choldgraf Apr 5, 2017

Contributor

why?

This comment has been minimized.

@larsoner

larsoner Apr 5, 2017

Member

to force me (and others) to use the property for code clarity

@larsoner

larsoner Apr 5, 2017

Member

to force me (and others) to use the property for code clarity

@@ -10,4 +10,5 @@
from .time_gen import GeneralizationAcrossTime, TimeDecoding
from .time_frequency import TimeFrequency
from .receptive_field import ReceptiveField
from .time_delaying_ridge import TimeDelayingRidge

This comment has been minimized.

@choldgraf

choldgraf Apr 5, 2017

Contributor

why not mne.decoding.model instead of having its own module. That way if we want to add any custom models they can go in there?

@choldgraf

choldgraf Apr 5, 2017

Contributor

why not mne.decoding.model instead of having its own module. That way if we want to add any custom models they can go in there?

This comment has been minimized.

@larsoner

larsoner Apr 5, 2017

Member

I'd assume YAGNI and we can rename later if need be

@larsoner

larsoner Apr 5, 2017

Member

I'd assume YAGNI and we can rename later if need be

This comment has been minimized.

@choldgraf

choldgraf Apr 5, 2017

Contributor

sounds good - so long as you think its fine renaming etc. I'm still a little annoyed that we have a top-level module called decoding that does a ton of non-decoding stuff :P

@choldgraf

choldgraf Apr 5, 2017

Contributor

sounds good - so long as you think its fine renaming etc. I'm still a little annoyed that we have a top-level module called decoding that does a ton of non-decoding stuff :P

This comment has been minimized.

@larsoner

larsoner Apr 6, 2017

Member

yeah we expose mne.decoding.* objects / methods, so that one is harder to change the name of

@larsoner

larsoner Apr 6, 2017

Member

yeah we expose mne.decoding.* objects / methods, so that one is harder to change the name of

@@ -1116,6 +1116,7 @@ Classes:
UnsupervisedSpatialFilter
Vectorizer
ReceptiveField
TimeDelayingRidge

This comment has been minimized.

@choldgraf

choldgraf Apr 5, 2017

Contributor

do we want to expose this model to the user or just have them use it as a part of the RF class? I'm happy either way

@choldgraf

choldgraf Apr 5, 2017

Contributor

do we want to expose this model to the user or just have them use it as a part of the RF class? I'm happy either way

This comment has been minimized.

@larsoner

larsoner Apr 5, 2017

Member

I'd like to expose it because it does have advanced parameters people can choose to use if they know what they're doing

@larsoner

larsoner Apr 5, 2017

Member

I'd like to expose it because it does have advanced parameters people can choose to use if they know what they're doing

model.fit(x, y)
assert_array_equal(model.delays_,
np.arange(slim[0], slim[1] + 1))
assert_allclose(model.coef_, expected, atol=1e-1)

This comment has been minimized.

@choldgraf

choldgraf Apr 5, 2017

Contributor

awesome that this works ;-)

@choldgraf

choldgraf Apr 5, 2017

Contributor

awesome that this works ;-)

# is raised
w = linalg.solve(mat, x_y.T, sym_pos=True, overwrite_a=False)
except np.linalg.LinAlgError:
warnings.warn('Singular matrix in solving dual problem. Using '

This comment has been minimized.

@choldgraf

choldgraf Apr 5, 2017

Contributor

this could only happen if we don't regularize at all, right? doesn't the regularization force it to be a solvable problem?

@choldgraf

choldgraf Apr 5, 2017

Contributor

this could only happen if we don't regularize at all, right? doesn't the regularization force it to be a solvable problem?

return xxt
class TimeDelayingRidge(BaseEstimator):

This comment has been minimized.

@choldgraf

choldgraf Apr 5, 2017

Contributor

maybe a See :ref:mne.decoding.ReceptiveField for more information about the point of time delaying etc etc etc?

@choldgraf

choldgraf Apr 5, 2017

Contributor

maybe a See :ref:mne.decoding.ReceptiveField for more information about the point of time delaying etc etc etc?

Show outdated Hide outdated mne/decoding/time_delaying_ridge.py
Returns
-------
self : instance of LinearModel

This comment has been minimized.

@choldgraf

choldgraf Apr 5, 2017

Contributor

instance of TimeDelayingRidge

@choldgraf

choldgraf Apr 5, 2017

Contributor

instance of TimeDelayingRidge

Show outdated Hide outdated mne/decoding/time_delaying_ridge.py
offset = max(self._smin, 0)
for oi in range(self.coef_.shape[0]):
for fi in range(self.coef_.shape[1]):
temp = np.convolve(X[:, fi], self.coef_[oi, fi][::-1])

This comment has been minimized.

@@ -190,7 +189,7 @@
scores = np.zeros_like(alphas)
models = []
for ii, alpha in enumerate(alphas):
rf = ReceptiveField(tmin, tmax, sfreq, freqs, estimator=Ridge(alpha))
rf = ReceptiveField(tmin, tmax, sfreq, freqs, estimator=alpha)

This comment has been minimized.

@choldgraf

choldgraf Apr 5, 2017

Contributor

maybe this is where we show off the laplacian as well? I think we should highlight it somehow as most people won't really know what this is. Actually maybe we need a different section of this tutorial for this, or a short example that compares vanilla ridge + laplacian.

@choldgraf

choldgraf Apr 5, 2017

Contributor

maybe this is where we show off the laplacian as well? I think we should highlight it somehow as most people won't really know what this is. Actually maybe we need a different section of this tutorial for this, or a short example that compares vanilla ridge + laplacian.

This comment has been minimized.

@larsoner

larsoner Apr 5, 2017

Member

Yeah I could add it here or in the example, whichever looks better

@larsoner

larsoner Apr 5, 2017

Member

Yeah I could add it here or in the example, whichever looks better

This comment has been minimized.

@choldgraf

choldgraf Apr 5, 2017

Contributor

well I think in the name of our conversations about keeping the tutorials / examples clearly delineated from one another, it should be in the tutorial. Unless we had one very specific and short example that made a comparison between the two.

@choldgraf

choldgraf Apr 5, 2017

Contributor

well I think in the name of our conversations about keeping the tutorials / examples clearly delineated from one another, it should be in the tutorial. Unless we had one very specific and short example that made a comparison between the two.

@larsoner

This comment has been minimized.

Show comment
Hide comment
@larsoner

larsoner Apr 6, 2017

Member

Okay @choldgraf tutorial updated and other comments addressed. I changed the name to "quadratic" because that's what they call it in their paper.

Member

larsoner commented Apr 6, 2017

Okay @choldgraf tutorial updated and other comments addressed. I changed the name to "quadratic" because that's what they call it in their paper.

@choldgraf

This comment has been minimized.

Show comment
Hide comment
@choldgraf

choldgraf Apr 6, 2017

Contributor

re: the tutorial, I think we could add a short explanation that if you give a float to ReceptiveField's alpha param, it will under-the-hood use an instance of TimeDelayedRidge, then explain how you can also just pass your own TDR estimator with the parameters you like. It's the first time we speak about this estimator class specifically, right?

Also in the performance STRF plot, the annotation looks like it's off the drawn figure at the bottom. Also, I wonder if this is not a super great example to show this because all the strfs w/ a quadratic regularization look crappier. I assume this is because the STRF we simulated didn't adhere to the statistics that the regularization fit to? It'd be best if there were a short example that explained how this particular kind of regularization improves the model fit

Contributor

choldgraf commented Apr 6, 2017

re: the tutorial, I think we could add a short explanation that if you give a float to ReceptiveField's alpha param, it will under-the-hood use an instance of TimeDelayedRidge, then explain how you can also just pass your own TDR estimator with the parameters you like. It's the first time we speak about this estimator class specifically, right?

Also in the performance STRF plot, the annotation looks like it's off the drawn figure at the bottom. Also, I wonder if this is not a super great example to show this because all the strfs w/ a quadratic regularization look crappier. I assume this is because the STRF we simulated didn't adhere to the statistics that the regularization fit to? It'd be best if there were a short example that explained how this particular kind of regularization improves the model fit

@mmagnuski

This comment has been minimized.

Show comment
Hide comment
@mmagnuski

mmagnuski Apr 6, 2017

Member

hm, the smoothness constraint is just along x axis - that's why the data look weird when you put it back to 2D. I'm not sure it is worth adding 2d smoothness constraint to TimeDelayedRidge, but I agree a short comment would help.

Member

mmagnuski commented Apr 6, 2017

hm, the smoothness constraint is just along x axis - that's why the data look weird when you put it back to 2D. I'm not sure it is worth adding 2d smoothness constraint to TimeDelayedRidge, but I agree a short comment would help.

@choldgraf

This comment has been minimized.

Show comment
Hide comment
@choldgraf

choldgraf Apr 6, 2017

Contributor

On that note, would it be relatively easy to impose a smoothness constraint on the X dimension as well? If so then we could use this as an extra flag, e.g. smoothing= ('delays' | 'features' | 'both')

Contributor

choldgraf commented Apr 6, 2017

On that note, would it be relatively easy to impose a smoothness constraint on the X dimension as well? If so then we could use this as an extra flag, e.g. smoothing= ('delays' | 'features' | 'both')

@mmagnuski

This comment has been minimized.

Show comment
Hide comment
@mmagnuski

mmagnuski Apr 6, 2017

Member

That depend whether TimeDelayedRidge gets the data in 2d or not. If it does - it is just a matter of checking the dimensions and formatting the regularization matrix differently.

Member

mmagnuski commented Apr 6, 2017

That depend whether TimeDelayedRidge gets the data in 2d or not. If it does - it is just a matter of checking the dimensions and formatting the regularization matrix differently.

@larsoner

This comment has been minimized.

Show comment
Hide comment
@larsoner

larsoner Apr 6, 2017

Member

I think @rkmaddox had some function to impose smoothness on features as well as delays. I'll have to bother him about it. In theory you could use different regularizations on different dimensions I think.

Member

larsoner commented Apr 6, 2017

I think @rkmaddox had some function to impose smoothness on features as well as delays. I'll have to bother him about it. In theory you could use different regularizations on different dimensions I think.

@mmagnuski

This comment has been minimized.

Show comment
Hide comment
@mmagnuski

mmagnuski Apr 6, 2017

Member

BTW - there are some minuses missing in the rendered equation in the tutorial (regularization matrix). :)

Member

mmagnuski commented Apr 6, 2017

BTW - there are some minuses missing in the rendered equation in the tutorial (regularization matrix). :)

@choldgraf

This comment has been minimized.

Show comment
Hide comment
@choldgraf

choldgraf Apr 6, 2017

Contributor

For sure - I'm just thinking of the special case where you have features whose proximity in the matrix has "meaning" to it, and where you want to smooth it. I could imagine this being true for many sensory stimulus features etc. I think this could be a different PR though

Contributor

choldgraf commented Apr 6, 2017

For sure - I'm just thinking of the special case where you have features whose proximity in the matrix has "meaning" to it, and where you want to smooth it. I could imagine this being true for many sensory stimulus features etc. I think this could be a different PR though

@larsoner

This comment has been minimized.

Show comment
Hide comment
@larsoner

larsoner Apr 6, 2017

Member

I think this could be a different PR though

I can probably do it here. It shouldn't be so bad (I don't think?), and there is no rush to get this in.

Member

larsoner commented Apr 6, 2017

I think this could be a different PR though

I can probably do it here. It shouldn't be so bad (I don't think?), and there is no rush to get this in.

@choldgraf

This comment has been minimized.

Show comment
Hide comment
@choldgraf

choldgraf Apr 6, 2017

Contributor

sounds good - just trying to be respectful of your time / the bounds of the PR :)

Contributor

choldgraf commented Apr 6, 2017

sounds good - just trying to be respectful of your time / the bounds of the PR :)

@larsoner larsoner added this to the 0.15 milestone Apr 6, 2017

@agramfort

This comment has been minimized.

Show comment
Hide comment
@agramfort

agramfort Apr 7, 2017

Member
Member

agramfort commented Apr 7, 2017

@choldgraf

This comment has been minimized.

Show comment
Hide comment
@choldgraf

choldgraf Apr 7, 2017

Contributor

ahh that's awesome! I love the new plot

Did you basically just noisify everything a bit more? That makes sense as a means of highlighting the value of the smoothing etc.

It also feels more smooth along the y-axis as well...is that a change you made or just a product of the noisifying?

Only comment I'd make is that it would be good to have a short explanation of why the smooth regularization is better at the end. I can add this as a patch if you like, once you tell me how to push to your branch on the other thread :)

Contributor

choldgraf commented Apr 7, 2017

ahh that's awesome! I love the new plot

Did you basically just noisify everything a bit more? That makes sense as a means of highlighting the value of the smoothing etc.

It also feels more smooth along the y-axis as well...is that a change you made or just a product of the noisifying?

Only comment I'd make is that it would be good to have a short explanation of why the smooth regularization is better at the end. I can add this as a patch if you like, once you tell me how to push to your branch on the other thread :)

@larsoner

This comment has been minimized.

Show comment
Hide comment
@larsoner

larsoner Apr 7, 2017

Member

You can make a PR from any branch on any fork to any branch on any other fork. GitHub defaults to doing user/branch into mne-tools/master because of forking, but in the "create pull request" page you can set the upstream fork to mine and the branch to speed.

Member

larsoner commented Apr 7, 2017

You can make a PR from any branch on any fork to any branch on any other fork. GitHub defaults to doing user/branch into mne-tools/master because of forking, but in the "create pull request" page you can set the upstream fork to mine and the branch to speed.

@choldgraf

This comment has been minimized.

Show comment
Hide comment
@choldgraf

choldgraf Apr 7, 2017

Contributor

added a short explanation to the tutorial

Contributor

choldgraf commented Apr 7, 2017

added a short explanation to the tutorial

Show outdated Hide outdated tutorials/plot_receptive_field.py
# Below we visualize the model performance of each regularization method
# (ridge vs. quadratic) for different levels of alpha. As you can see, the
# quadratic method performs better in general, because it imposes a smoothness
# constraint along the time dimension of the coefficients. This matches

This comment has been minimized.

@larsoner

larsoner Apr 7, 2017

Member

not just time, also features

@larsoner

larsoner Apr 7, 2017

Member

not just time, also features

This comment has been minimized.

@choldgraf

choldgraf Apr 7, 2017

Contributor

ah ok - I didn't realize you'd implemented both. Is that a hard requirement? Does that assumption hold if, e.g., you had log-spaced features?

@choldgraf

choldgraf Apr 7, 2017

Contributor

ah ok - I didn't realize you'd implemented both. Is that a hard requirement? Does that assumption hold if, e.g., you had log-spaced features?

This comment has been minimized.

@choldgraf

choldgraf Apr 7, 2017

Contributor

edit made in any case

@choldgraf

choldgraf Apr 7, 2017

Contributor

edit made in any case

This comment has been minimized.

@larsoner

larsoner Apr 7, 2017

Member

It should be clear in the TimeDelayingRidge docstring, but no it's not required. Although yes, it would probably still work for log-spaced freqs due to smoothness of our estimators.

@larsoner

larsoner Apr 7, 2017

Member

It should be clear in the TimeDelayingRidge docstring, but no it's not required. Although yes, it would probably still work for log-spaced freqs due to smoothness of our estimators.

This comment has been minimized.

@larsoner

larsoner Apr 7, 2017

Member

smoothness of our TF functions, rather

@larsoner

larsoner Apr 7, 2017

Member

smoothness of our TF functions, rather

@choldgraf

This comment has been minimized.

Show comment
Hide comment
@choldgraf

choldgraf Apr 7, 2017

Contributor

LGTM

Contributor

choldgraf commented Apr 7, 2017

LGTM

@choldgraf choldgraf changed the title from MRG: Use frequency domain for xc/ac to MRG+1: Use frequency domain for xc/ac Apr 7, 2017

Show outdated Hide outdated mne/decoding/time_delaying_ridge.py
if fit_intercept:
# We could do this in the Fourier domain, too, but it should
# be a bit cleaner numerically do do it here.

This comment has been minimized.

@agramfort

agramfort Apr 8, 2017

Member

do do

Show outdated Hide outdated tutorials/plot_receptive_field.py
scores = np.zeros_like(alphas)
models = []
for ii, alpha in enumerate(alphas):
estimator = TimeDelayingRidge(tmin, tmax, sfreq, reg_type='quadratic',

This comment has been minimized.

@agramfort

agramfort Apr 8, 2017

Member

did you always look at the validity of the results when using ['quadratic', 'ridge'] or vice versa?

@agramfort

agramfort Apr 8, 2017

Member

did you always look at the validity of the results when using ['quadratic', 'ridge'] or vice versa?

This comment has been minimized.

@larsoner

larsoner Apr 10, 2017

Member

Each permutation does what you'd expect (no smoothness, smoothness in time, smoothness in freq, or smoothness in both)

@larsoner

larsoner Apr 10, 2017

Member

Each permutation does what you'd expect (no smoothness, smoothness in time, smoothness in freq, or smoothness in both)

@agramfort

This comment has been minimized.

Show comment
Hide comment
@agramfort

agramfort Apr 8, 2017

Member

@Eric89GXL how do you handle the edges in Fourier?

Member

agramfort commented Apr 8, 2017

@Eric89GXL how do you handle the edges in Fourier?

@larsoner

This comment has been minimized.

Show comment
Hide comment
@larsoner

larsoner Apr 10, 2017

Member

how do you handle the edges in Fourier?

Unlike in the Ridge case they are not removed. I'm not sure if they can be.

In practice I don't think this should make much difference.

Member

larsoner commented Apr 10, 2017

how do you handle the edges in Fourier?

Unlike in the Ridge case they are not removed. I'm not sure if they can be.

In practice I don't think this should make much difference.

self.feature_names = feature_names
self.sfreq = float(sfreq)
self.tmin = tmin
self.tmax = tmax
self.estimator = 0. if estimator is None else estimator
if scoring not in _SCORERS.keys():
raise ValueError('scoring must be one of %s, got'
'%s ' % (_SCORERS.keys(), scoring))

This comment has been minimized.

@larsoner

larsoner Apr 10, 2017

Member

I was told no type checking or anything should be done in __init__ so I moved this to fit

@larsoner

larsoner Apr 10, 2017

Member

I was told no type checking or anything should be done in __init__ so I moved this to fit

@larsoner

This comment has been minimized.

Show comment
Hide comment
@larsoner

larsoner Apr 10, 2017

Member

Okay @agramfort comments addressed

Member

larsoner commented Apr 10, 2017

Okay @agramfort comments addressed

@choldgraf

This comment has been minimized.

Show comment
Hide comment
@choldgraf

choldgraf Apr 10, 2017

Contributor

Unlike in the Ridge case they are not removed. I'm not sure if they can be.
In practice I don't think this should make much difference.

Sounds like fodder for a blog post ;-)

Contributor

choldgraf commented Apr 10, 2017

Unlike in the Ridge case they are not removed. I'm not sure if they can be.
In practice I don't think this should make much difference.

Sounds like fodder for a blog post ;-)

for estimator in (Ridge(alpha=0.), tdr,
Ridge(alpha=0., fit_intercept=False), tdr_no):
x -= np.mean(x, axis=0)
y -= np.mean(y, axis=0)

This comment has been minimized.

@agramfort

agramfort Apr 11, 2017

Member

you should not need to do this. Results should match for Ridge and TimeDelayingRidge with and without intercept whatever if the data have been centered or not.

I am asking as I see slightly different values for the intercept_ when I switch base estimators.
If you have a bug you will have a discrepancy that gets worse and worse when y becomes less centered.

@agramfort

agramfort Apr 11, 2017

Member

you should not need to do this. Results should match for Ridge and TimeDelayingRidge with and without intercept whatever if the data have been centered or not.

I am asking as I see slightly different values for the intercept_ when I switch base estimators.
If you have a bug you will have a discrepancy that gets worse and worse when y becomes less centered.

This comment has been minimized.

@choldgraf

choldgraf Apr 11, 2017

Contributor

don't we have tests that compare the results of ridge + TDR? Are we not testing intercept values?

@choldgraf

choldgraf Apr 11, 2017

Contributor

don't we have tests that compare the results of ridge + TDR? Are we not testing intercept values?

This comment has been minimized.

@larsoner

larsoner Apr 11, 2017

Member

Results should match for Ridge and TimeDelayingRidge with and without intercept whatever if the data have been centered or not.

I don't think the results were the same even for Ridge with fit_intercept=True vs fit_intercept=False, but I'll double-check.

I am asking as I see slightly different values for the intercept_ when I switch base estimators.

I think this might be due to the differences in behavior with cropping to "valid" samples, but I'll try to confirm that, too.

@larsoner

larsoner Apr 11, 2017

Member

Results should match for Ridge and TimeDelayingRidge with and without intercept whatever if the data have been centered or not.

I don't think the results were the same even for Ridge with fit_intercept=True vs fit_intercept=False, but I'll double-check.

I am asking as I see slightly different values for the intercept_ when I switch base estimators.

I think this might be due to the differences in behavior with cropping to "valid" samples, but I'll try to confirm that, too.

@agramfort

This comment has been minimized.

Show comment
Hide comment
@agramfort

agramfort Apr 11, 2017

Member
Member

agramfort commented Apr 11, 2017

x += 1e3
model.fit(x, y)
if estimator.fit_intercept:
val, itol, ctol = [-6000, 4000], 20., 1e-1

This comment has been minimized.

@larsoner

larsoner Apr 11, 2017

Member

@agramfort these are the tests @choldgraf is talking about

@larsoner

larsoner Apr 11, 2017

Member

@agramfort these are the tests @choldgraf is talking about

This comment has been minimized.

@larsoner

larsoner Apr 11, 2017

Member

i.e., we do check that we are within some percentage of the expected non-zero values.

@larsoner

larsoner Apr 11, 2017

Member

i.e., we do check that we are within some percentage of the expected non-zero values.

@agramfort

This comment has been minimized.

Show comment
Hide comment
@agramfort

agramfort Apr 12, 2017

Member

ok indeed.

please update what's new and merge.

thanks a lot !

Member

agramfort commented Apr 12, 2017

ok indeed.

please update what's new and merge.

thanks a lot !

@larsoner

This comment has been minimized.

Show comment
Hide comment
@larsoner

larsoner Apr 12, 2017

Member

@agramfort updated. Good to go, then?

Member

larsoner commented Apr 12, 2017

@agramfort updated. Good to go, then?

@larsoner

This comment has been minimized.

Show comment
Hide comment
Member

larsoner commented Apr 20, 2017

ping @agramfort

@larsoner

This comment has been minimized.

Show comment
Hide comment
@larsoner

larsoner Apr 20, 2017

Member

I just noticed you said it's good to go, I'll go ahead and merge once the CIs come back happy (renamed "quadratic" as "laplacian" and rebased).

Member

larsoner commented Apr 20, 2017

I just noticed you said it's good to go, I'll go ahead and merge once the CIs come back happy (renamed "quadratic" as "laplacian" and rebased).

@agramfort

This comment has been minimized.

Show comment
Hide comment
Member

agramfort commented Apr 20, 2017

@choldgraf

This comment has been minimized.

Show comment
Hide comment
Contributor

choldgraf commented Apr 20, 2017

@larsoner larsoner merged commit 8fc2a54 into mne-tools:master Apr 20, 2017

4 checks passed

ci/circleci Your tests passed on CircleCI!
Details
codecov/project 86.26% (+0.07%) compared to a2b75f7
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@larsoner

This comment has been minimized.

Show comment
Hide comment
@larsoner

larsoner Apr 20, 2017

Member

Thanks for the extra push @choldgraf

Member

larsoner commented Apr 20, 2017

Thanks for the extra push @choldgraf

@larsoner larsoner deleted the larsoner:speed branch Apr 20, 2017

@choldgraf

This comment has been minimized.

Show comment
Hide comment
@choldgraf

choldgraf Apr 20, 2017

Contributor

I've been waiting to interweave some mid-2000s hip hop into github for quite some time now

Contributor

choldgraf commented Apr 20, 2017

I've been waiting to interweave some mid-2000s hip hop into github for quite some time now

@jona-sassenhagen

This comment has been minimized.

Show comment
Hide comment
@jona-sassenhagen

jona-sassenhagen Aug 10, 2018

Contributor

@larsoner have you yet experimented with using CUDA here btw? My GPU is currently mostly running idle.

Contributor

jona-sassenhagen commented Aug 10, 2018

@larsoner have you yet experimented with using CUDA here btw? My GPU is currently mostly running idle.

@rkmaddox

This comment has been minimized.

Show comment
Hide comment
@rkmaddox

rkmaddox Aug 10, 2018

Contributor

@jona-sassenhagen In the code that was originally adapted and improved (massively) by @larsoner I had a lazy cuda option where I just imported the FFT from mne.cuda instead of from scipy. It did indeed speed things up, though not more than an order of magnitude compared to n_jobs=10 for the types of data I was doing (~1000 lags from a couple hours of 1-channel EEG @ 10 kHz, though processed in minute-long chunks).

Contributor

rkmaddox commented Aug 10, 2018

@jona-sassenhagen In the code that was originally adapted and improved (massively) by @larsoner I had a lazy cuda option where I just imported the FFT from mne.cuda instead of from scipy. It did indeed speed things up, though not more than an order of magnitude compared to n_jobs=10 for the types of data I was doing (~1000 lags from a couple hours of 1-channel EEG @ 10 kHz, though processed in minute-long chunks).

@larsoner

This comment has been minimized.

Show comment
Hide comment
@larsoner

larsoner Aug 10, 2018

Member

I think that it should be straightforward to make this change now based on how we handle CUDA vs non-CUDA. @jona-sassenhagen want to give it a shot?

Member

larsoner commented Aug 10, 2018

I think that it should be straightforward to make this change now based on how we handle CUDA vs non-CUDA. @jona-sassenhagen want to give it a shot?

@jona-sassenhagen

This comment has been minimized.

Show comment
Hide comment
@jona-sassenhagen

jona-sassenhagen Aug 10, 2018

Contributor

I would prefer not to! My work with CUDA + FFTs has been one huge disappointment.
And you already forced me to work on the isotrack thing :)

Contributor

jona-sassenhagen commented Aug 10, 2018

I would prefer not to! My work with CUDA + FFTs has been one huge disappointment.
And you already forced me to work on the isotrack thing :)

@jona-sassenhagen

This comment has been minimized.

Show comment
Hide comment
@jona-sassenhagen

jona-sassenhagen Aug 10, 2018

Contributor

@rkmaddox an order of magnitude would be a huge improvement though!

Contributor

jona-sassenhagen commented Aug 10, 2018

@rkmaddox an order of magnitude would be a huge improvement though!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment