Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem in _smooth_spectrum function #2

Closed
krisvanneste opened this issue Feb 28, 2022 · 11 comments
Closed

Problem in _smooth_spectrum function #2

krisvanneste opened this issue Feb 28, 2022 · 11 comments

Comments

@krisvanneste
Copy link
Collaborator

I think there is a problem in the _smooth_spectrum function. When the spectrum is converted from linear space to log space, the spectrum is first decimated to 50 points (the default number of values generated by np.logspace) before the actual smoothing is applied.

To avoid loosing nformation, it would be much better to generate the freq_log sequence with the smallest log_df spacing in the linear series, as follows:
log_df = np.diff(np.log10(freq[-2:]))[0]
freq_log = 10**(np.arange(np.log10(freq[0]), np.log10(freq[-1]) + log_df, log_df))

When you do that, however, the 'npts' argument doesn't make much sense anymore. It would be better to define the amount of smoothing in terms of bandwidth in log space. The number of points can then be computed from the bandwidth as follows:
npts = max(1, int(round(bandwidth / log_df)))

freq_log = np.logspace(np.log10(freq[0]), np.log10(freq[-1]))

@claudiodsf
Copy link
Member

Hi Kris and thanks for the feedback!

What you propose is very interesting.

I'm going to create a branch so that we can test this idea.

@krisvanneste
Copy link
Collaborator Author

Claudio,
Thanks for looking into this!

Kris

@claudiodsf
Copy link
Member

I'm trying right now with this:

def _smooth_spectrum(spec):
    """Smooth spectrum in a log_freq space."""
    freq = spec.get_freq()
    f = interp1d(freq, spec.data, fill_value='extrapolate')
    _freq_log = np.log10(freq)
    # minimum frequency interval in log10
    log_df = _freq_log[-1] - _freq_log[-2]
    # frequencies in logarithmic spacing
    freq_log = 10**(np.arange(_freq_log[0], _freq_log[-1]+log_df, log_df))
    spec.freq_log = freq_log
    spec.data_log = f(freq_log)
    # number of points for smoothing
    log_bandwidth = _freq_log[-1] - _freq_log[0]
    npts = max(1, int(round(log_bandwidth / log_df)))
    spec.data_log = smooth(spec.data_log, window_len=npts)
    f = interp1d(freq_log, spec.data_log, fill_value='extrapolate')
    spec.data = f(freq)

But npts is of the same order of len(freq_log), which makes the spectrum way over smoothed...

Maybe I misunderstood what you mean for "bandwidth in log space" (log_bandwidth here)?

@krisvanneste
Copy link
Collaborator Author

Claudio,

log_bandwidth should be an argument of the _smooth_spectrum function (replacing npts). I use a default value of 0.05, which seems to give good results, but it should be configurable by the end user.

Kris

claudiodsf added a commit that referenced this issue Feb 28, 2022
@claudiodsf
Copy link
Member

I just pushed the branch https://github.com/SeismicSource/sourcespec/tree/smooth_spectrum

The main difference with your code is that I set the number of points in logspace to be equal to the points of the original spectrum: I found that your approach was generating way too many spectral points, which has a heavy cost for grid search inversion.

Otherwise, I think that introducing this new parameter (which I named spectral_smooth_width_decades) is a clearer approach. 😉

Could you please test this branch and tell me your opinion?

@krisvanneste
Copy link
Collaborator Author

Claudio,

Thanks for updating your code. I can't run sourcespec to evaluate the impact on the magnitude estimation right now, but I will try to evaluate the spectral smoothing by tomorrow, if that's OK.

Kris

@claudiodsf
Copy link
Member

No worry! Take your time!

@claudiodsf
Copy link
Member

Just pushed a new commit 16949b1

Here I resample the smoothed log-spectrum using a sampling rate that is 1/5 of the smooth window size.

In my tests, it is a sufficient number of points to correctly represent the smoothed spectrum.

Let me know what do you think.

@krisvanneste
Copy link
Collaborator Author

Claudio,

I compared your new smoothing function with mine and with your previous one in the attached plot.

With the same bandwidth (smooth_width_decades), the result of the new smoothing function is very close to mine (labeled "robspy" in the plot). With the default value of 0.2, it is close to your original function. Anyway, the new function can be configured much better than the original one, which smoothed a great deal more than you would expect with a smoothing window (npts) of only 5 points.

The only potential problem I see is that in your spec object, the linear spectrum (spec.freq, spec.data) has a different resolution compared to the logarithmic spectrum (spec.freq_log, spd.data_log). I hope this doesn't cause any problems later on in the process.

Kris
spectral_smoothing_comparison

@claudiodsf
Copy link
Member

Hi Kris and thanks for this test!

It looks like I prefer smoothing more than you 😉. So it's good to have a parameter for that!

The linear spectrum is not used anymore, except for plotting. But I think I'll replace spec.data with spec.data_log also for plotting, so that one can visually judge the effect of the downsampling.

I'll do this latest modification later today and then merge the smooth_spectrum branch.

Thanks again!

@krisvanneste
Copy link
Collaborator Author

Claudio,

OK, thanks for addressing this!
And yes, maybe I need to smooth a little more...

Kris

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants