Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Different Lomb-Scargle normalizations yield different False Alarm Probabilities #11891

Open
rdungee opened this issue Jun 24, 2021 · 6 comments

Comments

@rdungee
Copy link

rdungee commented Jun 24, 2021

Description

I'm currently working with some light curves collected from a ground-based telescope, and I've been using astropy.timeseries.LombScargle() to extract the rotation period. One of the ways I've been assessing the significance of a detection is through the False Alarm Probability (specifically, one that has been calculated by the bootstrap method). What I've found is that the probability I get depends on the normalization I choose for the periodograms. I generally find that the false alarm probability for the 'standard' normalization is 4-5% (absolute) smaller than the 'psd' normalization, even though everything else is identical (e.g. using 'standard' yields a probability of 1%, whereas using 'psd' yields 5%, or using 'standard' yields a probability of 25%, where as using 'psd' yields 30%). The different probabilities holds out to n_bootstraps of 100000, I have not tested larger numbers because of computation times required though.

Expected behavior

I would have expected that the false alarm probability be the same regardless of normalization, especially with all else the same.

Actual behavior

normalization='standard' consistently yields a different false alarm probability than normalization='psd' despite all other things being the same (i.e. same input light curve, same frequency grid, same method used, even the random_seed for the bootstrapping)

Steps to Reproduce

The code below roughly reproduces the issue, the sampling in time is different for my real data, but even for n_bootstrap of 100000 the printed values are 0.075 vs 0.116

from astropy.timeseries import LombScargle

import numpy as np  

rand = np.random.RandomState(13)
t = 100 * rand.rand(100)
y = 0.6*np.sin(2 * np.pi * t) + rand.randn(100)
lsc1 =  LombScargle(t, y,normalization='standard')
freq, power = lsc1.autopower()
fap1 = lsc1.false_alarm_probability(np.max(power),method='bootstrap',method_kwds={'random_seed': 42, 'n_bootstraps': 10000})
lsc2 =  LombScargle(t, y,normalization='psd')
freq, power = lsc2.autopower()
fap2 = lsc2.false_alarm_probability(np.max(power),method='bootstrap',method_kwds={'random_seed': 42, 'n_bootstraps': 10000})
print(fap1,fap2)

System Details

Linux-3.10.0-1062.12.1.el7.x86_64-x86_64-with-centos-7.7.1908-Core
Python 3.6.8 (default, Jun 20 2020, 11:12:59)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-39)]
Numpy 1.19.1
astropy 4.1
Scipy 1.5.2
Matplotlib 3.3.0

@github-actions
Copy link

Welcome to Astropy 👋 and thank you for your first issue!

A project member will respond to you as soon as possible; in the meantime, please double-check the guidelines for submitting issues and make sure you've provided the requested details.

If you feel that this issue has not been responded to in a timely manner, please leave a comment mentioning our software support engineer @embray, or send a message directly to the development mailing list. If the issue is urgent or sensitive in nature (e.g., a security vulnerability) please send an e-mail directly to the private e-mail feedback@astropy.org.

@embray
Copy link
Member

embray commented Jun 29, 2021

Hi @rdungee thanks for the detailed report. I think lately we've been suffering from a lack of dedicated expertise on astropy.timeseries, so I'm not sure who would be able to comment more deeply on this.

I did see that in the docs there is a note:

Note also that for normalization=’psd’, the distribution can only be computed for periodograms constructed with errors specified.

which I see your examples you did not provide. I have no idea why this is though. Could this have something to do with it?

@embray
Copy link
Member

embray commented Jun 29, 2021

I also see that there's nothing in the code that makes this limitation explicit, and I don't know exactly how universal the limitation is. For example in the case of method='single' it does not explicitly use the errors at all. In the case of method='bootstrap' the relationship seems more complicated, and I'm not sure where the limitation arises from. In some of the other methods it does use the errors explicitly.

Perhaps if nothing else, assuming it's a true limitation of the algorithm, there should be an explicit warning about it in the code...

@rdungee
Copy link
Author

rdungee commented Jun 29, 2021

Thank you for taking a look at this! I missed that note, I'll have to look into that and think about it some more. That said, my real dataset does have error bars for every point, if it would be helpful I can edit the example to include error bars while still replicating the issue.

@embray
Copy link
Member

embray commented Jul 8, 2021

I guess what I'm wondering is, is there a specific reason to expect that the normalization method shouldn't affect false alarm probabilities? I don't know that there's any reason to expect it wouldn't.

@giordano
Copy link

giordano commented Aug 17, 2021

For what is worth, with the Julia package LombScargle.jl I get the same false alarm probability with all normalizations:

julia> using LombScargle, Random

julia> function test(seed::Int = 42)
           @info "Seed for bootstrap: $(seed)"
           Random.seed!(42)
           t = 100 .* rand(100)
           y = 0.6 .* sin.(2 .* pi .* t) .+ randn.()
           for normalization in (:standard, :model, :log, :psd, :Scargle, :HorneBaliunas, :Cumming)
               plan = LombScargle.plan(t, y; normalization)
               periodogram = lombscargle(plan)
               maxpower = findmaxpower(periodogram)
               rng = Random.seed!(seed)
               boot = LombScargle.bootstrap(rng, 10_000, plan)
               println("Normalization: ", lpad(normalization, 13),
                       ", false alarm probability: ", fap(boot, maxpower))
           end
       end
test (generic function with 2 methods)

julia> @time test()
[ Info: Seed for bootstrap: 42
Normalization:      standard, false alarm probability: 0.7194
Normalization:         model, false alarm probability: 0.7194
Normalization:           log, false alarm probability: 0.7194
Normalization:           psd, false alarm probability: 0.7194
Normalization:       Scargle, false alarm probability: 0.7194
Normalization: HorneBaliunas, false alarm probability: 0.7194
Normalization:       Cumming, false alarm probability: 0.7194
 26.857431 seconds (25.08 M allocations: 12.352 GiB, 2.46% gc time)

julia> test.(rand(1:1000, 10)); # try 10 different random seeds just because I can
[ Info: Seed for bootstrap: 886
Normalization:      standard, false alarm probability: 0.7167
Normalization:         model, false alarm probability: 0.7167
Normalization:           log, false alarm probability: 0.7167
Normalization:           psd, false alarm probability: 0.7167
Normalization:       Scargle, false alarm probability: 0.7167
Normalization: HorneBaliunas, false alarm probability: 0.7167
Normalization:       Cumming, false alarm probability: 0.7167
[ Info: Seed for bootstrap: 122
Normalization:      standard, false alarm probability: 0.7186
Normalization:         model, false alarm probability: 0.7186
Normalization:           log, false alarm probability: 0.7186
Normalization:           psd, false alarm probability: 0.7186
Normalization:       Scargle, false alarm probability: 0.7186
Normalization: HorneBaliunas, false alarm probability: 0.7186
Normalization:       Cumming, false alarm probability: 0.7186
[ Info: Seed for bootstrap: 991
Normalization:      standard, false alarm probability: 0.7142
Normalization:         model, false alarm probability: 0.7142
Normalization:           log, false alarm probability: 0.7142
Normalization:           psd, false alarm probability: 0.7142
Normalization:       Scargle, false alarm probability: 0.7142
Normalization: HorneBaliunas, false alarm probability: 0.7142
Normalization:       Cumming, false alarm probability: 0.7142
[ Info: Seed for bootstrap: 240
Normalization:      standard, false alarm probability: 0.7153
Normalization:         model, false alarm probability: 0.7153
Normalization:           log, false alarm probability: 0.7153
Normalization:           psd, false alarm probability: 0.7153
Normalization:       Scargle, false alarm probability: 0.7153
Normalization: HorneBaliunas, false alarm probability: 0.7153
Normalization:       Cumming, false alarm probability: 0.7153
[ Info: Seed for bootstrap: 770
Normalization:      standard, false alarm probability: 0.7183
Normalization:         model, false alarm probability: 0.7183
Normalization:           log, false alarm probability: 0.7183
Normalization:           psd, false alarm probability: 0.7183
Normalization:       Scargle, false alarm probability: 0.7183
Normalization: HorneBaliunas, false alarm probability: 0.7183
Normalization:       Cumming, false alarm probability: 0.7183
[ Info: Seed for bootstrap: 704
Normalization:      standard, false alarm probability: 0.716
Normalization:         model, false alarm probability: 0.716
Normalization:           log, false alarm probability: 0.716
Normalization:           psd, false alarm probability: 0.716
Normalization:       Scargle, false alarm probability: 0.716
Normalization: HorneBaliunas, false alarm probability: 0.716
Normalization:       Cumming, false alarm probability: 0.716
[ Info: Seed for bootstrap: 585
Normalization:      standard, false alarm probability: 0.7104
Normalization:         model, false alarm probability: 0.7104
Normalization:           log, false alarm probability: 0.7104
Normalization:           psd, false alarm probability: 0.7104
Normalization:       Scargle, false alarm probability: 0.7104
Normalization: HorneBaliunas, false alarm probability: 0.7104
Normalization:       Cumming, false alarm probability: 0.7104
[ Info: Seed for bootstrap: 626
Normalization:      standard, false alarm probability: 0.7063
Normalization:         model, false alarm probability: 0.7063
Normalization:           log, false alarm probability: 0.7063
Normalization:           psd, false alarm probability: 0.7063
Normalization:       Scargle, false alarm probability: 0.7063
Normalization: HorneBaliunas, false alarm probability: 0.7063
Normalization:       Cumming, false alarm probability: 0.7063
[ Info: Seed for bootstrap: 287
Normalization:      standard, false alarm probability: 0.7083
Normalization:         model, false alarm probability: 0.7083
Normalization:           log, false alarm probability: 0.7083
Normalization:           psd, false alarm probability: 0.7083
Normalization:       Scargle, false alarm probability: 0.7083
Normalization: HorneBaliunas, false alarm probability: 0.7083
Normalization:       Cumming, false alarm probability: 0.7083
[ Info: Seed for bootstrap: 351
Normalization:      standard, false alarm probability: 0.7147
Normalization:         model, false alarm probability: 0.7147
Normalization:           log, false alarm probability: 0.7147
Normalization:           psd, false alarm probability: 0.7147
Normalization:       Scargle, false alarm probability: 0.7147
Normalization: HorneBaliunas, false alarm probability: 0.7147
Normalization:       Cumming, false alarm probability: 0.7147

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants