Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LombScargle, different methods different results(fast vs others) #11302

Open
rilesdg3 opened this issue Feb 5, 2021 · 10 comments
Open

LombScargle, different methods different results(fast vs others) #11302

rilesdg3 opened this issue Feb 5, 2021 · 10 comments

Comments

@rilesdg3
Copy link

rilesdg3 commented Feb 5, 2021

Description

In short the LombScargle.autopower(method= 'fast') returns some drastically different results than either the LombScargle.autopower('slow') method and the LombScargle.autopower('cython') method. I understand that fast is doing some estimation to speed things up but the results shouldn't be that different. The charts below illustrate the difference. The blue line is the method = 'fast' and the orange is either method = 'cython', or method = 'slow'. This only seems to be the case for various 'nyquist_factor'. This can be a huge deal when it comes to statistical significance.

chart using LINEAR_14752041data set
img1

chart using tmp data set
img2

EDIT: Embed image in issue.

Expected behavior

pretty close to the same results

Steps to Reproduce

code is below

import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
from scipy import signal
from astropy.time import Time
from astropy.timeseries import LombScargle


m5_data = pd.read_csv('LINEAR_14752041.csv')
#m5_data =pd.read_csv(tmp')
'''Using 0 to .size for t '''
t = m5_data.iloc[:,0]
t = np.arange(0,t.size,1)
#t = t + np.random.random(t.size)
y = m5_data.iloc[:,1]# - np.mean(m5_data.iloc[:,1])
m5_ls = LombScargle(t, y,)
m5_frequency, m5_power =m5_ls.autopower(nyquist_factor=10)
fap = m5_ls.false_alarm_level(.05, nyquist_factor=10)
print('fap auto', fap)
plt.plot(m5_frequency,m5_power)
plt.show()


m5_frequency, m5_power = m5_ls.autopower(method='cython', nyquist_factor=10)
fap = m5_ls.false_alarm_level(.05, nyquist_factor=10)
print('fap cython', fap)
plt.plot(m5_frequency,m5_power)
plt.show()

System Details

Linux-4.15.0-72-generic-x86_64-with-Ubuntu-18.04-bionic
Python 3.6.9 (default, Nov 7 2019, 10:44:02)
[GCC 8.3.0]
Numpy 1.19.4
astropy 4.1
Scipy 1.5.4
Matplotlib 3.0.2

@github-actions
Copy link

github-actions bot commented Feb 5, 2021

Welcome to Astropy 👋 and thank you for your first issue!

A project member will respond to you as soon as possible; in the meantime, please double-check the guidelines for submitting issues and make sure you've provided the requested details.

If you feel that this issue has not been responded to in a timely manner, please leave a comment mentioning our software support engineer @embray, or send a message directly to the development mailing list. If the issue is urgent or sensitive in nature (e.g., a security vulnerability) please send an e-mail directly to the private e-mail feedback@astropy.org.

@embray
Copy link
Member

embray commented Feb 7, 2021

Maybe @jakevdp has some ideas about this?

@rilesdg3
Copy link
Author

Hello, I just wanted to check and see if any one has any thoughts on this

thank you

@pllim
Copy link
Member

pllim commented Feb 10, 2021

Unfortunately, LS author is no longer in the project and other timeseries maintainers are unavailable right now. You can try to ask in other places listed in https://www.astropy.org/help if you cannot wait. We apologize for any convenience caused.

@jakevdp
Copy link
Contributor

jakevdp commented Feb 10, 2021

There's probably some assumption made by the fast algorithm that is not applicable to your data. The fast implementation follows the algorithm published in Press & Rybicki (1989); you might check there to see if there are some known conditions that your data do not meet.

@rilesdg3
Copy link
Author

rilesdg3 commented Feb 10, 2021

@jakevdp The first chart uses the LINEAR_1475204 dataset from your article "Understanding the Lomb–Scargle Periodogram". The second chart is my personal data set.

Do you know if there is some assumption made by the fast implementation that is not applicable to the LINEAR_1475204?

P.S. Your article and posting of the code and data sets has been immensely helpful in understanding lomb-scargle. I can't thank you enough for posting the code and data.

@embray
Copy link
Member

embray commented Feb 15, 2021

FWIW here is the original article describing the algorithm, and here is the Python implementation of it.

What happens if you pass method='fast', assume_regular_frequency=False?

@rilesdg3
Copy link
Author

rilesdg3 commented Feb 18, 2021

it doesn't seem to make a difference

LSProblem_3

m5_data = pd.read_csv('tmp')
'''Using 0 to .size for t '''
t = m5_data.iloc[:,0]
t = np.arange(0,t.size,1)
y = m5_data.iloc[:,1]
m5_ls = LombScargle(t, y)
m5_frequency = m5_ls.autofrequency( nyquist_factor=10)
m5_power =m5_ls.power(m5_frequency, method = 'fast',assume_regular_frequency=False)
plt.plot(m5_frequency,m5_power)
#plt.show()

ls = LombScargle(t, y,)
frequency, power = ls.autopower(method='cython', nyquist_factor=10)
plt.plot(frequency,power)
plt.show()

@embray
Copy link
Member

embray commented Feb 22, 2021

Ok, to clarify I didn't think assume_regular_frequency=False would fix anything; rather it would have thrown an error if the assumption failed.

@rilesdg3
Copy link
Author

ah ok

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants