-
Notifications
You must be signed in to change notification settings - Fork 158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve normalization and plotting options in LombScarglePeriodogram
#618
base: main
Are you sure you want to change the base?
Conversation
We had a productive in-person discussion on normalization involving @ojhall94 @danhey @jsk389 and others. Here is a picture of the whiteboard: @ojhall94 has kindly volunteered to write a paragraph to capture what we learned! Useful references to cite in the Lightkurve docs on this topic include the PhD thesii by @jsk389 & @rhandberg and a KASC document written by Bill Chaplin on 2009 Nov 23 (filename: wg1_wgmail02.pdf). |
/cc @keatonb |
Love it! This does just what I was asking for in the discussion of #570. Just checking: does the power property really return flux_variance / [frequency unit], or is Hz hard-coded here as the frequency unit to be returned? |
Add the TASOC/TDA meeting earlier this month, I learned that asteroseismologists prefer the following definitions of the words
where Does that sound right? @ojhall94 |
a996e9a
to
f99fb21
Compare
It is useful to compare the observed frequency-domain periodogram PSD with analytic power spectra from their time-domain counterparts, Gaussian Process covariance kernels. This mapping is exact via Bochner's theorem. Currently lightkurve's PSD and celerité's In other words, if you have a GP kernel from celerite import terms
kernel_mat = terms.Matern32Term(log_sigma=np.log(sigma_guess), log_rho=np.log(rho_guess))
power_lk = kernel_mat.get_psd(2*np.pi*pg.frequency.value) * 4 I think most lightkurve/celerite users can simply multiply the celerite psd by the number 4. If we were really keen on highlighting this connection, we could add some sort of The celerite normalization tutorial is a bit inconsistent, the correct normalization was settled on here: (Addendum: As a sanity check, I have experimentally confirmed that the lightkurve and celerite PSDs match for known inputs, only if the scalar factor of 4 shown above is applied.) |
Looks reasonable to me. I don't know who works in power directly or what they expect. Basu & Chaplin (Section 5.1.4, though this also advocates calibrating with Parseval's theorem which I'm not convinced about) define the regular power spectrum as power per bin without the extra factor of 2 (i.e., |
lightkurve/periodogram.py
Outdated
@@ -62,17 +70,20 @@ class Periodogram(object): | |||
Free-form metadata associated with the Periodogram. | |||
""" | |||
def __init__(self, frequency, power, nyquist=None, label=None, | |||
targetid=None, default_view='frequency', meta={}): | |||
targetid=None, default_view='frequency', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Issue found: Trailing whitespace
Hey all, I think this is an important feature we should push ASAP! Right now, if I call pg = lc.to_periodogram(normalization='amplitude')
pg.plot() then the plot says the y-units are in power (not amplitude!). I've had a few confused people ask me about this. The current behaviour is very confusing and we should either roll-back our current changes until this is ready or commit this soon. |
I agree with Dan! I'm at that point in my thesis where I need to understand this stuff. I've run some tests, and as far as I can tell the following things are true: Amplitude & Power normalisation:
For an amplitude of A in the time-domain, this produces peaks at PSD normalisation:
This obey's Parseval's theorem (before division by the frequency spacing), and is as Chaplin+Basu17 suggest for solar-like oscillators. Can somebody (@keatonb, @ jsk389 ?) confirm I'm not going crazy? |
@ojhall94 This appears to agree with my understanding. To be clear, A there is the semiamplitude, like the light curve signal is Asin(omegat...) |
d42af85
to
32105aa
Compare
I pushed a new version to this branch which is ready for testing. As explained before, Periodograms now have In addition, I have removed the For example: # show frequency (microhertz) vs amplitude (ppm)
pg.plot()
# show amplitude vs period
pg.plot(xunit='day', yunit='ppm')
# show power vs frequency
pg.plot(xunit='1/day', yunit='ppm2')
# show psd vs frequency
pg.plot(xunit='microhertz', yunit='ppm**2/microhertz') I made @danhey @ojhall94 @keatonb Would you be willing to check out this branch and see if this works for you? The changes I made to Something like this should work to try this PR: git remote add geert https://github.com/barentsen/lightkurve.git
git fetch geert
git checkout add-amplitude
python setup.py develop |
Hi Geert, awesome work! I can confirm that this works and I have played around with a few test systems. My main concern is that Also, how can we just plot And this is a weird bug, but it seems that when I import your branch of lightkurve then I can no longer render plots in Jupyter. Plots don't appear and calling |
@danhey Thanks for the feedback! I didn't pay attention to maintaining the existing defaults. I've changed it so you now get Your Jupyter problem is surprising. Does using Please keep playing with this branch! It's a big change so I'd like for all of us to use it for a week. |
Hey all, I've had a play around with it, it works very smoothly! I have some feedback & questions:
I'm not sure what the [normalized] units represent in this case, because the amplitude output in |
I second @ojhall94's comment about keeping a Can we keep the output of the periodogram in Ollie also raised a good point about breaking existing code which relies on |
I've just been playing around with going from a periodogram to seismology--- the |
Thank you for the feedback both! This all seems sensible. I'm hoping most of it can be resolved using small changes, but it may take me some time to get to it. |
I've played around with @barentsen's changes and have implemented my thoughts. You can see it in action here: https://gist.github.com/danhey/53f1374ce87bf21cbaa5a8166292136c There are a few options we can take here:
Anyway, I'm sure there are other options so please chime in and I'll try to implement some of these to play around with! |
Thanks @danhey! I like the behavior demonstrated in the I would be happy to merge your PR into this PR! Any objections? (Footnote for others: this is a bit complicated, but, over on barentsen#4, @danhey posted a PR against the branch on which this PR is based, i.e. proposing to change this PR. This is actually a great way to propose changes to a PR. Dan is showing off some pro GitHub skills here!) |
reintroduce normalization, move things to power.
psd
and amplitude
properties to LombScarglePeriodogram
LombScarglePeriodogram
I just took a look and found a strange behaviors from trying to guess at an intuitive way to compute power rather than amplitude... (edited to remove an example from my original comment, that I realize comes from "normalization" changing the default oversampling, which I'm used to specifying explicitly)
I could even generate periodograms with yunits of ppt^10, but they were just amplitude scaled way up. Also, why not just call these The documentation for I don't mind keeping the Overall, I think it would be best if both LightCurve and Periodogram objects could be defined and manipulated with similar-looking methods, so whatever users get used to is applicable to both (#657 (comment)). The LightCurve instantiation takes |
Thanks for taking a look @keatonb !
That wasn't part of my code but I believe it's because
I understand your point here -- and that is the most pythonic solution. The problem being that everyone who uses
I agree with this. An issue I found yesterday while testing this PR is that the periodogram will completely ignore the units of the input light curve..! I am also not sure how to get the units of the input lightcurve time.. It looks like they're stripped out? |
Hi everyone! I've had a play around with it. This is definitely more intuitive and accessible. In it's current format, with the Thanks so much for working on this us Geert!! Some further comments:
|
@ojhall94, I don't think I fixed this problem previously, but I do recall saying that I think this is the expected behavior. I would say that either one of the psds above (with or without min/max set) is equally valid and useful to analyze. You're right that we see the bigger difference in the psd rather than amplitude because of the sparser frequency sampling, but there should be differences in both if you look closely. This is one reason that I prefer to oversample when inspecting power spectra. |
Thank you all for the thoughtful comments! Would it be correct to say that most concerns could be addressed if we made the following changes:
Before doing so, I would love to think more about @danhey's suggestion to have separate |
ping @danhey 👋 |
Hi all, sorry it's taking me so long to get around to this. I've pushed some very small fixes to the current implementation which should get it mostly working. At this point, I consider new additions to be secondary to fixing the current bug which displays amplitude as power (as of v1.10). My idea for subclassing would be having AmplitudeSpectrum and PowerSpectralDensity classes. The functionality would still be the same, but it would let us support fitting things like Lorentzians for PSDs in the future. On top of this, I would like to add a normalization option for the window function. |
@danhey That sounds like it may be an effective solution. Please ping me if you need help! |
Based on discussions with @danhey and @ojhall94 at the TASOC workshop this week, the idea has emerged to add
amplitude
andpsd
properties toLombScarglePeriodogram
to enable anyone to use the normalization they prefer.Example
TODO
normalization
parameter.pg.plot()
knows which normalization to use for its Y axis.pg.amplitude
values match a sine with a known amplitude.