Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds Power Spectrum example for a timeseries #2496

Merged
merged 23 commits into from
Apr 13, 2018

Conversation

yashkgp
Copy link
Contributor

@yashkgp yashkgp commented Mar 2, 2018

Issue #2483
Here is an example :

import sunpy.timeseries
import sunpy.data.sample
import matplotlib.pyplot as plt
import pandas as pd 
from scipy import signal
ts = sunpy.timeseries.TimeSeries(sunpy.data.sample.GOES_XRS_TIMESERIES)
k,l=ts.power_spectra()
plt.semilogy(k[0], l[0])
plt.show()

image

Apart from adding the unit tests,is anything else needed for this method?

@pep8speaks
Copy link

pep8speaks commented Mar 2, 2018

Hello @yashkgp! Thanks for updating the PR.

Line 12:32: W291 trailing whitespace
Line 21:79: W291 trailing whitespace
Line 22:11: W291 trailing whitespace
Line 27:65: W291 trailing whitespace
Line 29:79: W291 trailing whitespace

Line 138:9: E265 block comment should start with '# '
Line 139:9: E265 block comment should start with '# '
Line 184:13: E221 multiple spaces before operator
Line 216:13: E221 multiple spaces before operator
Line 217:13: E221 multiple spaces before operator
Line 246:101: E501 line too long (138 > 100 characters)
Line 277:16: E221 multiple spaces before operator
Line 281:16: E221 multiple spaces before operator
Line 318:101: E501 line too long (107 > 100 characters)
Line 324:101: E501 line too long (127 > 100 characters)
Line 435:101: E501 line too long (113 > 100 characters)
Line 507:33: E231 missing whitespace after ':'
Line 604:101: E501 line too long (110 > 100 characters)

Comment last updated on April 12, 2018 at 11:42 Hours UTC

Copy link
Member

@dstansby dstansby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if the timeseries data isn't on continuously spaced timestamps?

data_ = self.data[col]
freq, spectra = signal.periodogram(data_, fs)
Freqs.append(freq)
Spectra.append(spectra)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from functools import partial
periodogram_fs = partial(signal.periodogram, fs=fs)
Freqs, Spectra = zip(*map(periodogram_fs, self.data))

This will remove all the loop, it will produce tuples instead of lists though.


Returns
-------
Freqs : array of arrays
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are not returning array but a list of arrays.


Parameter
----------
fs : `float, optional`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optional should be outside the backtick.

@nabobalis
Copy link
Contributor

Hey @yashkgp, sorry for the late reply.

We aren't sure we want to add methods on to time series right now. It sparked quite a discussion about methods in various parts of SunPy. That shake the foundation of SunPy itself.

What we want from this now is if you could turn it into an example we could add to our Gallery? If you have a look at the other examples here for how we format them!

@yashkgp
Copy link
Contributor Author

yashkgp commented Mar 7, 2018

Sure @nabobalis Not a Problem .

@yashkgp yashkgp changed the title Implements Power Spectrum for a TimeSeries Object Adds Power Spectrum example for a timeseries Mar 10, 2018
@yashkgp
Copy link
Contributor Author

yashkgp commented Mar 12, 2018

Hey @nabobalis Could you give it a review?

Copy link
Contributor

@nabobalis nabobalis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there needs to be more explanations and warnings for people who read the example.

@@ -0,0 +1,43 @@
"""
===============================
POWER SPECTRUM OF A TIMESERIES
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not all caps, just normal case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

POWER SPECTRUM OF A TIMESERIES
===============================

An example showing how to estimate power spectrum of a TimeSeries
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a the should between estimate and power. Full stop at the end.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

###############################################################################
# Let's first load a TimeSeries from sample data

ts = sunpy.timeseries.TimeSeries(GOES_XRS_TIMESERIES)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might want to describe the data you will working on to give whoever reads this an idea of what the input is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added a brief description about the input data now . Is it okay ?

ts = sunpy.timeseries.TimeSeries(GOES_XRS_TIMESERIES)

###############################################################################
# We now use scipy's periodogram to estimate the power spectra of the first
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You want to explain why you use the periodogram instead of say the FFT. Also we probably want to mention the limitations or requirements of using these methods. We would like people not to assume they can use these methods on any signal.

I think there is one in Astropy as well now. But I am unsure of what version.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah , just discovered that . Astropy's Periodogram has some amazing functionalities . We should probably switch over to Astropy's Periodogram .

# column of the Timeseries

cols=ts.columns
freq, spectra = signal.periodogram(ts.data[cols[0]], fs=1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is fs=1 correct for the data?

Also I wonder if we could use the false alarm probability in this example as well.

Copy link
Contributor Author

@yashkgp yashkgp Mar 12, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feature is covered in Astropy's periodogram . Should we switch over to Astropy's periodogram ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it works out regarding the version of Astropy it is supported in.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah , it's there in Astropy 3.0

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well we still have to support Astropy 2 until our next release.





Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All these extra lines need to go.


plt.semilogy(freq, spectra)
plt.title('Power Spectrum of {}'.format(cols[0]))
plt.ylabel('Power Spectral Density')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing the unit

@yashkgp
Copy link
Contributor Author

yashkgp commented Mar 12, 2018

Hey @nabobalis , Just discovered that Astropy's Periodogram is much more powerful than scipy's Periodogram . They have full fledged tutorial dedicated to Periodogram http://docs.astropy.org/en/stable/stats/lombscargle.html#.It has various functionalities like frequency grid spacing, false alarm probabilities, normalization etc.We should definitely switch over to Astrpoy's peiodogram . What do you say @nabobalis ?

# We now use scipy's periodogram to estimate the power spectra of the first
# column of the Timeseries

cols=ts.columns
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since probably these examples need to be somehow meaningful I would do the following instead of using a variable called cols:

xrays_short = ts.columns[0]

and change cols[0] for xrays_short everywhere. That makes the code more readable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done .

@yashkgp
Copy link
Contributor Author

yashkgp commented Mar 14, 2018

Hey @nabobalis @dpshelio , I tried switching to Astropy's version of periodogram (yeah, its supported in Astropy 2 ) but its throwing a ValueError

TypeError: ufunc true_divide cannot use operands with types dtype('float64') and dtype('<m8[ns]')

It seems like Astropy's periodogram does accept Pandas Timeseries object . One possible solution might to use Timedelta and convert dtype('<m8[ns]') into float . Is there any easy way to solve this problem ?

@nabobalis
Copy link
Contributor

If you can't get the astropy one to work, It would just be worth noting that its available and linking to it in the example.

ts = sunpy.timeseries.TimeSeries(GOES_XRS_TIMESERIES)
# Data is taken from GOES X-Ray Sensor(XRS), which observes full-disk-integrated
# solar flux in two broadband channels: 1--8 angstrom (long); and
# 0.5--4 angstrom (short)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be moved above the python code.

But what I wanted here was not what is GOES, it was can we use signal analysis on this data. Is it evenly spaced in time, whats its time step, are there any problems that might mean that signal analysis would be tricky etc. We don't want to give the impression you can just singal analysis any signal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Data is not completely evenly sampled,it has 2 outliers .Should I mention this?


###############################################################################
# We now use scipy's periodogram to estimate the power spectra of the first
# column of the Timeseries
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make a link to the astropy version here, saying it's available instead.

cols=ts.columns
freq, spectra = signal.periodogram(ts.data[cols[0]], fs=1)
xray_short = ts.columns[0]
freq, spectra = signal.periodogram(ts.data[xray_short], fs=1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've looked in detail into the data and I don't believe this is correct. fs=1 means the cadence of the data in Hz, the data seems to not be completely evenly sampled!!

>>> sample = np.array([(x - y).total_seconds() for x, y in zip(ts.index[1:], ts.index)])
>>> sample.min(), sample.max(), sample.mean(), sample.std()
  (2.0430000000000001, 14.337000000000002, 2.0485031771623676, 0.077912702228747382)

I don't know how there's a 14.33 seconds gap!!! If you plot that you see it happens twice! This means that this analysis would be wrong all together!

In case the cadence would be 2.043, then I think the fs should be 1/2.043 s = 0.48 Hz

Copy link
Contributor Author

@yashkgp yashkgp Mar 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I clean the data by removing those 2 outliers ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You cannot clean that by removing them... you would have bigger gaps! The only way to "clean" it is to resample the data.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I just change fs to 0.48 and notify the users about the irregularities in data or should I change the input data ?

plt.title('Power Spectrum of {}'.format(cols[0]))
plt.ylabel('Power Spectral Density')
plt.title('Power Spectrum of {}'.format(xray_short))
plt.ylabel('Power Spectral Density(({})**2/Hz)'.format(ts.units[xray_short]))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could use units here as:

import astropy.units as u
plt.ylabel('Power Spectral Density ({:LaTeX})'.format(ts.units[xray_short] ** 2 / u.Hz))

Also, put a space between the Density and the ()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe square brackets for the units are more common: [Hz] instead of (Hz)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this is much cleaner.

plt.title('Power Spectrum of {}'.format(cols[0]))
plt.ylabel('Power Spectral Density')
plt.title('Power Spectrum of {}'.format(xray_short))
plt.ylabel('Power Spectral Density(({})**2/Hz)'.format(ts.units[xray_short]))
plt.xlabel('Frequency(Hz)')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also space between the title and the units.

@nabobalis nabobalis added this to the 0.9 milestone Mar 16, 2018
@yashkgp
Copy link
Contributor Author

yashkgp commented Mar 19, 2018

Hey @nabobalis @dpshelio , I have changed the input data, as the previous one was not evenly sampled . I have switched to RHESSI_TIMESERIES , which is a evenly sampled timeseries with a time step of 4 sec. Could you please give it a review?

import matplotlib.pyplot as plt
import pandas as pd
from scipy import signal

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would remove this blank line.


###############################################################################
# Let's first load a TimeSeries from sample data.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this line.

ts = sunpy.timeseries.TimeSeries(RHESSI_TIMESERIES)

###############################################################################
# We now use scipy's periodogram to estimate the power spectra of the first
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SciPy, also what is the first column?

###############################################################################
# We now use scipy's periodogram to estimate the power spectra of the first
# column of the Timeseries. Astropy's Lomb-Scargle Periodograms can be used
# instead of scipy's periodogram.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SciPy


###############################################################################
# We now use scipy's periodogram to estimate the power spectra of the first
# column of the Timeseries. Astropy's Lomb-Scargle Periodograms can be used
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Link to Astropy's documentation for it.

x_ray = ts.columns[0]
# The suitable value for fs would be 0.25 Hz as the time step is 4 s.
freq, spectra = signal.periodogram(ts.data[xray_short], fs=0.25)
###############################################################################
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blank line here above the ####...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


x_ray = ts.columns[0]
# The suitable value for fs would be 0.25 Hz as the time step is 4 s.
freq, spectra = signal.periodogram(ts.data[xray_short], fs=0.25)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

x_ray

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @nabobalis ,had updated in my local branch , but somehow forgot to push it.

# Plot the power spectrum

plt.semilogy(freq, spectra)
plt.title('Power Spectrum of {}'.format(xray_short))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

x_ray

# Let's first load a TimeSeries from sample data.
# Input data contains 9 columns, which are evenly sampled with a time
# step of 4 seconds. So, the data is suitable for signal analysis.
ts = sunpy.timeseries.TimeSeries(RHESSI_TIMESERIES)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line break above here just to be consistent.

from sunpy.data.sample import RHESSI_TIMESERIES

###############################################################################
# Let's first load a TimeSeries from sample data.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets first load a RHESSI TimeSeries from SunPy's sample data


###############################################################################
# Let's first load a TimeSeries from sample data.
# Input data contains 9 columns, which are evenly sampled with a time
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This data contains

###############################################################################
# Let's first load a TimeSeries from sample data.
# Input data contains 9 columns, which are evenly sampled with a time
# step of 4 seconds. So, the data is suitable for signal analysis.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the last sentence.

ts = sunpy.timeseries.TimeSeries(RHESSI_TIMESERIES)

###############################################################################
# We now use SciPy's periodogram to estimate the power spectra of the first
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that ~scipy.signal.lombscargle should work as a link to the documentation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohh! Thats quite useful. Thanks a lot @nabobalis

# We now use SciPy's periodogram to estimate the power spectra of the first
# column of the Timeseries. The first column contains X-Ray emmisions in the range
# of 3-6 keV. Astropy's Lomb-Scargle Periodograms
# (http://docs.astropy.org/en/v2.0.5/stats/lombscargle.html) can be used
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try ~astropy.stats.LombScargle instead. Sphinx should link to it since we have that setup.

I would rephrase the last sentence as well. An alterantive version is .... or something.

# Start by importing the necessary modules.

import matplotlib.pyplot as plt
import pandas as pd
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pandas is not used

ts = sunpy.timeseries.TimeSeries(RHESSI_TIMESERIES)

###############################################################################
# We now use SciPy's periodogram(~scipy.signal.lombscargle) to estimate the power
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait... signal.periodogram is the same than signal.lombscargle? If so, then you could have used the previous series that was not evenly spaced.

Copy link
Contributor Author

@yashkgp yashkgp Mar 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dpshelio Its not the same , I'll just change it in the next commit .

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @dpshelio , I am not that well acquainted with lomb scargle periodogram . Is there any advantage of using signal.lombscargle over signal.periodogram?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are not the same, that should point to scipy.signal.periodogram instead.

@nabobalis
Copy link
Contributor

Odd how the sphinx interlink hasn't worked. I wonder what is wrong.

ts = sunpy.timeseries.TimeSeries(RHESSI_TIMESERIES)

###############################################################################
# We now use SciPy's periodogram(~scipy.signal.periodogram) to estimate the power
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There needs to be ` on each side for these links to work.

# We now use SciPy's periodogram(~scipy.signal.periodogram) to estimate the power
# spectra of the first column of the Timeseries. The first column contains X-Ray
# emmisions in the range of 3-6 keV. An alterantive version is Astropy's
# Lomb-Scargle Periodograms(~astropy.stats.LombScargle).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There needs to be ` on each side for these links to work.

Copy link
Contributor

@nabobalis nabobalis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than the backticks, it looks good to me.

@yashkgp
Copy link
Contributor Author

yashkgp commented Apr 10, 2018

Sorry for the late update , had my exams going on.

ts = sunpy.timeseries.TimeSeries(RHESSI_TIMESERIES)

###############################################################################
# We now use SciPy's periodogram(`~scipy.signal.periodogram`) to estimate the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the final result, i think we want this to be SciPy's ~scipy.signal.periodogram

# We now use SciPy's periodogram(`~scipy.signal.periodogram`) to estimate the
# power spectra of the first column of the Timeseries. The first column contains
# X-Ray emmisions in the range of 3-6 keV. An alterantive version is Astropy's
# Lomb-Scargle Periodograms(`~astropy.stats.LombScargle`).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar here. Astropy's ~astropy.stats.LombScargle.

Copy link
Contributor

@nabobalis nabobalis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need two minor changes

@nabobalis nabobalis merged commit ae70bda into sunpy:master Apr 13, 2018
@nabobalis
Copy link
Contributor

Thanks @yashkgp!

@yashkgp yashkgp deleted the power_spectra branch April 16, 2018 10:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants