Added new astropy.timeseries sub-package #8540
This pull request adds a new astropy.timeseries sub-package to the core package. The aim of this sub-package is to provide core classes for representing time series, using QTable as a foundation. The general design of this sub-package has been discussed in APE9 which is not yet accepted, but the hope is to try and bring this to a conclusion before the 3.2 release.
Those of you who have used Table/QTable with mix-in columns will know that it's already possible to represent time series with a Table and a mix-in Time column. This PR just provides a sub-class of QTable that does this automatically so that this becomes more user-friendly - but fundamentally TimeSeries still has all the flexibility of Table/QTable.
There are two main classes here, TimeSeries and BinnedTimeSeries - there is a detailed discussion on this in APE9, but the short version is that in most cases people just have some quantity vs time and TimeSeries is sufficient for this, but there are cases where being explicit about the binning (e.g. with binned X-ray time series).
Note that we are subclassing from QTable instead of Table since QTable is the way we would have designed the table class from the start if we could, so since this is a new class we have the opportunity to introduce the 'sensible' behavior from the start in relation to columns with units being quantities.
You can find a preview of the documentation at http://astropy-timeseries.readthedocs.org
I think it would be nice to aim to merge this before the 3.2 feature freeze. I am opening this well in advance of the feature freeze to allow people to have time to play with it. I recommend trying out the standalone astropy-timeseries package. If you would like to make improvements, you are welcome to make pull requests to https://github.com/aperiosoftware/astropy-timeseries as I have a script to easily convert that package into the branch for this pull request. None of the API is set in stone, so we can still discuss/work on it!
In any case, feel free to start reviewing this now!
Here are some things that I think we should do, though preferably in a separate pull request
The text was updated successfully, but these errors were encountered:
…r than start/end time
… for Getting Started
* Added entry in stability page of docs * Add documentation about missing values in the pandas conversion * Added Kepler/TESS reader to the API docs * Use real files to demonstrate I/O * Clarify example of .loc * Added a note about duplicates and sorting with vstack * Fix example on analysis page * Clean up formatting and docstrings * Improve example in docs to include units
Thanks everyone for your reviews!
I've done my best to try and address everything, and where I couldn't address it now I've opened new issues and linked to them. I've left a few comments as unresolved above so you can see the answer, but if you are happy with it please mark them as resolved.
Regarding the inclusion of the Kepler/TESS readers - I think that in cases where the format is well defined and documented and versioned, it's ok to have readers/writers in the core package? (after all, for io.ascii we have e.g. the IPAC reader which is reasonably specific but well documented). What I wouldn't want to see in the core package are formats that aren't actually documented. Also since Kepler/TESS time series are quite common at the moment, having these readers might entice more users to start using this subpackage. Does that seem reasonable?
Regarding the astropy-benchmarks - yes we should add some there, but that can wait until after feature freeze. It's also too soon indeed for performance tips I think.
Regarding the periodograms and moving them over to astropy.timeseries, I'll try and do this tomorrow in a separate PR.
@taldcroft - thanks for your detailed comments! Here are some thoughts below:
I see your point, but I don't really like that idea of being able to have a TimeSeries object without a time column for quite a while until eventually e.g. .fold() or another time series-specific method is used. The issue is that since this is mostly a container class with not much functionality, many callers will actually be third-party functions which will be expecting the time column to be there (and we can't decorate them). At the same time, you are right that currently one can e.g. rename or remove the time column without any warning/error, so this is clearly not ideal. My plan is to try and investigate a little more how we could get around this, but I might run out of time for the 3.2 feature freeze. However, arguably the current behavior of being able to rename or remove the time column is a bug, so I believe this is something I could address in a bug fix release in the worst case.
I've gone back and forth on this and I do think dropping down to QTable if the user does e.g.
I can also see the benefit of this, but now for users creating a time series would be a two-step operation instead of a one-step operation. The class method
Likewise, the vector column approach for binned time series is an elegant use of the Time/Table capabilities, but I worry that it's going to also end up being quite confusing for users. It's also not clear how indexing would work in this case.
Having said that, I'm totally on board with being able to use things like
At this point, I'll track down the last failures in CI, but otherwise @pllim and @eteq feel free to take another look and leave more comments if necessary or approve if you are happy with it. I'll be working on moving the periodograms in a separate PR.
A few quick follow-ons from above:
I sympathize with @taldcroft's suggestion here. But I'm not sure I'm a fan of doing it "lazy", because it feels as likely to end up inconsistent as the approach @astrofrog is using for un-decorated methods. Perhaps a safer thing to do is instead centralize all the checking in one method and call it on demand as @taldcroft suggests in the lazy way as well as at the end of the
t = Table.read('junk.ecsv') t.rename_column('col2', 'time') ts = TimeSeries(t)
Because then the
Relatedly I also favor
I really like @taldcroft's idea of the
Ok so I've now cleaned up the checking significantly in e7c1807 - I've added a mechanism to decorate any method that adds/removes columns, and I check for consistency after the method is called. There are a few corner cases where we need to relax or disable the checks - for example when using vstack, an empty time series is temporarily created, so we need to account for that. What I've written up now deals with all these corner cases and I've made sure they are well tested so I think things are a lot more robust now.
I definitely won't have time for this PR
Should be fixed now
No I don't think so. Maybe open an issue for astropy.time if you think this would be important, but here 'format' refers not so much to the precision but the type of string or float representation.