-
Notifications
You must be signed in to change notification settings - Fork 1
Closed
Milestone
Description
The first phase of this project is focused on building a prototype that lays out the basic implementation of the various modules/connector-pieces and provides an initial API for doing time-domain style work. This issue is linked to the status of the project relative to the MVP goals. In general, we would like to have a working package that can execute the following workflow:
from nested_pandas import NestedFrame, read_parquet
from nested_pandas.utils import calc_nobs
#Read in parquet data
nf = read_parquet(
data="objects.parquet",
to_pack={"dia": "dia_sources.parquet", "forced": "forced_sources.parquet"}, ##auto packs these source files
)
# pre-filter on base columns
nf = nf.query("hostgal_photoz > 3.0")
# calculate nobs for timeseries and use it to filter objects
nf = calc_nobs(nf, "dia", by_band=True) # calculates number of observations for the dia struct
#^ may be better in a separate library of apply utilities, keep API minimal
nf = nf.query("dia_nobs_g > 50")
#drop nans from timeseries struct
nf = nf.dropna(subset=["timeseries.flux", "timeseries.mjd"]) # passes the dropna command along to the timeseries struct
# there's ambiguity on what this should return
# select a lightcurve
nf.loc["id"]["dia"]
# Apply a function
nf_r = nf.query("timeseries.band==r")
from light_curve import Periodogram
periodogram = Periodogram()
r_pdgm = nf_r.batch(periodogram, "timeseries.mjd", "timeseries.flux", "timeseries.band") # or apply?
I will make issues for any blockers to this goal, which will link back to this ticket.
Metadata
Metadata
Assignees
Labels
No labels