# Convert Nested Lightcurves to Single Observations with `explode`

In this tutorial, we demonstrate how to use the `explode` method from `nested_pandas` to convert nested lightcurves into a table of single observations. This is particularly useful for time-series analysis where each observation needs to be treated individually. Like with other pandas operations, `explode` can be applied to `lsdb` catalogs using the `map_partitions` method.

First, we'll load the ZTF DR23 lightcurve catalog.

In [1]:
import lsdb

In [2]:
ztf = lsdb.open_catalog('s3://ipac-irsa-ztf/contributed/dr23/lc/hats')
ztf

Unnamed: 0_level_0,objectid,filterid,objra,objdec,lightcurve
npartitions=9933,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
"Order: 4, Pixel: 0",int64[pyarrow],int8[pyarrow],float[pyarrow],float[pyarrow],"nested<hmjd: [double], mag: [float], magerr: [..."
"Order: 4, Pixel: 1",...,...,...,...,...
...,...,...,...,...,...
"Order: 5, Pixel: 12286",...,...,...,...,...
"Order: 5, Pixel: 12287",...,...,...,...,...


In [3]:
ztf.head()

Unnamed: 0_level_0,objectid,filterid,objra,objdec,lightcurve
hmjd,mag,magerr,clrcoeff,catflags,Unnamed: 5_level_1
hmjd,mag,magerr,clrcoeff,catflags,Unnamed: 5_level_2
hmjd,mag,magerr,clrcoeff,catflags,Unnamed: 5_level_3
hmjd,mag,magerr,clrcoeff,catflags,Unnamed: 5_level_4
hmjd,mag,magerr,clrcoeff,catflags,Unnamed: 5_level_5
184342612410390,1447212400010477,2,44.042023,1.264162,hmjd  mag  magerr  clrcoeff  catflags  58761.42485  20.727491  0.2088  0.124223  0  +0 rows  ...  ...  ...  ...
hmjd,mag,magerr,clrcoeff,catflags,
58761.42485,20.727491,0.2088,0.124223,0,
+0 rows,...,...,...,...,
189475338943485,1447212400010480,2,44.006325,1.263639,hmjd  mag  magerr  clrcoeff  catflags  58761.42485  20.928879  0.226445  0.124223  0  +13 rows  ...  ...  ...  ...
hmjd,mag,magerr,clrcoeff,catflags,
58761.42485,20.928879,0.226445,0.124223,0,
+13 rows,...,...,...,...,
171958220169309,1447212400010486,2,44.685963,1.265697,hmjd  mag  magerr  clrcoeff  catflags  58740.4977  20.131371  0.154903  0.126351  0  +0 rows  ...  ...  ...  ...
hmjd,mag,magerr,clrcoeff,catflags,

hmjd,mag,magerr,clrcoeff,catflags
58761.42485,20.727491,0.2088,0.124223,0
+0 rows,...,...,...,...

hmjd,mag,magerr,clrcoeff,catflags
58761.42485,20.928879,0.226445,0.124223,0
+13 rows,...,...,...,...

hmjd,mag,magerr,clrcoeff,catflags
58740.4977,20.131371,0.154903,0.126351,0
+0 rows,...,...,...,...

hmjd,mag,magerr,clrcoeff,catflags
58356.39099,18.124929,0.033738,0.124563,0
+61 rows,...,...,...,...

hmjd,mag,magerr,clrcoeff,catflags
58787.27224,20.190992,0.160638,0.13095,0
+2 rows,...,...,...,...


We can see that the `lightcurve` column contains nested data, which can be accessed through the features of nested-pandas. It's also possible to convert this into a flat structure where each row corresponds to a single observation, like traditional lightcurve tables. To do this, we can use the [`explode` method from nested-pandas](https://nested-pandas.readthedocs.io/en/latest/reference/api/nested_pandas.NestedFrame.explode.html), applying it to the `lightcurve` column within a lsdb catalog using `map_partitions`. We'll need to define a function that applies `explode` to each partition of the catalog, and then map this function across all partitions.

In [4]:
def explode_lcs(partition):
    return partition.explode("lightcurve")

exploded_cat = ztf.map_partitions(explode_lcs)
exploded_cat

Unnamed: 0_level_0,objectid,filterid,objra,objdec,hmjd,mag,magerr,clrcoeff,catflags
npartitions=9933,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
"Order: 4, Pixel: 0",int64[pyarrow],int8[pyarrow],float[pyarrow],float[pyarrow],double[pyarrow],float[pyarrow],float[pyarrow],float[pyarrow],int32[pyarrow]
"Order: 4, Pixel: 1",...,...,...,...,...,...,...,...,...
...,...,...,...,...,...,...,...,...,...
"Order: 5, Pixel: 12286",...,...,...,...,...,...,...,...,...
"Order: 5, Pixel: 12287",...,...,...,...,...,...,...,...,...


In [5]:
exploded_cat.head(10)

Unnamed: 0_level_0,objectid,filterid,objra,objdec,hmjd,mag,magerr,clrcoeff,catflags
_healpix_29,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
184342612410390,1447212400010477,2,44.042023,1.264162,58761.42485,20.727491,0.2088,0.124223,0
189475338943485,1447212400010480,2,44.006325,1.263639,58761.42485,20.928879,0.226445,0.124223,0
189475338943485,1447212400010480,2,44.006325,1.263639,58761.42531,20.858665,0.220293,0.126322,0
189475338943485,1447212400010480,2,44.006325,1.263639,58773.33318,20.28091,0.169244,0.101843,0
189475338943485,1447212400010480,2,44.006325,1.263639,58775.35076,20.71442,0.207655,0.107146,0
189475338943485,1447212400010480,2,44.006325,1.263639,58777.43861,20.532925,0.191753,0.120748,0
189475338943485,1447212400010480,2,44.006325,1.263639,58812.29228,20.832573,0.218007,0.113882,0
189475338943485,1447212400010480,2,44.006325,1.263639,58861.17844,20.225426,0.163945,0.125741,0
189475338943485,1447212400010480,2,44.006325,1.263639,59061.45527,20.921619,0.225809,0.12306,0
189475338943485,1447212400010480,2,44.006325,1.263639,59090.5126,20.474043,0.186864,0.115983,32768


We can see that the resulting catalog has a flat structure, with each row representing a single observation from the original nested lightcurves, along with keeping the associated object-level columns. While nested-pandas provides a convenient and efficient way to analyze and work with nested data like lightcurves, this exploded format may be more familiar for use with traditional time-series analysis workflows and packages that expect flat tables.