# Get light-curves from ZTF, PS1, and Gaia for SNAD catalog

This notebook demonstrates how to get epoch photometry for a custom pointing catalog,
in this case, the [SNAD catalog](https://snad.space/catalog/).
We will get the SNAD catalog as a pandas dataframe, cross-match it with [PS1 DR2 object (OTMO) "detection"](https://outerspace.stsci.edu/display/PANSTARRS/) catalogs, [ZTF DR20 Zubercal](http://nesssi.cacr.caltech.edu/ZTF/Web/Zuber.html) catalog, and [Gaia DR3 epoch photometry](https://www.cosmos.esa.int/web/gaia/dr3) catalog.

## 1. Install and import required packages

In [1]:
# Install lsdb and matplotlib
%pip install --quiet lsdb matplotlib

Note: you may need to restart the kernel to use updated packages.


In [2]:
import dask.distributed
import matplotlib.pyplot as plt
import nested_pandas as npd
import numpy as np
import pandas as pd
import pyarrow as pa

from lsdb import open_catalog, from_dataframe
from nested_pandas.series.dtype import NestedDtype
from upath import UPath

## 2. Load catalogs

### 2.1. Define paths to catalogs

We access all the data through the web, so we don't need to download it in advance.
However, we are going much more data than we're actually going to use, so it will take some time.

In [3]:
# Define paths to PS1 DR2 catalogs
PS1_PATH = UPath("s3://stpubdata/panstarrs/ps1/public/hats/", anon=True)
PS1_OBJECT = PS1_PATH / "otmo"
PS1_DETECTION = PS1_PATH / "detection"

# Define path to ZTF DR20 Zubercal catalog
ZUBERCAL_PATH = "https://data.lsdb.io/hats/ztf_dr20/zubercal/"

# Define path to Gaia DR3 epoch photometry catalog
GAIA_EPOCH_PHOT_PATH = "https://data.lsdb.io/hats/gaia_dr3_epoch_phot/"

### 2.2. Download and convert SNAD catalog to LSDB Catalog, in memory

[SNAD catalog](https://snad.space/catalog/) is just ~100 rows, we download it through Pandas and convert to LSDB's Catalog object with `lsdb.from_dataframe`.

In [4]:
# Load SNAD catalog, remove rows with missing values and rename columns to more friendly names
snad_df = pd.read_csv(
    "https://snad.space/catalog/snad_catalog.csv",
    dtype_backend="pyarrow",
).dropna()
display(snad_df)

# Convert to LSDB's Catalog object
snad_catalog = from_dataframe(
    snad_df,
    # Optimize partition size
    drop_empty_siblings=True,
    # Keep partitions small
    lowest_order=5,
    # Specify columns with coordinates
    ra_column="R.A.",
    dec_column="Dec.",
)
display(snad_catalog)

Unnamed: 0,Name,R.A.,Dec.,OID,Discovery date (UT),mag,er_down,er_up,ref,er_ref,TNS/VSX,Type,Comments
0,SNAD101,247.45543,24.77282,633207400004730,2018-04-08 09:45:49,21.11,0.27,0.36,20.84,0.06,AT 2018lwh,PSN,ZTF18abqkqdm
1,SNAD102,245.05375,28.3822,633216300024691,2018-03-21 11:08:19,21.18,0.28,0.39,20.26,0.07,AT 2018lwi,PSN,ZTF18abdgwos
...,...,...,...,...,...,...,...,...,...,...,...,...,...
195,SNAD297,254.78806,64.62319,825216400010767,2018-06-30 08:10:55,20.97,0.25,0.33,21.457,0.141,AT2018mqh,PSN,ZTF18abmmpmt
197,SNAD299,212.92509,40.61877,718209300006481,2020-05-06 05:02:10,20.75,0.23,0.29,20.926,0.056,AT2020afiw,AGN,+


Unnamed: 0_level_0,Name,R.A.,Dec.,OID,Discovery date (UT),mag,er_down,er_up,ref,er_ref,TNS/VSX,Type,Comments
npartitions=120,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1
"Order: 7, Pixel: 3996",string[pyarrow],double[pyarrow],double[pyarrow],int64[pyarrow],string[pyarrow],double[pyarrow],double[pyarrow],double[pyarrow],string[pyarrow],double[pyarrow],string[pyarrow],string[pyarrow],string[pyarrow]
"Order: 7, Pixel: 8902",...,...,...,...,...,...,...,...,...,...,...,...,...
...,...,...,...,...,...,...,...,...,...,...,...,...,...
"Order: 5, Pixel: 8184",...,...,...,...,...,...,...,...,...,...,...,...,...
"Order: 7, Pixel: 144952",...,...,...,...,...,...,...,...,...,...,...,...,...


### 2.3. Load PS1 catalog structure and metadata

`lsdb.open_catalog` doesn't load the data immediately, it just reads the metadata and structure of the catalog.
Here we use it to load PS1 DR2 object and detection catalogs, selecting only a few columns to speed up the pipeline.

In [5]:
# Load PS1 catalogs metadata
ps1_object = open_catalog(
    PS1_OBJECT,
    columns=[
        "objID",  # PS1 ID
        "raMean",
        "decMean",  # coordinates to use for cross-matching
        "nStackDetections",  # some other data to use
    ],
)
display(ps1_object)

ps1_detection = open_catalog(
    PS1_DETECTION,
    columns=[
        "objID",  # PS1 object ID
        "detectID",  # PS1 detection ID
        # not really going to use it, but we can alternatively directly cross-match with detection table
        "ra",
        "dec",
        # light-curve stuff
        "obsTime",
        "filterID",
        "psfFlux",
        "psfFluxErr",
    ],
)
display(ps1_detection)

Unnamed: 0_level_0,objID,raMean,decMean,nStackDetections
npartitions=27161,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
"Order: 5, Pixel: 0",int64[pyarrow],double[pyarrow],double[pyarrow],int16[pyarrow]
"Order: 5, Pixel: 1",...,...,...,...
...,...,...,...,...
"Order: 6, Pixel: 49150",...,...,...,...
"Order: 6, Pixel: 49151",...,...,...,...


Unnamed: 0_level_0,objID,detectID,ra,dec,obsTime,filterID,psfFlux,psfFluxErr
npartitions=83004,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
"Order: 6, Pixel: 0",int64[pyarrow],int64[pyarrow],double[pyarrow],double[pyarrow],double[pyarrow],int16[pyarrow],double[pyarrow],double[pyarrow]
"Order: 6, Pixel: 1",...,...,...,...,...,...,...,...
...,...,...,...,...,...,...,...,...
"Order: 6, Pixel: 49150",...,...,...,...,...,...,...,...
"Order: 6, Pixel: 49151",...,...,...,...,...,...,...,...


### 2.4. Load ZTF DR20 Zubercal catalog

Zubercal is a Zuber-calibrated light-curve catalog for ZTF DR20.
It matches ZTF detections to PS1 objects and provides calibrated magnitudes. 

In [6]:
# Load Zubercal metadata
zubercal = open_catalog(
    ZUBERCAL_PATH,
    columns=[
        "objectid",  # matches to PS1 objID
        "mjd",
        "band",
        "mag",
        "magerr",  # integer, units are 1e-4 mag
    ],
)
display(zubercal)

Unnamed: 0_level_0,objectid,mjd,band,mag,magerr,objra,objdec
npartitions=76662,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
"Order: 5, Pixel: 0",int64[pyarrow],double[pyarrow],string[pyarrow],float[pyarrow],uint16[pyarrow],float[pyarrow],float[pyarrow]
"Order: 5, Pixel: 1",...,...,...,...,...,...,...
...,...,...,...,...,...,...,...
"Order: 6, Pixel: 49150",...,...,...,...,...,...,...
"Order: 6, Pixel: 49151",...,...,...,...,...,...,...


### 2.5. Load Gaia DR3 epoch photometry catalog

[Gaia DR3 epoch photometry](https://gea.esac.esa.int/archive/documentation/GDR3/Gaia_archive/chap_datamodel/sec_dm_photometry/ssec_dm_epoch_photometry.html) provides multi-band (G, BP, RP) time-series photometry for ~12 million sources.
The data is stored in a "wide" format with separate arrays per band (e.g. `g_transit_time`, `bp_obs_time`, `rp_obs_time`).
We will later transform this into a standard long-format light curve.

In [7]:
# Load Gaia DR3 epoch photometry metadata
gaia_epoch_phot = open_catalog(
    GAIA_EPOCH_PHOT_PATH,
    columns=[
        "source_id",
        "ra",
        "dec",
        # G-band epoch photometry
        "epoch_photometry.g_transit_time",
        "epoch_photometry.g_transit_mag",
        "epoch_photometry.g_transit_flux_over_error",
        "epoch_photometry.variability_flag_g_reject",
        # BP-band epoch photometry
        "epoch_photometry.bp_obs_time",
        "epoch_photometry.bp_mag",
        "epoch_photometry.bp_flux_over_error",
        "epoch_photometry.variability_flag_bp_reject",
        # RP-band epoch photometry
        "epoch_photometry.rp_obs_time",
        "epoch_photometry.rp_mag",
        "epoch_photometry.rp_flux_over_error",
        "epoch_photometry.variability_flag_rp_reject",
    ],
)
display(gaia_epoch_phot)

Unnamed: 0_level_0,source_id,ra,dec,epoch_photometry
npartitions=1305,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
"Order: 2, Pixel: 0",int64[pyarrow],double[pyarrow],double[pyarrow],"nested<g_transit_time: [double], g_transit_mag..."
"Order: 2, Pixel: 1",...,...,...,...
...,...,...,...,...
"Order: 3, Pixel: 766",...,...,...,...
"Order: 3, Pixel: 767",...,...,...,...


## 3. Plan cross-matching and joining

LSDB doesn't do any work until you call `compute` method, so here we just plan the work.

### 3.1. PS1 and ZTF pipeline

1. We find PS1 objects that are within 0.2 arcsec of SNAD objects.
2. Join Zubercal catalog to aggregate ZTF light-curves. Each light-curve would be represented by a "nested" dataframe, where each row is a detection.
3. Do the same for PS1 detections.  

In [8]:
# Planning cross-matching with objects, no work happens here
snad_ps1 = snad_catalog.crossmatch(
    ps1_object,
    radius_arcsec=0.2,
    suffixes=["", ""],
    suffix_method="overlapping_columns",
)

# Join with Zubercal detections to get one more set of light-curves
snad_ztf_lc = snad_ps1.join_nested(
    zubercal,
    left_on="objID",
    right_on="objectid",
    # light-curve will live in "ztf_lc" column
    nested_column_name="ztf_lc",
    output_catalog_name="snad_ztf_lc",
)

# Join with PS1 detections to get light-curves
snad_ps1_ztf_lc = snad_ztf_lc.join_nested(
    ps1_detection,
    left_on="objID",
    right_on="objID",
    # light-curve will live in "ps1_lc" column
    nested_column_name="ps1_lc",
    output_catalog_name="snad_ps1_ztf_lc",
)

display(snad_ps1_ztf_lc)

Unnamed: 0_level_0,Name,R.A.,Dec.,OID,Discovery date (UT),mag,er_down,er_up,ref,er_ref,TNS/VSX,Type,Comments,objID,raMean,decMean,nStackDetections,_dist_arcsec,ztf_lc,ps1_lc
npartitions=129,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1
"Order: 7, Pixel: 3996",string[pyarrow],double[pyarrow],double[pyarrow],int64[pyarrow],string[pyarrow],double[pyarrow],double[pyarrow],double[pyarrow],string[pyarrow],double[pyarrow],string[pyarrow],string[pyarrow],string[pyarrow],int64[pyarrow],double[pyarrow],double[pyarrow],int16[pyarrow],double[pyarrow],"nested<mjd: [double], band: [string], mag: [fl...","nested<detectID: [int64], ra: [double], dec: [..."
"Order: 7, Pixel: 8902",...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
"Order: 7, Pixel: 130954",...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
"Order: 7, Pixel: 144952",...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...


### 3.2. Gaia DR3 epoch photometry pipeline

We cross-match SNAD with Gaia DR3 epoch photometry by coordinates.
The epoch photometry data comes in a "wide" format: separate arrays per band (G, BP, RP) for times, magnitudes, and quality flags.
We define a UDF to transform this into a standard "long" format with columns: `mjd`, `mag` (in AB), `mag_err`, and `band`.

The UDF also applies quality filtering using `flux_over_error` (signal-to-noise ratio) to reject unreliable measurements.

In [9]:
# Gaia DR3 epoch photometry reference epoch: BJD 2455197.5 (TCB) = 2010-01-01
# Times in the catalog are BJD - 2455197.5, so MJD = time + 2455197.5 - 2400000.5
GAIA_TIME_OFFSET_TO_MJD = 55197.0

# Approximate Vega-to-AB magnitude offsets for Gaia DR3 passbands
# See https://www.cosmos.esa.int/web/gaia/dr3-passbands
GAIA_VEGA_TO_AB = {"G": 0.126, "BP": 0.024, "RP": 0.383}


def parse_gaia_epoch_photometry(
    g_time,
    g_mag,
    g_s2n,
    g_flag,
    bp_time,
    bp_mag,
    bp_s2n,
    bp_flag,
    rp_time,
    rp_mag,
    rp_s2n,
    rp_flag,
    *,
    min_s2n=3.0,
):
    """Transform Gaia DR3 epoch photometry from wide per-band arrays to a
    standard long-format light curve with quality filtering.

    Filters observations by flux_over_error > min_s2n,
    converts times to MJD, and converts magnitudes from Vega to AB system.
    """
    mjds, mags, mag_errs, bands = [], [], [], []
    for band, time_arr, mag_arr, s2n_arr in [
        ("G", g_time, g_mag, g_s2n, g_flag),
        ("BP", bp_time, bp_mag, bp_s2n, bp_flag),
        ("RP", rp_time, rp_mag, rp_s2n, rp_flag),
    ]:
        # time_arr = np.asarray(time_arr, dtype=np.float64)
        # mag_arr = np.asarray(mag_arr, dtype=np.float64)
        # s2n_arr = np.asarray(s2n_arr, dtype=np.float64)
        flag_arr = g_flag | bp_flag | rp_flag

        # Quality filter: keep only observations with sufficient signal-to-noise
        good = (s2n_arr > min_s2n) & flag_arr
        t = time_arr[good] + GAIA_TIME_OFFSET_TO_MJD
        m = mag_arr[good] + GAIA_VEGA_TO_AB[band]
        # mag_err ≈ 1.0857 / flux_over_error (from error propagation of -2.5*log10(flux))
        m_err = 2.5 / np.log(10.0) / s2n_arr[good]

        mjds.append(t)
        mags.append(m)
        mag_errs.append(m_err)
        bands.extend([band] * len(t))

    return {
        "gaia_lc.mjd": np.concatenate(mjds) if mjds else np.array([], dtype=np.float64),
        "gaia_lc.mag": np.concatenate(mags) if mags else np.array([], dtype=np.float64),
        "gaia_lc.mag_err": np.concatenate(mag_errs) if mag_errs else np.array([], dtype=np.float64),
        "gaia_lc.band": bands,
    }

In [10]:
# Cross-match SNAD with Gaia DR3 epoch photometry
snad_gaia = snad_catalog.crossmatch(
    gaia_epoch_phot,
    radius_arcsec=1.0,
    suffixes=["", ""],
    suffix_method="overlapping_columns",
)

# Define the output metadata for the parsed light curve nested column
gaia_lc_dtype = NestedDtype(
    pa.struct(
        [
            pa.field("mjd", pa.list_(pa.float64())),
            pa.field("mag", pa.list_(pa.float64())),
            pa.field("mag_err", pa.list_(pa.float64())),
            pa.field("band", pa.list_(pa.string())),
        ]
    )
)

# Apply the UDF to transform the per-band epoch photometry into a standard light curve
snad_gaia_lc = snad_gaia.map_rows(
    parse_gaia_epoch_photometry,
    columns=[
        "epoch_photometry.g_transit_time",
        "epoch_photometry.g_transit_mag",
        "epoch_photometry.g_transit_flux_over_error",
        "epoch_photometry.variability_flag_g_reject",
        "epoch_photometry.bp_obs_time",
        "epoch_photometry.bp_mag",
        "epoch_photometry.bp_flux_over_error",
        "epoch_photometry.variability_flag_bp_reject",
        "epoch_photometry.rp_obs_time",
        "epoch_photometry.rp_mag",
        "epoch_photometry.rp_flux_over_error",
        "epoch_photometry.variability_flag_rp_reject",
    ],
    row_container="args",
    append_columns=True,
    infer_nesting=True,
    meta=npd.NestedFrame({"gaia_lc": pd.Series([], dtype=gaia_lc_dtype)}),
).map_partitions(lambda df: df.drop(columns=["epoch_photometry"]))

display(snad_gaia_lc)

Unnamed: 0_level_0,Name,R.A.,Dec.,OID,Discovery date (UT),mag,er_down,er_up,ref,er_ref,TNS/VSX,Type,Comments,source_id,ra,dec,_dist_arcsec,gaia_lc
npartitions=109,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1
"Order: 7, Pixel: 3996",string[pyarrow],double[pyarrow],double[pyarrow],int64[pyarrow],string[pyarrow],double[pyarrow],double[pyarrow],double[pyarrow],string[pyarrow],double[pyarrow],string[pyarrow],string[pyarrow],string[pyarrow],int64[pyarrow],double[pyarrow],double[pyarrow],double[pyarrow],"nested<mjd: [double], mag: [double], mag_err: ..."
"Order: 7, Pixel: 22298",...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
"Order: 5, Pixel: 8184",...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
"Order: 7, Pixel: 144952",...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...


## 4. Run the pipeline!

Here we finally run the pipeline and get the light-curves.
This requires Dask client for parallel execution, and you probably want to adjust the parameters to your hardware.

We compute two pipelines:
1. PS1/ZTF: produces `ndf` with `ps1_lc` and `ztf_lc` nested columns.
2. Gaia: produces `ndf_gaia` with `gaia_lc` nested column (parsed from the wide per-band format into a standard long format).

We then merge the results on the SNAD object name.

In [11]:
%%time

# It will take a while to fetch all the data from the Internet
with dask.distributed.Client() as client:
    display(client)
    ndf = snad_ps1_ztf_lc.compute()
    ndf_gaia = snad_gaia_lc.compute()

# Merge PS1/ZTF and Gaia results on SNAD Name
ndf = ndf.merge(
    ndf_gaia[["Name", "source_id", "gaia_lc"]],
    on="Name",
    how="inner",
)

display(ndf)

0,1
Connection method: Cluster object,Cluster type: distributed.LocalCluster
Dashboard: http://127.0.0.1:8787/status,

0,1
Dashboard: http://127.0.0.1:8787/status,Workers: 4
Total threads: 12,Total memory: 32.00 GiB
Status: running,Using processes: True

0,1
Comm: tcp://127.0.0.1:55899,Workers: 0
Dashboard: http://127.0.0.1:8787/status,Total threads: 0
Started: Just now,Total memory: 0 B

0,1
Comm: tcp://127.0.0.1:55912,Total threads: 3
Dashboard: http://127.0.0.1:55915/status,Memory: 8.00 GiB
Nanny: tcp://127.0.0.1:55902,
Local directory: /var/folders/w1/lh3h4s7d5g10rdlfj4h0mshw0000gn/T/dask-scratch-space/worker-44eevhas,Local directory: /var/folders/w1/lh3h4s7d5g10rdlfj4h0mshw0000gn/T/dask-scratch-space/worker-44eevhas

0,1
Comm: tcp://127.0.0.1:55913,Total threads: 3
Dashboard: http://127.0.0.1:55917/status,Memory: 8.00 GiB
Nanny: tcp://127.0.0.1:55904,
Local directory: /var/folders/w1/lh3h4s7d5g10rdlfj4h0mshw0000gn/T/dask-scratch-space/worker-j5359ilm,Local directory: /var/folders/w1/lh3h4s7d5g10rdlfj4h0mshw0000gn/T/dask-scratch-space/worker-j5359ilm

0,1
Comm: tcp://127.0.0.1:55911,Total threads: 3
Dashboard: http://127.0.0.1:55918/status,Memory: 8.00 GiB
Nanny: tcp://127.0.0.1:55906,
Local directory: /var/folders/w1/lh3h4s7d5g10rdlfj4h0mshw0000gn/T/dask-scratch-space/worker-aoa5er02,Local directory: /var/folders/w1/lh3h4s7d5g10rdlfj4h0mshw0000gn/T/dask-scratch-space/worker-aoa5er02

0,1
Comm: tcp://127.0.0.1:55914,Total threads: 3
Dashboard: http://127.0.0.1:55916/status,Memory: 8.00 GiB
Nanny: tcp://127.0.0.1:55908,
Local directory: /var/folders/w1/lh3h4s7d5g10rdlfj4h0mshw0000gn/T/dask-scratch-space/worker-m2gjl1bs,Local directory: /var/folders/w1/lh3h4s7d5g10rdlfj4h0mshw0000gn/T/dask-scratch-space/worker-m2gjl1bs


This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
2026-02-12 14:17:35,319 - distributed.worker - ERROR - Compute Failed
Key:       ('read_pixel-_to_string_dtype-nestedframe-59a802b23a510d81d16ff2c7da8ff129', 13932)
State:     executing
Task:  <Task ('read_pixel-_to_string_dtype-nestedframe-59a802b23a510d81d16ff2c7da8ff129', 13932) _execute_subgraph(...)>
Exception: 'TypeError("\'ClientConnectorError\' object is not subscriptable")'
Traceback: '  File "/Users/hombit/.virtualenvs/lsdb/lib/python3.14/site-packages/dask/dataframe/core.py", line 98, in apply_and_enforce\n    df = func(*args, **kwargs)\n  File "/Users/hombit/projects/lincc-frameworks/lsdb/src/lsdb/loaders/hats/read_hats.py", line 499, in read_pixel\n    return _read_parquet_file(\n        path_generator(\n    .

CPU times: user 31.2 s, sys: 6.11 s, total: 37.3 s
Wall time: 7min 19s


TypeError: 'ClientConnectorError' object is not subscriptable

## 5. Plot first five light curves

Here we plot light curves from all three surveys (PS1, ZTF, and Gaia).

`NestedFrame` from `nested-pandas` ([docs](https://nested-pandas.readthedocs.org)) is just a wrapper around pandas `DataFrame`, so all you know about pandas applies here.
Every light-curve is packed into items of `ps1_lc`, `ztf_lc`, and `gaia_lc` columns.
When you are getting a single element from a nested dataframe, you get a pandas dataframe!

In [None]:
ps1_filter_id_to_name = {1: "g", 2: "r", 3: "i", 4: "z", 5: "y"}
get_ps1_name_from_filter_id = np.vectorize(ps1_filter_id_to_name.get)
filter_colors = {
    "g": "green",
    "r": "red",
    "i": "black",
    "z": "purple",
    "y": "cyan",
    "G": "olive",
    "BP": "steelblue",
    "RP": "firebrick",
}

ndf_sorted = ndf.sort_values("Name")

for i in range(len(ndf_sorted)):
    row = ndf_sorted.iloc[i]
    snad_name = row["Name"]
    ps1_lc = row["ps1_lc"]
    ztf_lc = row["ztf_lc"]
    gaia_lc = row["gaia_lc"]

    plt.figure(figsize=(12, 6))

    # --- PS1 ---
    ps1_bands = get_ps1_name_from_filter_id(ps1_lc["filterID"])
    for band in "grizy":
        color = filter_colors[band]
        band_idx = ps1_bands == band
        t = ps1_lc["obsTime"][band_idx]
        flux = ps1_lc["psfFlux"][band_idx] * 1e6  # micro Jy
        err = ps1_lc["psfFluxErr"][band_idx] * 1e6
        mag = 8.9 - 2.5 * np.log10(flux / 1e6)
        mag_plus = 8.9 - 2.5 * np.log10((flux - err) / 1e6)
        mag_minus = 8.9 - 2.5 * np.log10((flux + err) / 1e6)
        plt.scatter(t, mag, marker="*", color=color, label=f"PS1 {band}", alpha=0.3)
        plt.errorbar(
            t, mag, [mag - mag_minus, mag_plus - mag], ls="", color=color, alpha=0.15
        )

    # --- ZTF ---
    ztf_bands = ztf_lc["band"]
    for band in "gri":
        band_idx = ztf_bands == band
        color = filter_colors[band]
        t = ztf_lc["mjd"][band_idx]
        mag = ztf_lc["mag"][band_idx]
        magerr = np.asarray(ztf_lc["magerr"][band_idx], dtype=float) * 1e-4
        plt.scatter(t, mag, marker="x", color=color, label=f"ZTF {band}", alpha=0.3)
        plt.errorbar(t, mag, magerr, ls="", color=color, alpha=0.15)

    # --- Gaia DR3 ---
    if isinstance(gaia_lc, pd.DataFrame) and len(gaia_lc) > 0:
        gaia_bands = gaia_lc["band"]
        for band in ["G", "BP", "RP"]:
            band_idx = gaia_bands == band
            color = filter_colors[band]
            t = gaia_lc["mjd"][band_idx]
            mag = gaia_lc["mag"][band_idx]
            mag_err = gaia_lc["mag_err"][band_idx]
            plt.scatter(
                t, mag, marker="d", color=color, label=f"Gaia {band}", alpha=0.4
            )
            plt.errorbar(t, mag, mag_err, ls="", color=color, alpha=0.2)

    plt.title(snad_name)
    plt.xlabel("MJD")
    plt.ylabel("mag (AB)")
    plt.gca().invert_yaxis()
    plt.legend(loc="upper left")

We observe that some objects are missing—this is because Pan-STARRS lacked information about the transients, and the host center is either too faint or too distant from the transient.

None of these objects were active in Pan-STARRS, but they were active in ZTF, confirming that they are robust supernova candidates. Gaia DR3 epoch photometry provides additional pre-outburst and outburst data in G, BP, and RP bands when available, offering complementary wavelength coverage.

## About

**Authors**: Konstantin Malanchev

**Last updated on**: Feb 10, 2026

If you use `lsdb` for published research, please cite following [instructions](https://docs.lsdb.io/en/stable/citation.html).