# Accessing Rubin Data Preview 1 (DP1)

In this tutorial, we will:

- access Rubin's Data Preview 1 with LSDB
  * at RSP (Rubin Science Platform), based in USA, UK and [Poland coming soon]
  * at NERSC (National Energy Research Scientific Computing Center) for the [LSST DESC](https://lsstdesc.org) members
  * at CANFAR (Canadian Advanced Network for Astronomical Research), Canadian Independent Data Access Center
  * at LIneA (Laborat√≥rio Interinstitucional de e-Astronomia), Brazilian Independent Data Access Center

## Introduction

### Prerequisites

In order to access Rubin data, you must be a [Rubin data rights holder](https://rubinobservatory.org/for-scientists/data-products/data-policy).


## 1. Accessing the data on Rubin Science Platform (RSP)

### 1.1 Prepare your RSP container

Visit https://data.lsst.cloud, unless you are accessing RSP through UK IDAC participation program - in that case, visit https://rsp.lsst.ac.uk. Log in using your identity provider. Once in, you will see Portal, Notebooks, and APIs.  Choose Notebooks.

When it asks you what container to start, choose "Recommended" on the left, and "Large" on the right.

Once this has started, create a new notebook.

#### 1.1.1 Ensure your notebook kernel has the right version of lsdb

Make sure you've got at *least* version 0.6.3 of lsdb.  Try the following.

In [1]:
import lsdb

lsdb.__version__

'0.7.4.dev37+gcc1d426c9'

If the above does *not* work, then install/upgrade lsdb for your container.  Be sure to precede this command
with the `%` so that it will apply to *this notebook kernel*:

In [None]:
%pip install -U lsdb

You will probably see some version of these errors.  **You can safely ignore them.**

```
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
schedview 0.17.0 requires lsst-resources, which is not installed.
nested-dask 0.3.4 requires nested-pandas<0.4.0,>=0.3.1, but you have nested-pandas 0.4.5 which is incompatible.
```

#### 1.1.2 Restart your notebook kernel

Next, be sure to *restart the kernel*.  You've made a change to the kernel's dependencies, but you won't pick it up
until the kernel has been restarted.  Then re-import lsdb and verify the version.

In [2]:
import lsdb

lsdb.__version__

'0.7.4.dev37+gcc1d426c9'

### 1.2. Create a Dask Client

In [3]:
# Dask puts out more advisory logging than we care for in this tutorial.
# It takes some doing to quiet all of it, but this recipe works.

import dask

dask.config.set({"logging.distributed": "critical"})

import logging

# This also has to be done, for the above to be effective
logger = logging.getLogger("distributed")
logger.setLevel(logging.CRITICAL)

import warnings

# Finally, suppress the specific warning about Dask dashboard port usage
warnings.filterwarnings("ignore", message="Port 8787 is already in use.")

In [4]:
from dask.distributed import Client

client = Client(n_workers=4, threads_per_worker=1, memory_limit="auto")
client

0,1
Connection method: Cluster object,Cluster type: distributed.LocalCluster
Dashboard: https://olynn.nb.data.lsst.cloud/nb/user/olynn/proxy/8787/status,

0,1
Dashboard: https://olynn.nb.data.lsst.cloud/nb/user/olynn/proxy/8787/status,Workers: 4
Total threads: 4,Total memory: 16.00 GiB
Status: running,Using processes: True

0,1
Comm: tcp://127.0.0.1:45451,Workers: 0
Dashboard: https://olynn.nb.data.lsst.cloud/nb/user/olynn/proxy/8787/status,Total threads: 0
Started: Just now,Total memory: 0 B

0,1
Comm: tcp://127.0.0.1:44687,Total threads: 1
Dashboard: https://olynn.nb.data.lsst.cloud/nb/user/olynn/proxy/40435/status,Memory: 4.00 GiB
Nanny: tcp://127.0.0.1:42397,
Local directory: /tmp/dask-scratch-space/worker-ph0r4wtf,Local directory: /tmp/dask-scratch-space/worker-ph0r4wtf

0,1
Comm: tcp://127.0.0.1:45113,Total threads: 1
Dashboard: https://olynn.nb.data.lsst.cloud/nb/user/olynn/proxy/40265/status,Memory: 4.00 GiB
Nanny: tcp://127.0.0.1:37497,
Local directory: /tmp/dask-scratch-space/worker-nv9v1bwz,Local directory: /tmp/dask-scratch-space/worker-nv9v1bwz

0,1
Comm: tcp://127.0.0.1:43561,Total threads: 1
Dashboard: https://olynn.nb.data.lsst.cloud/nb/user/olynn/proxy/36253/status,Memory: 4.00 GiB
Nanny: tcp://127.0.0.1:46295,
Local directory: /tmp/dask-scratch-space/worker-9wuv5yc4,Local directory: /tmp/dask-scratch-space/worker-9wuv5yc4

0,1
Comm: tcp://127.0.0.1:42879,Total threads: 1
Dashboard: https://olynn.nb.data.lsst.cloud/nb/user/olynn/proxy/36043/status,Memory: 4.00 GiB
Nanny: tcp://127.0.0.1:43837,
Local directory: /tmp/dask-scratch-space/worker-0c7wbd8z,Local directory: /tmp/dask-scratch-space/worker-0c7wbd8z


Your Dask dashboard will be accessible at `https://{username}.nb.data.lsst.cloud/nb/user/{username}/proxy/{port}/status`.

### 1.3 Opening a Catalog

The data is divided into `objects` and `dia_objects`.  Let's open both catalogs:

In [5]:
from upath import UPath

base_path = UPath("/rubin/lsdb_data")

object_cat = lsdb.open_catalog(base_path / "object_collection")
dia_object_cat = lsdb.open_catalog(base_path / "dia_object_collection")

In [6]:
object_cat

Unnamed: 0_level_0,coord_dec,coord_decErr,coord_ra,coord_raErr,g_psfFlux,g_psfFluxErr,g_psfMag,g_psfMagErr,i_psfFlux,i_psfFluxErr,i_psfMag,i_psfMagErr,objectId,patch,r_psfFlux,r_psfFluxErr,r_psfMag,r_psfMagErr,refBand,refFwhm,shape_flag,shape_xx,shape_xy,shape_yy,tract,u_psfFlux,u_psfFluxErr,u_psfMag,u_psfMagErr,x,xErr,y,y_psfFlux,y_psfFluxErr,y_psfMag,y_psfMagErr,yErr,z_psfFlux,z_psfFluxErr,z_psfMag,z_psfMagErr,objectForcedSource
npartitions=389,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1
"Order: 6, Pixel: 130",double[pyarrow],float[pyarrow],double[pyarrow],float[pyarrow],float[pyarrow],float[pyarrow],float[pyarrow],float[pyarrow],float[pyarrow],float[pyarrow],float[pyarrow],float[pyarrow],int64[pyarrow],int64[pyarrow],float[pyarrow],float[pyarrow],float[pyarrow],float[pyarrow],string[pyarrow],float[pyarrow],bool[pyarrow],float[pyarrow],float[pyarrow],float[pyarrow],int64[pyarrow],float[pyarrow],float[pyarrow],float[pyarrow],float[pyarrow],double[pyarrow],float[pyarrow],double[pyarrow],float[pyarrow],float[pyarrow],float[pyarrow],float[pyarrow],float[pyarrow],float[pyarrow],float[pyarrow],float[pyarrow],float[pyarrow],"nested<coord_ra: [double], coord_dec: [double]..."
"Order: 8, Pixel: 2176",...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
"Order: 9, Pixel: 2302101",...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
"Order: 7, Pixel: 143884",...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...


In [7]:
dia_object_cat

Unnamed: 0_level_0,dec,diaObjectId,nDiaSources,ra,radecMjdTai,tract,diaObjectForcedSource,diaSource
npartitions=208,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
"Order: 6, Pixel: 130",double[pyarrow],int64[pyarrow],int64[pyarrow],double[pyarrow],double[pyarrow],int64[pyarrow],"nested<band: [string], coord_dec: [double], co...","nested<band: [string], centroid_flag: [bool], ..."
"Order: 6, Pixel: 136",...,...,...,...,...,...,...,...
...,...,...,...,...,...,...,...,...
"Order: 11, Pixel: 36833621",...,...,...,...,...,...,...,...
"Order: 7, Pixel: 143884",...,...,...,...,...,...,...,...


### 1.4 Downloading data to your machine

If you need to work on the data on your own machine, you can `scp` your data from the container to your own machine.
Suppose you have an account named `myself` on your machine named `big-box.astro.somewhere.edu`, and the data is in
a directory called `./some_data`. The below command will copy that directory to one of the same name in your home
directory on your machine.

```shell
scp -r ./some_data myself@big-box.astro.somewhere.edu:some_data
```

## 2. Accessing the data at NERSC (Perlmutter)

If you are a part of the LSST DESC collaboration and have a NERSC account, you can access Rubin DP1 via Perlmutter cluster.
You can use both batch jobs and jupyter.nersc.gov, bellow we assume that you use NERSC's Jupyter Hub.

### 2.1 Launch Jupyter

Login to NERSC at https://jupyter.nersc.gov. Select "Login Node" for data exploration, configuration and code development. Use "Exclusive CPU Node" for larger tasks, such as full-catalog analysis.

Please also see NERSC documentation for [Dask configuration](https://gitlab.com/NERSC/nersc-notebooks/-/tree/main/perlmutter/dask).

### 2.2a Start kernel with LSDB

LSDB is available in a Jupyter kernel `desc-td_env-dev`.

If you haven't already set up the DESC Jupyter kernels at NERSC, run the one-time setup step on the Perlmutter command line:

```bash
source /global/common/software/lsst/common/miniconda/kernels/setup.sh
```

Then the next time you start up jupyter.nersc.gov, you'll have access to a few `desc-*` jupyter kernels. More information can be found [here](https://github.com/LSSTDESC/td_env/issues/111#issuecomment-3534996204).



### 2.2b Alternative: Install LSDB

For conda installation run `conda install -c conda-forge lsdb` in the terminal.
For pip installation run `python -m pip install lsdb` or the following cell in a Jupyter notebook:

In [None]:
%pip install lsdb

Restart the kernel and check that the `lsdb` version is up-to-date:

In [3]:
import lsdb

lsdb.__version__

'0.6.7'

### 2.4 Open a catalog

The data is divided into `objects` and `dia_objects`.  Let's open both catalogs:

In [6]:
from upath import UPath

base_path = UPath("/global/cfs/cdirs/lsst/shared/rubin/DP1/HATS/dp1_full/hats/v29_0_0")

object_cat = lsdb.open_catalog(base_path / "object_collection")
dia_object_cat = lsdb.open_catalog(base_path / "dia_object_collection")

In [7]:
object_cat

Unnamed: 0_level_0,coord_dec,coord_decErr,coord_ra,coord_raErr,g_psfFlux,g_psfFluxErr,g_psfMag,g_psfMagErr,i_psfFlux,i_psfFluxErr,i_psfMag,i_psfMagErr,objectId,patch,r_psfFlux,r_psfFluxErr,r_psfMag,r_psfMagErr,refBand,refFwhm,shape_flag,shape_xx,shape_xy,shape_yy,tract,u_psfFlux,u_psfFluxErr,u_psfMag,u_psfMagErr,x,xErr,y,y_psfFlux,y_psfFluxErr,y_psfMag,y_psfMagErr,yErr,z_psfFlux,z_psfFluxErr,z_psfMag,z_psfMagErr,objectForcedSource
npartitions=389,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1
"Order: 6, Pixel: 130",double[pyarrow],float[pyarrow],double[pyarrow],float[pyarrow],float[pyarrow],float[pyarrow],float[pyarrow],float[pyarrow],float[pyarrow],float[pyarrow],float[pyarrow],float[pyarrow],int64[pyarrow],int64[pyarrow],float[pyarrow],float[pyarrow],float[pyarrow],float[pyarrow],string[pyarrow],float[pyarrow],bool[pyarrow],float[pyarrow],float[pyarrow],float[pyarrow],int64[pyarrow],float[pyarrow],float[pyarrow],float[pyarrow],float[pyarrow],double[pyarrow],float[pyarrow],double[pyarrow],float[pyarrow],float[pyarrow],float[pyarrow],float[pyarrow],float[pyarrow],float[pyarrow],float[pyarrow],float[pyarrow],float[pyarrow],"nested<band: [string], coord_dec: [double], co..."
"Order: 8, Pixel: 2176",...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
"Order: 9, Pixel: 2302101",...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
"Order: 7, Pixel: 143884",...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...


In [8]:
dia_object_cat

Unnamed: 0_level_0,dec,diaObjectId,nDiaSources,ra,radecMjdTai,tract,diaObjectForcedSource,diaSource
npartitions=208,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
"Order: 6, Pixel: 130",double[pyarrow],int64[pyarrow],int64[pyarrow],double[pyarrow],double[pyarrow],int64[pyarrow],"nested<band: [string], coord_dec: [double], co...","nested<band: [string], centroid_flag: [bool], ..."
"Order: 6, Pixel: 136",...,...,...,...,...,...,...,...
...,...,...,...,...,...,...,...,...
"Order: 11, Pixel: 36833621",...,...,...,...,...,...,...,...
"Order: 7, Pixel: 143884",...,...,...,...,...,...,...,...


## 3.  Accessing the data at CANFAR (Canada Independent Data Access Center)

### 3.1 Launch science platform

If you are a member of the Canadian astronomical community, you can access Rubin DP1 via the CANFAR science portal. Open https://www.canfar.net/science-portal. We recommend the defaults, i.e., project skaha and container image astroml:latest.

### 3.2 Opening a Catalog

The data is divided into `objects` and `dia_objects`.  Let's open both catalogs:

In [None]:
from upath import UPath

base_path = UPath("/arc/projects/hats/lsst/dp1")

object_cat = lsdb.open_catalog(base_path / "object_collection")
dia_object_cat = lsdb.open_catalog(base_path / "dia_object_collection")

An authenticated user could also get the DP1 through an authenticated sshfs session,  or using VOSpace client. More information can be found at https://www.opencadc.org/canfar/latest/platform/storage. If you get permission error, but you are Canadian Rubin data rights holder, contact CADC for assistance. 

## 4. Accessing the data at LIneA (Laborat√≥rio Interinstitucional de e-Astronomia)

Detailed instructions for access via Brazilian IDAC are available at https://data.linea.org.br/en/lsdb/how_to_access_rubin_dp1.html.

## 5. Cleaning up (if you pip-installed LSDB)

When you're done working, if you have used `pip` to install LSDB, it is good to remember to uninstall LSDB. This will ensure you pick up newer versions in your kernels, as development is made on LSDB.

In [None]:
%pip uninstall -y lsdb

## About

**Authors**: Neven Caplar, Derek Jones, Konstantin Malanchev

**Last updated on**: November 15, 2025

**Last run:** Jan 7, 2026 (RSP section only)

If you use `lsdb` for published research, please cite following [instructions](https://docs.lsdb.io/en/stable/citation.html).