# Resuming import with different HEALPix order

**Author:** Sandro Campos | [GitHub Issue](https://github.com/astronomy-commons/hipscat-import/issues/261)

**Context:** When importing a catalog, the pipeline might be pre-empted while the histogram binaries are generated and stored into disk. If for some reason we decide to resume the pipeline with a different HEALPix order it will fail because the histograms that were written to disk will have an incompatible size.

This issue first appeared in a [workflow](https://github.com/lincc-frameworks/notebooks_lf/blob/main/DELVE_gaia_xmatch/Spectroscopy_failed.ipynb) by Julie Xue. In this notebook we will demonstrate how LSDB behaves in this scenario.

In [1]:
import os
import healpy as hp
import numpy as np

from dask.distributed import Client
from hipscat_import.catalog.arguments import ImportArguments
from hipscat_import.pipeline import pipeline_with_client

In [2]:
client = Client(n_workers=4)

Perhaps you already have a cluster running?
Hosting the HTTP server on port 44207 instead


We first ran the pipeline with a maximum HEALPix order of 2, and stopped the importing process before it finished. We were left with the intermediate files.

```python
args = ImportArguments(
    output_artifact_name="MagE_hipscat",
    file_reader="fits",
    input_file_list=["mage_bonaca_rcat_V0.05.fits"],
    ra_column="GAIAEDR3_RA",
    dec_column="GAIAEDR3_DEC",
    output_path=".",
    pixel_threshold=500_000,
    highest_healpix_order=2,
)
pipeline_with_client(args, client)
```

In [3]:
os.listdir("resume_catalog/intermediate")

['input_paths.txt',
 'splitting',
 'reducing',
 'mapping_done',
 'mapping_histogram.binary',
 'order_0',
 'splitting_done']

In [4]:
os.listdir("resume_catalog/Norder=0/Dir=0")

['Npix=10.parquet', 'Npix=4.parquet']

Importing stopped at the reducing stage. We know that since the catalog's `Norder_0` directory is already present and contains 2 partition files. Notice that, because we only specified a highest order, the catalog was imported at the lowest possible order, which in the case of this extremely small catalog was 0. However, our `mapping histogram` is always generated for the highest order (in this case, 2)! The size of an histogram for order 2 is 192, for order 3 it is 768. These values correspond to the number of pixels at the given highest pixel order.

In [5]:
print(f"Histogram size (order 2): {hp.order2npix(2)}")
print(f"Histogram size (order 3): {hp.order2npix(3)}")

Histogram size (order 2): 192
Histogram size (order 3): 768


Loading our previously generated `mapping_histogram` we confirm it:

In [6]:
with open("resume_catalog/intermediate/mapping_histogram.binary", "rb") as file_handle:
    print(len(np.frombuffer(file_handle.read(), dtype=np.int64)))

192


This means that resuming the previous pipeline with a higher pixel order will not work. In order to resume a pipeline, the histograms size (and therefore, the highest pixel order) must match!

In [7]:
args = ImportArguments(
    output_artifact_name="MagE_hipscat",
    file_reader="fits",
    input_file_list=["mage_bonaca_rcat_V0.05.fits"],
    ra_column="GAIAEDR3_RA",
    dec_column="GAIAEDR3_DEC",
    output_path=".",
    pixel_threshold=500_000,
    highest_healpix_order=3,
)
pipeline_with_client(args, client)

Planning  :   0%|          | 0/5 [00:00<?, ?it/s]



Binning   :   0%|          | 0/2 [00:00<?, ?it/s]

ValueError: The histogram from the previous execution is incompatible with the current healpix order. To run with a different configuration set `resume` to False

We have updated this error message. Whoever is running the pipeline is now aware that they need to restore the highest healpix order or set the `resume` to False to restart the pipeline with the new running configuration.

In [8]:
args = ImportArguments(
    output_artifact_name="MagE_hipscat",
    file_reader="fits",
    input_file_list=["mage_bonaca_rcat_V0.05.fits"],
    ra_column="GAIAEDR3_RA",
    dec_column="GAIAEDR3_DEC",
    output_path=".",
    pixel_threshold=500_000,
    constant_healpix_order=3,
    resume=False,
)
pipeline_with_client(args, client)

Planning  :   0%|          | 0/5 [00:00<?, ?it/s]

Mapping   :   0%|          | 0/1 [00:00<?, ?it/s]

Binning   :   0%|          | 0/2 [00:00<?, ?it/s]

Splitting :   0%|          | 0/1 [00:00<?, ?it/s]

Reducing  :   0%|          | 0/66 [00:00<?, ?it/s]

Finishing :   0%|          | 0/5 [00:00<?, ?it/s]

In [9]:
os.listdir("MagE_hipscat/Norder=3/Dir=0")

['Npix=297.parquet',
 'Npix=322.parquet',
 'Npix=266.parquet',
 'Npix=393.parquet',
 'Npix=525.parquet',
 'Npix=331.parquet',
 'Npix=358.parquet',
 'Npix=618.parquet',
 'Npix=556.parquet',
 'Npix=468.parquet',
 'Npix=366.parquet',
 'Npix=534.parquet',
 'Npix=646.parquet',
 'Npix=763.parquet',
 'Npix=627.parquet',
 'Npix=541.parquet',
 'Npix=757.parquet',
 'Npix=323.parquet',
 'Npix=355.parquet',
 'Npix=353.parquet',
 'Npix=260.parquet',
 'Npix=558.parquet',
 'Npix=542.parquet',
 'Npix=299.parquet',
 'Npix=387.parquet',
 'Npix=298.parquet',
 'Npix=289.parquet',
 'Npix=523.parquet',
 'Npix=324.parquet',
 'Npix=361.parquet',
 'Npix=465.parquet',
 'Npix=386.parquet',
 'Npix=531.parquet',
 'Npix=535.parquet',
 'Npix=539.parquet',
 'Npix=456.parquet',
 'Npix=363.parquet',
 'Npix=619.parquet',
 'Npix=620.parquet',
 'Npix=669.parquet',
 'Npix=761.parquet',
 'Npix=564.parquet',
 'Npix=546.parquet',
 'Npix=522.parquet',
 'Npix=526.parquet',
 'Npix=630.parquet',
 'Npix=631.parquet',
 'Npix=743.pa

In [10]:
client.shutdown()