# LOCAL unit test data

There are two types of data used in unit tests in this repo: local and cloud. This notebook concerns itself only with the local versions of test data, so you can re-generate it.

## Object catalog: small sky

This is the same "object catalog" with 131 randomly generated radec values inside the order0-pixel11 healpix pixel that is used in hats and LSDB unit test suites.

In [None]:
import tempfile

import hats_import.pipeline as runner
from hats_import.catalog.arguments import ImportArguments
from dask.distributed import Client
from hats.io.file_io import remove_directory

tmp_path = tempfile.TemporaryDirectory()
tmp_dir = tmp_path.name

client = Client(n_workers=1, threads_per_worker=1, local_directory=tmp_dir)

### small_sky

This catalog was generated with the following snippet:

In [None]:
remove_directory("./small_sky")
with tempfile.TemporaryDirectory() as pipeline_tmp:
    args = ImportArguments(
        input_path="small_sky_parts",
        output_path=".",
        highest_healpix_order=1,
        file_reader="csv",
        output_artifact_name="small_sky",
        tmp_dir=pipeline_tmp,
    )
    runner.pipeline_with_client(args, client)

### small_sky_order1

This catalog has the same data points as other small sky catalogs, but is coerced to spreading these data points over partitions at order 1, instead of order 0.

This means there are 4 leaf partition files, instead of just 1, and so can be useful for confirming reads/writes over multiple leaf partition files.

NB: Setting `constant_healpix_order` coerces the import pipeline to create leaf partitions at order 1.

This catalog was generated with the following snippet:

In [None]:
remove_directory("./small_sky_order1")
with tempfile.TemporaryDirectory() as pipeline_tmp:
    args = ImportArguments(
        input_path="small_sky_parts",
        output_path=".",
        file_reader="csv",
        output_artifact_name="small_sky_order1",
        constant_healpix_order=1,
        tmp_dir=pipeline_tmp,
    )
    runner.pipeline_with_client(args, client)

And that's it!

Everything else is raw files, and things that need to be manipulated manually.

In [None]:
tmp_path.cleanup()
client.close()