-
Notifications
You must be signed in to change notification settings - Fork 12
Open
Description
run the following:
export REF_DATASET_CACHE_DIR="/pscratch/sd/m/minxu/mytest_ref_2026-03-14/cache"
export REF_CONFIGURATION="/pscratch/sd/m/minxu/mytest_ref_2026-03-14/config"
export REF_INSTALLATION_DIR="/pscratch/sd/m/minxu/mytest_ref_2026-03-14/climate-ref"
ref datasets ingest --source-type pmp-climatology "${REF_DATASET_CACHE_DIR}/datasets/pmp-climatology"
got the error as follows:
2026-03-14 14:40:04.493 -07:00 | WARNING | climate_ref.datasets.base - Files to remove: ['/global/homes/m/minxu/.cache/climate_ref/PMP_obs4MIPsClims/ts/gr/v20250224/ts_mon_ERA-5_PCMDI_gr_198101-200412_AC_v20250224_2.5x2.5.nc']
╭─────────────────────────────────────────────────────────────────────────────────────────────────────────── Traceback (most recent call last) ────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ /pscratch/sd/m/minxu/mytest_ref_2026-03-14/climate-ref/packages/climate-ref/src/climate_ref/cli/datasets.py:202 in ingest │
│ │
│ 199 │ │ │ │ │ │ logger.info(f"Would save dataset {instance_id} to the database") │
│ 200 │ │ else: │
│ 201 │ │ │ # Use shared ingestion logic with pre-validated catalog │
│ ❱ 202 │ │ │ stats = ingest_datasets(adapter, None, db, data_catalog=data_catalog, skip_i │
│ 203 │ │ │ stats.log_summary() │
│ 204 │ │
│ 205 │ if solve: │
│ │
│ /pscratch/sd/m/minxu/mytest_ref_2026-03-14/climate-ref/packages/climate-ref/src/climate_ref/datasets/__init__.py:138 in ingest_datasets │
│ │
│ 135 │ for instance_id, data_catalog_dataset in data_catalog.groupby(adapter.slug_column): │
│ 136 │ │ logger.debug(f"Processing dataset {instance_id}") │
│ 137 │ │ with db.session.begin(): │
│ ❱ 138 │ │ │ results = adapter.register_dataset(db, data_catalog_dataset) │
│ 139 │ │ │ │
│ 140 │ │ │ if results.dataset_state == ModelState.CREATED: │
│ 141 │ │ │ │ stats.datasets_created += 1 │
│ │
│ /pscratch/sd/m/minxu/mytest_ref_2026-03-14/climate-ref/packages/climate-ref/src/climate_ref/datasets/base.py:316 in register_dataset │
│ │
│ 313 │ │ if files_to_remove: │
│ 314 │ │ │ files_removed = list(files_to_remove) │
│ 315 │ │ │ logger.warning(f"Files to remove: {files_removed}") │
│ ❱ 316 │ │ │ raise NotImplementedError("Removing files is not yet supported") │
│ 317 │ │ │
│ 318 │ │ # Update existing files if any file-specific metadata has changed │
│ 319 │ │ for file_path, existing_file in current_file_paths.items(): │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
NotImplementedError: Removing files is not yet supported
The ingestion code tried to remove the data in the default cache directory $HOME/.cache, and did not respect cache the directory that I set using the environment variables.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels