Implement CoDICE L1a hi-ialirt#1786
Implement CoDICE L1a hi-ialirt#1786bourque merged 40 commits intoIMAP-Science-Operations-Center:devfrom
Conversation
There was a problem hiding this comment.
Pull Request Overview
This PR implements processing for the CoDICE L1a hi-ialirt data product by updating test expectations, configuring a dedicated energy table, and refactoring data reshaping and decompression for I‑ALiRT datasets.
- Updated tests in test_codice_l1a.py to reflect the new array shapes and variable counts for hi‑ialirt
- Added a dedicated IALIRT_ENERGY_TABLE and updated the configuration for COD_HI_IAL
- Modified the pipeline and binned dataset creation to better accommodate hi‑ialirt processing
Reviewed Changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| imap_processing/tests/codice/test_codice_l1a.py | Adjusted expected shapes/variables and updated xfail conditions for hi‑ialirt tests |
| imap_processing/codice/constants.py | Added IALIRT_ENERGY_TABLE and updated COD_HI_IAL configuration to use it |
| imap_processing/codice/codice_l1a.py | Refactored decompression and data reshaping logic to support hi‑ialirt processing |
| imap_processing/cdf/config/imap_codice_l1a_variable_attrs.yaml | Added variable attributes for hi‑ialirt energy table fields |
Comments suppressed due to low confidence (2)
imap_processing/tests/codice/test_codice_l1a.py:250
- [nitpick] Consider applying consistent xfail handling for hi-ialirt datasets across all test functions. Some tests no longer mark hi-ialirt as xfail while others do, so unifying this behavior until full validation is available would improve test consistency.
if descriptor == "hi-ialirt":
imap_processing/codice/codice_l1a.py:136
- [nitpick] Ensure that the general check for 'ialirt' in dataset_name does not unintentionally cover cases that need distinct processing (e.g., lo-ialirt vs. hi-ialirt) by confirming that this condition aligns with the intended behavior for each data type.
if "ialirt" in self.config["dataset_name"]:
| CODICEAPID.COD_HI_IAL: { | ||
| "dataset_name": "imap_codice_l1a_hi-ialirt", | ||
| "energy_table": OMNI_ENERGY_TABLE, | ||
| "energy_table": IALIRT_ENERGY_TABLE, |
There was a problem hiding this comment.
[nitpick] Consider adding a comment explaining why hi-ialirt processing uses a separate IALIRT_ENERGY_TABLE instead of the OMNI_ENERGY_TABLE for clarity.
| `acq_start_subseconds` fields in the packet. The exception to this is | ||
| the I-ALiRT packets, which use "acquisition_time". |
There was a problem hiding this comment.
It turns out that I didn't need to treat I-ALiRT differently here.
|
|
||
| def create_binned_dataset(apid: int, dataset: xr.Dataset) -> xr.Dataset: | ||
| def create_binned_dataset( | ||
| apid: int, dataset: xr.Dataset, science_values: list[str] |
There was a problem hiding this comment.
Now passing in science_values as a parameter to make it easier to re-use this function in multiple places.
| # hi-omni data gets reshaped a bit differently than other products, | ||
| # so we need to stray away from the nominal pipeline | ||
| stacked_data = np.stack( | ||
| [np.array(item, dtype=np.uint32) for item in pipeline.raw_data] | ||
| ) | ||
|
|
||
| # This will hold all of the data per-species and support variables, | ||
| # ready to be put in a CDF file | ||
| data: dict[str, list] = {} | ||
| for species in pipeline.config["energy_table"]: | ||
| data[species] = [] | ||
| data["epoch"] = [] | ||
| data["spin_period"] = [] | ||
| data["data_quality"] = [] | ||
|
|
||
| # Get the number of spins per species | ||
| num_spins = pipeline.config["num_spins"] | ||
|
|
||
| # Iterate through each epoch's data and pull out the data for each | ||
| # species | ||
| for i, epoch in enumerate(stacked_data): | ||
| current_epoch = dataset.epoch.data[i] | ||
| position = 0 | ||
| for species in pipeline.config["energy_table"]: | ||
| num_bins = ( | ||
| len(pipeline.config["energy_table"][species]) - 1 | ||
| ) # Subtracting one here since the table includes endpoints | ||
| species_data = ( | ||
| epoch[position : position + num_bins * pipeline.config["num_spins"]] | ||
| .reshape(num_bins, num_spins) | ||
| .T | ||
| ) | ||
|
|
||
| # Now pull out the data for each spin within the species data | ||
| for spin_data in species_data: | ||
| data[species].append(spin_data) | ||
|
|
||
| # We only need one set of support variables in the CDF, | ||
| # so just iterate using one species for these | ||
| if species == "h": | ||
| # For each spin, we add <spin_period>*<num_spins> to the epoch value | ||
| spin_period = ( | ||
| dataset.spin_period.data[i] * constants.SPIN_PERIOD_CONVERSION | ||
| ) | ||
| epoch_value = current_epoch + np.int64( | ||
| (spin_period * num_spins) * 1e9 # Convert from s to ns | ||
| ) | ||
| data["epoch"].append(epoch_value) | ||
| current_epoch = epoch_value | ||
|
|
||
| # Other support variables | ||
| data["spin_period"].append(spin_period) | ||
| data["data_quality"].append(dataset.suspect.data[i]) | ||
|
|
||
| position += num_bins * num_spins |
There was a problem hiding this comment.
I pulled a lot of this out and put it in the new reshape_binned_data() method
| # Iterate through each epoch's data and pull out the data for each | ||
| # species | ||
| stacked_data = np.stack( | ||
| [np.array(item, dtype=np.uint32) for item in self.raw_data] |
There was a problem hiding this comment.
Can you turn the data into an array directly and then reshape it?
| [np.array(item, dtype=np.uint32) for item in self.raw_data] | |
| np.array(self.raw_data, dtype=np.uint32).reshape(...) |
There was a problem hiding this comment.
Yes, good simplification!
| for species in self.config["energy_table"]: | ||
| num_bins = ( | ||
| len(self.config["energy_table"][species]) - 1 | ||
| ) # Subtracting one here since the table includes endpoints |
There was a problem hiding this comment.
I think the auto-formatter may have moved this comment below the place where it is referencing. I'd suggest moving it above the num_bins line. Otherwise I thought it was applying to the following lines and there was no "-1" there.
This PR implements processing for CoDICE L1a
hi-ialirt. A few things to note:hi-ialirtprocessing is similar to thehi-omnidata product in that it is what I consider a "binned dataset", and so I tried to re-use as much code as I could for that.hi-ialirtproduct. There are some nuances in thehi-ialirtvalidation data that I need to better understand and ask Joey about. I will validate this product in a future PR.Closes #252 and #785
Pertains to #1107