Implement CoDICE L1a hi-ialirt by bourque · Pull Request #1786 · IMAP-Science-Operations-Center/imap_processing

bourque · 2025-05-28T15:25:26Z

This PR implements processing for CoDICE L1a hi-ialirt. A few things to note:

"Level L1a" doesn't really mean anything from the perspective of I-ALiRT, as that team just needs the end products, not intermediary products. But for the sake of my own pipeline and milestones, I am producing an intermediary product for this. The L1a dataset will be used in the future to calculate rates and the final I-ALiRT products.
hi-ialirt processing is similar to the hi-omni data product in that it is what I consider a "binned dataset", and so I tried to re-use as much code as I could for that.
This PR doesn't quite validate the hi-ialirt product. There are some nuances in the hi-ialirt validation data that I need to better understand and ask Joey about. I will validate this product in a future PR.

Closes #252 and #785
Pertains to #1107

Copilot

Pull Request Overview

This PR implements processing for the CoDICE L1a hi-ialirt data product by updating test expectations, configuring a dedicated energy table, and refactoring data reshaping and decompression for I‑ALiRT datasets.

Updated tests in test_codice_l1a.py to reflect the new array shapes and variable counts for hi‑ialirt
Added a dedicated IALIRT_ENERGY_TABLE and updated the configuration for COD_HI_IAL
Modified the pipeline and binned dataset creation to better accommodate hi‑ialirt processing

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File	Description
imap_processing/tests/codice/test_codice_l1a.py	Adjusted expected shapes/variables and updated xfail conditions for hi‑ialirt tests
imap_processing/codice/constants.py	Added IALIRT_ENERGY_TABLE and updated COD_HI_IAL configuration to use it
imap_processing/codice/codice_l1a.py	Refactored decompression and data reshaping logic to support hi‑ialirt processing
imap_processing/cdf/config/imap_codice_l1a_variable_attrs.yaml	Added variable attributes for hi‑ialirt energy table fields

Comments suppressed due to low confidence (2)

imap_processing/tests/codice/test_codice_l1a.py:250

[nitpick] Consider applying consistent xfail handling for hi-ialirt datasets across all test functions. Some tests no longer mark hi-ialirt as xfail while others do, so unifying this behavior until full validation is available would improve test consistency.

if descriptor == "hi-ialirt":

imap_processing/codice/codice_l1a.py:136

[nitpick] Ensure that the general check for 'ialirt' in dataset_name does not unintentionally cover cases that need distinct processing (e.g., lo-ialirt vs. hi-ialirt) by confirming that this condition aligns with the intended behavior for each data type.

if "ialirt" in self.config["dataset_name"]:

Copilot · 2025-05-28T15:26:00Z

    CODICEAPID.COD_HI_IAL: {
        "dataset_name": "imap_codice_l1a_hi-ialirt",
-        "energy_table": OMNI_ENERGY_TABLE,
+        "energy_table": IALIRT_ENERGY_TABLE,


[nitpick] Consider adding a comment explaining why hi-ialirt processing uses a separate IALIRT_ENERGY_TABLE instead of the OMNI_ENERGY_TABLE for clarity.

bourque · 2025-05-28T15:26:15Z

-        `acq_start_subseconds` fields in the packet. The exception to this is
-        the I-ALiRT packets, which use "acquisition_time".


It turns out that I didn't need to treat I-ALiRT differently here.

bourque · 2025-05-28T15:27:19Z


-def create_binned_dataset(apid: int, dataset: xr.Dataset) -> xr.Dataset:
+def create_binned_dataset(
+    apid: int, dataset: xr.Dataset, science_values: list[str]


Now passing in science_values as a parameter to make it easier to re-use this function in multiple places.

bourque · 2025-05-28T15:27:59Z

-    # hi-omni data gets reshaped a bit differently than other products,
-    # so we need to stray away from the nominal pipeline
-    stacked_data = np.stack(
-        [np.array(item, dtype=np.uint32) for item in pipeline.raw_data]
-    )
-
-    # This will hold all of the data per-species and support variables,
-    # ready to be put in a CDF file
-    data: dict[str, list] = {}
-    for species in pipeline.config["energy_table"]:
-        data[species] = []
-    data["epoch"] = []
-    data["spin_period"] = []
-    data["data_quality"] = []
-
-    # Get the number of spins per species
-    num_spins = pipeline.config["num_spins"]
-
-    # Iterate through each epoch's data and pull out the data for each
-    # species
-    for i, epoch in enumerate(stacked_data):
-        current_epoch = dataset.epoch.data[i]
-        position = 0
-        for species in pipeline.config["energy_table"]:
-            num_bins = (
-                len(pipeline.config["energy_table"][species]) - 1
-            )  # Subtracting one here since the table includes endpoints
-            species_data = (
-                epoch[position : position + num_bins * pipeline.config["num_spins"]]
-                .reshape(num_bins, num_spins)
-                .T
-            )
-
-            # Now pull out the data for each spin within the species data
-            for spin_data in species_data:
-                data[species].append(spin_data)
-
-                # We only need one set of support variables in the CDF,
-                # so just iterate using one species for these
-                if species == "h":
-                    # For each spin, we add <spin_period>*<num_spins> to the epoch value
-                    spin_period = (
-                        dataset.spin_period.data[i] * constants.SPIN_PERIOD_CONVERSION
-                    )
-                    epoch_value = current_epoch + np.int64(
-                        (spin_period * num_spins) * 1e9  # Convert from s to ns
-                    )
-                    data["epoch"].append(epoch_value)
-                    current_epoch = epoch_value
-
-                    # Other support variables
-                    data["spin_period"].append(spin_period)
-                    data["data_quality"].append(dataset.suspect.data[i])
-
-            position += num_bins * num_spins


I pulled a lot of this out and put it in the new reshape_binned_data() method

greglucas · 2025-05-28T15:29:55Z

+        # Iterate through each epoch's data and pull out the data for each
+        # species
+        stacked_data = np.stack(
+            [np.array(item, dtype=np.uint32) for item in self.raw_data]


Can you turn the data into an array directly and then reshape it?

Suggested change

[np.array(item, dtype=np.uint32) for item in self.raw_data]

np.array(self.raw_data, dtype=np.uint32).reshape(...)

Yes, good simplification!

greglucas · 2025-05-28T15:31:23Z

+            for species in self.config["energy_table"]:
+                num_bins = (
+                    len(self.config["energy_table"][species]) - 1
+                )  # Subtracting one here since the table includes endpoints


I think the auto-formatter may have moved this comment below the place where it is referencing. I'd suggest moving it above the num_bins line. Otherwise I thought it was applying to the following lines and there was no "-1" there.

bourque added 30 commits February 11, 2025 11:44

Initial attempt at L1a Codice-Lo I-ALiRT processing

017f2c7

squash

2529f94

progress

cde43cc

squash

d017fdb

squash

3d7dfc4

squash

54a042c

squash

373e332

squash

dab3224

squash

cbfd8bb

Added ialirt cdf attrs

9d5c2c3

Updated I-ALiRT config

e7d55d6

Now in a state where lo-ialirt data product is being produced

3fbd364

Updated branch

c461a90

Added attrs for hi-ialirt

597da80

Added validation files for ialirt

2e701d8

Added config for hi-ialirt

69c1ed3

Progress on hi-ialirt

9fff688

Added compression algorithm for pack 24 bit compression

502f178

Now in a state where lo-ialirt is working, though not valid

7f7706e

updated branch

41134fe

fixed some bugs I introduced during merge conflict fixing

5f4348e

Improved code readability

0a70460

Turned on tests for lo-ialirt

42dc8a3

Removed redundant code whose logic is not in codice_l1a.py

6475b7f

Updated tests to reflect newly designed process_codicelo module

f82c119

Added proper fix/workaround for when bad packets are encountered

0e57def

squash

b2c769f

fixing doc build errors

8d98cdc

fixed typo. Thank you copilot

8b8b81a

Marked test_process_codicelo as requiring external test data

e696b81

bourque added 8 commits May 21, 2025 21:09

squash

9bff2f8

updated validation file

d237877

squash

9209c88

turned on hi-ialirt tests

d4a5473

Defined energy table for hi-ialirt

d55236b

Added energy and delta cdf attrs for hi-ialirt-h

49e8f22

Completed hi-ialirt implementation

0e8844c

Updated branch

f2db7a4

bourque requested review from Copilot, greglucas, joeymukherjee, laspsandoval and tech3371 May 28, 2025 15:25

bourque self-assigned this May 28, 2025

bourque added the Ins: CoDICE Related to the CoDICE instrument label May 28, 2025

Copilot AI reviewed May 28, 2025

View reviewed changes

bourque commented May 28, 2025

View reviewed changes

Fixing doc build errors

839fc9c

greglucas approved these changes May 28, 2025

View reviewed changes

PR review suggestions

add27eb

bourque merged commit 7e80f6a into IMAP-Science-Operations-Center:dev May 29, 2025
14 checks passed

bourque deleted the codice-l1a-hi-ialirt branch May 29, 2025 01:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement CoDICE L1a hi-ialirt#1786

Implement CoDICE L1a hi-ialirt#1786
bourque merged 40 commits intoIMAP-Science-Operations-Center:devfrom
bourque:codice-l1a-hi-ialirt

bourque commented May 28, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI May 28, 2025

Uh oh!

bourque May 28, 2025

Uh oh!

bourque May 28, 2025

Uh oh!

bourque May 28, 2025

Uh oh!

greglucas May 28, 2025

Uh oh!

bourque May 28, 2025

Uh oh!

greglucas May 28, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		`acq_start_subseconds` fields in the packet. The exception to this is
		the I-ALiRT packets, which use "acquisition_time".

	[np.array(item, dtype=np.uint32) for item in self.raw_data]
	np.array(self.raw_data, dtype=np.uint32).reshape(...)

Conversation

bourque commented May 28, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI May 28, 2025

Choose a reason for hiding this comment

Uh oh!

bourque May 28, 2025

Choose a reason for hiding this comment

Uh oh!

bourque May 28, 2025

Choose a reason for hiding this comment

Uh oh!

bourque May 28, 2025

Choose a reason for hiding this comment

Uh oh!

greglucas May 28, 2025

Choose a reason for hiding this comment

Uh oh!

bourque May 28, 2025

Choose a reason for hiding this comment

Uh oh!

greglucas May 28, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants