Turned the repo into a pip-installable package.#1
Merged
Conversation
ChrisJChang
added a commit
that referenced
this pull request
Mar 10, 2026
#1 — read_hdf5_datasets: eliminate np.append in loop Replaced the per-file np.append pattern (which reallocates the full array on every call) with collecting chunks in a list and calling np.concatenate once at the end. For N files of M points each, this reduces allocations from O(N) full-array copies to one. #2 — fill_nan_with_neighbor_mean: vectorise triple-nested loop Replaced the Python for i / for j / for neighbour loop with scipy.ndimage.generic_filter, which operates in compiled C over the entire array in one pass. #3 — plot_2D_posterior: deduplicate sorted-histogram computation The same np.sort + np.cumsum + normalisation was computed twice in the same function call (once for CR masking, once for contour drawing). Now initialised to None and computed at most once, with the second use reusing the cached result. #4 — Replace deepcopy with shallow copies throughout - Numeric bounds (xy_bounds, x_bounds): replaced with [list(b[0]), list(b[1])] / list(b) — sufficient since only scalars are mutated. - requested_datasets (list of immutable tuples): replaced with list(). - NumPy array slices (y_data[mask], z_data[mask], etc.): removed entirely — fancy indexing already returns a copy. - deepcopy(y_data) in posterior shading: replaced with y_data.copy(). - Removed the now-unused deepcopy import. #5 — plot_1D_profile confidence band loop: single mask computation The boolean mask (x_values > x_start) & (x_values < x_end) was evaluated twice per segment (once for x, once for y) and converted through a Python list. Now computed once with np.concatenate instead of list wrapping. #6 — bin_and_profile_2D: nested loop → np.meshgrid Replaced the double Python loop over bin centres with a single np.meshgrid call + .ravel(), keeping the same flat-array layout (index = y_bin_index * n_xbins + x_bin_index). Bonus — Module-level np.finfo constants np.finfo(float).max and .eps (called ~20 times across function signatures and bodies) are now computed once at import time as _FLOAT_MAX and _FLOAT_EPS.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.