Release Release 2.5.4 · OasisLMF/OasisLMF

OasisLMF Changelog - 2.5.4

#1940 - enhancement/profile check script
#1942 - Fix brittle PolNumber backfill in IL input preparation
#1947 - enhancement/conversion_tool_speed
#1955 - improved quadratic interpolation so it evaluates in a way that's robu…
#1957 - Stochastic hazard dynamic footprint
#1963 - Update API client for OIDC M2M
#1964 - perf(gulmc): replace numba dicts with precomputed array-backed structures
#1967 - Fix for stalled runs on V2 workers
#1968 - improve numerical stability in variance calculations and add unit tests
#1969 - Improved bash error detection
#1971 - Fix IL merge failure when layers sharing a CondTag mix %-TIV and flat terms
#1973 - Add portfolio complexity metrics to oasislmf exposure run
#1974 - Improve rtree builtin
#1975 - Fix platform checks for external PRs
#1979 - Feature/hazard selection dynamic
#1980 - Round progress bar down
#1985 - port receiving data from non oasis source wont crash
#1987 - fix/pytools-empty-inputs
#1992 - Speed up summarypy read_buffer
#1993 - fix broken docs link
#1994 - fix summarypy missing dtypes
#1997 - Fix for ci error
#1999 - fix/input_gen_status

OasisLMF Notes

fixes to default profile + tests - (PR #1940)

removed non OED fields
- PolLimit
- CondNumber
Fixed Cyber names
created tests to check default_acc_profile and default_loc_profile with the following checks:
- check field is in OEDSpec (exceptions for BI Type fields and Cyber TIV fields)
- check ProfileElementName matches key (except Cyber TIV field)

enhancement/conversion_tool_speed - (PR #1947)

Update conversion tools for speed

Rewrites the Python converter implementations (csvtobin, bintocsv, bintoparquet, parquettobin) to reduce peak memory and improve throughput. Changes apply across all converter directions.

What changed:

The core change across all converters is chunked processing: CSV is read in fixed DEFAULT_BUFFER_SIZE chunks via iter_csv_as_ndarray(), binary output is written through pre-allocated batch buffers (_BATCH_ROWS), and parquet I/O streams via PyArrow's native ParquetWriter/iter_batches(). Binary inputs switch from np.fromfile to np.memmap. Hot-path encoding in fm, gul, and summarycalc csvtobin uses Numba JIT to build the binary stream format per chunk; validation state is carried across chunk boundaries as scalars rather than accumulating full-file structures.

Behaviour changes worth noting:

Vulnerability csvtobin: three validation checks removed — damage_bin_id contiguity, damage_bin_id starts at 1, and intensity_bin_id contiguity within each vulnerability. These no longer run even when no_validation=False. The suppress_int_bin_checks=False global intensity-bin consistency check is also replaced by a rolling per-vulnerability check, so cross-file inconsistencies between non-adjacent vulnerabilities are no longer caught.
Footprint: new decompressed_size flag writes the uncompressed size into zip .idx files; bintocsv zip path reuses a single pre-allocated decompression buffer when the field is present
Occurrence csvtobin: no_date_alg=True path now validates period_no ≤ no_of_periods (previously unchecked)

Affected converters:

csvtobin: amplifications, coverages, damagebin, fm, footprint, gul, lossfactors, occurrence, summarycalc, vulnerability
bintocsv: amplifications, coverages, cdf, footprint, lossfactors, occurrence, vulnerability
bintoparquet / parquettobin: default handler (aal, melt, periods, items, correlations)

Tests: parametrised round-trip coverage for all converter types, no_validation paths, and decompressed_size index format.

Benchmark results

Best of 3 repeats. Memory via tracemalloc (Python heap + NumPy; excludes Numba-JIT internals).

csvtobin

Converter	Dataset	Speedup	Peak mem: orig → new
fm	40k ev × 5 items × 100 samples	10.1x (25.2s → 2.5s)	610 MB → 89 MB (6.9x)
gul	40k ev × 5 items × 100 samples	10.4x (28.0s → 2.7s)	610 MB → 89 MB (6.9x)
summarycalc	20k ev × 3 summaries × 100 samples	8.3x (10.4s → 1.3s)	229 MB → 108 MB (2.1x)
lossfactors	200k ev × 10 amp	6.5x (12.1s → 1.9s)	221 MB → 52 MB (4.3x)
footprint	15k ev × 100 ap × 2 ib	105x (44.5s → 0.42s)	261 MB → 78 MB (3.3x)
footprint (zip)	15k ev × 100 ap × 2 ib	32x (48.1s → 1.5s)	261 MB → 67 MB (3.9x)
vulnerability + idx	15k v × 50 ib × 10 db	96x (233s → 2.4s)	229 MB → 57 MB (4.0x)
vulnerability + zip + idx	15k v × 50 ib × 10 db	53x (232s → 4.4s)	229 MB → 57 MB (4.0x)
occurrence	10M events	1.1x (2.2s → 2.1s)	382 MB → 96 MB (4.0x)
coverages	5M coverages	1.4x (0.63s → 0.47s)	76 MB → 12 MB (6.2x)
amplifications	5M items	1.1x (0.43s → 0.40s)	76 MB → 28 MB (2.8x)
damagebin	5M bins	1.1x (1.19s → 1.13s)	219 MB → 83 MB (2.6x)

bintocsv

Converter	Dataset	Speedup	Peak mem: orig → new
footprint	15k ev × 100 ap × 2 ib	2.2x (0.33s → 0.15s)	39 MB → 707 KB (56.7x)
footprint (zip)	15k ev × 100 ap × 2 ib	1.2x	13 MB → 93 KB (144.8x)
vulnerability + idx	15k v × 50 ib × 10 db	1.4x	200 MB → 692 KB (296.5x)
vulnerability + zip + idx	15k v × 50 ib × 10 db	1.0x	153 MB → 80 KB (1960x)
lossfactors	200k ev × 10 amp	1.9x (3.1s → 1.6s)	244 MB → 337 KB (740.6x)
cdf	3k ev × 30 ap × 2 vuln × 10 bins	21.2x (5.5s → 0.26s)	~384 KB → ~883 KB
occurrence	10M events	2.6x (1.8s → 0.69s)	59 MB → 66 MB (~)
amplifications	5M items	1.6x (0.15s → 0.09s)	169 MB → 23 MB (7.4x)
coverages	5M coverages	1.3x (0.31s → 0.23s)	245 MB → 49 MB (5.0x)

bintoparquet / parquettobin (default handler: aal, melt, periods, items, correlations)

Direction	Converter	Dataset	Speedup	Peak mem: orig → new
bintoparquet	aal	5M rows	1.2x (0.39s → 0.32s)	229 MB → 30 MB (7.5x)
bintoparquet	melt	5M rows	1.2x (1.27s → 1.06s)	629 MB → 84 MB (7.5x)
parquettobin	aal	5M rows	2.1x (0.17s → 0.08s)	153 MB → 43 MB (3.5x)
parquettobin	melt	5M rows	1.7x (0.52s → 0.30s)	420 MB → 126 MB (3.3x)

closes #1944

Update API client for OIDC M2M - (PR #1963)

Added new auth_mode m2m which uses client_credentials grant direct to IdP. Added three new flags to the API client CLI to support this.

  --auth-type {simple,oidc,m2m}
                        Authentication type: simple (username/password JWT), oidc (client credentials via platform),
                        m2m (client credentials direct to IdP)
  --oidc-token-url OIDC_TOKEN_URL
                        Token endpoint URL for m2m client_credentials grant (e.g.
                        https://idp.example.com/oauth2/token)
  --oidc-scope OIDC_SCOPE
                        OAuth2 scope to request when fetching an m2m token (e.g. oasis/m2m)

Ground-up loss (gulmc) now runs ~45% faster end-to-end and uses ~30% less peak memory on representative workloads, by replacing numba dicts with precomputed array-backed structures.

Fix for stalled runs on V2 workers - (PR #1967)

Fixed issue where one run script matched an deleted another chunks FIFO queues, causing that chunk of events to stall

Improved bash error detection - (PR #1969)

Bash script generation checks bash version support and adds -p var to wait calls, this will check the exit code of tracked background processes and kill the script if one errors.
Moved the bash tracing support check into python
Added a check to ensure all expected named pipes exist and are FIFO (and not files), check happens before the main execution starts. see: #1967

Fixes a pandas.errors.MergeError: Merge keys are not unique in right dataset; not a many-to-one merge crash in IL input preparation when multiple policies/layers on the same account share a CondTag and at least one declares a %-of-TIV (or BI) financial term while another declares a flat or non-TIV-dependent term.

Improvements to rtree lookup builtin - (PR #1974)

Improve performance by using vectorised operations.
Rename parameter from nearest_neighbor_min_distance to nearest_neighbor_max_distance to correctly reflect that this is the greatest distance at which a point will be associated with a geometry. Former parameter is still accepted but will log a deprecation warning.
Hide the warning about distances being incorrect when using a geographical coordinate system (this is not ideal but can still function as a rough threshold).
Add comments explaining that the distance is the Euclidean distance, not the more accurate spherical or ellipsoidal approximation.
Add tests.
Remove references in code and parameter names to "area peril" since this is a generic function that can be used for other purposes.

Fix platform checks for external PRs - (PR #1975)

Fix so that platform checks work on outside PR's

fix/pytools-empty-inputs - (PR #1987)

updates elt, plt, aal, lec, kat, join-summary-info code and tests to handle empty input files

closes #1986

fix/input_gen_status - (PR #1999)

Adds OasisExceptionNoKeys error to generate files

closes OasisLMF/OasisPlatform#974

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release 2.5.4

Choose a tag to compare

Sorry, something went wrong.