charex 0.5.0
charex 0.5.0
charex 0.5.0 is the shape/layout release for the Numba 0.65.1 compatibility
line. It extends the read-only string operation surface from scalar, 0-D, and
1-D inputs to NumPy-matching N-D and broadcast-compatible shapes.
Compatibility
- Python
>=3.10,<3.15 - Numba
>=0.65.1,<0.66 - NumPy supported/tested window:
>=1.22,<1.27or>=2.0,<2.5 - llvmlite
0.47.x
np.strings support is conditional on NumPy 2.x.
Highlights
- Supports N-D and broadcast-compatible shapes for fixed-width
S/Uarrays
through bothnp.charandnp.strings. - Supports N-D and broadcast-compatible shapes for NumPy 2.x variable-width
StringDTypearrays throughnp.strings. - Supports contiguous arrays, read-only views, positive and negative strides,
zero-stride views, and empty views for the supported read-only catalog. - Supports default
StringDType()andStringDType(na_object=...)variants
with NumPy-matching operation-specific null behavior. - Preserves separate
np.charandnp.stringssemantics, including the
trailing whitespace/NUL behavior difference. - Keeps transformation/output-producing operations outside this release scope.
Supported Read-Only Catalog
- comparisons:
equal,not_equal,greater,greater_equal,less,
less_equal; - occurrence/search:
count,startswith,endswith,find,rfind,
index,rindex; - information/predicates:
str_len,isalpha,isalnum,isdigit,
isdecimal,isnumeric,isspace,islower,isupper,istitle; np.char.compare_chararraysfor fixed-widthS/U.
Parity Audit
The full shape audit on Python 3.12.8, NumPy 2.4.6, and Numba 0.65.1 reports:
- rows: 1702
- matching rows: 1702
- mismatches: 0
- NumPy accepts but charex rejects: 0
The audit CSV is written to
docs/exploration/string_array_shape_audit.csv; the summary is maintained in
docs/string-array-shape-parity.md.
Benchmarks
The Numba 0.65.1 benchmark matrix in
docs/benchmarks/numba-v-0.65.1 includes
fixed-width np.char inputs and NumPy 2.x StringDType inputs through
np.strings.
The current matrix reports a 1.60x median speedup across 135 fixed-width and
StringDType cases, with results ranging from 1.02x to 6.51x NumPy speed.
Not In Scope
- Transformation/output-producing operations such as replace, case conversion,
strip, pad, join, split, encode, and decode. - Object array bridges.
- Max-performance experimental kernels that have not been distilled.