Skip to content

Fix cartola XLS parsing for binary .xls files (v0.8.2) (#17)

Choose a tag to compare

@mabahamo mabahamo released this 06 Feb 21:54
· 16 commits to main since this release
6304464
* Fix cartola XLS parsing for binary .xls files with dynamic column detection

The XLS extractor had hardcoded column indices that only worked with
XLSX files. Real Banco de Chile binary .xls files produce a different
DataFrame layout (10 columns with shifted offsets due to merged cells).

- Add _find_value_column() and _find_header_column() helpers
- Make metadata extraction use flexible label matching (Sr(a), Cuenta)
- Dynamically detect balance/totals column positions
- Scan all columns for statement date in split "Movimientos" rows
- Add _detect_transaction_columns() to map header names to indices
- Add debug logging to identify() for diagnosing parse failures
- Add binary XLS test fixture and comprehensive tests

* Bump version to 0.8.2