Skip to content

bug: CC transactions missing filename cause unknown grouping in CCGroupingService #127

@longieirl

Description

@longieirl

Description

CC transactions were being grouped under "unknown" in CCGroupingService because tx.filename was empty for some transactions, causing pdf_card_numbers.get(tx.filename) to return None.

Root Cause

In processor.py _build_grouping_inputs(), CC transactions are collected from ExtractionResult.transactions and passed directly to group_by_card(). However _enrich_with_filename() (which backfills tx.filename from additional_fields['source_pdf']) is only called later inside prepare_transactions() for IBAN groups — never for CC transactions before grouping.

Some transactions produced by RowPostProcessor have an empty Filename field, so tx.filename = '' after from_dict(). These can't be looked up in pdf_card_numbers and fall through to card_suffix = 'unknown'.

Fix

In _build_grouping_inputs(), stamp tx.filename = extraction.source_file.name for any CC transaction with an empty filename before extending all_cc_txns.

Impact

Paid tier: 7 transactions were bucketed under unknown instead of their correct card suffix (9459). After fix, bank_statements_unknown.* files no longer produced.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions