Skip to content

feat(csv): read-only mode for files past a size threshold (memory-mapped / DuckDB-backed) #1472

@salmonumbrella

Description

@salmonumbrella

Problem

TablePro currently loads CSVs fully into memory. Open a 2 GB Stripe export or a multi-million-row log dump, and the app either pauses for minutes or fails. Tablecruncher opens 2 GB / 16M rows in 32 s on an M2 by skipping the full load; Modern CSV advertises load times "up to 11× Excel" via the same technique. This is the bar for being the only CSV tool on the user's Mac.

Proposed solution

Read-Only Big-File mode that engages automatically past a configurable threshold (default 1 GB or a few million rows):

  • File opens via memory-mapping, not full in-memory load
  • Grid scrolls and renders normally — column-header context menu still offers Copy Column Values, Copy as CSV / Markdown / IN Clause, Hide Column, Column Statistics…, and feat(datagrid): per-column local value filter (Excel-style) #1454's per-column filter
  • Disabled in this mode: cell edit, fill column, structural ops, find & replace
  • A Switch to Edit Mode action loads the file fully into memory and re-enables every mutation surface
  • A mode indicator lives in the window header so a greyed-out action is never confusing
  • Threshold configurable in Preferences

Implementation

  • The DuckDB plugin's duckdb.read_csv_auto over a memory-mapped file is the most direct backing path. Query results stream into the grid via the existing datasource interface.
  • Read-only as a primitive is independently useful. Some workflows want a guarantee the on-disk file isn't being silently dirtied during inspection.
  • Tablecruncher's parser lives in src/csvparser.cpp / src/csvdatastorage.cpp and is GPLv3

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions