Skip to content

systems data loading and json fields

Douwe de Vries edited this page Jul 2, 2026 · 1 revision

Data loading and JSON fields

Active contributors: Douwe de Vries

Purpose

The data layer parses CSV input, detects delimiter and column types, rejects unsupported headers, and exposes JSON object fields as virtual selectable columns. It is used before every mapping or comparison workflow.

Directory layout

src/data/
├── csv_loader.rs
├── json_fields.rs
├── types.rs
├── export.rs
└── mod.rs

Key abstractions

Name File Description
CsvData src/data/types.rs Headers and row matrix for a loaded file.
load_csv_from_bytes src/data/csv_loader.rs Decodes file bytes and parses CSV text.
detect_columns src/data/csv_loader.rs Infers string, integer, float, or date-like columns.
discover_virtual_headers src/data/json_fields.rs Finds JSON object paths inside cells and exposes labels.
ColumnSelection src/data/json_fields.rs Physical or virtual selection resolved against headers.

How it works

src/data/csv_loader.rs decodes bytes with encoding_rs_io, trims a UTF-8 BOM, detects comma vs semicolon on the first non-empty row, and parses strict headered CSV. Duplicate headers are rejected. src/data/json_fields.rs scans object-valued cells and emits dotted virtual labels or # labels when a physical header prefix would be ambiguous.

graph TD
    Bytes[CSV bytes] --> Decode[decode_csv_text]
    Decode --> Parse[parse_csv_text]
    Parse --> Columns[detect_columns]
    Parse --> Virtual[discover_virtual_headers]
    Virtual --> Labels[physical and virtual labels]
Loading

Integration points

src/backend/workflow.rs builds FileLoadResponse with physical headers, virtual headers, detected columns, and row counts. src/backend/validation.rs, src/comparison/rows.rs, src/backend/pair_order.rs, and src/backend/persistence/v1/mod.rs all resolve physical and virtual labels.

Entry points for modification

Change parsing rules in src/data/csv_loader.rs, then update tests/csv_loader_integration.rs and workflow tests. Change virtual label grammar in src/data/json_fields.rs, then update JSON field integration tests and frontend selector tests.

Key source files

File Purpose
src/data/csv_loader.rs Decode, parse, delimiter detection, duplicate headers, type detection.
src/data/json_fields.rs Virtual JSON header discovery and extraction.
src/data/types.rs Core CSV, column, mapping, result, and normalization types.
tests/csv_loader_integration.rs CSV loading regression tests.
tests/json_virtual_fields_integration.rs Virtual field behavior tests.

Clone this wiki locally