Skip to content
This repository was archived by the owner on Jun 13, 2026. It is now read-only.

Providers

benzsevern edited this page Mar 29, 2026 · 1 revision

Providers

Providers extract schema information from data sources. infermap auto-detects the right provider.

Auto-Detection

Input Provider
.csv, .parquet, .xlsx FileProvider
postgresql://, mysql://, sqlite://, duckdb:// DBProvider
.yaml, .yml, .json SchemaFileProvider
Polars/Pandas DataFrame, list[dict] InMemoryProvider

Database Support

DB URI Format Install
SQLite sqlite:///path/to/db built-in
PostgreSQL postgresql://user:pass@host:5432/db pip install infermap[postgres]
DuckDB duckdb:///path/to/db.duckdb pip install infermap[duckdb]
MySQL mysql://user:pass@host:3306/db pip install infermap[mysql]

Schema Definition Files

fields:
  - name: email
    type: string
    aliases: [email_address, e_mail]
    required: true
  - name: phone
    type: string
    aliases: [tel, mobile]

These add extra signals to the scorer pipeline — they don't bypass it.

FieldInfo

Every provider produces FieldInfo objects:

Field Type Description
name str Column name
dtype str Normalized type (string/integer/float/boolean/date/datetime)
sample_values list[str] Sampled values (stringified)
null_rate float Fraction of nulls (0.0-1.0)
unique_rate float Fraction of unique values
value_count int Non-null count
metadata dict Provider-specific extras

Clone this wiki locally