csvalchemy

A Python package for reading and writing CSV files using Pydantic models.

Overview

csvalchemy provides a clean interface for validating CSV data against Pydantic models, handling errors gracefully, and writing validated results back to CSV files. It integrates with dydactic for robust validation of data records.

Features

CSV Reading: Read CSV files and validate each row against Pydantic models
Error Handling: Continue processing even when individual rows fail validation
Type Safety: Full type hints and validation using Pydantic
CSV Writing: Write validated results back to CSV files
Integration: Built on dydactic for reliable validation

Dependencies

Python: 3.10 or higher
pydantic: >=2.9.2 (Data validation using Python type annotations)
dydactic: >=0.2.0 (Validation engine - requires Python 3.10+)
python-dateutil: >=2.8.0 (DateTime parsing)

Installation

pip install csvalchemy

Quick Start

from pydantic import BaseModel
from csvalchemy import read
from io import StringIO

# Define your model
class Person(BaseModel):
    name: str
    age: int
    email: str | None = None

# Sample CSV content
csv_content = """name,age,email
Alice,30,alice@example.com
Bob,25,bob@example.com
Charlie,35,charlie@example.com
"""

# Read and validate CSV
with StringIO(csv_content) as f:
    for result in read(f, Person):
        if result.error:
            print(f"Validation error: {result.error}")
        else:
            print(f"Valid person: {result.result.name}, age {result.result.age}")

Output:

Valid person: Alice, age 30
Valid person: Bob, age 25
Valid person: Charlie, age 35

Examples

Error Handling

csvalchemy continues processing even when individual rows fail validation:

from pydantic import BaseModel
from csvalchemy import read
from io import StringIO

class Person(BaseModel):
    name: str
    age: int
    email: str | None = None

# CSV with some invalid rows
csv_content = """name,age,email
Alice,30,alice@example.com
Bob,not_a_number,bob@example.com
Charlie,35,charlie@example.com
Diana,not_a_number,diana@example.com
"""

with StringIO(csv_content) as f:
    valid_count = 0
    error_count = 0
    
    for result in read(f, Person):
        if result.error:
            error_count += 1
            print(f"Error on row {error_count}: {result.error}")
        else:
            valid_count += 1
            print(f"Valid: {result.result.name}")
    
    print(f"\nSummary: {valid_count} valid, {error_count} errors")

Output:

Valid: Alice
Error on row 1: 1 validation error for Person
age
  Input should be a valid integer, unable to parse string as an integer [type=int_parsing, input_value='not_a_number', input_type=str]
    For further information visit https://errors.pydantic.dev/2.12/v/int_parsing
Valid: Charlie
Error on row 2: 1 validation error for Person
age
  Input should be a valid integer, unable to parse string as an integer [type=int_parsing, input_value='not_a_number', input_type=str]
    For further information visit https://errors.pydantic.dev/2.12/v/int_parsing

Summary: 2 valid, 2 errors

Writing Validated CSV

Write only validated results back to CSV:

from pydantic import BaseModel
from csvalchemy import read
from io import StringIO

class Product(BaseModel):
    id: int
    name: str
    price: float
    in_stock: bool

# Input CSV
input_csv = """id,name,price,in_stock
1,Widget,19.99,True
2,Gadget,29.99,False
3,Invalid,not_a_number,True
4,Thing,39.99,True
"""

# Read and validate
input_file = StringIO(input_csv)
validator = read(input_file, Product)

# Write validated results to new CSV
output_file = StringIO()

# Recreate validator since iterator was consumed
input_file2 = StringIO(input_csv)
validator2 = read(input_file2, Product)
writer = validator2.csv_writer(output_file)

# Consume writer to trigger CSV writing
for result in writer:
    if result.error:
        print(f"Skipped invalid row: {result.error}")
    else:
        print(f"Wrote: {result.result.name}")

# Show output CSV
output_file.seek(0)
print("\n=== Output CSV ===")
print(output_file.read())

Output:

Wrote: Widget
Wrote: Gadget
Skipped invalid row: 1 validation error for Product
price
  Input should be a valid number, unable to parse string as a number [type=float_parsing, input_value='not_a_number', input_type=str]
    For further information visit https://errors.pydantic.dev/2.12/v/float_parsing
Wrote: Thing

=== Output CSV ===
id,name,price,in_stock
1,Widget,19.99,True
2,Gadget,29.99,False
4,Thing,39.99,True

Using Validator Directly

Validate data not from CSV files:

from pydantic import BaseModel
from csvalchemy import Validator
import dydactic.options

class Person(BaseModel):
    name: str
    age: int
    email: str | None = None

# Data not from CSV
records = [
    {"name": "Alice", "age": "30", "email": "alice@example.com"},
    {"name": "Bob", "age": "not_a_number", "email": "bob@example.com"},
    {"name": "Charlie", "age": "35"},
]

# Standard validation
print("=== Using Validator directly ===")
validator = Validator(iter(records), Person)

for result in validator:
    if result.error:
        print(f"Error: {result.error}")
    else:
        print(f"Valid: {result.result.name}, age {result.result.age}")

# Skip invalid records
print("\n=== Using SKIP error option ===")
validator_skip = Validator(
    iter(records),
    Person,
    error_option=dydactic.options.ErrorOption.SKIP
)

valid_results = list(validator_skip)
print(f"Got {len(valid_results)} valid results (invalid ones skipped)")

Output:

=== Using Validator directly ===
Valid: Alice, age 30
Error: 1 validation error for Person
age
  Input should be a valid integer, unable to parse string as an integer [type=int_parsing, input_value='not_a_number', input_type=str]
    For further information visit https://errors.pydantic.dev/2.12/v/int_parsing
Valid: Charlie, age 35

=== Using SKIP error option ===
Got 2 valid results (invalid ones skipped)

Integration with dydactic

csvalchemy uses dydactic as its core validation engine. The Validator and ValidatorIterator classes wrap dydactic.validate() to provide a consistent API for CSV data validation.

How it works

CSV Reading: read() creates a CSVReaderValidator that reads CSV rows using Python's csv.DictReader
Validation: Each row is validated using dydactic.validate(), which handles Pydantic model validation
Error Handling: Validation errors are captured without stopping the iteration
Result Mapping: dydactic's result objects are mapped to csvalchemy's Result type for consistent API

Benefits

Leverages dydactic's robust validation handling
Independent validation of each record (errors don't stop processing)
Type-safe error handling with clear error messages
Compatible with dydactic's validation strategies
Configurable error handling (RETURN, RAISE, or SKIP)
Support for strict validation and attribute-based validation

Configuration Options

The Validator class supports dydactic's configuration options:

error_option: Control how validation errors are handled:
- RETURN (default): Errors are returned in Result.error
- RAISE: Exceptions are raised immediately on validation errors
- SKIP: Records with errors are skipped entirely
strict: Enable strict Pydantic validation
from_attributes: Validate from object attributes

Example:

from pydantic import BaseModel
from csvalchemy import Validator
import dydactic.options

class Person(BaseModel):
    name: str
    age: int

records = [
    {"name": "Alice", "age": "30"},
    {"name": "Bob", "age": "invalid"},
    {"name": "Charlie", "age": "35"},
]

# Default: RETURN errors
validator_return = Validator(iter(records), Person)
results_return = list(validator_return)
print(f"RETURN mode: {len(results_return)} results (including errors)")

# SKIP invalid records
validator_skip = Validator(
    iter(records),
    Person,
    error_option=dydactic.options.ErrorOption.SKIP
)
results_skip = list(validator_skip)
print(f"SKIP mode: {len(results_skip)} results (errors skipped)")

Output:

RETURN mode: 3 results (including errors)
SKIP mode: 2 results (errors skipped)

Architecture Notes

Casting and Validation

csvalchemy provides two approaches to validation:

Full Validation (Recommended): Use Validator or read() which leverage dydactic's complete validation pipeline including dydactic's casting functionality. This is the primary and recommended approach for CSV validation.
Standalone Casting: The cast.py module provides casting utilities similar to dydactic.cast. This module is kept for:
- Standalone use cases that don't require full dydactic validation
- Direct class instantiation without Pydantic models
- Testing scenarios

Note: The main validation flow uses dydactic's casting internally, so cast.py is not used in the primary validation pipeline.

Requirements

Python 3.10+ (required by dydactic)
See pyproject.toml for complete dependency list

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
csvalchemy		csvalchemy
examples		examples
tests		tests
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

csvalchemy

Overview

Features

Dependencies

Installation

Quick Start

Examples

Error Handling

Writing Validated CSV

Using Validator Directly

Integration with dydactic

How it works

Benefits

Configuration Options

Architecture Notes

Casting and Validation

Requirements

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

csvalchemy

Overview

Features

Dependencies

Installation

Quick Start

Examples

Error Handling

Writing Validated CSV

Using Validator Directly

Integration with dydactic

How it works

Benefits

Configuration Options

Architecture Notes

Casting and Validation

Requirements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages