A Python package for reading and writing CSV files using Pydantic models.
csvalchemy provides a clean interface for validating CSV data against Pydantic models, handling errors gracefully, and writing validated results back to CSV files. It integrates with dydactic for robust validation of data records.
- CSV Reading: Read CSV files and validate each row against Pydantic models
- Error Handling: Continue processing even when individual rows fail validation
- Type Safety: Full type hints and validation using Pydantic
- CSV Writing: Write validated results back to CSV files
- Integration: Built on dydactic for reliable validation
- Python: 3.10 or higher
- pydantic: >=2.9.2 (Data validation using Python type annotations)
- dydactic: >=0.2.0 (Validation engine - requires Python 3.10+)
- python-dateutil: >=2.8.0 (DateTime parsing)
pip install csvalchemyfrom pydantic import BaseModel
from csvalchemy import read
from io import StringIO
# Define your model
class Person(BaseModel):
name: str
age: int
email: str | None = None
# Sample CSV content
csv_content = """name,age,email
Alice,30,alice@example.com
Bob,25,bob@example.com
Charlie,35,charlie@example.com
"""
# Read and validate CSV
with StringIO(csv_content) as f:
for result in read(f, Person):
if result.error:
print(f"Validation error: {result.error}")
else:
print(f"Valid person: {result.result.name}, age {result.result.age}")Output:
Valid person: Alice, age 30
Valid person: Bob, age 25
Valid person: Charlie, age 35
csvalchemy continues processing even when individual rows fail validation:
from pydantic import BaseModel
from csvalchemy import read
from io import StringIO
class Person(BaseModel):
name: str
age: int
email: str | None = None
# CSV with some invalid rows
csv_content = """name,age,email
Alice,30,alice@example.com
Bob,not_a_number,bob@example.com
Charlie,35,charlie@example.com
Diana,not_a_number,diana@example.com
"""
with StringIO(csv_content) as f:
valid_count = 0
error_count = 0
for result in read(f, Person):
if result.error:
error_count += 1
print(f"Error on row {error_count}: {result.error}")
else:
valid_count += 1
print(f"Valid: {result.result.name}")
print(f"\nSummary: {valid_count} valid, {error_count} errors")Output:
Valid: Alice
Error on row 1: 1 validation error for Person
age
Input should be a valid integer, unable to parse string as an integer [type=int_parsing, input_value='not_a_number', input_type=str]
For further information visit https://errors.pydantic.dev/2.12/v/int_parsing
Valid: Charlie
Error on row 2: 1 validation error for Person
age
Input should be a valid integer, unable to parse string as an integer [type=int_parsing, input_value='not_a_number', input_type=str]
For further information visit https://errors.pydantic.dev/2.12/v/int_parsing
Summary: 2 valid, 2 errors
Write only validated results back to CSV:
from pydantic import BaseModel
from csvalchemy import read
from io import StringIO
class Product(BaseModel):
id: int
name: str
price: float
in_stock: bool
# Input CSV
input_csv = """id,name,price,in_stock
1,Widget,19.99,True
2,Gadget,29.99,False
3,Invalid,not_a_number,True
4,Thing,39.99,True
"""
# Read and validate
input_file = StringIO(input_csv)
validator = read(input_file, Product)
# Write validated results to new CSV
output_file = StringIO()
# Recreate validator since iterator was consumed
input_file2 = StringIO(input_csv)
validator2 = read(input_file2, Product)
writer = validator2.csv_writer(output_file)
# Consume writer to trigger CSV writing
for result in writer:
if result.error:
print(f"Skipped invalid row: {result.error}")
else:
print(f"Wrote: {result.result.name}")
# Show output CSV
output_file.seek(0)
print("\n=== Output CSV ===")
print(output_file.read())Output:
Wrote: Widget
Wrote: Gadget
Skipped invalid row: 1 validation error for Product
price
Input should be a valid number, unable to parse string as a number [type=float_parsing, input_value='not_a_number', input_type=str]
For further information visit https://errors.pydantic.dev/2.12/v/float_parsing
Wrote: Thing
=== Output CSV ===
id,name,price,in_stock
1,Widget,19.99,True
2,Gadget,29.99,False
4,Thing,39.99,True
Validate data not from CSV files:
from pydantic import BaseModel
from csvalchemy import Validator
import dydactic.options
class Person(BaseModel):
name: str
age: int
email: str | None = None
# Data not from CSV
records = [
{"name": "Alice", "age": "30", "email": "alice@example.com"},
{"name": "Bob", "age": "not_a_number", "email": "bob@example.com"},
{"name": "Charlie", "age": "35"},
]
# Standard validation
print("=== Using Validator directly ===")
validator = Validator(iter(records), Person)
for result in validator:
if result.error:
print(f"Error: {result.error}")
else:
print(f"Valid: {result.result.name}, age {result.result.age}")
# Skip invalid records
print("\n=== Using SKIP error option ===")
validator_skip = Validator(
iter(records),
Person,
error_option=dydactic.options.ErrorOption.SKIP
)
valid_results = list(validator_skip)
print(f"Got {len(valid_results)} valid results (invalid ones skipped)")Output:
=== Using Validator directly ===
Valid: Alice, age 30
Error: 1 validation error for Person
age
Input should be a valid integer, unable to parse string as an integer [type=int_parsing, input_value='not_a_number', input_type=str]
For further information visit https://errors.pydantic.dev/2.12/v/int_parsing
Valid: Charlie, age 35
=== Using SKIP error option ===
Got 2 valid results (invalid ones skipped)
csvalchemy uses dydactic as its core validation
engine. The Validator and ValidatorIterator classes wrap dydactic.validate() to provide
a consistent API for CSV data validation.
- CSV Reading:
read()creates aCSVReaderValidatorthat reads CSV rows using Python'scsv.DictReader - Validation: Each row is validated using
dydactic.validate(), which handles Pydantic model validation - Error Handling: Validation errors are captured without stopping the iteration
- Result Mapping: dydactic's result objects are mapped to csvalchemy's
Resulttype for consistent API
- Leverages dydactic's robust validation handling
- Independent validation of each record (errors don't stop processing)
- Type-safe error handling with clear error messages
- Compatible with dydactic's validation strategies
- Configurable error handling (RETURN, RAISE, or SKIP)
- Support for strict validation and attribute-based validation
The Validator class supports dydactic's configuration options:
- error_option: Control how validation errors are handled:
RETURN(default): Errors are returned inResult.errorRAISE: Exceptions are raised immediately on validation errorsSKIP: Records with errors are skipped entirely
- strict: Enable strict Pydantic validation
- from_attributes: Validate from object attributes
Example:
from pydantic import BaseModel
from csvalchemy import Validator
import dydactic.options
class Person(BaseModel):
name: str
age: int
records = [
{"name": "Alice", "age": "30"},
{"name": "Bob", "age": "invalid"},
{"name": "Charlie", "age": "35"},
]
# Default: RETURN errors
validator_return = Validator(iter(records), Person)
results_return = list(validator_return)
print(f"RETURN mode: {len(results_return)} results (including errors)")
# SKIP invalid records
validator_skip = Validator(
iter(records),
Person,
error_option=dydactic.options.ErrorOption.SKIP
)
results_skip = list(validator_skip)
print(f"SKIP mode: {len(results_skip)} results (errors skipped)")Output:
RETURN mode: 3 results (including errors)
SKIP mode: 2 results (errors skipped)
csvalchemy provides two approaches to validation:
-
Full Validation (Recommended): Use
Validatororread()which leverage dydactic's complete validation pipeline including dydactic's casting functionality. This is the primary and recommended approach for CSV validation. -
Standalone Casting: The
cast.pymodule provides casting utilities similar todydactic.cast. This module is kept for:- Standalone use cases that don't require full dydactic validation
- Direct class instantiation without Pydantic models
- Testing scenarios
Note: The main validation flow uses dydactic's casting internally, so cast.py is not used in the primary validation pipeline.
- Python 3.10+ (required by dydactic)
- See
pyproject.tomlfor complete dependency list