Skip to content

Coding-Dev-Tools/datamorph

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DataMorph CLI

GitHub stars

Batch data format converter with streaming support for files over 10GB.

Python CI PyPI License

Part of the Revenue Holdings developer tool ecosystem.

Why DataMorph?

Data format conversion shouldn't require a custom script every time. CSV to Parquet for analytics, JSON to YAML for configs, Avro to JSON for debugging — DataMorph handles it all with one command. And for large files, row-by-row streaming means you never run out of memory.

Features

  • 6+ format pairs: CSV, JSON, JSONL, YAML, Parquet, Avro, Protobuf
  • Streaming: Row-by-row processing for files >10GB
  • Schema inference: Auto-detect field types from data
  • CLI commands: convert, batch, schema, formats
  • Batch mode: Convert entire directories at once

Installation

pip install datamorph

Quick Start

# Convert a single file
datamorph convert input.csv output.parquet
datamorph convert input.json output.csv
datamorph convert input.yaml output.json
datamorph convert input.parquet output.csv

# Batch convert all files in a directory
datamorph batch ./csv_data/ ./parquet_data/ --from csv --to parquet --recursive

# Inspect schema
datamorph schema data.parquet
datamorph schema data.csv --json-output

# List supported formats
datamorph formats

Pricing

Tier Price Features
Free $0 CLI only, 100 conversions/mo
Pro $12/mo Unlimited conversions, streaming, batch mode, all formats
Suite $49/mo All 8 Revenue Holdings tools

Get a license key at revenueholdings.dev/pricing.

Development

pip install -e ".[dev]"
pytest tests/ -v

License

MIT — Revenue Holdings

About

CLI tool for batch converting between data formats (CSV, JSON, YAML, Parquet, Avro, Protobuf) with streaming for large files

Topics

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages