Skip to content

Domusgpt/parserator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

27 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

πŸ€– Parserator

The Structured Data Layer for AI Agents

npm version License: MIT EMA Compliant

Transform any unstructured input into agent-ready JSON with 95% accuracy. Built for Google ADK, MCP, LangChain, and any agent framework using our revolutionary two-stage Architect-Extractor pattern.


πŸš€ Quick Start

Install & Use in 30 seconds

# Install globally
npm install -g parserator

# Parse any text instantly
parserator parse "Contact: John Doe, Email: john@example.com, Phone: 555-0123"

Output:

{
  "contact": "John Doe",
  "email": "john@example.com", 
  "phone": "555-0123"
}

Get API Access

# Get your free API key
curl -X POST https://parserator.com/api/keys/generate \
  -H "Content-Type: application/json" \
  -d '{"email": "your@email.com"}'

πŸ”§ Agent Integrations

Google ADK

@agent.tool
def extract_user_intent(user_message: str) -> UserIntent:
    return parse_for_agent(
        text=user_message,
        schema=UserIntent,
        context="command_parsing"
    )

MCP Server (Universal)

# Install MCP server for any agent
npm install -g parserator-mcp-server

# Use in any MCP-compatible agent
mcp://parserator/parse?schema=Contact&text=email_content

LangChain

from parserator import ParseChain

parser = ParseChain(api_key="your_key")
result = parser.parse(
    text="messy data here",
    output_schema={"name": "string", "age": "number"}
)

CrewAI

from parserator.integrations.crewai import ParseratorTool

parse_tool = ParseratorTool(
    name="extract_data",
    description="Parse unstructured text into JSON"
)

⚑ Browser Extensions

Transform web data instantly while browsing:

Chrome Extension

  • Status: Built and ready for Chrome Web Store submission
  • Use: Right-click any text β†’ "Parse with Parserator" β†’ Perfect JSON
  • Features: Auto-detect schemas, bulk export, local processing

VS Code Extension

  • Status: Built and packaged (parserator-1.0.0.vsix)
  • Use: Select messy data β†’ Ctrl+Shift+P β†’ Generate TypeScript types
  • Features: Schema templates, batch processing, framework integration

Extensions will be published to official stores once API is finalized.


🧠 How It Works: Architect-Extractor Pattern

Traditional LLMs waste tokens on complex reasoning with large datasets. Parserator uses a two-stage approach:

Stage 1: The Architect (Planning)

  • Input: Your schema + small data sample (~1K chars)
  • Job: Create detailed extraction plan
  • LLM: Gemini 1.5 Flash (optimized for reasoning)
  • Output: Structured search instructions

Stage 2: The Extractor (Execution)

  • Input: Full dataset + extraction plan
  • Job: Execute plan with minimal thinking
  • LLM: Gemini 1.5 Flash (optimized for following instructions)
  • Output: Clean, validated JSON

Results

  • 70% token reduction vs single-LLM approaches
  • 95% accuracy on complex data
  • Sub-3 second response times
  • No vendor lock-in - works with any LLM provider

πŸ“¦ Installation Options

πŸ”Ή Node.js/TypeScript

npm install parserator

πŸ”Ή Python

pip install parserator

πŸ”Ή Browser Extensions

  • Chrome Extension: Built, pending Chrome Web Store submission
  • VS Code Extension: Built, pending VS Code Marketplace submission

πŸ”Ή Agent Frameworks (In Development)

# MCP Server - Coming soon
npm install -g parserator-mcp-server

# Framework integrations - In beta
pip install parserator[langchain]
pip install parserator[crewai]
pip install parserator[adk]

Contact us for early access to framework integrations.


🌟 Use Cases

For Developers

  • API Integration: Parse inconsistent API responses
  • Data Migration: Extract from legacy systems
  • ETL Pipelines: Intelligent data transformation
  • Web Scraping: Handle changing site layouts

For AI Agents

  • Email Processing: Extract tasks, contacts, dates
  • Document Analysis: Parse contracts, invoices, reports
  • User Commands: Convert natural language to structured actions
  • Research Workflows: Extract key info from papers, articles

For Data Teams

  • Log Analysis: Structure unstructured log files
  • Data Cleaning: Normalize messy datasets
  • Import Processing: Handle varied file formats
  • Quality Assurance: Validate data consistency

πŸ—οΈ API Reference

Core Endpoint

POST https://api.parserator.com/v1/parse
Content-Type: application/json
Authorization: Bearer YOUR_API_KEY

{
  "inputData": "Contact: John Doe, Email: john@example.com, Phone: 555-0123",
  "outputSchema": {
    "contact": "string",
    "email": "string", 
    "phone": "string"
  },
  "instructions": "Extract contact information"
}

Response

{
  "success": true,
  "parsedData": {
    "contact": "John Doe",
    "email": "john@example.com",
    "phone": "555-0123"
  },
  "metadata": {
    "confidence": 0.96,
    "tokensUsed": 1250,
    "processingTimeMs": 800
  }
}

SDK Examples

JavaScript/TypeScript

import { Parserator } from 'parserator';

const parser = new Parserator('your-api-key');

const result = await parser.parse({
  inputData: 'messy text here',
  outputSchema: { name: 'string', age: 'number' }
});

console.log(result.parsedData);

Python

from parserator import Parserator

parser = Parserator('your-api-key')

result = parser.parse(
    input_data='messy text here',
    output_schema={'name': 'string', 'age': 'number'}
)

print(result.parsed_data)

πŸ—οΈ Shared Core Architecture

Parserator uses a lean shared core architecture for maximum efficiency and maintainability:

Architecture Overview

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚     SHARED CORE (@parserator/core)     β”‚
β”‚   Types, Validation, HTTP Client       β”‚ 
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
              β”‚
     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”
     β”‚        β”‚        β”‚
β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β” β”Œβ”€β”€β–Όβ”€β”€β”€β” β”Œβ”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚Node SDKβ”‚ β”‚Pythonβ”‚ β”‚Extensions β”‚
β”‚(50KB)  β”‚ β”‚ SDK  β”‚ β”‚ (Chrome,  β”‚
β”‚        β”‚ β”‚(50KB)β”‚ β”‚   VSCode) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
     β”‚        β”‚        β”‚
     β””β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”˜
              β”‚
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚  PRODUCTION API   β”‚
    β”‚ 95% Accuracy      β”‚
    β”‚ Architect-Extract β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Benefits

  • 75% smaller SDK bundles (250KB vs 1MB total)
  • Single source of truth for API logic
  • Consistent experience across all platforms
  • Faster maintenance and feature development

See SHARED_CORE_ARCHITECTURE.md for complete technical details.

πŸ›‘οΈ Exoditical Moral Architecture

Parserator is built on EMA principles - a revolutionary approach to ethical software development:

πŸ”“ Digital Sovereignty

  • Your data is yours - We never store input/output content
  • No vendor lock-in - Export everything, switch anytime
  • Open standards - JSON, OpenAPI, Docker - universal compatibility
  • Transparent pricing - No hidden costs or usage surprises

πŸšͺ The Right to Leave

  • Complete data export - All schemas, templates, usage history
  • Standard formats - Import into any compatible system
  • Migration tools - Seamless transition to other platforms
  • Zero retention - Data deleted immediately upon request

🌐 Universal Compatibility

  • Framework agnostic - Works with any agent development platform
  • LLM agnostic - Switch between OpenAI, Anthropic, Google, etc.
  • Deployment agnostic - Cloud, on-premise, or hybrid
  • Standard protocols - REST API, MCP, GraphQL support

"The ultimate expression of empowerment is the freedom to leave."


πŸ§ͺ Beta Program

πŸš€ Beta Features in Development

  • Multi-LLM Support: Working on OpenAI, Anthropic, Google Gemini compatibility
  • Schema Validation: Type checking and constraint enforcement
  • Batch Processing: Handle multiple documents simultaneously
  • Custom Workflows: Chain parsing operations
  • Monitoring Dashboard: Parse analytics and performance metrics

Join the Beta

Contact us for beta access:

Beta Feedback: GitHub Issues | GitHub Discussions


πŸ“Š Pricing

Currently in beta - Contact us for early access pricing and custom solutions.


🀝 Community & Support

πŸ“š Documentation

  • API Reference: Coming soon in docs/ directory
  • Integration Guides: Available in this repository
  • Examples: Check examples/ directory for framework integrations

πŸ’¬ Community

  • GitHub Issues: Bug reports and feature requests
  • GitHub Discussions: Community questions and feedback
  • YouTube: @parserator - Tutorials and demos coming soon
  • LinkedIn: Company Page - Updates and announcements

πŸ› οΈ Support

  • Email: Gen-rl-millz@parserator.com
  • Response: We'll get back to you as soon as possible
  • Beta Support: Priority support for early adopters

πŸ† Why Parserator?

Feature Parserator Traditional Parsers Single-LLM Solutions
Accuracy 95% 60-70% 85%
Token Efficiency 70% less N/A Baseline
Setup Time <5 minutes Hours/Days 30 minutes
Maintenance Zero High Medium
Vendor Lock-in None High Medium
Schema Flexibility Unlimited Fixed Limited

πŸ“ˆ Development Status

Current Focus:

  • βœ… Core parsing engine - Two-stage Architect-Extractor pattern
  • βœ… Browser extensions - Chrome and VS Code extensions built
  • βœ… Agent integrations - LangChain, CrewAI, Google ADK support
  • 🚧 Documentation - API reference and integration guides in progress
  • 🚧 Beta testing - Gathering feedback from early adopters

Planned Features:

  • Multi-modal parsing (images, PDFs, audio)
  • Enhanced schema validation and templates
  • Enterprise deployment options
  • Additional framework integrations

Roadmap details will be updated based on community feedback and beta testing results.


πŸ“„ License

MIT License - see LICENSE file for details.

EMA Commitment: This project follows Exoditical Moral Architecture principles, ensuring your right to digital sovereignty and freedom to migrate.


πŸ™ Credits

Built with radical conviction by GEN-RL-MiLLz - "The Higher Dimensional Solo Dev"

"Grateful for your support as I grow Hooves & a Horn, taking pole position for the 2026 Agentic Derby."


πŸš€ Get Started β€’ πŸ“š Documentation β€’ πŸ’¬ Discord β€’ πŸ™ GitHub

Transform your messy data into agent-ready JSON today.

About

Parserator - Private Development Repository

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •