Advanced MySQL SQL Parser with Visualization Component Support
- π¨ Field Type Classification: Automatically categorizes fields as
column,aggregation,expression, orcomputed - π Aggregation Scope Tracking: Tracks which tables are involved in aggregation functions like
COUNT(*) - π Visualization-Ready Output: Enhanced JSON format perfect for SQL diagram generation
- π οΈ Advanced JOIN Detection: Handles complex nested JOINs and old-style comma-separated syntax
- π·οΈ Smart Alias Resolution: Context-aware alias mapping and resolution
- π¬ MySQL Compatibility: Full MySQL syntax support with normalization
- π Comprehensive Metadata: Provides detailed parsing information for debugging and visualization
pip install sql-splitterOr install from source:
git clone https://github.com/alexkwok22/sql-splitter.git
cd sql-splitter
pip install -e .from sql_splitter import SQLParserAST
# Initialize parser
parser = SQLParserAST()
# Parse SQL query
sql = """
SELECT
users.name,
COUNT(*) as total_orders,
SUM(orders.amount) as total_revenue
FROM users
JOIN orders ON users.id = orders.user_id
WHERE users.status = 'active'
GROUP BY users.name
"""
result = parser.parse(sql)
print(result){
"success": true,
"fields": [
{
"table": "users",
"field": "users.name",
"alias": "name",
"fieldType": "column",
"involvedTables": ["users"]
},
{
"table": null,
"field": "COUNT(*)",
"alias": "total_orders",
"fieldType": "aggregation",
"aggregationScope": ["users", "orders"],
"involvedTables": ["users", "orders"]
},
{
"table": "orders",
"field": "SUM(orders.amount)",
"alias": "total_revenue",
"fieldType": "aggregation",
"involvedTables": ["orders"]
}
],
"tables": ["users", "orders"],
"joins": [
{
"type": "JOIN",
"leftTable": "users",
"leftField": "id",
"rightTable": "orders",
"rightField": "user_id",
"condition": "users.id = orders.user_id"
}
],
"whereConditions": ["users.status = 'active'"],
"parser": "sqlsplit",
"metadata": {
"aliasMapping": {},
"aggregationFields": ["total_orders", "total_revenue"],
"computedFields": [],
"unresolved": {
"aliases": [],
"fields": []
}
}
}SQL Splitter automatically classifies fields into four types:
column: Simple table columns (users.name)aggregation: Aggregate functions (COUNT(*),SUM(amount))expression: Complex expressions (DATE_FORMAT(created_at, '%Y-%m'))computed: Conditional logic (CASE WHEN status = 1 THEN 'active' END)
For visualization components, aggregation functions include aggregationScope to show which tables are involved:
# COUNT(*) shows all tables in the query
{
"field": "COUNT(*)",
"fieldType": "aggregation",
"aggregationScope": ["users", "orders", "products"] # All related tables
}
# Specific aggregations show only relevant tables
{
"field": "SUM(orders.amount)",
"fieldType": "aggregation",
"aggregationScope": ["orders"] # Only orders table
}from sql_splitter import MySQLCompatibleNormalizer
normalizer = MySQLCompatibleNormalizer()
normalized_sql, rules, errors = normalizer.normalize_query(sql)Automatically converts old-style comma-separated JOINs:
-- Input: Old-style
SELECT * FROM users a, orders b WHERE a.id = b.user_id
-- Output: Modern JOIN
SELECT * FROM users a JOIN orders b ON a.id = b.user_idHandles complex alias scenarios:
# Resolves aliases like 'u' -> 'users', 'o' -> 'orders'
"metadata": {
"aliasMapping": {
"u": "users",
"o": "orders"
}
}- Quick Start Guide - Get started in 5 minutes
- API Documentation - Complete API reference
- Expected Format - JSON output specification
- Examples - Real-world usage examples
# Run basic tests
python -m pytest tests/
# Run with coverage
python -m pytest tests/ --cov=sql_splitter --cov-report=html- Python 3.7+
- No external dependencies (pure Python implementation)
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Built for SQL visualization component developers
- Supports complex MySQL queries and edge cases
- Designed with performance and accuracy in mind
Made with β€οΈ for the SQL visualization community