Skip to content

DBAPI-style adapter that translates SQL (JOIN/DDL/aggregations) into MongoDB operations

License

Notifications You must be signed in to change notification settings

kitfactory/dbapi-mongodb

Repository files navigation

dbapi-mongodb

DBAPI-style adapter that lets you execute a limited subset of SQL against MongoDB by translating SQL to Mongo queries. Built on pymongo (3.13.x for MongoDB 3.6 compatibility) and SQLGlot.

Purpose: let existing DB-API / SQLAlchemy Core / FastAPI code treat MongoDB as “just another dialect.”

  • PyPI package name: dbapi-mongodb
  • Module import name: mongo_dbapi

Release Policy (Join-first / MongoDB 4.4)

  • This release locks on practical JOIN compatibility first, targeting MongoDB 4.4.
  • Guaranteed JOIN scope: INNER/LEFT (up to 3 hops, composite ON, non-equi ON), RIGHT OUTER (single JOIN, equality ON), FULL OUTER (single JOIN, equality ON), JOIN with ORDER BY/LIMIT/OFFSET, and JOIN + GROUP BY/HAVING (A27 pattern: COALESCE(...) + COUNT(*)).
  • Other advanced SQL features remain available where implemented, but further hardening is treated as future work. Unsupported cases return [mdb][E2].

Features

  • DBAPI-like Connection/Cursor

  • SQL → Mongo: SELECT/INSERT/UPDATE/DELETE, CREATE/DROP TABLE/INDEX (ASC/DESC, UNIQUE, composite), WHERE (comparisons/AND/OR/IN/BETWEEN/LIKE/ILIKE/regex literal), ORDER BY, LIMIT/OFFSET, JOIN (INNER/LEFT up to 3 hops with composite/non-equi ON, RIGHT/FULL OUTER for single JOIN with equality ON), GROUP BY + aggregates + HAVING (including FULL OUTER A27 pattern), simple CASE aggregates (SUM(CASE WHEN ... THEN ... ELSE ... END)), UNION ALL/UNION (multi-branch), subqueries (WHERE IN/EXISTS/NOT EXISTS, scalar subquery in comparisons, FROM (SELECT ...)), non-recursive WITH CTE

  • %s positional and %(name)s named parameters; unsupported constructs raise Error IDs (e.g. [mdb][E2])

  • Error IDs for common failures: invalid URI, unsupported SQL, unsafe DML without WHERE, parse errors, connection/auth failures

  • DBAPI fields: rowcount, lastrowid, description (column order: explicit order, or alpha for SELECT *; JOIN uses left→right)

  • Transactions: begin/commit/rollback wrap Mongo sessions; MongoDB 3.6 and other unsupported envs are treated as no-op success

  • Async dialect (thread-pool backed) for Core CRUD/DDL/Index with FastAPI-friendly usage; minimal ORM CRUD for single-table entities (relationships out of scope)

  • Use cases

    • Swap in Mongo as “another dialect” for existing SQLAlchemy Core–based infra (Engine/Connection + Table/Column)
    • Point existing Core-based batch/report jobs at Mongo data with minimal changes
    • Minimal ORM CRUD for single-table entities (PK → _id)
    • Async dialect for FastAPI/async stacks (thread-pool implementation; native async later)

Requirements

  • Python 3.10+
  • MongoDB 3.6 (bundled mongodb-3.6 binary) or later (note: bundled binary is 3.6, so transactions are unsupported)
  • Virtualenv at .venv (already present); dependencies are managed via pyproject.toml

Installation

pip install dbapi-mongodb
# (optional) with a virtualenv: python -m venv .venv && . .venv/bin/activate && pip install dbapi-mongodb

Start local MongoDB (bundled 3.6)

# Default port 27017; override with PORT
PORT=27018 ./startdb.sh

Start local MongoDB 4.4 (replica set, bundled)

# Default port 27019; uses bundled libssl1.1. LD_LIBRARY_PATH is set inside the script for mongod.
PORT=27019 ./start4xdb.sh
# Run tests against 4.x
MONGODB_URI=mongodb://127.0.0.1:27019 MONGODB_DB=mongo_dbapi_test .venv/bin/pytest -q

Usage example

from mongo_dbapi import connect

conn = connect("mongodb://127.0.0.1:27018", "mongo_dbapi_test")
cur = conn.cursor()
cur.execute("INSERT INTO users (id, name) VALUES (%s, %s)", (1, "Alice"))

cur.execute("SELECT id, name FROM users WHERE id = %s", (1,))
print(cur.fetchall())  # [(1, 'Alice')]
print(cur.rowcount)    # 1

Supported SQL

  • Statements: SELECT, INSERT, UPDATE, DELETE, CREATE/DROP TABLE, CREATE/DROP INDEX
  • WHERE: comparisons (=, <>, >, <, <=, >=), AND, OR, IN, BETWEEN, LIKE (%/_$regex), ILIKE, regex literal /.../
  • JOIN: INNER/LEFT (composite ON, non-equi ON, up to 3 joins), RIGHT OUTER (single JOIN, equality ON), FULL OUTER (single JOIN, equality ON)
  • Aggregation: GROUP BY with COUNT/SUM/AVG/MIN/MAX and HAVING
  • Subqueries: WHERE IN/EXISTS/NOT EXISTS (correlated EXISTS in limited simple form), scalar non-correlated subqueries in comparisons, and FROM (SELECT ...)
  • Set ops: UNION ALL / UNION (supports 3+ branches; mixed UNION + UNION ALL is rejected)
  • CTE: non-recursive WITH
  • Window: ROW_NUMBER/RANK/DENSE_RANK on MongoDB 5.x+ ([mdb][E2] on 4.4)
  • ORDER/LIMIT/OFFSET
  • Unsupported (4.4 target): WITH RECURSIVE, RIGHT/FULL OUTER with non-equality ON, RIGHT/FULL join chains, complex FULL OUTER aggregate shapes (outside COALESCE(...) + COUNT(*)), complex correlated subqueries, mixed UNION + UNION ALL, ORM relationships

SQLAlchemy

  • DBAPI module attributes: apilevel="2.0", threadsafety=1, paramstyle="pyformat".
  • Scheme: mongodb+dbapi://... dialect provided (sync + async/thread-pool).
  • Scope: Core text()/Table/Column CRUD/DDL/Index、ORM 最小 CRUD(単一テーブル)、JOIN/UNION ALL/HAVING/subquery/ROW_NUMBER を実通信で確認済み。async dialect は Core CRUD/DDL/Index のラップで、ネイティブ async は今後検討。

Async (FastAPI/Core) - beta

  • Current implementation wraps the sync driver in a thread pool (native async driver is planned). Provided via mongo_dbapi.async_dbapi.connect_async. API mirrors sync Core: awaitable CRUD/DDL/Index, JOIN/UNION ALL/HAVING/IN/EXISTS/FROM subquery.
  • Transactions: effective on MongoDB 4.x+ only; 3.6 is no-op. Be mindful that MongoDB transactions differ from RDBMS in locking/perf; avoid heavy transactional workloads.
  • Window: ROW_NUMBER/RANK/DENSE_RANK are available on MongoDB 5.x+; earlier versions return [mdb][E2] Unsupported SQL construct: WINDOW_FUNCTION.
  • FastAPI example:
from fastapi import FastAPI, Depends
from sqlalchemy.ext.asyncio import create_async_engine, AsyncConnection
from sqlalchemy import text

engine = create_async_engine("mongodb+dbapi://127.0.0.1:27019/mongo_dbapi_test")
app = FastAPI()

async def get_conn() -> AsyncConnection:
    async with engine.connect() as conn:
        yield conn

@app.get("/users/{user_id}")
async def get_user(user_id: str, conn: AsyncConnection = Depends(get_conn)):
    rows = await conn.execute(text("SELECT id, name FROM users WHERE id = :id"), {"id": user_id})
    row = rows.fetchone()
    return dict(row) if row else {}
  • Limitations: async ORM/relationship and statement cache are out of scope; heavy concurrency uses a thread pool under the hood.

Support levels

  • Stable guarantee (Join-first): JOIN-heavy paths (INNER/LEFT up to 3 hops, RIGHT/FULL single JOIN constraints, JOIN + ORDER/LIMIT/OFFSET, JOIN + GROUP BY/HAVING in supported shapes) and single-collection CRUD.
  • Available as best effort: UNION/UNION ALL, non-recursive CTE, scalar subqueries, limited correlated EXISTS, MongoDB 5.x+ window functions.
  • Not supported / constraints: WITH RECURSIVE, RIGHT/FULL OUTER with non-equality ON, RIGHT/FULL join chains, complex FULL OUTER aggregate shapes, complex correlated subqueries, mixed UNION + UNION ALL, ORM relationships; async remains thread-pool based.

Running tests

PORT=27018 ./startdb.sh  # if 27017 is taken
MONGODB_URI=mongodb://127.0.0.1:27018 MONGODB_DB=mongo_dbapi_test .venv/bin/pytest -q

Tutorials

  • English: docs/tutorial.md
  • 日本語: docs/tutorial_ja.md

Notes

  • Transactions on MongoDB 3.6 are treated as no-op; 4.x+ (replica set) uses real sessions and the bundled 4.4 binary passes all tests.
  • Error messages are fixed strings per docs/spec.md. Keep logs at DEBUG only (default INFO is silent).

Roadmap (SQL support prioritization)

  1. Complex correlated subqueries and FULL OUTER aggregate generalization (beyond A27 shape)
  2. WITH RECURSIVE and mixed UNION + UNION ALL chains
  3. Additional window functions beyond ROW_NUMBER/RANK/DENSE_RANK (LAG/LEAD/NTILE, etc.)
  4. Performance guidance for large JOIN workloads (recommended indexes and slow-query thresholds, especially FULL OUTER)
    If you need one of these sooner, please open an issue and share your use case.

License

MIT License (see LICENSE). Provided as-is without warranty; commercial use permitted.

GitHub Sponsors

Maintained in personal time. If this helps you run MongoDB from DB-API/SQLAlchemy stacks, consider supporting via GitHub Sponsors to keep fixes and version updates coming.

About

DBAPI-style adapter that translates SQL (JOIN/DDL/aggregations) into MongoDB operations

Resources

License

Stars

Watchers

Forks

Packages

No packages published