-
Notifications
You must be signed in to change notification settings - Fork 6
Open
Labels
documentationImprovements or additions to documentationImprovements or additions to documentationduplicateThis issue or pull request already existsThis issue or pull request already existsenhancementNew feature or requestNew feature or requestgood first issueGood for newcomersGood for newcomershelp wantedExtra attention is neededExtra attention is neededquestionFurther information is requestedFurther information is requested
Description
Summary
Add translation support for common SQL window functions into MongoDB’s $setWindowFields (MongoDB ≥ 5.0). Cover ROW_NUMBER(), RANK(), DENSE_RANK(), NTILE(n) (best-effort), running/tumbling SUM/AVG/MIN/MAX/COUNT, and partitioned order-aware windows. Provide graceful fallback/errors for unsupported server versions.
Motivation
- Window functions are standard in analytics SQL; many migrations stall when they hit
OVER (PARTITION BY … ORDER BY …). - MongoDB’s
$setWindowFieldsis the canonical equivalent; supporting it unlocks powerful reporting in-place without ETL.
Scope (v1)
Supported SQL patterns:
ROW_NUMBER() OVER (PARTITION BY … ORDER BY …)RANK()/DENSE_RANK()over partition+order- Running aggregates:
SUM/AVG/MIN/MAX/COUNTwithROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW - Basic frame variants:
CURRENT ROW,UNBOUNDED PRECEDING; (optional)N PRECEDINGwhere feasible NTILE(n)(emit bucket index via computed percentile approximation)
Out of scope (follow-ups):
- Window frames with
FOLLOWING,RANGE(value-based) frames, and complex expressions LAG/LEAD(separate issue)
API
from sql_mongo_converter import sql_to_pipeline
pipeline = sql_to_pipeline("""
SELECT
u.country,
u.user_id,
ROW_NUMBER() OVER (PARTITION BY u.country ORDER BY u.created_at) AS rn,
RANK() OVER (PARTITION BY u.country ORDER BY u.total_spend DESC) AS rnk,
SUM(u.total_spend) OVER (
PARTITION BY u.country
ORDER BY u.created_at
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS running_spend
FROM users u
WHERE u.status = 'ACTIVE';
""", options={
"mongoServerVersion": "6.0", # enable $setWindowFields
"emitCompatibilityNotes": True
})Example Output (abridged)
{
"collection": "users",
"pipeline": [
{"$match": {"status": "ACTIVE"}},
{"$setWindowFields": {
"partitionBy": "$country",
"sortBy": {"created_at": 1},
"output": {
"rn": {"$documentNumber": {}}, // ROW_NUMBER
"running_spend": {
"$sum": "$total_spend",
"window": {"documents": ["unbounded","current"]}
}
}
}},
{"$setWindowFields": {
"partitionBy": "$country",
"sortBy": {"total_spend": -1},
"output": {
"rnk": {"$rank": {}} // RANK (DENSE_RANK → $denseRank)
}
}},
{"$project": {"country": 1, "user_id": 1, "rn": 1, "rnk": 1, "running_spend": 1, "_id": 0}}
]
}Note: multiple
$setWindowFieldsstages may be emitted to accommodate differentORDER BYspecs per window.
Translation Rules
- PARTITION BY →
partitionBy: <expr> - ORDER BY →
sortBy: { … } - ROW_NUMBER →
"$documentNumber": {} - RANK →
"$rank": {}, DENSE_RANK →"$denseRank": {} - Running aggregates → accumulator +
window.documents: ["unbounded","current"] N PRECEDING(best-effort) →window.documents: [-N, "current"](warn when not supported)- NTILE(n) (approx): compute bucket index using
$rank/$denseRank+ derived math; document approximation caveats.
Dialect & Validation
- Gate feature behind
options.mongoServerVersion >= 5.0; otherwise raise a clear, actionable error with a suggestion to precompute or upgrade. - Validate that each window has a single
ORDER BYand compatible frame; emit warnings for unsupported frames.
Tests
- Row number by partition (stable order)
- Rank vs dense_rank with ties
- Running sum/avg by time order
- Mixed windows (different ORDER BYs) on same query
- Version guard (4.4 → friendly error)
- NTILE(n) sanity checks on uniform partitions
README / Docs
- New section: “Window Functions → $setWindowFields”
- Matrix table (SQL function → Mongo expression)
- Two full copy-paste examples + caveats (frames, server version)
- Performance note: ensure supporting indexes on
partitionByandsortBykeys
Acceptance Criteria
-
sql_to_pipeline()translates listed window functions into valid$setWindowFieldsstages. - Multiple windows with different
ORDER BYare supported (multiple stages emitted). - Version gating & helpful error/warning messages implemented.
- Comprehensive unit tests + README examples added.
- Backward compatibility preserved for existing APIs.
Metadata
Metadata
Assignees
Labels
documentationImprovements or additions to documentationImprovements or additions to documentationduplicateThis issue or pull request already existsThis issue or pull request already existsenhancementNew feature or requestNew feature or requestgood first issueGood for newcomersGood for newcomershelp wantedExtra attention is neededExtra attention is neededquestionFurther information is requestedFurther information is requested