[DOC] Callout the aggregation result may be approximate#4922
[DOC] Callout the aggregation result may be approximate#4922LantaoJin merged 5 commits intoopensearch-project:mainfrom
Conversation
Signed-off-by: Lantao Jin <ltjin@amazon.com>
📝 WalkthroughSummary by CodeRabbitDocumentation
✏️ Tip: You can customize this high-level summary in your review settings. WalkthroughAdded documentation clarifying that OpenSearch bucket aggregation Changes
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~3 minutes
Pre-merge checks and finishing touches✅ Passed checks (5 passed)
✨ Finishing touches🧪 Generate unit tests (beta)
Comment |
Signed-off-by: Lantao Jin <ltjin@amazon.com>
Signed-off-by: Lantao Jin <ltjin@amazon.com>
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
docs/user/ppl/limitations/limitations.md (1)
105-115: Minor wording improvement for clarity.The phrase starting with "A term that is globally infrequent..." could be tightened for improved readability. Consider simplifying to reduce wordiness while preserving the technical accuracy:
- A term that is globally infrequent might not appear as infrequent on every individual shard or might be entirely absent from the least frequent results returned by some shards. Conversely, a term that appears infrequently on one shard might be common on another. In both scenarios, rare terms can be missed during shard-level aggregation, resulting in incorrect overall results. + Rare terms may not be ranked consistently across shards. A term infrequent globally might rank higher on some shards or be absent from others. This shard-level inconsistency can cause rare terms to be missed during aggregation, resulting in incomplete results.docs/user/ppl/cmd/stats.md (1)
68-79: Minor wording improvement for clarity.The phrase in the second subsection could be simplified for improved readability. Consider the same improvement suggested for the parallel section in limitations.md:
- A term that is globally infrequent might not appear as infrequent on every individual shard or might be entirely absent from the least frequent results returned by some shards. Conversely, a term that appears infrequently on one shard might be common on another. In both scenarios, rare terms can be missed during shard-level aggregation, resulting in incorrect overall results. + Rare terms may not be ranked consistently across shards. A term infrequent globally might rank higher on some shards or be absent from others. This shard-level inconsistency can cause rare terms to be missed during aggregation, resulting in incomplete results.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
docs/user/ppl/cmd/stats.md(1 hunks)docs/user/ppl/functions/aggregations.md(1 hunks)docs/user/ppl/limitations/limitations.md(1 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/user/ppl/limitations/limitations.md
[style] ~115-~115: You can shorten this phrase to improve clarity and avoid wordiness.
Context: ...) as c by URL | sort + c | head 10 ``` A term that is globally infrequent might not appear as infrequent on every...
(NNS_THAT_ARE_JJ)
docs/user/ppl/cmd/stats.md
[style] ~78-~78: You can shorten this phrase to improve clarity and avoid wordiness.
Context: ...) as c by URL | sort + c | head 10 ``` A term that is globally infrequent might not appear as infrequent on every...
(NNS_THAT_ARE_JJ)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (24)
- GitHub Check: build-linux (21, unit)
- GitHub Check: build-linux (25, integration)
- GitHub Check: build-linux (21, doc)
- GitHub Check: build-linux (25, doc)
- GitHub Check: build-linux (25, unit)
- GitHub Check: build-linux (21, integration)
- GitHub Check: bwc-tests-rolling-upgrade (21)
- GitHub Check: bwc-tests-rolling-upgrade (25)
- GitHub Check: security-it-linux (25)
- GitHub Check: security-it-linux (21)
- GitHub Check: build-windows-macos (macos-14, 25, integration)
- GitHub Check: security-it-windows-macos (windows-latest, 21)
- GitHub Check: build-windows-macos (macos-14, 25, unit)
- GitHub Check: build-windows-macos (macos-14, 25, doc)
- GitHub Check: security-it-windows-macos (macos-14, 25)
- GitHub Check: security-it-windows-macos (windows-latest, 25)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, integration)
- GitHub Check: build-windows-macos (macos-14, 21, unit)
- GitHub Check: build-windows-macos (macos-14, 21, doc)
- GitHub Check: build-windows-macos (macos-14, 21, integration)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, unit)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, unit)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, integration)
- GitHub Check: CodeQL-Scan (java)
🔇 Additional comments (4)
docs/user/ppl/functions/aggregations.md (1)
5-5: Accurate expansion of supported commands.The addition of
streamstatsto the list of commands that use aggregation functions is correct and aligns with the related documentation updates. The change reads naturally in context.docs/user/ppl/limitations/limitations.md (2)
91-115: Well-documented limitations with clear examples.The new sections effectively document the approximate behavior of bucket aggregations and related limitations. The examples and explanations are clear and directly address the issue #4915 requirements. The placement in the limitations section is appropriate.
52-115: Note: Content duplication with stats.md.The Limitations section added here (lines 91-115) appears to have identical content in
docs/user/ppl/cmd/stats.md(lines 52-79). While this ensures both the general limitations and command-specific documentation are comprehensive, you may want to consider a single source of truth approach or use cross-references to reduce maintenance burden. This is informational only.docs/user/ppl/cmd/stats.md (1)
52-79: Appropriate command-level limitations documentation.Adding limitations to the stats command documentation ensures users encounter this critical information in the relevant context. The examples effectively demonstrate the affected scenarios, and the explanations are clear.
Signed-off-by: Lantao Jin <ltjin@amazon.com>
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
docs/user/ppl/limitations/limitations.md (1)
91-116: Documentation successfully clarifies approximate aggregation behavior.The two new limitation sections effectively explain the approximation issues with bucket aggregations and sorting behavior, directly addressing issue #4915. Code examples are properly formatted with the
ignoreflag per doctest guidelines.Minor clarity suggestion: Line 115 contains a wordy phrase. Consider shortening "A term that is globally infrequent might not appear as infrequent on every individual shard" to something more concise, such as "A globally infrequent term may not appear infrequent on every individual shard" for improved readability.
docs/user/ppl/cmd/stats.md (1)
52-80: Limitations section appropriately documents aggregation approximation behavior.Placing the Limitations section within the stats command documentation is ideal for context awareness. Code examples follow doctest guidelines with proper
ignoreflags.Same minor clarity suggestion as in limitations.md: Line 79 uses a wordy phrasing. Shorten "A term that is globally infrequent might not appear as infrequent on every individual shard" to improve clarity and reduce cognitive load.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
docs/user/ppl/cmd/stats.md(1 hunks)docs/user/ppl/limitations/limitations.md(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: For PPL command PRs, refer docs/dev/ppl-commands.md and verify the PR satisfies the checklist
Applied to files:
docs/user/ppl/cmd/stats.md
🪛 LanguageTool
docs/user/ppl/cmd/stats.md
[style] ~78-~78: You can shorten this phrase to improve clarity and avoid wordiness.
Context: ...) as c by URL | sort + c | head 10 ``` A term that is globally infrequent might not appear as infrequent on every...
(NNS_THAT_ARE_JJ)
docs/user/ppl/limitations/limitations.md
[style] ~115-~115: You can shorten this phrase to improve clarity and avoid wordiness.
Context: ...) as c by URL | sort + c | head 10 ``` A term that is globally infrequent might not appear as infrequent on every...
(NNS_THAT_ARE_JJ)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (27)
- GitHub Check: security-it-linux (25)
- GitHub Check: security-it-linux (21)
- GitHub Check: build-linux (21, unit)
- GitHub Check: build-linux (25, integration)
- GitHub Check: build-linux (25, unit)
- GitHub Check: bwc-tests-rolling-upgrade (25)
- GitHub Check: build-linux (21, integration)
- GitHub Check: bwc-tests-full-restart (25)
- GitHub Check: bwc-tests-rolling-upgrade (21)
- GitHub Check: build-linux (21, doc)
- GitHub Check: build-linux (25, doc)
- GitHub Check: bwc-tests-full-restart (21)
- GitHub Check: build-windows-macos (macos-14, 25, unit)
- GitHub Check: security-it-windows-macos (macos-14, 25)
- GitHub Check: security-it-windows-macos (macos-14, 21)
- GitHub Check: build-windows-macos (macos-14, 21, unit)
- GitHub Check: security-it-windows-macos (windows-latest, 25)
- GitHub Check: build-windows-macos (macos-14, 21, doc)
- GitHub Check: security-it-windows-macos (windows-latest, 21)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, integration)
- GitHub Check: build-windows-macos (macos-14, 21, integration)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, unit)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, unit)
- GitHub Check: build-windows-macos (macos-14, 25, integration)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, integration)
- GitHub Check: build-windows-macos (macos-14, 25, doc)
- GitHub Check: CodeQL-Scan (java)
|
The backport to To backport manually, run these commands in your terminal: # Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/sql/backport-2.19-dev 2.19-dev
# Navigate to the new working tree
pushd ../.worktrees/sql/backport-2.19-dev
# Create a new branch
git switch --create backport/backport-4922-to-2.19-dev
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 90ee47c6f909d38f5ba12cef3c2bda8c5f23cce5
# Push it to GitHub
git push --set-upstream origin backport/backport-4922-to-2.19-dev
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/sql/backport-2.19-devThen, create a pull request where the |
* [DOC] Callout the aggregation result may be approximate Signed-off-by: Lantao Jin <ltjin@amazon.com> * add to limitation.rst Signed-off-by: Lantao Jin <ltjin@amazon.com> * revert Signed-off-by: Lantao Jin <ltjin@amazon.com> * add ignore format Signed-off-by: Lantao Jin <ltjin@amazon.com> --------- Signed-off-by: Lantao Jin <ltjin@amazon.com> (cherry picked from commit 90ee47c) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
* [DOC] Callout the aggregation result may be approximate * add to limitation.rst * revert * add ignore format --------- (cherry picked from commit 90ee47c) Signed-off-by: Lantao Jin <ltjin@amazon.com> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Description
[DOC] Callout the aggregation result may be approximate
Related Issues
Resolves #4915
Check List
--signoffor-s.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.