updates the Apriori prune() function to improve performance while preserving correctness. #13616
+4
−4
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR updates the Apriori prune() function to improve performance while preserving correctness.
What’s included
Use a set for O(1) membership checks of (k–1)-subsets.
Retain a Counter-based multiplicity check (< length − 1) for robustness and to match existing doctest behavior.
Inline comments clarifying the rationale.
Why
Standard Apriori datasets (unique items per transaction) benefit from faster membership-only checks.
Keeping multiplicity check ensures correctness for edge cases.
File changed:
[apriori_algorithm.py]
Tests:
Doctests (module): 11 passed, 0 failed.
Reference:
Apriori algorithm: [https://en.wikipedia.org/wiki/Apriori_algorithm]
Fixes #12943
Checklist:
[x]I have read CONTRIBUTING.md.
[x]This pull request is all my own work -- I have not plagiarized.
[x]I know that pull requests will not be merged if they fail the automated tests.
[x] This PR only changes one algorithm file. To ease review, please open separate PRs for separate algorithms.
[x]All new Python files are placed inside an existing directory. (No new files added.)
[x]All filenames are in all lowercase characters with no spaces or dashes.
[x]All functions and variable names follow Python naming conventions.
[x] All function parameters and return values are annotated with Python type hints.
[x] All functions have doctests that pass the automated testing.
[ ] All new algorithms include at least one URL that points to Wikipedia or another similar explanation. (Not a new algorithm.)
[x] If this pull request resolves one or more open issues then the description above includes the issue number(s) with a closing keyword: “Fixes #12943”.