Skip to content

Commit

Permalink
Fpmax (#553)
Browse files Browse the repository at this point in the history
* Add fpmax algorithm to frequent patterns module

* Refactor unit tests for frequent patterns

* Small fix to fpmax

* Add unit tests for fpmax

* Fix unit tests for apriori and growth plus more refactoring

* Change EOL to match rest of repo (LF instead of CRLF)

* Remove unittest parent class from frequent pattern tests as it is unneeded

* Improve valid val check performance in frequent patterns

* Fix some pep8 issues

* Fix pytest issues

* Refactor fpgrowth

* add boolean array to unit tests

* add documentation
  • Loading branch information
harenbergsd authored and rasbt committed Jun 23, 2019
1 parent b20e57c commit cbe17d7
Show file tree
Hide file tree
Showing 15 changed files with 1,270 additions and 453 deletions.
1 change: 1 addition & 0 deletions docs/mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -93,6 +93,7 @@ nav:
- user_guide/frequent_patterns/apriori.md
- user_guide/frequent_patterns/association_rules.md
- user_guide/frequent_patterns/fpgrowth.md
- user_guide/frequent_patterns/fpmax.md
- general concepts:
- user_guide/general_concepts/activation-functions.md
- user_guide/general_concepts/gradient-optimization.md
Expand Down
3 changes: 2 additions & 1 deletion docs/sources/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,9 @@ The CHANGELOG for the current development version is available at
- Added optional `groups` parameter to `SequentialFeatureSelector` and `ExhaustiveFeatureSelector` `fit()` methods for forwarding to sklearn CV ([#537](https://github.com/rasbt/mlxtend/pull/537) via [arc12](https://github.com/qiaguhttps://github.com/arc12))
- Added a new `plot_pca_correlation_graph` function to the `mlxtend.plotting` submodule for plotting a PCA correlation graph. ([#544](https://github.com/rasbt/mlxtend/pull/544) via [Gabriel-Azevedo-Ferreira](https://github.com/qiaguhttps://github.com/Gabriel-Azevedo-Ferreira))
- Added a `zoom_factor` parameter to the `mlxten.plotting.plot_decision_region` function that allows users to zoom in and out of the decision region plots. ([#545](https://github.com/rasbt/mlxtend/pull/545))
- Added a function `fpgrowth` that implements the FP-Growth algorithm for mining frequent itemsets as a drop-in replacement of the existing `apriori` algorithm. ([#550](https://github.com/rasbt/mlxtend/pull/550) via [Steve Harenberg](https://github.com/harenbergsd))
- Added a function `fpgrowth` that implements the FP-Growth algorithm for mining frequent itemsets as a drop-in replacement for the existing `apriori` algorithm. ([#550](https://github.com/rasbt/mlxtend/pull/550) via [Steve Harenberg](https://github.com/harenbergsd))
- New `heatmap` function in `mlxtend.plotting`. ([#552](https://github.com/rasbt/mlxtend/pull/552))
- Added a function `fpmax` that implements the FP-Max algorithm for mining maximal itemsets as a drop-in replacement for the `fpgrowth` algorithm. ([#553](https://github.com/rasbt/mlxtend/pull/553) via [Steve Harenberg](https://github.com/harenbergsd))
- New `figsize` parameter for the `plot_decision_regions` function in `mlxtend.plotting`. ([#555](https://github.com/rasbt/mlxtend/pull/555) via [Mirza Hasanbasic](https://github.com/kazyka))

##### Changes
Expand Down
1 change: 1 addition & 0 deletions docs/sources/USER_GUIDE_INDEX.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,7 @@
- [apriori](user_guide/frequent_patterns/apriori.md)
- [association_rules](user_guide/frequent_patterns/association_rules.md)
- [fpgrowth](user_guide/frequent_patterns/fpgrowth.md)
- [fpmax](user_guide/frequent_patterns/fpmax.md)

## `general concepts`
- [activation-functions](user_guide/general_concepts/activation-functions.md)
Expand Down
30 changes: 8 additions & 22 deletions docs/sources/user_guide/frequent_patterns/apriori.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,12 @@
"source": [
"## References\n",
"\n",
"[1] Agrawal, Rakesh, and Ramakrishnan Srikant. \"[Fast algorithms for mining association rules](https://www.it.uu.se/edu/course/homepage/infoutv/ht08/vldb94_rj.pdf).\" Proc. 20th int. conf. very large data bases, VLDB. Vol. 1215. 1994."
"[1] Agrawal, Rakesh, and Ramakrishnan Srikant. \"[Fast algorithms for mining association rules](https://www.it.uu.se/edu/course/homepage/infoutv/ht08/vldb94_rj.pdf).\" Proc. 20th int. conf. very large data bases, VLDB. Vol. 1215. 1994.\n",
"\n",
"## Related\n",
"\n",
"- [FP-Growth](../fpgrowth.md)\n",
"- [FP-Max](../fpmax.md)"
]
},
{
Expand All @@ -53,9 +58,7 @@
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": true
},
"metadata": {},
"source": [
"The `apriori` function expects data in a one-hot encoded pandas DataFrame.\n",
"Suppose we have the following transaction data:"
Expand Down Expand Up @@ -923,23 +926,6 @@
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Iteration: 1 | Sampling itemset size 2\r",
"Iteration: 2 | Sampling itemset size 2\r",
"Iteration: 3 | Sampling itemset size 2\r",
"Iteration: 4 | Sampling itemset size 2\r",
"Iteration: 5 | Sampling itemset size 2\r",
"Iteration: 6 | Sampling itemset size 2\r",
"Iteration: 7 | Sampling itemset size 2\r",
"Iteration: 8 | Sampling itemset size 2\r",
"Iteration: 9 | Sampling itemset size 2\r",
"Iteration: 10 | Sampling itemset size 2\r",
"Iteration: 11 | Sampling itemset size 3\r",
"Iteration: 12 | Sampling itemset size 3\r",
"Iteration: 13 | Sampling itemset size 3\r",
"Iteration: 14 | Sampling itemset size 3\r",
"Iteration: 15 | Sampling itemset size 3\r",
"Iteration: 16 | Sampling itemset size 3\r",
"Iteration: 17 | Sampling itemset size 3\n"
]
},
Expand Down Expand Up @@ -1176,5 +1162,5 @@
}
},
"nbformat": 4,
"nbformat_minor": 1
"nbformat_minor": 2
}
7 changes: 6 additions & 1 deletion docs/sources/user_guide/frequent_patterns/fpgrowth.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,12 @@
"\n",
"[1] Han, Jiawei, Jian Pei, Yiwen Yin, and Runying Mao. \"Mining frequent patterns without candidate generation. \"[A frequent-pattern tree approach.](https://link.springer.com/content/pdf/10.1023%2FB%3ADAMI.0000005258.31418.83.pdf)\" Data mining and knowledge discovery 8, no. 1 (2004): 53-87.\n",
"\n",
"[2] Agrawal, Rakesh, and Ramakrishnan Srikant. \"[Fast algorithms for mining association rules](https://www.it.uu.se/edu/course/homepage/infoutv/ht08/vldb94_rj.pdf).\" Proc. 20th int. conf. very large data bases, VLDB. Vol. 1215. 1994."
"[2] Agrawal, Rakesh, and Ramakrishnan Srikant. \"[Fast algorithms for mining association rules](https://www.it.uu.se/edu/course/homepage/infoutv/ht08/vldb94_rj.pdf).\" Proc. 20th int. conf. very large data bases, VLDB. Vol. 1215. 1994.\n",
"\n",
"## Related\n",
"\n",
"- [FP-Max](../fpmax.md)\n",
"- [Apriori](../apriori.md)"
]
},
{
Expand Down

0 comments on commit cbe17d7

Please sign in to comment.