Apriori use sets #393

WLaney · 2018-06-11T00:09:42Z

Description

Changes apriori to return itemsets as sets instead of lists.

Related issues or pull requests

Pull Request Checklist

Added a note about the modification or contribution to the ./docs/sources/CHANGELOG.md file (if applicable)
Added appropriate unit test functions in the ./mlxtend/*/tests directories (if applicable)
Modify documentation in the corresponding Jupyter Notebook under mlxtend/docs/sources/ (if applicable)
Ran nosetests ./mlxtend -sv and make sure that all unit tests pass (for small modifications, it might be sufficient to only run the specific test file, e.g., nosetests ./mlxtend/classifier/tests/test_stacking_cv_classifier.py -sv)
Checked for style issues by running flake8 ./mlxtend

coveralls · 2018-06-11T00:25:00Z

Coverage increased (+0.01%) to 91.267% when pulling a7cdb41 on WLaney:apriori-use-sets into bcf3f44 on rasbt:master.

rasbt · 2018-06-12T02:18:31Z

Thanks for the PR!

I think for efficiency reasons, to avoid yet another for-loop, it would be better to do the transformation to sets some place earlier in the code. For example, replacing the line itemsets = pd.Series([set(i) for i in itemset_dict[k]] by

itemsets = pd.Series([set(i) for i in itemset_dict[k]]

and then the line for the use_colnames loop as well. I.e., res_df['itemsets'] = res_df['itemsets'].apply(lambda x: {mapping[i] for i in x}).

Would be great if you could also add a short unit test checking that the types of the itemsets is indeed "set" for use_colnames=True & False

Thanks!

this is for efficency reasons

WLaney · 2018-06-14T01:34:29Z

Thank you for the advice!

I tried to do the transformation earlier in the code on my first go, but I couldn't quite figure out where to do it without messing up the panda data structure. I was able to get it this time with your help though!

For the unit test I'm just looping through the returned itemsets and checking the type of each element. I've never had to make unit tests before, but I that test seems to checks what we want. If you think it would be better to do something else though just let me know.

rasbt · 2018-06-14T20:15:39Z

mlxtend/frequent_patterns/tests/test_apriori.py

+
+
+def test_itemsets_type():
+    res_colindice = apriori(df, use_colnames=False)  # This is defualt behavior


Could you change "defualt" to "default" ? :)

rasbt · 2018-06-14T20:16:21Z

mlxtend/frequent_patterns/tests/test_apriori.py

+    for i in res_colindice['itemsets']:
+        assert isinstance(i, set) is True
+
+    res_colnames = apriori(df, use_colnames=True)  # This is defualt behavior


I think this shouldn't be default behavior then, because the line above with "use_colnames=False" is indicated with default behavior

rasbt · 2018-06-14T20:17:09Z

That looks good, thanks! Could you maybe just do the little comment-fix that I mentioned above? Besides that, this looks good!

WLaney · 2018-06-15T01:24:14Z

Thanks! And yeah, no problem. Sorry about the comments, that was just carelessness on my part.

rasbt · 2018-06-15T01:43:58Z

That's great, thanks for the fix & PR

WLaney added 2 commits June 10, 2018 17:36

apriori.py now outputs itemsets as a set

2cffa5a

updated docs

cdf899c

WLaney added 3 commits June 13, 2018 20:44

itemsets now converted to sets in the loop that creates them

bc4ea52

this is for efficency reasons

added unit tests to check apriori itemsets data type

b9cc516

formating updates from flake8

175fc5c

rasbt reviewed Jun 14, 2018

View reviewed changes

small comment fixes

a7cdb41

rasbt merged commit 84fde97 into rasbt:master Jun 15, 2018

This was referenced Jun 15, 2018

Change itemsets generated via apriori from list to sets #344

Closed

use frozensets in apriori #394

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Apriori use sets #393

Apriori use sets #393

WLaney commented Jun 11, 2018 •

edited

Loading

coveralls commented Jun 11, 2018 •

edited

Loading

rasbt commented Jun 12, 2018 •

edited

Loading

WLaney commented Jun 14, 2018

rasbt Jun 14, 2018

rasbt Jun 14, 2018

rasbt commented Jun 14, 2018

WLaney commented Jun 15, 2018

rasbt commented Jun 15, 2018



		def test_itemsets_type():
		res_colindice = apriori(df, use_colnames=False) # This is defualt behavior

Apriori use sets #393

Apriori use sets #393

Conversation

WLaney commented Jun 11, 2018 • edited Loading

Description

Related issues or pull requests

Pull Request Checklist

coveralls commented Jun 11, 2018 • edited Loading

rasbt commented Jun 12, 2018 • edited Loading

WLaney commented Jun 14, 2018

rasbt Jun 14, 2018

Choose a reason for hiding this comment

rasbt Jun 14, 2018

Choose a reason for hiding this comment

rasbt commented Jun 14, 2018

WLaney commented Jun 15, 2018

rasbt commented Jun 15, 2018

WLaney commented Jun 11, 2018 •

edited

Loading

coveralls commented Jun 11, 2018 •

edited

Loading

rasbt commented Jun 12, 2018 •

edited

Loading