-
Notifications
You must be signed in to change notification settings - Fork 883
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Entropy aggregation primitive #779
Conversation
Hi @rwedge, had issues with assigning github account to my commits with my previous PR, so I redid the PR. Previously when running 'make html' the Entropy aggregation primitive was not automatically generating documentation. Is there any extra step that I am missing? In addition, currently I am getting the following error when I run 'make html' now: |
My mistake, you should add entropy to the Aggregation Primitives section of That docs error is strange, does you environment warn you when running this code:
|
There were some issues with the packages in the environment. Have a dedicated featuretools env now, so that should be sorted. Apologies for the silly questions, still getting the hang of the contributing and getting everything setup. I am getting the following error when trying to make the docs: Traceback (most recent call last): |
I also fail the "featuretools/tests/cli_tests/test_cli.py " test locally - however, when I install featuretools into my env this issue (and configuration error) disappear but then the documentation does not build properly. I think it's because everything is imported from the version installed in the env and not from the source code i'm working off. |
If you run
featuretools will get installed in "editable" mode and changes you make to the source code will be reflected when you import featuretools again I would try re-installing featuretools in the env in editable mode |
Then re-add twdobson to changelog
@rwedge, thanks for the help - editable mode did the trick. |
@twdobson code looks good, just thinking about what range we would want to use for the scipy dependency |
@rwedge, thanks. Would the oldest version of scipy that has entropy, with the required parameters, work? Alternatively, we could implement the function via numpy, the formula is simple: Entropy = -sum(pk * log(pk), axis=0), where pk = probability of categorical, which is the output from value_counts(normalize =True) |
@twdobson it looks like scipy 0.11.0 is when the |
@rwedge, sounds good. I have updated the PR to reflect this. Quick two questions:
|
Let me know if there is anything else that needs to be updated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR looks good
As to your questions:
- I installed featuretools into a clean environment and checked the pip output to see which package required scipy. I did try using pipdeptree after you mentioned it, it also came up with scipy 0.13.3 being necessary for sklearn 0.20.4
- We don't test how new primitives impact model performance or feature importance
Entropy aggregation primitive
For details on entropy please see:
https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.entropy.html