-
Notifications
You must be signed in to change notification settings - Fork 879
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow installation of additional primitives #326
Conversation
@@ -1,14 +0,0 @@ | |||
[tox] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can probably remove tox from dev-requirements.txt as well
featuretools/primitives/install.py
Outdated
s3 = s3fs.S3FileSystem(anon=True) | ||
remote_archive = s3.open(uri, 'rb') | ||
|
||
f.write(remote_archive.read()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of reading the whole archive should we read it line by line or some number of bytes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't think it matters, does it? at least for now, I don't expect these archives to be too big
…ools into install-primitives
This PR introduces a method to install additional primitives into an existing Featuretools installation.
Installation from command line
Now a user can access the primitives either by importing or by string
This returns
How it works
The installation command is provided a directory or tar archive of primitive source files. If it is an archive, the installation script extracts it to directory. The archive can also be remote as in the example above, in which case it is downloaded to a temporary directory first. We use smart_open to handle the downloading, so it supports downloading primitives from S3, HDFS, and HTTP/HTTPS.
Primitives are installed by copying their source files into a new directory
installed
within thefeaturetools.primitives
submodule. The installation script only considers files with a.py
extension. The installation script detects if a primitive is in the file by looking for an object that is a subclass ofPrimitiveBase
. Each of these files must only exactly one primitive, otherwise the installation will through an error. Whenfeaturetools.primitives
is a loaded, primitives source files ininstalled/
are automatically imported.To support the CLI, there is new configuration for
entry_points
in the setup.py based on the instruction here: https://chriswarrick.com/blog/2014/09/15/python-apps-the-right-way-entry_points-and-scripts/Users can also install from a python script using
ft.install_primitives(...)
In this PR I also updated the structure of
featuretools.primitives
to more organized. It doesn't change any of the external API, but there is a nowfeaturetools.primitives.standard
- all the primitives that come with featuretoolsfeaturetools.primitives.installed
- all the primitives that are installed into featuretoolsfeaturetools.primitives.base
- the base classes used by primitives e.gPrimitiveBase
,AggregationPrimitive
, etcFinally, I also removed our usage of tox. It wasn't providing any useful functionality after we separate each version into separate circle ci jobs
TODO before ready for review
Future Development