-
Notifications
You must be signed in to change notification settings - Fork 14
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
46 changed files
with
15,746 additions
and
7 deletions.
There are no files selected for viewing
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
# Sphinx build info version 1 | ||
# This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done. | ||
config: 19bbadb7a81fa723ab573d8e34479952 | ||
tags: 645f666f9bcd5a90fca523b33c5a78b7 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,24 @@ | ||
.. keras-pandas documentation master file, created by | ||
sphinx-quickstart on Fri May 25 16:33:06 2018. | ||
You can adapt this file completely to your liking, but it should at least | ||
contain the root `toctree` directive. | ||
Welcome to keras-pandas's documentation! | ||
======================================== | ||
|
||
.. toctree:: | ||
:maxdepth: 2 | ||
:caption: Contents: | ||
|
||
.. automodule:: Automater | ||
:members: | ||
:undoc-members: | ||
:show-inheritance: | ||
|
||
|
||
Indices and tables | ||
================== | ||
|
||
* :ref:`genindex` | ||
* :ref:`modindex` | ||
* :ref:`search` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,126 @@ | ||
# Backlog | ||
|
||
- Interface: Need to determine options (SKLearn transformer, custom interface, etc) | ||
- Interface: Need to outline functionality | ||
- Boolean: Need to determine if it'll be handled as numerical or categorical | ||
- Pip installable: Need to determine level of effort | ||
|
||
# Interface: Need to determine options (SKLearn transformer, custom interface, etc) | ||
|
||
Options: | ||
|
||
- SKLearn transformer | ||
- Pandas-SKLearn style module | ||
- Custom module | ||
|
||
Requirements: | ||
|
||
- Be able to expand one column to many columns (datetime) | ||
- Abilityt o use SKLearn transformers | ||
|
||
## SKLearn transformer | ||
|
||
- Inputs would have to be Numpy arrays | ||
- Inputs can be Numpy arrays | ||
- Tighter integration to SKLearn infrastructure | ||
|
||
## Pandas-SKLearn style module | ||
|
||
- Can have pandas dataframe inputs | ||
- Can have multipel column inputs | ||
- Can apply multiple transformations to the same column (many to one relationship) | ||
- Can not apply transformations to some columns | ||
- Can use default transformer | ||
|
||
## Custom module | ||
|
||
- Can sit on top of Pandas-SKLearn | ||
- Can mimic SKLearn `fit` and `transform` interface | ||
|
||
# Need to determine if it'll be handled as numerical or categorical | ||
|
||
## Numerical | ||
|
||
- Less compute time | ||
- Reduced complexity | ||
|
||
## Categorical | ||
|
||
- Embedding representing different values | ||
|
||
# Pip installable: Need to determine level of effort | ||
|
||
## Lit review | ||
|
||
- PMOTW: Setuptools | ||
- Common library: setuptools | ||
- PMOTW: distutils | ||
- Common library: distutils | ||
- Blog review | ||
|
||
## PMOTW: Setuptools | ||
|
||
- Unavailable | ||
|
||
## [Common library: setuptools](https://setuptools.readthedocs.io/en/latest/) | ||
|
||
- Designed to facilitate packaging Python projects | ||
- Enhancement to distutils | ||
|
||
Highlights | ||
|
||
- Dependency resolution & downloading | ||
- Create eggs | ||
- Automatically create wrapper scripts & .exe files | ||
- [Decent getting started guide](https://setuptools.readthedocs.io/en/latest/setuptools.html#basic-use) | ||
|
||
Superset of of distutils | ||
|
||
## PMOTW: distutils | ||
|
||
- Unavailable | ||
|
||
## [Common library: distutils](https://docs.python.org/2/distutils/) | ||
|
||
[Intro](https://docs.python.org/2/distutils/introduction.html) | ||
|
||
- Setup script | ||
- Source distribution | ||
- Binary distributions | ||
|
||
Setup script | ||
|
||
- Handles packaging | ||
- Not aware of package managers | ||
|
||
[PyPi](https://docs.python.org/2/distutils/packageindex.html#pypi-overview) | ||
|
||
- Registering | ||
- Upload | ||
|
||
## Blogs | ||
|
||
- [Marthall](https://marthall.github.io/blog/how-to-package-a-python-app/) | ||
- [so](https://stackoverflow.com/questions/9411494/how-do-i-create-a-pip-installable-project) | ||
|
||
## [Marthall](https://marthall.github.io/blog/how-to-package-a-python-app/) | ||
|
||
- Strong, convenient walk through | ||
|
||
## [so](https://stackoverflow.com/questions/9411494/how-do-i-create-a-pip-installable-project) | ||
|
||
- Rambling | ||
|
||
## [Scott Torborg](http://python-packaging.readthedocs.io/en/latest/) | ||
|
||
- Strong advanced discussion | ||
- How to declare dependencies | ||
|
||
## [PyPI docs](https://packaging.python.org/tutorials/distributing-packages/#uploading-your-project-to-pypi) | ||
|
||
## [twine quickstart](https://packaging.python.org/tutorials/distributing-packages/) | ||
|
||
# Decisions | ||
|
||
- Interface: Will use custom interface, similar to SKLearn, with Pandas-SKLearn under the hood. | ||
- Pip installable: Will move forward w/ setuptools |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,48 @@ | ||
# Elevator pitch | ||
|
||
# Audience | ||
|
||
- Semi-technical | ||
- ML background, new to deep learning | ||
- Familiarity with Python data ecosystem (Pandas, SKLearn, numpy) | ||
- Interested in getting started in deep learning | ||
- Interested in quickly prototyping models | ||
|
||
# Goals | ||
|
||
- Easy to use interface | ||
- Sensible default actions | ||
- Low barrier to entry to create Keras input / output layers | ||
|
||
# Requirements | ||
|
||
## Backlog | ||
|
||
- Numerical inputs: Null handling, Z score normalizaiton | ||
- Categorical inputs: Create embedding, handle unseen levels | ||
- Boolean inputs: Handle appropriately | ||
- Datetime: Extract categorical fields, treat as epoch time if possible. | ||
- Test run: train on random sample of data | ||
- Convenient interface | ||
- Logging | ||
- Unit tests | ||
- Appropriate exceptions | ||
- Pip installable | ||
|
||
## Prioritized backlog | ||
|
||
- Unit tests | ||
- Logging | ||
- Numerical inputs: Null handling, Z score normalizaiton | ||
- Categorical inputs: Create embedding, handle unseen levels | ||
- Boolean inputs: Handle appropriately | ||
- Datetime: Extract categorical fields, treat as epoch time if possible. | ||
- Appropriate exceptions | ||
- Pip installable | ||
|
||
## POC items | ||
|
||
- Interface: Need to determine options (SKLearn transformer, custom interface, etc) | ||
- Interface: Need to outline functionality | ||
- Boolean: Need to determine if it'll be handled as numerical or categorical | ||
- Pip installable: Need to determine level of effort |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,43 @@ | ||
# Public API | ||
|
||
## Methods | ||
|
||
- init | ||
- fit | ||
- transform | ||
- datetime_expansion_method_dict | ||
- constants | ||
- set_embedding_size_function | ||
- set_embedding_size_for_variable | ||
- get_tranformers | ||
- get_transformer | ||
- list_default_transformation_pipelines | ||
- list_input_variables | ||
- list_output_variables | ||
|
||
## Attributes | ||
|
||
(no public attributes) | ||
|
||
|
||
# Internal API | ||
|
||
## Methods | ||
|
||
- check_variable_lists | ||
- check_input_dataframe_columns | ||
- check_output_dataframe_columns | ||
- datetime expansion | ||
- get_input_nub | ||
- get_output_nub | ||
|
||
## Attributes | ||
|
||
- sklearn_pandas_object | ||
- datetime_expansion_method_dict | ||
- embedding_size_function | ||
- variable_transformer_dict | ||
- input_variables | ||
- output variables | ||
- input_numb | ||
- output_nub |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
Workflow for new variable type: | ||
|
||
- Add default transformation type | ||
- Add input layer handler | ||
- Add output layer handler |
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Oops, something went wrong.