Skip to content

Commit

Permalink
Working sphiunx build
Browse files Browse the repository at this point in the history
  • Loading branch information
bjherger committed May 26, 2018
1 parent 0138371 commit b91f92f
Show file tree
Hide file tree
Showing 46 changed files with 15,746 additions and 7 deletions.
Binary file added docs/_build/doctrees/environment.pickle
Binary file not shown.
Binary file added docs/_build/doctrees/index.doctree
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file added docs/_build/doctrees/planning/interface.doctree
Binary file not shown.
Binary file not shown.
4 changes: 4 additions & 0 deletions docs/_build/html/.buildinfo
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# Sphinx build info version 1
# This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done.
config: 19bbadb7a81fa723ab573d8e34479952
tags: 645f666f9bcd5a90fca523b33c5a78b7
24 changes: 24 additions & 0 deletions docs/_build/html/_sources/index.rst.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
.. keras-pandas documentation master file, created by
sphinx-quickstart on Fri May 25 16:33:06 2018.
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.
Welcome to keras-pandas's documentation!
========================================

.. toctree::
:maxdepth: 2
:caption: Contents:

.. automodule:: Automater
:members:
:undoc-members:
:show-inheritance:


Indices and tables
==================

* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`
126 changes: 126 additions & 0 deletions docs/_build/html/_sources/planning/POC_notes.md.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
# Backlog

- Interface: Need to determine options (SKLearn transformer, custom interface, etc)
- Interface: Need to outline functionality
- Boolean: Need to determine if it'll be handled as numerical or categorical
- Pip installable: Need to determine level of effort

# Interface: Need to determine options (SKLearn transformer, custom interface, etc)

Options:

- SKLearn transformer
- Pandas-SKLearn style module
- Custom module

Requirements:

- Be able to expand one column to many columns (datetime)
- Abilityt o use SKLearn transformers

## SKLearn transformer

- Inputs would have to be Numpy arrays
- Inputs can be Numpy arrays
- Tighter integration to SKLearn infrastructure

## Pandas-SKLearn style module

- Can have pandas dataframe inputs
- Can have multipel column inputs
- Can apply multiple transformations to the same column (many to one relationship)
- Can not apply transformations to some columns
- Can use default transformer

## Custom module

- Can sit on top of Pandas-SKLearn
- Can mimic SKLearn `fit` and `transform` interface

# Need to determine if it'll be handled as numerical or categorical

## Numerical

- Less compute time
- Reduced complexity

## Categorical

- Embedding representing different values

# Pip installable: Need to determine level of effort

## Lit review

- PMOTW: Setuptools
- Common library: setuptools
- PMOTW: distutils
- Common library: distutils
- Blog review

## PMOTW: Setuptools

- Unavailable

## [Common library: setuptools](https://setuptools.readthedocs.io/en/latest/)

- Designed to facilitate packaging Python projects
- Enhancement to distutils

Highlights

- Dependency resolution & downloading
- Create eggs
- Automatically create wrapper scripts & .exe files
- [Decent getting started guide](https://setuptools.readthedocs.io/en/latest/setuptools.html#basic-use)

Superset of of distutils

## PMOTW: distutils

- Unavailable

## [Common library: distutils](https://docs.python.org/2/distutils/)

[Intro](https://docs.python.org/2/distutils/introduction.html)

- Setup script
- Source distribution
- Binary distributions

Setup script

- Handles packaging
- Not aware of package managers

[PyPi](https://docs.python.org/2/distutils/packageindex.html#pypi-overview)

- Registering
- Upload

## Blogs

- [Marthall](https://marthall.github.io/blog/how-to-package-a-python-app/)
- [so](https://stackoverflow.com/questions/9411494/how-do-i-create-a-pip-installable-project)

## [Marthall](https://marthall.github.io/blog/how-to-package-a-python-app/)

- Strong, convenient walk through

## [so](https://stackoverflow.com/questions/9411494/how-do-i-create-a-pip-installable-project)

- Rambling

## [Scott Torborg](http://python-packaging.readthedocs.io/en/latest/)

- Strong advanced discussion
- How to declare dependencies

## [PyPI docs](https://packaging.python.org/tutorials/distributing-packages/#uploading-your-project-to-pypi)

## [twine quickstart](https://packaging.python.org/tutorials/distributing-packages/)

# Decisions

- Interface: Will use custom interface, similar to SKLearn, with Pandas-SKLearn under the hood.
- Pip installable: Will move forward w/ setuptools
48 changes: 48 additions & 0 deletions docs/_build/html/_sources/planning/brainstorming.md.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# Elevator pitch

# Audience

- Semi-technical
- ML background, new to deep learning
- Familiarity with Python data ecosystem (Pandas, SKLearn, numpy)
- Interested in getting started in deep learning
- Interested in quickly prototyping models

# Goals

- Easy to use interface
- Sensible default actions
- Low barrier to entry to create Keras input / output layers

# Requirements

## Backlog

- Numerical inputs: Null handling, Z score normalizaiton
- Categorical inputs: Create embedding, handle unseen levels
- Boolean inputs: Handle appropriately
- Datetime: Extract categorical fields, treat as epoch time if possible.
- Test run: train on random sample of data
- Convenient interface
- Logging
- Unit tests
- Appropriate exceptions
- Pip installable

## Prioritized backlog

- Unit tests
- Logging
- Numerical inputs: Null handling, Z score normalizaiton
- Categorical inputs: Create embedding, handle unseen levels
- Boolean inputs: Handle appropriately
- Datetime: Extract categorical fields, treat as epoch time if possible.
- Appropriate exceptions
- Pip installable

## POC items

- Interface: Need to determine options (SKLearn transformer, custom interface, etc)
- Interface: Need to outline functionality
- Boolean: Need to determine if it'll be handled as numerical or categorical
- Pip installable: Need to determine level of effort
43 changes: 43 additions & 0 deletions docs/_build/html/_sources/planning/interface.md.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
# Public API

## Methods

- init
- fit
- transform
- datetime_expansion_method_dict
- constants
- set_embedding_size_function
- set_embedding_size_for_variable
- get_tranformers
- get_transformer
- list_default_transformation_pipelines
- list_input_variables
- list_output_variables

## Attributes

(no public attributes)


# Internal API

## Methods

- check_variable_lists
- check_input_dataframe_columns
- check_output_dataframe_columns
- datetime expansion
- get_input_nub
- get_output_nub

## Attributes

- sklearn_pandas_object
- datetime_expansion_method_dict
- embedding_size_function
- variable_transformer_dict
- input_variables
- output variables
- input_numb
- output_nub
5 changes: 5 additions & 0 deletions docs/_build/html/_sources/planning/new_var_type.md.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
Workflow for new variable type:

- Add default transformation type
- Add input layer handler
- Add output layer handler
Binary file added docs/_build/html/_static/ajax-loader.gif
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit b91f92f

Please sign in to comment.