Skip to content
This repository has been archived by the owner on Jan 9, 2024. It is now read-only.

Commit

Permalink
Merge d71a2d5 into fe257b0
Browse files Browse the repository at this point in the history
  • Loading branch information
adithyabsk committed Jun 21, 2019
2 parents fe257b0 + d71a2d5 commit 63d49be
Show file tree
Hide file tree
Showing 12 changed files with 376 additions and 98 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -114,3 +114,4 @@ pip-wheel-metadata/
.DS_Store
*.sublime-project
*.sublime-workspace
*.svg
3 changes: 2 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
@@ -1,10 +1,11 @@
CWD=$(shell pwd)
PKG=foreshadow
TST=tests

clean:
find ./$(PKG) -name "*.pyc" -exec rm -rfv {} \;

test:
tox -r
poetry run tox -r

.PHONY: test clean
29 changes: 20 additions & 9 deletions README.rst
Original file line number Diff line number Diff line change
@@ -1,20 +1,31 @@
foreshadow
==========
Foreshadow: Simple Machine Learning Scaffolding
===============================================

|License| |BuildStatus| |Coverage| |Code style: black|
|BuildStatus| |DocStatus| |Coverage| |CodeStyle| |License|

Foreshadow is an automatic pipeline generation tool that makes creating, iterating,
and evaluating machine learning pipelines a fast and intuitive experience allowing
data scientists to spend more time on data science and less time on code.

.. |License| image:: https://img.shields.io/badge/License-Apache%202.0-blue.svg
:target: https://github.com/georgianpartners/foreshadow/blob/master/LICENSE
.. |BuildStatus| image:: https://travis-ci.org/georgianpartners/foreshadow.svg?branch=master
:target: https://travis-ci.org/georgianpartners/foreshadow
:target: https://travis-ci.org/georgianpartners/foreshadow
:alt: Build Status

.. |DocStatus| image:: https://readthedocs.org/projects/foreshadow/badge/?version=latest
:target: https://foreshadow.readthedocs.io/en/latest/?badge=latest
:alt: Documentation Status

.. |Coverage| image:: https://coveralls.io/repos/github/georgianpartners/foreshadow/badge.svg?branch=development
:target: https://coveralls.io/github/georgianpartners/foreshadow
.. |Code style: black| image:: https://img.shields.io/badge/code%20style-black-000000.svg
:target: https://github.com/ambv/black
:target: https://coveralls.io/github/georgianpartners/foreshadow
:alt: Coverage

.. |CodeStyle| image:: https://img.shields.io/badge/code%20style-black-000000.svg
:target: https://github.com/ambv/black
:alt: Code Style

.. |License| image:: https://img.shields.io/badge/License-Apache%202.0-blue.svg
:target: https://github.com/georgianpartners/foreshadow/blob/master/LICENSE
:alt: License

Installing Foreshadow
---------------------
Expand Down
26 changes: 26 additions & 0 deletions doc/architecture.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
.. _architecture:

Project Architecture
====================

.. note::
Open the diagram in a new tab to see the full details.

UML Class Diagram
-----------------

.. uml:: foreshadow_class.uml

UML Sequence Diagrams
---------------------

Main Sequence Diagram
^^^^^^^^^^^^^^^^^^^^^

.. uml:: foreshadow_sequence_main.uml


Hyperparameter Optimization Sequence Diagram
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. uml:: foreshadow_sequence_hyp.uml
9 changes: 9 additions & 0 deletions doc/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
# documentation root, use os.path.abspath to make it absolute, like shown here.
#
import os
import platform
import sys


Expand Down Expand Up @@ -59,6 +60,7 @@ def get_version():
"sphinx.ext.ifconfig",
"sphinx.ext.viewcode",
"sphinx.ext.napoleon",
"sphinxcontrib.plantuml",
]

# Autodoc Settings
Expand Down Expand Up @@ -205,3 +207,10 @@ def get_version():

# Additional Modifications
add_module_names = False

# Plant UML
plantuml = "{} -Djava.awt.headless=true".format(
"/usr/local/bin/plantuml"
if platform.system() == "Darwin"
else "/usr/bin/plantuml"
)
9 changes: 8 additions & 1 deletion doc/developers.rst
Original file line number Diff line number Diff line change
Expand Up @@ -84,12 +84,19 @@ Install all the packages and commit hooks
.. _pre-commit: https://pre-commit.com/

.. code-block:: console
(venv) $ poetry install -v
(venv) $ export CC=gcc-5; export CXX=g++-5;
(venv) $ poetry install -E dev
(venv) $ poetry run pre-commit install
Configure PlantUML

.. code-block:: console
(venv) $ brew install plantuml # MacOS (requires brew cask install adoptopenjdk)
(venv) $ sudo apt install plantuml # Linux

Making sure everything works
1. Run pytest to make sure you're good to go

Expand Down
169 changes: 169 additions & 0 deletions doc/foreshadow_class.uml
Original file line number Diff line number Diff line change
@@ -0,0 +1,169 @@
@startuml

skinparam BackgroundColor transparent
skinparam Shadowing false

' Diagram setup
hide empty members
left to right direction
set namespaceSeparator none

package foreshadow.utils {
class check_df << (M,lemonchiffon) >>
}
package foreshadow.logging {
class ForeshadowLogger
}
package foreshadow.intents {
abstract class BaseIntent {
list engineering_pipeline
list preprocessing_pipeline
resolve_intent()
}

class DropIntent
class NumericalIntent
class CategoricalIntent
class TextIntent

BaseIntent <|-- DropIntent
BaseIntent <|-- NumericalIntent
BaseIntent <|-- CategoricalIntent
BaseIntent <|-- TextIntent

note "Config for intentless transformations (cleaner) are placed in\nBaseIntent's specification" as N1
}

together {
package foreshadow.transformers.core {
abstract class SmartTransformer {
bool fixed_fit
log_decision()
}

class ParallelProcessor
class SigCopy
class DropFeature

class wrap_transformer << (M,lemonchiffon) >>
wrap_transformer o-- SigCopy
}

package foreshadow.transformers.smart {
SmartTransformer <|-- Cleaner
SmartTransformer <|-- Engineerer
SmartTransformer <|-- Scaler
SmartTransformer <|-- Imputer
SmartTransformer <|-- CategoricalEncoder
SmartTransformer <|-- TextEncoder
SmartTransformer <|-- Reducer
}

package foreshadow.transformers.internal {
class FancyImpute
class UncommonRemover
class BoxCox

FancyImpute o-- Imputer
CategoricalEncoder o-- UncommonRemover
Scaler o-- BoxCox

class DaysSince
class NumericalFeatuerizer
class CategoricalFeatuerizer

Engineerer o-- DaysSince
Engineerer o-- NumericalFeatuerizer
Engineerer o-- CategoricalEncoder

class ToString
class SplitDate
class FinancialCleaner

Cleaner o-- ToString
Cleaner o-- SplitDate
Cleaner o-- FinancialCleaner

class Boruta
class Hypothesis

Reducer o-- Boruta
Reducer o-- Hypothesis
}

package foreshadow.transformers.external {
note "All sklearn transformers are mirrored\nand pandas wrapped here." as N3
}
}

package foreshadow.config {
class ConfigManager {
json framework_config
json user_config
json local_config
}
}

package foreshadow.tuners {
class WrappedTuner {
BaseEstimator tuner_type
}
}
package foreshadow.core {
abstract class BaseFeatureMapper {
split_columns()
join_columns()
}

class Foreshadow
class DataPreparer {
bool is_y_var
}
class FeatureCleaner
class IntentResolver
class FeatureEngineerer
class FeaturePreprocessor
class FeatureReducer

class ColumnInfoSharer

class SerializerMixin << (X,peru) >>

Foreshadow "0..2" o-- DataPreparer
Foreshadow "0..1" o-- sklearn.RandomizedSearchCV

DataPreparer o-- FeatureCleaner
DataPreparer o-- IntentResolver
DataPreparer "0..1" o-- FeatureEngineerer
DataPreparer o-- FeaturePreprocessor
DataPreparer "0..1" o-- FeatureReducer

SerializerMixin <|-- DataPreparer
SerializerMixin <|-- FeatureCleaner
SerializerMixin <|-- IntentResolver
SerializerMixin <|-- FeatureEngineerer
SerializerMixin <|-- FeaturePreprocessor
SerializerMixin <|-- FeatureReducer

BaseFeatureMapper <|-- DataPreparer
BaseFeatureMapper <|-- FeatureCleaner
BaseFeatureMapper <|-- IntentResolver
BaseFeatureMapper <|-- FeatureEngineerer
BaseFeatureMapper <|-- FeaturePreprocessor
BaseFeatureMapper <|-- FeatureReducer
}
package foreshadow.estimators {
class MetaEstimator
class AutoEstimator

MetaEstimator "0..1" o-- AutoEstimator
MetaEstimator o-- DataPreparer

Foreshadow "0..1" o-- MetaEstimator
}
package sklearn.base {
class TransformerMixin << (X,peru) >>
class BaseEstimator
}

@enduml
29 changes: 29 additions & 0 deletions doc/foreshadow_sequence_hyp.uml
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
@startuml

skinparam BackgroundColor transparent
skinparam Shadowing false

participant User

User -> Foreshadow: ~__init__()

note over Foreshadow
pipeline = Pipeline([
('dp', DataPreparer()),
('lr', LogisticRegression()),
])
end note

Foreshadow -> OptimizerWrapper: ~__init__(pipeline, RandomizedSearchCV)

User -> Foreshadow: fit(X, y)
Foreshadow -> OptimizerWrapper: fit(X, y)
OptimizerWrapper -> RandomizedSearchCV ++: fit_pipelines(X, y)
return best_pipeline
OptimizerWrapper -> RandomizedSearchCV ++: fit_params(X, y)
return best_pipeline_params

OptimizerWrapper --> Foreshadow: self
Foreshadow --> User: self

@enduml

0 comments on commit 63d49be

Please sign in to comment.