Add Support for Other ML Libraries #397

ndanielsen · 2018-05-11T21:42:50Z

We currently have an experimental wrapper for StatsModels in : #383

It would be fantastic if we can add visualizer support for Tensorflow, Keras and other classification selection visualizers using ATM.

mitevpi · 2018-06-05T21:48:15Z

I think there's some great low hanging fruit here in the short term, with even more exciting opportunities for the long term.

One thing I want to understand better - could you clarify how you were envisioning the integration of ATM? Are you thinking that the ATM would occur prior to visualization, or were you thinking that visualization is separate from any model tuning?

bbengfort · 2018-06-06T10:48:06Z

@mitevpi I think the initial idea with ATM was to create a multi-model comparison visualizer that could be generated either during tuning to show progress, or after tuning to demonstrate the scope of the autotuning and its findings (so both?). Because this visualization would be ATM-specific, and possibly require some interactivity/liveness we haven't really had the opportunity to do a deep-dive into what this might look like.

rebeccabilbro · 2018-08-20T16:04:19Z

CC: @carlomazzaferro

mattharrison · 2018-10-08T21:29:54Z

Was just trying to create a ResidualsPlot with a TensorFlow regression model today. Since the keras api has a fit method that is similar to sklearn, this should be really low hanging fruit.

bbengfort · 2018-10-09T14:38:39Z

@mattharrison did you get an exception when you tried to use the ResidualsPlot with the Keras model? I agree that it is low hanging fruit, and we've had good experience in the past with things just working, like XGBoost.

mattharrison · 2018-10-09T15:25:02Z

I did. Here's some code to create a model:

# Create the model

tf_model1 = keras.Sequential([
    # hidden layer 1
    keras.layers.Dense(64, activation=tf.nn.relu, 
                      input_shape=(1,)), # shape is num features
    # hidden layer 2
    keras.layers.Dense(64, activation=tf.nn.relu),
    keras.layers.Dense(1)
])
optimizer = tf.train.RMSPropOptimizer(0.001)
tf_model1.compile(loss='mse', optimizer=optimizer,
             metrics=['mae'])
tf_model1.summary()

# Train the model
class TheDot(keras.callbacks.Callback):
    def on_epoch_end(self, epoch, logs):
        if epoch % 10 == 0: 
            print('.', end='')
            
history = tf_model1.fit(X, y, epochs=500, validation_split=.2,
                       verbose=0, callbacks=[TheDot()])


# Yellowbrick version
fig, ax = plt.subplots(figsize=(10, 10))
res_viz = ResidualsPlot(tf_model1)
res_viz.fit(X, y)
res_viz.score(X,y)
res_viz.poof()

And here is the error:

--------------------------------------------------------------------------
YellowbrickTypeError                      Traceback (most recent call last)
<ipython-input-117-51e29b0dcce8> in <module>
      1 # Yellowbrick version
      2 fig, ax = plt.subplots(figsize=(10, 10))
----> 3 res_viz = ResidualsPlot(tf_model2)
      4 res_viz.fit(X_train, y_train)
      5 res_viz.score(X_test, y_test)

~/.env/364/lib/python3.6/site-packages/yellowbrick/regressor/residuals.py in __init__(self, model, ax, hist, train_color, test_color, line_color, **kwargs)
    377                  test_color='g', line_color=LINE_COLOR, **kwargs):
    378 
--> 379         super(ResidualsPlot, self).__init__(model, ax=ax, **kwargs)
    380 
    381         # TODO: allow more scatter plot arguments for train and test points

~/.env/364/lib/python3.6/site-packages/yellowbrick/regressor/base.py in __init__(self, model, ax, **kwargs)
     46         if not isregressor(model):
     47             raise YellowbrickTypeError(
---> 48                 "This estimator is not a regressor; try a classifier or "
     49                 "clustering score visualizer instead!"
     50         )

YellowbrickTypeError: This estimator is not a regressor; try a classifier or clustering score visualizer instead!

rebeccabilbro · 2018-10-11T14:22:44Z

Ah, thank you @mattharrison! You know, I think @carlomazzaferro was working on this a while back -- perhaps he would be interested?

rebeccabilbro · 2018-10-19T12:16:12Z

See also #637!

carlomazzaferro · 2018-10-24T03:42:23Z

Just as a heads up, you can make keras work with yellowbrick with minimal work, just like this:

from sklearn.datasets import make_regression
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from yellowbrick.regressor import ResidualsPlot
from sklearn.model_selection import train_test_split
from keras.wrappers.scikit_learn import  KerasRegressor
from sklearn.base import BaseEstimator, RegressorMixin
import tensorflow as tf


class KerasClf(BaseEstimator, RegressorMixin, KerasRegressor):

    def __init__(self, build_fn, **kwargs):
        super(KerasRegressor, self).__init__(build_fn,  **kwargs)


def keras_model():
    optimizer = tf.train.RMSPropOptimizer(0.001)
    model = Sequential()
    model.add(Dense(64, input_dim=2, activation='relu'))
    model.add(Dense(1, activation='sigmoid'))
    model.compile(loss='mse', optimizer=optimizer,
                  metrics=['mae'])

    return model


if __name__ == '__main__':
    x, y = make_regression(n_features=2,  n_informative=1)
    X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=0.2)
    kf = KerasClf(keras_model, epochs=150, batch_size=10, verbose=0)
    r = ResidualsPlot(kf)
    r.fit(X_train, y_train)  
    r.score(X_test, y_test)
    r.poof()

Note that you need to import the sklearn wrapper for keras, which unfortunately is not fully compliant with the sklearn api, and does not have the required attribute _estimator_type, hence the need to subclass BaseEstimator and ClassifierMixin

The same pattern holds for a classifier (tested it against the ROCAUC API),

ndanielsen · 2018-10-24T03:51:47Z

Thanks for sharing this!

ndanielsen · 2018-10-24T19:26:36Z

@carlomazzaferro thanks so much for this work. This is a clever implementation. I'm looking forward to playing with this.

You're also welcome to contribute this to contrib where we are placing experimental and research related projects like this.

carlomazzaferro · 2018-10-24T21:10:56Z

Thanks @ndanielsen ! I had worked on something similar for skorch as well, which is very similar role to keras' scikit-learn wrapper but for pytorch.

I think these small examples could maybe be added to the docs, or at least on some sample notebooks? I know that adding full support for either keras, torch or tensorflow would be a massive undertaking, but some minimal working examples could be useful for people that currently using the high-level wrappers.

Something along the lines of #637 .

ndanielsen · 2018-10-25T17:12:56Z

@carlomazzaferro that's an even better idea. I think that we can likely add a reference in the docs to the example notebooks.

carlomazzaferro · 2018-10-26T20:10:37Z

Cool! I'll work on getting some examples going and I'll do a pull request.

ResidentMario · 2019-03-18T22:44:43Z

I independently had the same thought and took a shot at mating yellowbrick to keras myself. Here's a write-up, which is currently in draft: "Evaluation Keras neural network performance using Yellowbrick visualizations".

I'd be happy to try and figure out if broad compatibility is possible for a PR, if that's still of interest. The tricky bit is that for most of the interesting plot types, like e.g. ROCAUC, yellowbrick calls (IIRC) model.clone(), which would be non-trivial to hack in, but might be possible.

lwgray · 2019-03-19T09:36:08Z

@ResidentMario Thank you for this awesome work. I encourage you to open a PR. However we won’t be able to review it for 2-3 weeks as we are on a short hiatus. Here is a link to our contributor’s guide. http://www.scikit-yb.org/en/latest/contributing.html

here is an example of a wrapper for statsmodel https://github.com/DistrictDataLabs/yellowbrick/blob/d6ebc391e0c2e7aeb57ab14396ccc11b67ee0790/yellowbrick/contrib/statsmodels/base.py

ResidentMario · 2019-03-25T17:53:33Z

FYI, the write-up is now published to the Towards Data Science mailer on Medium.

bbengfort · 2019-04-18T15:20:36Z

@ResidentMario such a great write up - thank you so much! We would be certainly interested in a contrib model for Keras if you ever want to pursue it!

This PR introdues a wrapper for estimators that implement the scikit-learn API but do not extend BaseEstimator. If the estimator is missing required properties (generally the learned attributes) then a sensible error is raised. Includes documentation about how to use non-sklearn estimators with Yellowbrick. Closes #1098, closes #1099, closes #397

ndanielsen added the pycon2019 label May 11, 2018

bbengfort mentioned this issue Jun 4, 2018

ROCAUC function #465

Closed

rebeccabilbro removed the pycon2019 label Aug 20, 2018

rebeccabilbro added the bc-hack label Aug 20, 2018

rebeccabilbro removed the bc-hack label Feb 13, 2019

This was referenced Oct 2, 2020

Not supporting model of Catboost #1099

Closed

Third Party Estimator Wrapper #1103

Merged

Visual ATM Model Report #1105

Open

rebeccabilbro closed this as completed in #1103 Oct 5, 2020

bbengfort mentioned this issue Oct 5, 2020

Keras contrib module #1106

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Support for Other ML Libraries #397

Add Support for Other ML Libraries #397

ndanielsen commented May 11, 2018

mitevpi commented Jun 5, 2018

bbengfort commented Jun 6, 2018

rebeccabilbro commented Aug 20, 2018

mattharrison commented Oct 8, 2018

bbengfort commented Oct 9, 2018

mattharrison commented Oct 9, 2018

rebeccabilbro commented Oct 11, 2018

rebeccabilbro commented Oct 19, 2018

carlomazzaferro commented Oct 24, 2018

ndanielsen commented Oct 24, 2018

ndanielsen commented Oct 24, 2018

carlomazzaferro commented Oct 24, 2018

ndanielsen commented Oct 25, 2018

carlomazzaferro commented Oct 26, 2018

ResidentMario commented Mar 18, 2019

lwgray commented Mar 19, 2019 •

edited

ResidentMario commented Mar 25, 2019

bbengfort commented Apr 18, 2019

Add Support for Other ML Libraries #397

Add Support for Other ML Libraries #397

Comments

ndanielsen commented May 11, 2018

mitevpi commented Jun 5, 2018

bbengfort commented Jun 6, 2018

rebeccabilbro commented Aug 20, 2018

mattharrison commented Oct 8, 2018

bbengfort commented Oct 9, 2018

mattharrison commented Oct 9, 2018

rebeccabilbro commented Oct 11, 2018

rebeccabilbro commented Oct 19, 2018

carlomazzaferro commented Oct 24, 2018

ndanielsen commented Oct 24, 2018

ndanielsen commented Oct 24, 2018

carlomazzaferro commented Oct 24, 2018

ndanielsen commented Oct 25, 2018

carlomazzaferro commented Oct 26, 2018

ResidentMario commented Mar 18, 2019

lwgray commented Mar 19, 2019 • edited

ResidentMario commented Mar 25, 2019

bbengfort commented Apr 18, 2019

lwgray commented Mar 19, 2019 •

edited