Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fixup scikit-learn PolynomialFeatures #273

Closed
ksaur opened this issue Aug 30, 2020 · 12 comments · Fixed by #763
Closed

fixup scikit-learn PolynomialFeatures #273

ksaur opened this issue Aug 30, 2020 · 12 comments · Fixed by #763
Labels
help wanted Extra attention is needed

Comments

@ksaur
Copy link
Collaborator

ksaur commented Aug 30, 2020

In #269, the basics are implemented, but to be fully complete we need:
(1) degree larger than 2
(2) support for interaction_only
(3) more tests

@ksaur ksaur added the help wanted Extra attention is needed label Aug 30, 2020
@Hemantr05
Copy link

@ksaur working on the same.

@ksaur
Copy link
Collaborator Author

ksaur commented Sep 10, 2020

Welcome! Please reach out with questions as necessary! :)

@Hemantr05
Copy link

@ksaur will do ksaur

@Hemantr05
Copy link

Hemantr05 commented Sep 13, 2020

@ksaur

from sklearn.preprocessing import PolynomialFeatures
from hummingbird.ml import convert

X = np.arange(6).reshape(1,6)
y = np.random.randint(2, size=6)

poly = PolynomialFeatures(1)
poly_x = poly.fit_transform(X)
poly.fit(X, y)

poly_convert = convert(poly, 'pytorch')

This is the error reproduced after implementing the above code for 1 dimension.

Unable to find converter for model type <class 'numpy.ndarray'>.
It usually means the pipeline being converted contains a
transformer or a predictor with no corresponding converter implemented.
Please fill an issue at https://github.com/microsoft/hummingbird.

Is the error right?
If so, i'll fix it.
otherwise, could you please guide me through the error to produced

@ksaur
Copy link
Collaborator Author

ksaur commented Sep 14, 2020

If you see this error, it's possible you don't have the latest hummingbird (0.0.6) installed, or that you are feeding it the wrong data type

For "(1)" above: The error I get when I run your code (and the error I expect to see) is :

---------------------------------------------------------------------------
NotImplementedError                       Traceback (most recent call last)
<ipython-input-7-84b752131820> in <module>
      9 poly.fit(X, y)
     10 
---> 11 poly_convert = convert(poly, 'pytorch')
     12 

~/hummingbird/hummingbird/ml/convert.py in convert(model, backend, test_input, device, extra_config)
    250         return _convert_onnxml(model, backend, test_input, device, extra_config)
    251 
--> 252     return _convert_sklearn(model, backend, test_input, device, extra_config)

~/hummingbird/hummingbird/ml/convert.py in _convert_sklearn(model, backend, test_input, device, extra_config)
     78 
     79     # Convert the Topology object into a PyTorch model.
---> 80     hb_model = topology_converter(topology, backend, device, extra_config=extra_config)
     81     return hb_model
     82 

~/hummingbird/hummingbird/ml/_topology.py in convert(topology, backend, device, extra_config)
     74             )
     75         except Exception as e:
---> 76             raise e
     77 
     78     operators = list(topology.topological_operator_iterator())

~/hummingbird/hummingbird/ml/_topology.py in convert(topology, backend, device, extra_config)
     66                 extra_config[constants.TREE_IMPLEMENTATION] = "tree_trav"
     67 
---> 68             operator_map[operator.full_name] = converter(operator, device, extra_config)
     69         except ValueError:
     70             raise MissingConverter(

~/hummingbird/hummingbird/ml/operator_converters/sklearn/poly_features.py in convert_sklearn_poly_features(operator, device, extra_config)
     75 
     76     if operator.raw_operator.degree != 2:
---> 77         raise NotImplementedError("Hummingbird currently only supports degree 2 for PolynomialFeatures")
     78     return PolynomialFeatures(
     79         operator.raw_operator.n_input_features_,

NotImplementedError: Hummingbird currently only supports degree 2 for PolynomialFeatures

Please let me know if I misunderstood your question! Hopefully that makes sense!

@Hemantr05
Copy link

@ksaur, I have the updated version of hummingbird, was able to reproduce the error you got.
As you mentioned, the input was incorrect.

@ksaur
Copy link
Collaborator Author

ksaur commented Oct 2, 2020

Hi @Hemantr05 , just checking in to see if you have any questions? :)

@Hemantr05
Copy link

Hi @ksaur , none as of now.
Will be creating a PR in a couple of days.

@Hemantr05
Copy link

@ksaur Apologies for the delay.
Will resolve this by the end of the week

@ereide
Copy link

ereide commented Nov 11, 2020

Just came over this library. Great initiative. However I second that there is a bug in the polynomial transformer:

This code runs, but does not evaluate the test at the bottom to true, it looks like something is up with the polynomial features.

from hummingbird.ml import convert
from sklearn import pipeline, preprocessing , linear_model
# Create some random data for binary classification
num_classes = 2
N = 1000
X = np.random.rand(N, 28)
y = np.random.randint(num_classes, size=N)

# Create and train a model (scikit-learn RandomForestClassifier in this case)


skl_model = pipeline.make_pipeline(
  preprocessing.StandardScaler(),
  preprocessing.PolynomialFeatures(),
  linear_model.LinearRegression()
)

skl_model.fit(X, y)

y_pred_skl = skl_model.predict(X) 

#print(y_pred_skl)

# Use Hummingbird to convert the model to PyTorch
model = convert(skl_model, 'pytorch')

# Run predictions on CPU
print(np.allclose(model.predict(X), y_pred_skl))
print(model.predict(X) - skl_model.predict(X) )```

@Hemantr05
Copy link

@ereide working on fixing the same.

@ksaur
Copy link
Collaborator Author

ksaur commented Nov 11, 2020

Hi @ereide Welcome!

Thanks for reporting this! PolynomialFeatures is only partially implemented, so yes it is very likely that there is a bug. Thanks for this example! We will use it in our testing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants