-
Notifications
You must be signed in to change notification settings - Fork 252
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When using Pandas DataFrames #3
Comments
Yepp atm boruta expects a numpy array for X, but this is made explicit in the docstring of fit(): If you feel this is an important issue, please add this to the fit and I'll review your changes. Oh you did, that's wonderful, cheers! |
The examples show pandas going in. I suppose it would be as easy to just update the user doc to show them to only send numpy. I built a 'pandas check' but that has the unfortunate side effect of adding a dependency. It appears that's how sklearn handles it as well though. Toss up, I'll leave you to decide which you like better :) |
Hi Mike, Yepp, I wanted it to have a scikit learn interface, so kinda instinctively stuck with the numpy input as sklearn does.. I added a warning to the examples as you recommended, and renamed boruta_py2 to boruta_py_plus.. Also left in your sanity check for pandas dataframes jsut in case. Pandas is pretty common now, it's not a major dependency issue imo.. Thanks again for your valuable input, really appreciate it! cheers, |
_add_shadows_get_imps() fails when X is pandas rather than numpy
Pandas DF can no longer be sliced as
x_cur = np.copy(X[:, x_cur_ind])
x_cur = np.copy(X.as_matrix()[:, x_cur_ind])
OR
x_cur = np.copy(X.ix[:, x_cur_ind])
I'd recommend testing/casting dataframes to numpy arrays in _fit
The text was updated successfully, but these errors were encountered: