Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for DataFrame based data and model formulas #13

Open
saosebastiao opened this issue Jun 19, 2014 · 2 comments
Open

Support for DataFrame based data and model formulas #13

saosebastiao opened this issue Jun 19, 2014 · 2 comments

Comments

@saosebastiao
Copy link

Hello, thanks for writing this. I've benchmarked its use against the default randomForest implementation in R and have found it to be amazingly fast.

I was hoping to be able to use this library with DataFrames, including the Model Formula format api. I know that DataFrames currently doesn't support categorical data columns, but I think it is planned to be integrated.

I can try to help contribute to this, but it would be nice if this project was merged into the JuliaStats project first (I prefer to contribute to projects that are explicitly community owned).

@bensadeghi
Copy link
Member

Thanks for trying out and benchmarking the package. I've been meaning to add support for DataFrames and model formula syntax, but just haven't gotten around to it. As for merging with JuliaStats, I feel that perhaps I should write a new consistent API first (via wrappers). Let me look into it. Also, take a look at MachineLearning.jl and RandomForests.jl. Cheers.

@cstjean
Copy link
Collaborator

cstjean commented Nov 2, 2016

FWIW, I've been using DecisionTree.jl with DataFrames via ScikitLearn.DataFrameMapper and ScikitLearn.Pipeline. See here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants