-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Major refactor] Incorporate OO redesign #91
Comments
@tonyfast: I just remembered that we have this issue open to discuss the major refactor coming up after the 0.4 release. @teaearlgraycold, please link your WIP refactor here so we can take a look at it. |
Don't try to run this code - atm it's just a structural layout Edit: Some of it will kinda work now Edit 2: You can actually do a fit_predict run now https://github.com/teaearlgraycold/tpot/tree/refactor Code of interest is in tpot/operators and tpot/tpot.py#134 |
So with this setup if you want to do something like refactor TPOT so it just uses Numpy matrices instad of Pandas DataFrames you can edit the Operator class, Classifer class, and PreProcessor class and leave all actuall classifiers and preprocessors untouched. Edit: And something I'm interested in doing is largely forgoing the preprocess_args() method (from the refactor) as it is now, and just implement some general rules for arguments that will be applied based off of what argument names are used. So for example: If you have a Classifier that takes arguments 'max_features' and 'max_depth', there will be general rules that apply that say max_features should be between 1 and So when you add a new classifier or pre-processor you don't need to add extra code that threshholds values that we've already determined reasonable limits for. You'd just need to say you want some set of parameters, and if any of them have pre-defined limits it'll use those. Failing that it would run any code you specify to limit the arguments. This however assumes that an argument's name can reliably be used to determine what kind of threshholding is useful for that argument. Doing this would also means that instead of testing each operator with extreme values that are covered in argument preprocessing code, you can test the general rules once. |
I started looking into a refactor myself to understand the project a little bit more. I haven't put this into scripts yet, but the idea in drawn out in the notebook. There were a few main opinions for this to study.
With a limited corpus of models so far this gets all the way through. There is a problem with the scoring function at the moment. Are there are reasons not to use a direct |
@tonyfast I pushed out a commit so your line number is off. You were referring to the _apply_default_params() method though, correct? That method is largely there so that certain parameters can be blindly applied to all estimators (regardless of whether they're applicable), and behaves differently than set_params. Also, currently my refactor branch is in a state where it can be ran - albeit with only 2 classifiers and one preprocessor at the moment. |
See #63 for an example of the new OO design on an old version of TPOT. This will require a large overhaul of TPOT.
The text was updated successfully, but these errors were encountered: