Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expand project unit tests and integration tests #41

Closed
rhiever opened this issue Dec 4, 2015 · 1 comment
Closed

Expand project unit tests and integration tests #41

rhiever opened this issue Dec 4, 2015 · 1 comment

Comments

@rhiever
Copy link
Contributor

rhiever commented Dec 4, 2015

Currently, there are only a few unit tests in tests.py. These are basic unit tests and don't cover a large portion of the project. We should expand the unit tests to cover more of the core TPOT functions.

We also need integration tests that test TPOT as a whole. This can be done with a small, fixed data set and a fixed random number generator seed over only a few generations, with a few different parameter settings.

@rhiever rhiever changed the title Expand project unit tests Expand project unit tests and integration tests Feb 22, 2016
@GJena GJena mentioned this issue Apr 16, 2016
@rhiever
Copy link
Contributor Author

rhiever commented May 16, 2016

@teaearlgraycold, here's the list of unit tests that @GJena and I brainstormed back when she was working on this issue. Some of them are already implemented.

Unit tests

Run nose tests https://nose.readthedocs.org/en/latest/ Nose/Nose2?

If no features, return copy of data frame..for every feat operator
Df has 3: class, guess, group

File: tpot.py
Function: __init__

Assertion tests present; add more assertions

Cover the new parameters that have been added since the init() test was created

Function: fit
fit() can be tested in the integration tests

Function: pareto_eq
Cannot be tested because it is inside fit()

Function: predict
Testing features = training features

Function: score
Use fixed pipeline, RNG, and data sets. Should output the same score each time.

Function: export
Should not be necessary to unit test because all of its functionality will be tested in export_utils.py

Function: decision_tree
Checking if output same as sklearn decision tree

Function: random_forest
Same as sklearn

Function: logistic_regression
Same as sklearn

Function: svc
Same as sklearn

Function: knnc
Same as sklearn

Function: xgradient_boosting
Same as XGBClassifier

Function: train_model_and_predict
Same as sklearn

Function: combine_dfs
input_df1 = input_df2

Function: _rfe
Compare with sklearn

Function: select_percentile
Compare with sklearn

Function: select k_best
Compare with sklearn

Function: select_fwe
Compare with sklearn

Function: variance_threshold
Compare with sklearn

Function: standard_scaler
Compare with sklearn

Function: robust_scaler
Compare with sklearn

Function: polynomial_features
Compare with sklearn

Function: min_max_scaler
Check with features in range
Check with features out of range
If train < test...don’t scale between 0,1..for all feat scaling

Function: max_abs_scaler
Check with features in range
Check with features out of range

Function: binarizer
Check with different thresholds

Function: pca
Compare with sklearn

Function: div remove params, no storing result
Underflow?
Divide by 0

More verbose comments

Function: evaluate_individual
Evaluate balanced accuracy on a balanced and imbalanced data set

Try with different scoring functions

Function: balanced_accuracy
Test with hard-coded values?

Function: combined_selection_operatior
Create a fixed population, find what this function currently outputs, then assert that the resulting population should be the same as what this function currently outputs. (We are assuming that the function, as it is implemented now, is correct.)

Do this with a few different population sizes

Function: random_mutation_operator
Fix the RNG seed, then apply mutations to a few different individuals

Function: main
This can be tested as part of the integration tests

Function: positive_integer
Cannot be tested because it is encapsulated in the main() function

Function: float_range
Cannot be tested because it is encapsulated in the main() function

File: export_utils.py

Function: replace_mathematical_operators

Try with known values and results
Test with combination of operators too

Function: unroll_nested_function_calls

Try with known values and results

Function: generate_import_code

Try with known values and results

Function: replace_function_calls

Check for different params of learning rate, max_depth, n_estimators
For Scikit-learn functions

DEAP library
Don’t forget this file :-)

Integration tests

Fixed dataset: use MNIST data set from sklearn
Fixed random number generator seed
Fixed # of generations (5)
Few parameters
Training score should be >=0 by the end
Testing score should be >=0 by the end

Helpful links
https://coveralls.io/github/rhiever/tpot?branch=master
https://travis-ci.org/rhiever/tpot
https://landscape.io/github/rhiever/tpot/211/messages/style

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant