You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I ran a couple of experiments on MNIST and observed that the code generation is a bit buggy at the moment. In the first example only operator generated is SelectPercentile
import numpy as np
import pandas as pd
from sklearn.cross_validation import StratifiedShuffleSplit
from sklearn.feature_selection import SelectPercentile
from sklearn.feature_selection import f_classif
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC
# NOTE: Make sure that the class is labeled 'class' in the data file
tpot_data = pd.read_csv('PATH/TO/DATA/FILE', sep='COLUMN_SEPARATOR')
training_indices, testing_indices = next(iter(StratifiedShuffleSplit(tpot_data['class'].values, n_iter=1, train_size=0.75, test_size=0.25)))
# Use Scikit-learn's SelectPercentile for feature selection
training_features = result2.loc[training_indices].drop('class', axis=1)
training_class_vals = result2.loc[training_indices, 'class'].values
if len(training_features.columns.values) == 0:
result3 = result2.copy()
else:
selector = SelectPercentile(f_classif, percentile=100)
selector.fit(training_features.values, training_class_vals)
mask = selector.get_support(True)
mask_cols = list(training_features.iloc[:, mask].columns) + ['class']
result3 = result2[mask_cols]
No indentation
result2 is not defined
optimized_pipeline_ contains _select_percentile, svc, _standard_scaler, but svc and
standard scaler don't appear in the generated code
Another example with RobustScaler:
import numpy as np
import pandas as pd
from sklearn.cross_validation import StratifiedShuffleSplit
from sklearn.feature_selection import SelectPercentile
from sklearn.feature_selection import f_classif
from sklearn.preprocessing import RobustScaler
from sklearn.svm import SVC
# NOTE: Make sure that the class is labeled 'class' in the data file
tpot_data = pd.read_csv('PATH/TO/DATA/FILE', sep='COLUMN_SEPARATOR')
training_indices, testing_indices = next(iter(StratifiedShuffleSplit(tpot_data['class'].values, n_iter=1, train_size=0.75, test_size=0.25)))
# Use Scikit-learn's RobustScaler to scale the features
training_features = result3.loc[training_indices].drop('class', axis=1)
result4 = result3.copy()
if len(training_features.columns.values) > 0:
scaler = RobustScaler()
scaler.fit(training_features.values.astype(np.float64))
scaled_features = scaler.transform(result4.drop('class', axis=1).values.astype(np.float64))
for col_num, column in enumerate(result4.drop('class', axis=1).columns.values):
result4.loc[:, column] = scaled_features[:, col_num]
No indentation
result2 is not defined
optimized_pipeline_ contains _robust_scaler, svc, svc, _select_percentile, but svc, svc and
_select_percentile, don't appear in the generated code
The text was updated successfully, but these errors were encountered:
I ran a couple of experiments on MNIST and observed that the code generation is a bit buggy at the moment. In the first example only operator generated is SelectPercentile
standard scaler don't appear in the generated code
Another example with RobustScaler:
_select_percentile, don't appear in the generated code
The text was updated successfully, but these errors were encountered: