Sklearn10_1_机器学习流水线介绍

**案例描述**
> 本案例为《机器学习实践》课程第十章的课件配套代码，以糖尿病数据集为例，介绍了sklearn中机器学习流水线Pipeline的使用方法。

**数据集**
> 本地地址：file_path = "./dataSets/data_chap10/india_diabetes.csv"

> 数据集来自Kaggle，根据患者的医疗信息建立模型，预测其是否会患糖尿病
共1000条数据，9个字段,预测患者是否会患糖尿病
>> |字段名	|说明|
|:--|--:|
|Pregnancies	|怀孕次数|
|Glucose	|血糖浓度(2小时口服葡糖耐量试验)|
|BloodPressure	|心脏舒张压(mm Hg)|
|SkinThickness	|肱三头肌皮褶厚度(mm)|
|Insulin	|2小时血清胰岛素(mu U/ml)|
|BMI	|体重指数|
|DiabetesPedigreeFunction	|糖尿病血系功能|
|Age	|年龄(年)|
|Outcome	|过去5年内是否有糖尿病（目标字段，0为没有，1为有）|


**导入必要库**

In [1]:
## 导入库
import numpy as np
import pandas as pd

# 非Pipeline流程
> 数据预处理、建立训练并评估K近邻模型
## 数据EDA

In [3]:
## 读取数据
file_path = "./dataSets/data_chap10/india_diabetes.csv"
diabetes = pd.read_csv(file_path)

## 查看前几行
diabetes.head()

Unnamed: 0,Pregnancies,Glucose,BloodPressure,SkinThickness,Insulin,BMI,DiabetesPedigreeFunction,Age,Outcome
0,6,148,72,35,0,33.6,0.627,50,1
1,1,85,66,29,0,26.6,0.351,31,0
2,8,183,64,0,0,23.3,0.672,32,1
3,1,89,66,23,94,28.1,0.167,21,0
4,0,137,40,35,168,43.1,2.288,33,1


In [4]:
## 数据集基本信息
diabetes.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1000 entries, 0 to 999
Data columns (total 9 columns):
 #   Column                    Non-Null Count  Dtype  
---  ------                    --------------  -----  
 0   Pregnancies               1000 non-null   int64  
 1   Glucose                   1000 non-null   int64  
 2   BloodPressure             1000 non-null   int64  
 3   SkinThickness             1000 non-null   int64  
 4   Insulin                   1000 non-null   int64  
 5   BMI                       1000 non-null   float64
 6   DiabetesPedigreeFunction  1000 non-null   float64
 7   Age                       1000 non-null   int64  
 8   Outcome                   1000 non-null   int64  
dtypes: float64(2), int64(7)
memory usage: 70.4 KB


In [5]:
## 数据集基本统计信息
diabetes.describe()

Unnamed: 0,Pregnancies,Glucose,BloodPressure,SkinThickness,Insulin,BMI,DiabetesPedigreeFunction,Age,Outcome
count,1000.0,1000.0,1000.0,1000.0,1000.0,1000.0,1000.0,1000.0,1000.0
mean,4.031,125.513,69.158,21.015,84.752,32.6875,0.498014,34.092,0.5
std,3.325221,32.130581,19.842701,16.139246,117.869885,7.507894,0.335205,11.361806,0.50025
min,0.0,0.0,0.0,0.0,0.0,0.0,0.078,21.0,0.0
25%,1.0,102.0,64.0,0.0,0.0,28.2525,0.256,25.0,0.0
50%,3.0,122.0,72.0,24.0,27.0,32.8,0.4055,31.0,0.5
75%,6.0,146.0,80.0,33.0,140.0,36.91,0.65615,41.0,1.0
max,17.0,199.0,122.0,99.0,846.0,67.1,2.42,81.0,1.0


In [6]:
# 查看目标分布
diabetes['Outcome'].value_counts()

0    500
1    500
Name: Outcome, dtype: int64

## 数据预处理

### 数据分离X，y

In [7]:
# 目标与数据分离
X = diabetes.drop(['Outcome'], axis=1)
y = diabetes['Outcome']

### 数据分割train，test

In [8]:
## 分割训练集和测试集，0.2
from sklearn.model_selection import train_test_split

X_train,X_test,y_train,y_test = train_test_split(X, y, test_size=.2, random_state=10)

### 数据标准化

In [9]:
# 标准化
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler().fit(X_train)

## 对训练集进行标准化
X_train_scaled = scaler.transform(X_train)

## 对测试集进行相同标准化
X_test_scaled = scaler.transform(X_test)

## K近邻建模-训练

In [10]:
from sklearn.neighbors import KNeighborsClassifier
# 建立模型
model_knn = KNeighborsClassifier()

# 训练模型
model_knn.fit(X_train_scaled, y_train)

KNeighborsClassifier()

## 模型预测与评估

In [11]:
from sklearn.metrics import accuracy_score
## 模型预测
y_pred = model_knn.predict(X_test_scaled)
# 评价模型
print("测试集分类正确率：", round(model_knn.score(X_test_scaled, y_test), 2))
print("测试集分类正确率：", round(accuracy_score(y_test,y_pred), 2))

测试集分类正确率： 0.77
测试集分类正确率： 0.77


**以上是正常的流程，有没有什么方法可以把一些步骤整合起来，减少代码量?**

# Pipeline流水线学习器
**理论部分：**
> ◆Sklearn中pipeline模块中的Pipeline类
>> * 实现机器学习过程中**全部步骤的流式化封装和管理**，大幅减少代码量
>> * Pipeline通常步骤：
>>> * 数据预处理学习器--->特征选择学习器--->执行预测的学习器
>>> * 除最后一个学习器，其余学习器必须有`transform`方法，用于数据转换
![](./imgs/chap10/fig10_01.jpg)

> ◆Pipeline类的参数、方法、属性
>> |参数	  |说明|
|:---|---:|
|steps	|学习器列表，按顺序以元组列表的方式给出，最后一个是估计器|

>>|方法	|说明|
|:---|---:|
|fit(X, y)	|训练模型|
|fit_predict(X, y)	|先训练模型，再进行预测|
|fit_transform(X, y)	|先训练模型，再利用最后一个学习器进行转换|
|predict(X)	|进行预测|
|predict_log_proba(X)	|预测对数概率|
|predict_proba(X)	|预测概率|
|score(X, y)	|模型评价|
|set_params( )	|修改学习器的参数|

>>|属性	|说明|
|:---|---:|
|name_steps	|查看每个步骤的名称和参数，字典对象，键为名称，值为参数|

**实践部分:**
> * 将以上过程，构建成机器学习流水线pipeline，流程如下：
<img src="./imgs/chap10/fig10_02.png">
> * 使用Pipeline()，之前的步骤只需几行代码


## 构建流水线

In [12]:
## 构建流水线学习器
from sklearn.pipeline import Pipeline

# 构建流水线
pipe = Pipeline(steps=[("scaler",StandardScaler()),
                       ("model",KNeighborsClassifier()),
                      ])

# 训练
pipe.fit(X_train,y_train)

# 预测
y_pred_pipe = pipe.predict(X_test)

# 评估
print("测试集分类正确率：", round(pipe.score(X_test, y_test), 2))
print("测试集分类正确率：", round(accuracy_score(y_test,y_pred_pipe), 2))

测试集分类正确率： 0.77
测试集分类正确率： 0.77


## 查看/修改流水线

### 查看pipeline中的某一个具体步骤
> * pipe.named_steps[key]

In [13]:
# 查看具体步骤--"scaler"
pipe.named_steps['scaler']

StandardScaler()

In [14]:
# 查看具体步骤--"model"
pipe.named_steps['model']


KNeighborsClassifier()

### 修改某一学习器的参数
> * 修改"model"步骤中KNeighborsClassfier()学习器的`weights='distance'`
* model__weights='distance'，注意是两个下划线

In [15]:
# 修改学习器参数,注意是两个下划线
pipe.set_params(model__weights='distance')

Pipeline(steps=[('scaler', StandardScaler()),
                ('model', KNeighborsClassifier(weights='distance'))])

In [16]:
pipe.named_steps['model']

KNeighborsClassifier(weights='distance')

# 逐步复杂的流水线

## 网格搜索中使用pipeline
> * model:`KNeighborsClassifier(n_neighbors=5, weights='uniform',metric='minkowski')`
* 可调参数1：`n_neighbors`:[2,4,6,8,10]
* 可调参数2：`weights`:['uniform', 'distance']

### GridSearchCV(cv=5)寻优

In [17]:
## 导入网格搜索库
from sklearn.model_selection import GridSearchCV

## 设置参数网络（K:V），注意是两个下划线
param_grid = {"model__n_neighbors":[2,4,6,8,10],
              "model__weights":['uniform', 'distance']
             }
## 构造网格搜索
grid_search = GridSearchCV(estimator=pipe,param_grid=param_grid,cv=5)

## 搜索
grid_search.fit(X_train,y_train)


GridSearchCV(cv=5,
             estimator=Pipeline(steps=[('scaler', StandardScaler()),
                                       ('model',
                                        KNeighborsClassifier(weights='distance'))]),
             param_grid={'model__n_neighbors': [2, 4, 6, 8, 10],
                         'model__weights': ['uniform', 'distance']})

In [18]:
## 使用最优模型-->预测
y_best = grid_search.predict(X_test)

In [19]:
## 测试集上的score
print("最优模型在test上的score:",round(grid_search.score(X_test, y_test),2))
print("最优模型在test上的score:",round(accuracy_score(y_test,y_best),2))

最优模型在test上的score: 0.79
最优模型在test上的score: 0.79


### 查看最优pipeline
> * 查看GridSeachCV的最优参数 
* 查看最优流水线
* 查看最优流水线中某步骤中的属性，如查看最佳分类器

In [20]:
## 查看搜索后的最优参数
grid_search.best_params_

{'model__n_neighbors': 6, 'model__weights': 'distance'}

注意：最优参数以K:V的形式给出，其中K为：“步骤名称__参数名称”，双下划线

In [21]:
## 查看最优流水线
grid_search.best_estimator_

Pipeline(steps=[('scaler', StandardScaler()),
                ('model',
                 KNeighborsClassifier(n_neighbors=6, weights='distance'))])

In [22]:
# 访问步骤中的属性，查看最佳分类器
grid_search.best_estimator_.named_steps['model']

KNeighborsClassifier(n_neighbors=6, weights='distance')

## 在网格搜索中再加入学习器的选择
> * 在步骤`scaler`处，给出`scale_selector=[StandardScaler(), MinMaxScaler()]`
* 再次使用网格搜索，添加网格参数：`'scaler': scale_selector`
* 流水线图：  <img src="./imgs/chap10/fig10_03.png " width=40%>

### GridSearchCV(cv=5)寻优(+scaler_selector)

In [23]:
## 在网格搜索中加入学习器的选择(sacler_selector)
from sklearn.preprocessing import StandardScaler,MinMaxScaler

# 设定需要选择的学习器
scaler_selector = [StandardScaler(),MinMaxScaler()]

# 设置参数网路(K:V)
param_grid = {'scaler':scaler_selector,
              "model__n_neighbors":[2,4,6,8,10],
              "model__weights":['uniform', 'distance']
             }

# 构造网格搜索
grid_search = GridSearchCV(estimator=pipe,
                           param_grid=param_grid,
                           cv=5
                          )

# 搜索-->寻优
grid_search.fit(X_train,y_train)


GridSearchCV(cv=5,
             estimator=Pipeline(steps=[('scaler', StandardScaler()),
                                       ('model',
                                        KNeighborsClassifier(weights='distance'))]),
             param_grid={'model__n_neighbors': [2, 4, 6, 8, 10],
                         'model__weights': ['uniform', 'distance'],
                         'scaler': [StandardScaler(), MinMaxScaler()]})

In [24]:
## 使用最优模型-->预测
y_best = grid_search.predict(X_test)

In [25]:
## 测试集上的score
print("最优模型在test上的score:",round(grid_search.score(X_test, y_test),2))
print("最优模型在test上的score:",round(accuracy_score(y_test,y_best),2))

最优模型在test上的score: 0.78
最优模型在test上的score: 0.78


### 查看最优pipeline(+scaler_selector)
> * 查看GridSeachCV的最优参数
* 查看最优流水线
* 查看最优流水线中某步骤中的属性，如最优scaler

In [26]:
## 查看GridSeachCV的最优参数
grid_search.best_params_

{'model__n_neighbors': 6,
 'model__weights': 'distance',
 'scaler': MinMaxScaler()}

In [27]:
## 查看最优流水线
grid_search.best_estimator_

Pipeline(steps=[('scaler', MinMaxScaler()),
                ('model',
                 KNeighborsClassifier(n_neighbors=6, weights='distance'))])

In [28]:
## 查看最优流水线中某步骤中的属性，如最优scaler
grid_search.best_estimator_.named_steps['scaler']

MinMaxScaler()

### GridSearchCV(cv=5)寻优(+model_selector)
**在以上基础上，继续++复杂化**
> * 在步骤`model`处，给出`model_selector=[SVC(),LogisticRegression(random_state=10)]`
* 依然使用网格搜索，添加网格`'model': model_selector`
* 注意：网格中个超参必须是model中共有的。
* `LogisticRegression(C=1.0,class_weight=None,solver='lbfgs'])`
* `SVC(C=1.0, kernel='rbf',degree=3,gamma='scale',class_weight='None')`
* 超参1:`'model__class_weight':['balanced', None]`
* 超参2:`'model__C':[0.01, 0.1, 0.2, 0.5, 1]`
* 流水线图：<img src="./imgs/chap10/fig10_04.png " width=40%>

In [32]:
##  在网格搜索中加入学习器的选择(model_selector)
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC

# 设定需要选择的学习器（model_selector）
model_selector = [SVC(), LogisticRegression(random_state=10)]

# 设置参数网络
param_grid = {'scaler':scaler_selector,
              'model': model_selector,
              'model__class_weight':['balanced', None],
              'model__C':[0.01, 0.1, 0.2, 0.5, 1]}

# 构造网格搜索
grid_search = GridSearchCV(estimator=pipe,
                           param_grid=param_grid,
                           cv=5
                          )

# 搜索--》寻优
grid_search.fit(X_train, y_train)

GridSearchCV(cv=5,
             estimator=Pipeline(steps=[('scaler', StandardScaler()),
                                       ('model',
                                        KNeighborsClassifier(weights='distance'))]),
             param_grid={'model': [SVC(C=1, class_weight='balanced'),
                                   LogisticRegression(random_state=10)],
                         'model__C': [0.01, 0.1, 0.2, 0.5, 1],
                         'model__class_weight': ['balanced', None],
                         'scaler': [StandardScaler(), MinMaxScaler()]})

In [33]:
## 使用最优模型-->预测
y_best = grid_search.predict(X_test)

In [34]:
## 测试集上的score
print("最优模型在test上的score:",round(grid_search.score(X_test, y_test),2))
print("最优模型在test上的score:",round(accuracy_score(y_test,y_best),2))

最优模型在test上的score: 0.79
最优模型在test上的score: 0.79


### 查看最优pipeline(+model_selector)
> * 查看GridSeachCV的最优参数
* 查看最优流水线
* 查看最优流水线中某步骤中的属性，如最优model

In [35]:
## 查看查看GridSeachCV的最优参数
grid_search.best_params_

{'model': SVC(C=1, class_weight='balanced'),
 'model__C': 1,
 'model__class_weight': 'balanced',
 'scaler': StandardScaler()}

In [36]:
## 查看最优流水线
grid_search.best_estimator_

Pipeline(steps=[('scaler', StandardScaler()),
                ('model', SVC(C=1, class_weight='balanced'))])

In [38]:
## 查看最优流水线中某步骤中的属性，如最优scaler
grid_search.best_estimator_.named_steps['model']

SVC(C=1, class_weight='balanced')

## 在流水线中加入特征选择RFECV并进行网格搜索
> * 流水线图： <img src="./imgs/chap10/fig10_05.png " width=70%>
* 由图看出:在`Data`与`Evaluation`中间，共三个步骤
* 较之前pipeline中的两步骤：`scaler`和`model`，又多了一个步骤RFECV，
* 这里取名为`selector`，功能是进行RFECV，即特征选择
* `RFECV(estimator=DecisionTreeClassifier(random_state=10),cv=5)`

In [40]:
## 构造scaler-->selector-->model三步骤piplline

# 导入RFECV,以及其依赖库决策树
from sklearn.feature_selection import RFECV
from sklearn.tree import DecisionTreeClassifier

# 构造pipeline
pipe_3 = Pipeline(steps=[('scaler',StandardScaler()),
                         ("selector",RFECV(estimator=DecisionTreeClassifier(random_state=10),cv=5)),
                         ("model",KNeighborsClassifier())
                        ])

# 设置参数网络
param_grid = {'scaler':scaler_selector,
              'model': model_selector,
              'model__class_weight':['balanced', None],
              'model__C':[0.01, 0.1, 0.2, 0.5, 1]}  

# 构造网格搜索
grid_search = GridSearchCV(estimator=pipe_3, param_grid=param_grid, cv=5)

# 搜索-->寻优
grid_search.fit(X_train, y_train)   

GridSearchCV(cv=5,
             estimator=Pipeline(steps=[('scaler', StandardScaler()),
                                       ('selector',
                                        RFECV(cv=5,
                                              estimator=DecisionTreeClassifier(random_state=10))),
                                       ('model', KNeighborsClassifier())]),
             param_grid={'model': [SVC(C=1, class_weight='balanced'),
                                   LogisticRegression(random_state=10)],
                         'model__C': [0.01, 0.1, 0.2, 0.5, 1],
                         'model__class_weight': ['balanced', None],
                         'scaler': [StandardScaler(), MinMaxScaler()]})

In [41]:
## 查看最优pipeline
grid_search.best_estimator_

Pipeline(steps=[('scaler', StandardScaler()),
                ('selector',
                 RFECV(cv=5,
                       estimator=DecisionTreeClassifier(random_state=10))),
                ('model', SVC(C=1, class_weight='balanced'))])

In [42]:
## 访问步骤(selector)属性，查看特征排名
fea_selected = grid_search.best_estimator_.named_steps['selector'].ranking_
print(fea_selected)

## 构造Series
pd.Series(fea_selected,index=X_train.columns)

[1 1 1 1 2 1 1 1]


Pregnancies                 1
Glucose                     1
BloodPressure               1
SkinThickness               1
Insulin                     2
BMI                         1
DiabetesPedigreeFunction    1
Age                         1
dtype: int32

In [43]:
## 使用最优模型-->预测
y_best = grid_search.predict(X_test)

## 测试集上的score
print("最优模型在test上的score:",round(grid_search.score(X_test, y_test),2))
print("最优模型在test上的score:",round(accuracy_score(y_test,y_best),2))

最优模型在test上的score: 0.77
最优模型在test上的score: 0.77


## 在流水线中加入特征降维并进行网格搜索
> 流水线图：<img src="./imgs/chap10/fig10_06.png " width=70%>

> * 由图看出:在`Data`与`Evaluation`中间，共三个步骤
* 较之前pipeline中的两步骤：`scaler`和`model`，又多了一个步骤PCA，
* 这里取名为`decomposition`，功能是进行PCA(n_components=3)，即特征降维->3维
* `PCA(n_components=None,whiten=False,svd_solver='auto',random_state=None)`

In [49]:
## 在流水线中加入特征降维并进行网格搜索
from sklearn.decomposition import PCA

# 在管道中加入PCA
pipe_3 = Pipeline(steps=[("scaler",StandardScaler()),
                         ("decomposition",PCA(n_components=3)),
                         ("model",KNeighborsClassifier())
                        ])

# 设置参数网络(K:V)
param_grid = {"scaler":scaler_selector,
              "model":model_selector,
              "model__C":[0.01,0.1,0.2,0.5,1.0],
              "model__class_weight":['balanced',None]
             }

# 构造网格搜索
grid_search = GridSearchCV(estimator=pipe_3,
                           param_grid=param_grid,
                           cv=5)

# 搜索--》寻优
grid_search.fit(X_train,y_train)

GridSearchCV(cv=5,
             estimator=Pipeline(steps=[('scaler', StandardScaler()),
                                       ('decomposition', PCA(n_components=3)),
                                       ('model', KNeighborsClassifier())]),
             param_grid={'model': [SVC(), LogisticRegression(random_state=10)],
                         'model__C': [0.01, 0.1, 0.2, 0.5, 1.0],
                         'model__class_weight': ['balanced', None],
                         'scaler': [StandardScaler(), MinMaxScaler()]})

In [51]:
# 查看最优参数
grid_search.best_params_

{'model': SVC(),
 'model__C': 1.0,
 'model__class_weight': None,
 'scaler': StandardScaler()}

In [50]:
# 查看最优pipeline
grid_search.best_estimator_

Pipeline(steps=[('scaler', StandardScaler()),
                ('decomposition', PCA(n_components=3)), ('model', SVC())])

In [55]:
## 最优模型在训练集上的得分
grid_search.best_score_

0.75625

In [56]:
# 查看方差贡献率'explained_variance_ratio_.sum()'
grid_search.best_estimator_.named_steps["decomposition"].explained_variance_ratio_.sum()

0.6068026317734158

解析：如果利用PCA()将8个特征降到3个，那只会有60%的方差被解释，所以效果不好，需要提高维度，或者换用其他的降维方法，如LDA,TSNE等。

## 构建多条并行的流水线
> * 目前，我们已经尝试了以下三种流水线,并进行GridSearchCV,找到了他们各自的最优pipeline：
>> (1) pipe:  `scaler` --> `model`   <br> 
>> (2) pipe_3: `scaler` -->`selector` -->`model` ,   即：+RFECV特征选择    <br>
>> (3) pipe_3: `scaler` -->`decompostion` -->`model`  即：+PCA(3)特征降维    <br>

> * 问题：特征选择和特征降维都是降维的手段，那么以上哪种流水线效果更好，能否进行一下横向的比较呢？
> * 解析：构造多条pipleline来解决以上问题。
* 流水线图：<img src="./imgs/chap10/fig10_07.png " width=70%>

In [60]:
## 构建复杂流水线(多条并行的流水线)
from sklearn.pipeline import Pipeline

# 预处理
from sklearn.preprocessing import StandardScaler,MinMaxScaler

# 特征降维
from sklearn.decomposition import PCA

# 特征选择
from sklearn.feature_selection import RFECV
from sklearn.tree import DecisionTreeClassifier

# models
from sklearn.ensemble import RandomForestClassifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.linear_model import LogisticRegression


# 构造第1条pipe_rf
pipe_rf = Pipeline(steps=[('scaler1', StandardScaler()),
                          ('model1', RandomForestClassifier(n_estimators=100,random_state=10))
                         ])

# 构造第2条pipe_knn
pipe_knn = Pipeline(steps=[('scaler2', StandardScaler()),
                           ('decomposition1', PCA(n_components=6)),
                           ('model2', KNeighborsClassifier(n_neighbors=6,weights='distance'))
                          ])

# 构造第3条pipe_lr
pipe_lr = Pipeline(steps=[('scaler3', MinMaxScaler()),
                          ('decomposition2',RFECV(DecisionTreeClassifier(random_state=10), cv=5)),
                          ('model3', LogisticRegression(random_state=10))
                         ])


In [63]:
# 创建流水线字典
pipe_dic = {'随机森林':pipe_rf, '加权K-近邻':pipe_knn, '逻辑回归':pipe_lr}
y_pred_dict = {}
                           
# 训练&预测&评估流水线
for pipe_name,pipe in pipe_dic.items():
    pipe.fit(X_train, y_train)
    y_pred_dict[pipe_name] = pipe.predict(X_test)
    print('%s在测试集上分类正确率: %.3f' % (pipe_name, pipe.score(X_test, y_test)))




随机森林在测试集上分类正确率: 0.825
加权K-近邻在测试集上分类正确率: 0.780
逻辑回归在测试集上分类正确率: 0.750


# FeatureUnion
> `from sklearn.pipeline import FeatureUnion`
> * FeatureUnion流水线图： <img src="./imgs/chap10/fig10_08.png"   width=60%>
>>* 将若干个transformer（转换器）对象组合成一个新的transformer
>>* 一个FeatureUnion对象接受输入一个transformer对象列表
>>* 训练阶段，列表中的transformer并行应用于数据，然后将结果横向连接，拼接成一个更大的特征向量矩阵
>>* 有利于将多个特征提取机制组合到一个transformer中

> * FeatureUnion类的参数、方法、属性

>>|参数	|说明|
|:---|---:|
|transformer_list	|应用于数据的transformer对象列表|
|transformer_weights	|设置每个transformer的权重，字典对象，键为transformer的名称，值为权重|

>>|方法	|说明|
|:---|---:|
|fit(X, y)	|训练模型|
|fit_transform(X, y)	|先训练模型，再进行转换|
|predict(X)	   |进行预测|
|get_params([deep])	   |获取transformer的参数|
|set_params()	  |修改transformer的参数|
|transform(X)	  |进行转换|

>>|属性	|说明|
|:---|---:|
|transformer_list	   |查看transformer对象列表|
|transformer_weights	|查看transformer的权重|

## 构建FeatureUnion
> 将源数据X中的8维特征进行FeagtureUnion，具体：
>> * 使用`PCA(3)`获得一个3维特征；
>> * 再使用`PolynomialFeatures(degree=2,include_bias=False`)获得一个44维特征，
>> * 最后横向拼接成47维特征

> * 流水线图： <img src="./imgs/chap10/fig10_09.png"  width=50%>

>*  `from sklearn.preprocessing import PolynomialFeatures`   <br>
>*  `PolynomialFeatures(degree=2, interaction_only=False, include_bias=True, order='C')`
>>* 功能：生成多项式和交互特征 <br>
>>* 参数

|参数    |说明|
|:---|---:|
|degree        |度数，决定多项式的次数|
|interaction_only |默认为False，字面意思就是只能交叉相乘，不能有a^2这种|
|include_bias    | 默认为True, 这个bias指的是多项式会自动包含1，设为False就没这个1了|
|order        |"C" 或"F","C":在密集情况(dense case)下的输出array的顺序，"F" 加快操作|

>>* 例如： 现在有(a,b)两个特征，使用`degree=2,其他默认`的二次多项式则为(1,a,a^2,ab,b,b^2)。
>>* 公式：若有n个特征，`degree=d,interaction_only=False,include_bias=True`时，得到的新特征数量为：
>>> $$
\text { newf_num }=C_{n+d}^{d}=\frac{(n+d) \times(n+d-1) \ldots}{d \times(d-1) \times \ldots} \quad(d-1==1)
$$
>>* 例如：n=8,d=2时:
$$
\text { newf_num }=C_{8+2}^{2}=\frac{(10) \times(9)}{2 \times 1}=45
$$
>> * 说明：若`include_bias=False`时，则减去偏置bias，数量减1。

In [66]:
# 构建FeatureUnion

from sklearn.pipeline import FeatureUnion
from sklearn.preprocessing import PolynomialFeatures

# 构建FeatureUnion
combined_features = FeatureUnion([('pca', PCA(n_components=3)),
                                  ('poly', PolynomialFeatures(degree=2, include_bias=False))])

X_features = combined_features.fit_transform(X_train)

# 查看转换后数据的维度
X_features.shape


(800, 47)

## 将FeatureUnion加入流水线
> * 做FeatureUnion的目的是为了模型的训练，所以需要将FeatureUnion加入到流水线中。
> * 流水线图：<img src="./imgs/chap10/fig10_10.png"  width=50%>

> **构建两步骤的pipeline：**
>>  * step1: `Data` 与 `New_data`之间的步骤为featureUnion，将该步骤命名为`feagtures`
>>  * step2: `New_data` 与 `Evaluation`之间的步骤为model，将该步骤命名为`lr`

In [68]:
## 将FeatureUnion加入流水线

# 构建流水线
pipeline = Pipeline([("features", combined_features), ("lr", LogisticRegression(random_state=10))])

# 构建网格KV
param_grid = dict(features__pca__n_components=[1, 2, 3, 4, 5, 6],
                  lr__C=[0.1, 0.2, 0.5, 1],
                  lr__class_weight=[None, 'balanced'],
                  lr__penalty=['l1', 'l2'])

# 构建网格搜索
grid_search = GridSearchCV(pipeline, param_grid=param_grid, cv=5)

# 搜索--》寻优
grid_search.fit(X_train, y_train)


Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\pipeline.py", line 346, in fit
    self._final_estimator.fit(Xt, y, **fit_params_last_step)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 1306, in fit
    solver = _check_solver(self.solver, self.penalty, self.dual)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 444, in _check_solver
    "got %s penalty." % (solver, penalty))
ValueError: Solver lbfgs supports only 'l2' or 'none' penalties, got l1 penalty.

Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packag

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logist

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logist

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\pipeline.py", line 346, in fit
    self._final_estimator.fit(Xt, y, **fit_params_last_step)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 1306, in fit
    solver = _check_solver(self.solver, self.penalty, self.dual)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 444, in _check_solver
    "got %s penalty." % (solver, p

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logist

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logist

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\pipeline.py", line 346, in fit
    self._final_estimator.fit(Xt, y, **fit_params_last_step)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 1306, in fit
    solver = _check_solver(self.solver, self.penalty, self.dual)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 444, in _check_solver
    "got %s penalty." % (solver, p

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logist

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logist

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logist

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\pipeline.py", line 346, in fit
    self._final_estimator.fit(Xt, y, **fit_params_last_step)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 1306, in fit
    solver = _check_solver(self.solver, self.penalty, self.dual)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 444, in _check_solver
    "got %s penalty." % (solver, p

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logist

Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\pipeline.py", line 346, in fit
    self._final_estimator.fit(Xt, y, **fit_params_last_step)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 1306, in fit
    solver = _check_solver(self.solver, self.penalty, self.dual)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 444, in _check_solver
    "got %s penalty." % (solver, penalty))
ValueError: Solver lbfgs supports only 'l2' or 'none' penalties, got l1 penalty.

Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packag

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logist

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logist

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logist

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\pipeline.py", line 346, in fit
    self._final_estimator.fit(Xt, y, **fit_params_last_step)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 1306, in fit
    solver = _check_solver(self.solver, self.penalty, self.dual)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 444, in _check_solver
    "got %s penalty." % (solver, p

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logist

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logist

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\pipeline.py", line 346, in fit
    self._final_estimator.fit(Xt, y, **fit_params_last_step)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 1306, in fit
    solver = _check_solver(self.solver, self.penalty, self.dual)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 444, in _check_solver
    "got %s penalty." % (solver, p

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logist

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logist

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\pipeline.py", line 346, in fit
    self._final_estimator.fit(Xt, y, **fit_params_last_step)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 1306, in fit
    solver = _check_solver(self.solver, self.penalty, self.dual)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 444, in _check_solver
    "got %s penalty." % (solver, p

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logist

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logist

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\pipeline.py", line 346, in fit
    self._final_estimator.fit(Xt, y, **fit_params_last_step)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 1306, in fit
    solver = _check_solver(self.solver, self.penalty, self.dual)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 444, in _check_solver
    "got %s penalty." % (solver, p

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logist

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logist

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\pipeline.py", line 346, in fit
    self._final_estimator.fit(Xt, y, **fit_params_last_step)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 1306, in fit
    solver = _check_solver(self.solver, self.penalty, self.dual)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 444, in _check_solver
    "got %s penalty." % (solver, p

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logist

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logist

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\pipeline.py", line 346, in fit
    self._final_estimator.fit(Xt, y, **fit_params_last_step)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 1306, in fit
    solver = _check_solver(self.solver, self.penalty, self.dual)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 444, in _check_solver
    "got %s penalty." % (solver, p

Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\pipeline.py", line 346, in fit
    self._final_estimator.fit(Xt, y, **fit_params_last_step)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 1306, in fit
    solver = _check_solver(self.solver, self.penalty, self.dual)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 444, in _check_solver
    "got %s penalty." % (solver, penalty))
ValueError: Solver lbfgs supports only 'l2' or 'none' penalties, got l1 penalty.

Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packag

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logist

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\pipeline.py", line 346, in fit
    self._final_estimator.fit(Xt, y

GridSearchCV(cv=5,
             estimator=Pipeline(steps=[('features',
                                        FeatureUnion(transformer_list=[('pca',
                                                                        PCA(n_components=3)),
                                                                       ('poly',
                                                                        PolynomialFeatures(include_bias=False))])),
                                       ('lr',
                                        LogisticRegression(random_state=10))]),
             param_grid={'features__pca__n_components': [1, 2, 3, 4, 5, 6],
                         'lr__C': [0.1, 0.2, 0.5, 1],
                         'lr__class_weight': [None, 'balanced'],
                         'lr__penalty': ['l1', 'l2']})

In [69]:

# 查看最优流水线在测试集上的表现
grid_search.best_score_


0.6925

**再加一个步骤`scaler`**

In [72]:
## 将FeatureUnion加入流水线

# 构建流水线
pipeline_3 = Pipeline([("features", combined_features), 
                     ("scaler",StandardScaler()),
                     ("lr", LogisticRegression(random_state=10)),
                    ])

# 构建网格KV
param_grid = dict(features__pca__n_components=[3, 4, 5, 6],
                  scaler = scaler_selector,
                  lr__C=[0.1, 0.2, 0.5, 1],
                  lr__class_weight=[None, 'balanced'],
                  lr__penalty=['l1', 'l2'])

# 构建网格搜索
grid_search = GridSearchCV(pipeline_3, param_grid=param_grid, cv=5)

# 搜索--》寻优
grid_search.fit(X_train, y_train)



Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\pipeline.py", line 346, in fit
    self._final_estimator.fit(Xt, y, **fit_params_last_step)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 1306, in fit
    solver = _check_solver(self.solver, self.penalty, self.dual)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 444, in _check_solver
    "got %s penalty." % (solver, penalty))
ValueError: Solver lbfgs supports only 'l2' or 'none' penalties, got l1 penalty.

Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packag

Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\pipeline.py", line 346, in fit
    self._final_estimator.fit(Xt, y, **fit_params_last_step)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 1306, in fit
    solver = _check_solver(self.solver, self.penalty, self.dual)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 444, in _check_solver
    "got %s penalty." % (solver, penalty))
ValueError: Solver lbfgs supports only 'l2' or 'none' penalties, got l1 penalty.

Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packag

Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\pipeline.py", line 346, in fit
    self._final_estimator.fit(Xt, y, **fit_params_last_step)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 1306, in fit
    solver = _check_solver(self.solver, self.penalty, self.dual)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 444, in _check_solver
    "got %s penalty." % (solver, penalty))
ValueError: Solver lbfgs supports only 'l2' or 'none' penalties, got l1 penalty.

Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packag

Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\pipeline.py", line 346, in fit
    self._final_estimator.fit(Xt, y, **fit_params_last_step)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 1306, in fit
    solver = _check_solver(self.solver, self.penalty, self.dual)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 444, in _check_solver
    "got %s penalty." % (solver, penalty))
ValueError: Solver lbfgs supports only 'l2' or 'none' penalties, got l1 penalty.

Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packag

Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\pipeline.py", line 346, in fit
    self._final_estimator.fit(Xt, y, **fit_params_last_step)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 1306, in fit
    solver = _check_solver(self.solver, self.penalty, self.dual)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 444, in _check_solver
    "got %s penalty." % (solver, penalty))
ValueError: Solver lbfgs supports only 'l2' or 'none' penalties, got l1 penalty.

Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packag

Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\pipeline.py", line 346, in fit
    self._final_estimator.fit(Xt, y, **fit_params_last_step)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 1306, in fit
    solver = _check_solver(self.solver, self.penalty, self.dual)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 444, in _check_solver
    "got %s penalty." % (solver, penalty))
ValueError: Solver lbfgs supports only 'l2' or 'none' penalties, got l1 penalty.

Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packag

Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\pipeline.py", line 346, in fit
    self._final_estimator.fit(Xt, y, **fit_params_last_step)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 1306, in fit
    solver = _check_solver(self.solver, self.penalty, self.dual)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 444, in _check_solver
    "got %s penalty." % (solver, penalty))
ValueError: Solver lbfgs supports only 'l2' or 'none' penalties, got l1 penalty.

Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packag

Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\pipeline.py", line 346, in fit
    self._final_estimator.fit(Xt, y, **fit_params_last_step)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 1306, in fit
    solver = _check_solver(self.solver, self.penalty, self.dual)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 444, in _check_solver
    "got %s penalty." % (solver, penalty))
ValueError: Solver lbfgs supports only 'l2' or 'none' penalties, got l1 penalty.

Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packag

Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\pipeline.py", line 346, in fit
    self._final_estimator.fit(Xt, y, **fit_params_last_step)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 1306, in fit
    solver = _check_solver(self.solver, self.penalty, self.dual)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 444, in _check_solver
    "got %s penalty." % (solver, penalty))
ValueError: Solver lbfgs supports only 'l2' or 'none' penalties, got l1 penalty.

Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packag

Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\pipeline.py", line 346, in fit
    self._final_estimator.fit(Xt, y, **fit_params_last_step)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 1306, in fit
    solver = _check_solver(self.solver, self.penalty, self.dual)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 444, in _check_solver
    "got %s penalty." % (solver, penalty))
ValueError: Solver lbfgs supports only 'l2' or 'none' penalties, got l1 penalty.

Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packag

Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\pipeline.py", line 346, in fit
    self._final_estimator.fit(Xt, y, **fit_params_last_step)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 1306, in fit
    solver = _check_solver(self.solver, self.penalty, self.dual)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 444, in _check_solver
    "got %s penalty." % (solver, penalty))
ValueError: Solver lbfgs supports only 'l2' or 'none' penalties, got l1 penalty.

Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packag

Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\pipeline.py", line 346, in fit
    self._final_estimator.fit(Xt, y, **fit_params_last_step)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 1306, in fit
    solver = _check_solver(self.solver, self.penalty, self.dual)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 444, in _check_solver
    "got %s penalty." % (solver, penalty))
ValueError: Solver lbfgs supports only 'l2' or 'none' penalties, got l1 penalty.

Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packag

Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\pipeline.py", line 346, in fit
    self._final_estimator.fit(Xt, y, **fit_params_last_step)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 1306, in fit
    solver = _check_solver(self.solver, self.penalty, self.dual)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 444, in _check_solver
    "got %s penalty." % (solver, penalty))
ValueError: Solver lbfgs supports only 'l2' or 'none' penalties, got l1 penalty.

Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packag

Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\pipeline.py", line 346, in fit
    self._final_estimator.fit(Xt, y, **fit_params_last_step)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 1306, in fit
    solver = _check_solver(self.solver, self.penalty, self.dual)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 444, in _check_solver
    "got %s penalty." % (solver, penalty))
ValueError: Solver lbfgs supports only 'l2' or 'none' penalties, got l1 penalty.

Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packag

Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\pipeline.py", line 346, in fit
    self._final_estimator.fit(Xt, y, **fit_params_last_step)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 1306, in fit
    solver = _check_solver(self.solver, self.penalty, self.dual)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 444, in _check_solver
    "got %s penalty." % (solver, penalty))
ValueError: Solver lbfgs supports only 'l2' or 'none' penalties, got l1 penalty.

Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packag

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\pipeline.py", line 346, in fit
    self._final_estimator.fit(Xt, y, **fit_params_last_step)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 1306, in fit
    solver = _check_solver(self.solver, self.penalty, self.dual)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 444, in _check_solver
    "got %s penalty." % (solver, p

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\pipeline.py", line 346, in fit
    self._final_estimator.fit(Xt, y, **fit_params_last_step)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 1306, in fit
    solver = _check_solver(self.solver, self.penalty, self.dual)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 444, in _check_solver
    "got %s penalty." % (solver, p

Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\pipeline.py", line 346, in fit
    self._final_estimator.fit(Xt, y, **fit_params_last_step)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 1306, in fit
    solver = _check_solver(self.solver, self.penalty, self.dual)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 444, in _check_solver
    "got %s penalty." % (solver, penalty))
ValueError: Solver lbfgs supports only 'l2' or 'none' penalties, got l1 penalty.

Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packag

Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\pipeline.py", line 346, in fit
    self._final_estimator.fit(Xt, y, **fit_params_last_step)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 1306, in fit
    solver = _check_solver(self.solver, self.penalty, self.dual)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 444, in _check_solver
    "got %s penalty." % (solver, penalty))
ValueError: Solver lbfgs supports only 'l2' or 'none' penalties, got l1 penalty.

Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packag

Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\pipeline.py", line 346, in fit
    self._final_estimator.fit(Xt, y, **fit_params_last_step)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 1306, in fit
    solver = _check_solver(self.solver, self.penalty, self.dual)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 444, in _check_solver
    "got %s penalty." % (solver, penalty))
ValueError: Solver lbfgs supports only 'l2' or 'none' penalties, got l1 penalty.

Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packag

Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\pipeline.py", line 346, in fit
    self._final_estimator.fit(Xt, y, **fit_params_last_step)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 1306, in fit
    solver = _check_solver(self.solver, self.penalty, self.dual)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 444, in _check_solver
    "got %s penalty." % (solver, penalty))
ValueError: Solver lbfgs supports only 'l2' or 'none' penalties, got l1 penalty.

Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packag

Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\pipeline.py", line 346, in fit
    self._final_estimator.fit(Xt, y, **fit_params_last_step)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 1306, in fit
    solver = _check_solver(self.solver, self.penalty, self.dual)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 444, in _check_solver
    "got %s penalty." % (solver, penalty))
ValueError: Solver lbfgs supports only 'l2' or 'none' penalties, got l1 penalty.

Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packag

Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\pipeline.py", line 346, in fit
    self._final_estimator.fit(Xt, y, **fit_params_last_step)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 1306, in fit
    solver = _check_solver(self.solver, self.penalty, self.dual)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 444, in _check_solver
    "got %s penalty." % (solver, penalty))
ValueError: Solver lbfgs supports only 'l2' or 'none' penalties, got l1 penalty.

Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packag

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\pipeline.py", line 346, in fit
    self._final_estimator.fit(Xt, y

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\pipeline.py", line 346, in fit
    self._final_estimator.fit(Xt, y

Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\pipeline.py", line 346, in fit
    self._final_estimator.fit(Xt, y, **fit_params_last_step)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 1306, in fit
    solver = _check_solver(self.solver, self.penalty, self.dual)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 444, in _check_solver
    "got %s penalty." % (solver, penalty))
ValueError: Solver lbfgs supports only 'l2' or 'none' penalties, got l1 penalty.

Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packag

Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\pipeline.py", line 346, in fit
    self._final_estimator.fit(Xt, y, **fit_params_last_step)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 1306, in fit
    solver = _check_solver(self.solver, self.penalty, self.dual)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 444, in _check_solver
    "got %s penalty." % (solver, penalty))
ValueError: Solver lbfgs supports only 'l2' or 'none' penalties, got l1 penalty.

Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packag

Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\pipeline.py", line 346, in fit
    self._final_estimator.fit(Xt, y, **fit_params_last_step)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 1306, in fit
    solver = _check_solver(self.solver, self.penalty, self.dual)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 444, in _check_solver
    "got %s penalty." % (solver, penalty))
ValueError: Solver lbfgs supports only 'l2' or 'none' penalties, got l1 penalty.

Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packag

Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\pipeline.py", line 346, in fit
    self._final_estimator.fit(Xt, y, **fit_params_last_step)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 1306, in fit
    solver = _check_solver(self.solver, self.penalty, self.dual)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 444, in _check_solver
    "got %s penalty." % (solver, penalty))
ValueError: Solver lbfgs supports only 'l2' or 'none' penalties, got l1 penalty.

Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packag

Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\pipeline.py", line 346, in fit
    self._final_estimator.fit(Xt, y, **fit_params_last_step)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 1306, in fit
    solver = _check_solver(self.solver, self.penalty, self.dual)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 444, in _check_solver
    "got %s penalty." % (solver, penalty))
ValueError: Solver lbfgs supports only 'l2' or 'none' penalties, got l1 penalty.

Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packag

Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\pipeline.py", line 346, in fit
    self._final_estimator.fit(Xt, y, **fit_params_last_step)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 1306, in fit
    solver = _check_solver(self.solver, self.penalty, self.dual)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\linear_model\_logistic.py", line 444, in _check_solver
    "got %s penalty." % (solver, penalty))
ValueError: Solver lbfgs supports only 'l2' or 'none' penalties, got l1 penalty.

Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packag

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
Traceback (most recent call last):
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\model_selection\_validation.py", line 593, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "D:\virtualEnvs\MLenv1\lib\site-packages\sklearn\pipeline.py", line 346, in fit
    self._final_estimator.fit(Xt, y

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
     nan 0.76875 0.74        nan     nan 0.77375 0.745       nan     nan
 0.765   0.7425      nan     nan 0.77    0.75125     nan     nan 0.765
 0.7525      nan     nan 0.76375 0.755       nan     nan 0.76125 0.73
     nan     nan 0.76375 0.74125     nan     nan 0.76875 0.73875     nan
     nan 0.775   0.74375     nan     

GridSearchCV(cv=5,
             estimator=Pipeline(steps=[('features',
                                        FeatureUnion(transformer_list=[('pca',
                                                                        PCA(n_components=3)),
                                                                       ('poly',
                                                                        PolynomialFeatures(include_bias=False))])),
                                       ('scaler', StandardScaler()),
                                       ('lr',
                                        LogisticRegression(random_state=10))]),
             param_grid={'features__pca__n_components': [3, 4, 5, 6],
                         'lr__C': [0.1, 0.2, 0.5, 1],
                         'lr__class_weight': [None, 'balanced'],
                         'lr__penalty': ['l1', 'l2'],
                         'scaler': [StandardScaler(), MinMaxScaler()]})

In [73]:
grid_search.best_estimator_

Pipeline(steps=[('features',
                 FeatureUnion(transformer_list=[('pca', PCA(n_components=4)),
                                                ('poly',
                                                 PolynomialFeatures(include_bias=False))])),
                ('scaler', StandardScaler()),
                ('lr',
                 LogisticRegression(C=0.2, class_weight='balanced',
                                    random_state=10))])

In [74]:
grid_search.best_score_

0.7750000000000001

# 小结
> * pipeline
> * featureUnion