# Benchmark with Sklearn and DeepFeatX
In the following notebook we are going to use the library deepfeatx to extract features from the images and use sklearn to train a simple classifier. The main goal is to provide a simple benchmark for comparing with more advanced strategies. The default feature feature extractor from deepfeatx is the Resnet50 model's top. 

In [1]:
!pip install deepfeatx --quiet

In [2]:
from deepfeatx.image import ImageFeatureExtractor
fe = ImageFeatureExtractor()

In [3]:
train=fe.extract_features_from_directory('../input/labeled-chest-xray-images/chest_xray/train', 
                                         classes_as_folders=True,
                                         export_class_names=True)
test=fe.extract_features_from_directory('../input/labeled-chest-xray-images/chest_xray/test', 
                                         classes_as_folders=True,
                                         export_class_names=True)

In [4]:
train.head()

In [5]:
test.head()

In [17]:
X_train, y_train = train.drop(['filepaths', 'classes'], axis=1), train['classes']
X_test, y_test = test.drop(['filepaths', 'classes'], axis=1), test['classes']

In [18]:
from sklearn.linear_model import LogisticRegression
lr = LogisticRegression(solver='liblinear').fit(X_train, y_train)

In [19]:
lr.score(X_test, y_test)

In [21]:
from sklearn.metrics import roc_auc_score, classification_report, confusion_matrix
roc_auc_score(y_test, lr.predict_proba(X_test)[:, 1])

In [22]:
print(classification_report(y_test, lr.predict(X_test)))

In [24]:
import seaborn as sns
cm=confusion_matrix(y_test, lr.predict(X_test))
sns.heatmap(cm, annot=True, fmt='g')