# René Parlange, MSc
### 📚 Machine Learning Course, PhD in Computer Science
#### 🎓 Instructor: Juan Carlos Cuevas Tello, PhD
#### 🏛 Universidad Autónoma de San Luis Potosí (UASLP)

🔗 [GitHub Repository](https://github.com/parlange)

## Restricted Boltzmann Machines using scikit-learn
### dataset: wine (UCI)
### Note: even though the dataset is continuious, hidden layers are still Bernoulli-Bernoulli

In [None]:
import numpy as np
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import minmax_scale
from sklearn import linear_model
from sklearn.neural_network import BernoulliRBM
from sklearn.pipeline import Pipeline
from sklearn import metrics
import time

# Load data
wine = datasets.load_wine()
X = np.asarray(wine.data, "float32")
Y = wine.target

# Scale data
X = minmax_scale(X, feature_range=(0, 1))

# Split data
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.2, random_state=0)

# Model components and pipeline
logistic = linear_model.LogisticRegression(solver='newton-cg', tol=1)
rbm = BernoulliRBM(random_state=0, verbose=True)
rbm_features_classifier = Pipeline(steps=[("rbm", rbm), ("logistic", logistic)])

# Set hyperparameters
rbm.learning_rate = 0.06
rbm.n_iter = 10
rbm.n_components = 10
logistic.C = 6000

# Train rbm_features_classifier and print its execution time
start_time_rbm = time.time()
rbm_features_classifier.fit(X_train, Y_train)

# Train raw_data_classifier and print its execution time
start_time_raw = time.time()
raw_data_classifier = linear_model.LogisticRegression(solver='newton-cg', tol=1, C=100.0)
raw_data_classifier.fit(X_train, Y_train)

# Evaluate models
Y_pred = rbm_features_classifier.predict(X_test)
print("\nLogistic regression using RBM features:\n%s\n" % metrics.classification_report(Y_test, Y_pred, zero_division=1))

Y_pred = raw_data_classifier.predict(X_test)
print("Logistic regression using raw data:\n%s\n" % metrics.classification_report(Y_test, Y_pred, zero_division=1))

print("\nExecution time for rbm_features_classifier:", time.time() - start_time_rbm, "seconds")
print("\nExecution time for raw_data_classifier:", time.time() - start_time_raw, "seconds")


[BernoulliRBM] Iteration 1, pseudo-likelihood = -8.44, time = 0.00s
[BernoulliRBM] Iteration 2, pseudo-likelihood = -8.27, time = 0.00s
[BernoulliRBM] Iteration 3, pseudo-likelihood = -8.19, time = 0.00s
[BernoulliRBM] Iteration 4, pseudo-likelihood = -8.18, time = 0.00s
[BernoulliRBM] Iteration 5, pseudo-likelihood = -8.15, time = 0.00s
[BernoulliRBM] Iteration 6, pseudo-likelihood = -8.18, time = 0.00s
[BernoulliRBM] Iteration 7, pseudo-likelihood = -8.17, time = 0.00s
[BernoulliRBM] Iteration 8, pseudo-likelihood = -8.17, time = 0.00s
[BernoulliRBM] Iteration 9, pseudo-likelihood = -8.20, time = 0.00s
[BernoulliRBM] Iteration 10, pseudo-likelihood = -8.20, time = 0.00s

Logistic regression using RBM features:
              precision    recall  f1-score   support

           0       0.87      0.93      0.90        14
           1       0.67      0.88      0.76        16
           2       1.00      0.00      0.00         6

    accuracy                           0.75        36
   mac