<h1>Text Classification using Regularized Logistic Regression<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Regularization-Theory" data-toc-modified-id="Regularization-Theory-1">Regularization Theory</a></span></li><li><span><a href="#Machine-Learning-Project-Lifecycle:-Third-Iteration" data-toc-modified-id="Machine-Learning-Project-Lifecycle:-Third-Iteration-2">Machine Learning Project Lifecycle: Third Iteration</a></span><ul class="toc-item"><li><span><a href="#Problem-Statement" data-toc-modified-id="Problem-Statement-2.1">Problem Statement</a></span></li><li><span><a href="#Training-Data" data-toc-modified-id="Training-Data-2.2">Training Data</a></span></li><li><span><a href="#Preprocessing-+-Feature-Engineering" data-toc-modified-id="Preprocessing-+-Feature-Engineering-2.3">Preprocessing + Feature Engineering</a></span></li><li><span><a href="#Machine-Learning-Algorithm:-Logistic-Regression" data-toc-modified-id="Machine-Learning-Algorithm:-Logistic-Regression-2.4">Machine Learning Algorithm: Logistic Regression</a></span><ul class="toc-item"><li><span><a href="#Using-Sklearn-Implementation-for-Student-Loan-Class-Prediction" data-toc-modified-id="Using-Sklearn-Implementation-for-Student-Loan-Class-Prediction-2.4.1">Using Sklearn Implementation for Student Loan Class Prediction</a></span></li><li><span><a href="#Custom-Implementation-for-Student-Loan-Class-Prediction-using-l2-penalty" data-toc-modified-id="Custom-Implementation-for-Student-Loan-Class-Prediction-using-l2-penalty-2.4.2">Custom Implementation for Student Loan Class Prediction using l2 penalty</a></span></li><li><span><a href="#util-fns" data-toc-modified-id="util-fns-2.4.3">util fns</a></span></li><li><span><a href="#Evaluate-different-hyper-parameters-on-test-dataset" data-toc-modified-id="Evaluate-different-hyper-parameters-on-test-dataset-2.4.4">Evaluate different hyper parameters on test dataset</a></span></li></ul></li><li><span><a href="#Model-Evaluation" data-toc-modified-id="Model-Evaluation-2.5">Model Evaluation</a></span></li><li><span><a href="#Quality-Metrics" data-toc-modified-id="Quality-Metrics-2.6">Quality Metrics</a></span></li><li><span><a href="#Model-Evaluation-on-Test-Dataset" data-toc-modified-id="Model-Evaluation-on-Test-Dataset-2.7">Model Evaluation on Test Dataset</a></span></li></ul></li><li><span><a href="#Homework" data-toc-modified-id="Homework-3">Homework</a></span></li><li><span><a href="#Resources" data-toc-modified-id="Resources-4">Resources</a></span></li></ul></div>

<img src="../images/classification.png" alt="Classification" style="width: 700px;"/>

## Regularization Theory

## Machine Learning Project Lifecycle: Third Iteration

### Problem Statement

Classify the Financial Consumer Complaints into different Product Categories given consumer complaint text.

**Product Categories**

- Credit reporting, repair, or other
- Debt collection
- Student loan
- Money transfer, virtual currency, or money service
- Bank account or service

### Training Data

[Kaggle: Consumer Complaint Database](https://www.kaggle.com/selener/consumer-complaint-database)

In [1]:
import pandas as pd

In [2]:
complaints_training_dataset = pd.read_csv('../datasets/consumer_complaints_training_dataset.csv')

In [3]:
complaints_training_dataset.head()

Unnamed: 0,Product,Complaint_text
0,"Credit reporting, repair, or other","My name is XXXX XXXX XXXX , not XXXX X..."
1,"Credit reporting, repair, or other",I was shocked when I reviewed my credit report...
2,"Credit reporting, repair, or other",Equifax misused of credit file. Disputing acco...
3,"Credit reporting, repair, or other",I am disturbed that you continue to list the v...
4,"Credit reporting, repair, or other",I went to multiple different credit report web...


In [4]:
complaints_training_dataset.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 20000 entries, 0 to 19999
Data columns (total 2 columns):
 #   Column          Non-Null Count  Dtype 
---  ------          --------------  ----- 
 0   Product         20000 non-null  object
 1   Complaint_text  20000 non-null  object
dtypes: object(2)
memory usage: 312.6+ KB


**Q) What is the distribution of complaints for each product type?**

In [5]:
complaints_training_dataset.Product.unique()

array(['Credit reporting, repair, or other', 'Debt collection',
       'Student loan',
       'Money transfer, virtual currency, or money service',
       'Bank account or service'], dtype=object)

In [6]:
complaints_training_dataset\
    .groupby('Product')\
    [['Complaint_text']]\
    .count()\
    .rename(columns={'Complaint_text': 'Count'})\
    .sort_values('Count', ascending=False)

Unnamed: 0_level_0,Count
Product,Unnamed: 1_level_1
Bank account or service,4000
"Credit reporting, repair, or other",4000
Debt collection,4000
"Money transfer, virtual currency, or money service",4000
Student loan,4000


**Q) Find out the Occurances of Duplicate Text messages if any?**

In [7]:
complaints_training_dataset['Complaint_text'].nunique()

19913

In [8]:
duplicate_complaints = complaints_training_dataset['Complaint_text']\
    .value_counts()\
    [complaints_training_dataset['Complaint_text'].value_counts() > 2].index

In [9]:
len(duplicate_complaints)

9

### Preprocessing + Feature Engineering

In [10]:
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.model_selection import train_test_split

RANDOM_STATE = 19

In [11]:
count_vectorizer = CountVectorizer(stop_words='english', max_features=5000)

In [12]:
X_train, X_test, y_train, y_test = train_test_split(
    complaints_training_dataset['Complaint_text'],
    complaints_training_dataset['Product'],
    test_size=.2,
    stratify=complaints_training_dataset['Product'],
    random_state=RANDOM_STATE)

In [13]:
X_train.shape, X_test.shape, y_train.shape, y_test.shape

((16000,), (4000,), (16000,), (4000,))

In [14]:
X_train_count_vectorizer = count_vectorizer.fit_transform(X_train)
X_test_count_vectorizer = count_vectorizer.transform(X_test)

In [15]:
len(count_vectorizer.get_feature_names())

5000

In [16]:
count_vectorizer.get_feature_names()[:10]

['00', '000', '10', '100', '1000', '10000', '100000', '1005', '11', '110']

In [17]:
list(count_vectorizer.vocabulary_.items())[:10]

[('xxxx', 4976),
 ('account', 322),
 ('listed', 2727),
 ('credit', 1279),
 ('report', 3828),
 ('experian', 1842),
 ('paid', 3234),
 ('closed', 1021),
 ('2007', 63),
 ('like', 2712)]

In [18]:
X_train_count_vectorizer.shape, X_test_count_vectorizer.shape

((16000, 5000), (4000, 5000))

### Machine Learning Algorithm: Logistic Regression

#### Using Sklearn Implementation for Student Loan Class Prediction

In [19]:
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score

In [20]:
sklearn_binary_classifier = LogisticRegression(penalty='none',
                                               max_iter=101,
                                               random_state=RANDOM_STATE)

In [21]:
sklearn_binary_classifier.fit(X_train_count_vectorizer, y_train == 'Student loan')

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html.
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(


LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,
                   intercept_scaling=1, l1_ratio=None, max_iter=101,
                   multi_class='auto', n_jobs=None, penalty='none',
                   random_state=19, solver='lbfgs', tol=0.0001, verbose=0,
                   warm_start=False)

In [22]:
sklearn_binary_classifier_predictions = sklearn_binary_classifier.predict(X_test_count_vectorizer)

In [23]:
sklearn_binary_classifier_score = accuracy_score(y_test == 'Student loan', sklearn_binary_classifier_predictions)
sklearn_binary_classifier_score

0.954

#### Custom Implementation for Student Loan Class Prediction using l2 penalty

**Estimating Conditional Probability using Link Function**

$ \displaystyle P(y_i = +1 | \mathbf{x}_i,\mathbf{w}) = \frac{1}{1 + \exp(-\mathbf{w}^T h(\mathbf{x}_i))} $

In [24]:
import numpy as np

def sigmoid(scores):
    return 1.0 / (1 + np.exp(-scores))

def predict_probability(feature_matrix, coefficients):
    scores = np.dot(feature_matrix, coefficients)
    predictions = sigmoid(scores)
    return predictions

In [25]:
dummy_feature_matrix = np.array([[1.,2.,3.], [1.,-1.,-1]])
dummy_coefficients = np.array([1., 3., -1.])

correct_scores      = np.array( [ 1.*1. + 2.*3. + 3.*(-1.),          1.*1. + (-1.)*3. + (-1.)*(-1.) ] )
correct_predictions = np.array( [ 1./(1+np.exp(-correct_scores[0])), 1./(1+np.exp(-correct_scores[1])) ] )

print('The following outputs must match ')
print('------------------------------------------------')
print('correct_predictions           =', correct_predictions)
print('output of predict_probability =', predict_probability(dummy_feature_matrix, dummy_coefficients))

The following outputs must match 
------------------------------------------------
correct_predictions           = [0.98201379 0.26894142]
output of predict_probability = [0.98201379 0.26894142]


**Compute derivative with respect to a single coefficient**

- coefficients($w_1 .. w_j$) with **L1 Penalty** will be derived using

$
\displaystyle \frac{\partial\ell}{\partial w_j} = \sum_{i=1}^N h_j(\mathbf{x}_i)\left(\mathbf{1}[y_i = +1] - P(y_i = +1 | \mathbf{x}_i, \mathbf{w})\right) \color{red}{-\lambda w_j }
$

- coefficients($w_1 .. w_j$) with **L2 Penalty** will be derived using

$
\displaystyle \frac{\partial\ell}{\partial w_j} = \sum_{i=1}^N h_j(\mathbf{x}_i)\left(\mathbf{1}[y_i = +1] - P(y_i = +1 | \mathbf{x}_i, \mathbf{w})\right) \color{red}{-2\lambda w_j }
$

- For intercept ($w0$) term

$
\displaystyle \frac{\partial\ell}{\partial w_0} = \sum_{i=1}^N h_j(\mathbf{x}_i)\left(\mathbf{1}[y_i = +1] - P(y_i = +1 | \mathbf{x}_i, \mathbf{w})\right)
$

We will now write a function that computes the derivative of log likelihood with respect to a single coefficient $w_j$. The function accepts two arguments:
* `errors` vector containing $\mathbf{1}[y_i = +1] - P(y_i = +1 | \mathbf{x}_i, \mathbf{w})$ for all $i$.
* `feature` vector containing $h_j(\mathbf{x}_i)$  for all $i$.
* `coefficient` containing the current value of coefficient $w_j$.
* `is_intercept` boolean value indicating whether the given coefficient is intercept ($w_0$) or not.
* `penalty_type` value represeting the type of regularization($l1 \: or \: l2$) to use .
* `penalty_value` constant penalty value $\lambda$.

In [45]:
def feature_derivative(errors, feature, coefficient=None, is_intercept=None,
                       penalty_type=None, penalty_value=None):
    derivative = np.dot(errors, feature)
    if penalty_type is not None and not is_intercept:
        if penalty_type == 'l2':
            # TODO: Check if 2 * penalty_value * coefficient is correct or penalty_value * (coefficient ** 2)
            derivative -= 2 * penalty_value * coefficient
        elif penalty_type == 'l1':
            derivative -= penalty_value * coefficient
    return derivative

**Compute Regularized log likelihood will be given by**

- For $l1$ norm regularization

$ \ell\ell(\mathbf{w}) = \sum_{i=1}^N \Big( (\mathbf{1}[y_i = +1] - 1)\mathbf{w}^T h(\mathbf{x}_i) - \ln\left(1 + \exp(-\mathbf{w}^T h(\mathbf{x}_i))\right) \Big) \color{red}{-\lambda\|\mathbf{w}\|}$

- For $l2$ regularization

$ \ell\ell(\mathbf{w}) = \sum_{i=1}^N \Big( (\mathbf{1}[y_i = +1] - 1)\mathbf{w}^T h(\mathbf{x}_i) - \ln\left(1 + \exp(-\mathbf{w}^T h(\mathbf{x}_i))\right) \Big) \color{red}{-\lambda\|\mathbf{w}\|_2^2}$

In [28]:
def compute_log_likelihood(feature_matrix, target_labels, target_label,
                           coefficients, penalty_value=None, penalty_type=0):
    indicator = (target_labels == target_label)
    scores = np.dot(feature_matrix, coefficients)
    if penalty_type == 'l2':
        likelihood = np.sum((indicator - 1) * scores - np.log(1 + np.exp(-scores))) \
            - penalty_value * np.sum(coefficients[1:] ** 2)
    elif penalty_type == 'l1':
        likelihood = np.sum((indicator - 1) * scores - np.log(1 + np.exp(-scores))) \
            - penalty_value * np.sum(coefficients[1:])
    else:
        likelihood = np.sum((indicator - 1) * scores - np.log(1 + np.exp(-scores)))
    return likelihood

In [29]:
dummy_feature_matrix = np.array([[1.,2.,3.], [1.,-1.,-1]])
dummy_coefficients = np.array([1., 3., -1.])
dummy_sentiment = np.array([-1, 1])

correct_indicators = np.array([ -1==+1, 1==+1])
correct_scores      = np.array( [ 1.*1. + 2.*3. + 3.*(-1.),  1.*1. + (-1.)*3. + (-1.)*(-1.) ] )
correct_first_term  = np.array( [ (correct_indicators[0]-1)*correct_scores[0],
                                 (correct_indicators[1]-1)*correct_scores[1] ] )
correct_second_term = np.array( [ np.log(1. + np.exp(-correct_scores[0])), 
                                 np.log(1. + np.exp(-correct_scores[1])) ] )

correct_ll          =      sum( [ correct_first_term[0]-correct_second_term[0],
                                 correct_first_term[1]-correct_second_term[1] ] ) 

print('The following outputs must match ')
print('------------------------------------------------')
print('correct_log_likelihood           =', correct_ll)
print('output of compute_log_likelihood =', compute_log_likelihood(dummy_feature_matrix,
                                                                   dummy_sentiment,
                                                                   1,
                                                                   dummy_coefficients))

The following outputs must match 
------------------------------------------------
correct_log_likelihood           = -5.331411615436032
output of compute_log_likelihood = -5.331411615436032


**Any Questions????**

**Train Binary Logistic Regression Classifier model using Gradient Ascent**

In [36]:
def train_binary_lr_classifier(
        features_matrix, target_labels, target_label,
        initial_coefficients, step_size, max_iterations,
        penalty_type=None, penalty_value=0, debug=False):
    coefficients = np.array(initial_coefficients)
    likelihood_values = []
    for iteration in range(max_iterations):
        predictions = predict_probability(features_matrix, coefficients)
        
        indicator = (target_labels == target_label)
        
        errors = indicator - predictions
        
        for j in range(len(coefficients)):
            is_intercept = (j == 0)
            derivative = feature_derivative(errors, features_matrix[:, j],
                                            coefficients[j], is_intercept,
                                            penalty_type, penalty_value)
            coefficients[j] += step_size * derivative

        lp = compute_log_likelihood(features_matrix, target_labels,
                                    target_label, coefficients,
                                    penalty_type, penalty_value)
        likelihood_values.append(lp)
        if debug:
            if (iteration <= 100 and iteration % 10 == 0)\
                or (iteration <= 1000 and iteration % 100 == 0)\
                or (iteration <= 10000 and iteration % 1000 == 0)\
                or iteration % 10000 == 0:
                print('----------------------------------')
                print(f'Iteration: {iteration} -> Likelihood value: {lp} for {target_label} classifier.')
                predicted_probabilities = predict_probability(features_matrix, coefficients)
                predicted_classes = predicted_probabilities > .5
                correct_predictions = predicted_classes == (target_labels == target_label)
                print(f'Minimum Probability:{predictions.min()},',
                      f'Maximum Probability:{predictions.max()},',
                      f'Current Accuracy: {correct_predictions.sum() / len(target_labels)}')
    if not debug:
        predicted_probabilities = predict_probability(features_matrix, coefficients)
        predicted_classes = predicted_probabilities > .5
        correct_predictions = predicted_classes == (target_labels == target_label)
        print(f'Minimum Probability:{predictions.min()},',
              f'Maximum Probability:{predictions.max()},',
              f'Current Accuracy: {correct_predictions.sum() / len(target_labels)}')
    return coefficients, likelihood_values

#### util fns

In [85]:
import pickle
import time

from sklearn.metrics import accuracy_score

LAST_RESULTS_FILE_NAME = None

def count_vectorized_features_to_features_matrix(count_vectorized_features):
    constant_feature = np.ones((count_vectorized_features.shape[0], 1))
    return np.hstack((constant_feature, count_vectorized_features.toarray()))

def save_experiment_results(results):
    file_name_prefix =  'lr_hyper_params_experiments_results'
    file_name = f'{file_name_prefix}_{time.monotonic_ns()}.pkl'
    print(file_name)
    with open(file_name, 'wb') as results_file:
        pickle.dump(results, results_file)
    return file_name
        
def classify(features_matrix, coeffs):
    prob_predictions = predict_probability(features_matrix, coeff)
    return prob_predictions > .5

def load_results(file_path):
    with open(file_path, 'rb') as file:
        return pickle.load(file)

In [86]:
hyper_params_experiments_results = load_results('lr_experiment_results_876800194845060.pkl')
    
print(len(hyper_params_experiments_results))

hyper_params_experiments_results.append(
    {'max_iterations': 101, 'penalty_type': 'l2', 'penalty_value': 100})
hyper_params_experiments_results.append(
    {'max_iterations': 501, 'penalty_type': 'l2', 'penalty_value': 100})
hyper_params_experiments_results.append(
    {'max_iterations': 1001, 'penalty_type': 'l2', 'penalty_value': 100})
hyper_params_experiments_results.append(
    {'max_iterations': 101, 'penalty_type': 'l1', 'penalty_value': 100})
hyper_params_experiments_results.append(
    {'max_iterations': 501, 'penalty_type': 'l1', 'penalty_value': 100})
hyper_params_experiments_results.append(
    {'max_iterations': 1001, 'penalty_type': 'l1', 'penalty_value': 100})

9


In [87]:
import datetime

# Hyper Params

step_size = 1e-5

# Target Variables
target_labels, target_label = y_train, 'Student loan'

# Feature Matrix & Initial Coefficients
X_train_features_matrix = count_vectorized_features_to_features_matrix(X_train_count_vectorizer)
X_test_features_matrix = count_vectorized_features_to_features_matrix(X_test_count_vectorizer)

initial_coefficients = np.zeros(X_train_features_matrix.shape[1])

print(X_train_features_matrix.shape, initial_coefficients.shape, X_test_features_matrix.shape)

for hyper_params in hyper_params_experiments_results:
    if 'coeffs' in hyper_params:
        continue
    print('---------------------------------')
    print(hyper_params)
    start_time = datetime.datetime.now()
    coeffs, likelihood_values = train_binary_lr_classifier(
        X_train_features_matrix,
        target_labels,
        target_label,
        initial_coefficients,
        step_size = step_size,
        max_iterations = hyper_params['max_iterations'],
        penalty_type = hyper_params['penalty_type'],
        penalty_value = hyper_params['penalty_value']
    )
    time_delta = datetime.datetime.now() - start_time
    hyper_params['coeffs'] = coeffs
    hyper_params['likelihood_values'] = likelihood_values
    hyper_params['time_delta'] = time_delta
    hyper_params['train_dataset_accuracy'] = accuracy_score(
        y_train == 'Student loan',
        classify(X_train_features_matrix, coeff))
    hyper_params['test_dataset_accuracy'] = accuracy_score(
        y_test == 'Student loan',
        classify(X_test_features_matrix, coeff))

LAST_RESULTS_FILE_NAME = save_experiment_results(hyper_params_experiments_results)

(16000, 5001) (5001,) (4000, 5001)
---------------------------------
{'max_iterations': 101, 'penalty_type': 'l2', 'penalty_value': 100}
Minimum Probability:1.5642499285839832e-21, Maximum Probability:1.0, Current Accuracy: 0.9500625
---------------------------------
{'max_iterations': 501, 'penalty_type': 'l2', 'penalty_value': 100}
Minimum Probability:9.356462064424788e-21, Maximum Probability:1.0, Current Accuracy: 0.9521875
---------------------------------
{'max_iterations': 1001, 'penalty_type': 'l2', 'penalty_value': 100}
Minimum Probability:4.66073532162223e-20, Maximum Probability:1.0, Current Accuracy: 0.9525
---------------------------------
{'max_iterations': 101, 'penalty_type': 'l1', 'penalty_value': 100}
Minimum Probability:1.8021828697023274e-22, Maximum Probability:1.0, Current Accuracy: 0.9506875
---------------------------------
{'max_iterations': 501, 'penalty_type': 'l1', 'penalty_value': 100}
Minimum Probability:8.788829242675276e-24, Maximum Probability:1.0, Curr

#### Evaluate different hyper parameters on test dataset

In [88]:
from sklearn.metrics import accuracy_score

X_train_features_matrix = count_vectorized_features_to_features_matrix(X_train_count_vectorizer)
X_test_features_matrix = count_vectorized_features_to_features_matrix(X_test_count_vectorizer)

for hyper_params_result in hyper_params_experiments_results:
    coeff = hyper_params_result['coeffs']
    hyper_params_result['train_dataset_accuracy'] = accuracy_score(
        y_train == 'Student loan',
        classify(X_train_features_matrix, coeff))
    hyper_params_result['test_dataset_accuracy'] = accuracy_score(
        y_test == 'Student loan',
        classify(X_test_features_matrix, coeff))
    print(f"max_iter: {hyper_params_result['max_iterations']},",
          f"penalty: {hyper_params_result['penalty_type']},",
          f"lambda: {hyper_params_result['penalty_value']},",
          f"train accuracy: {hyper_params_result['train_dataset_accuracy'] * 100},",
          f"test accuracy: {hyper_params_result['test_dataset_accuracy'] * 100}")

max_iter: 101, penalty: l2, lambda: 0, train accuracy: 95.13125, test accuracy: 94.69999999999999
max_iter: 501, penalty: l2, lambda: 0, train accuracy: 95.92500000000001, test accuracy: 95.575
max_iter: 1001, penalty: l2, lambda: 0, train accuracy: 96.34375, test accuracy: 95.75
max_iter: 101, penalty: l2, lambda: 10.0, train accuracy: 95.1375, test accuracy: 94.675
max_iter: 501, penalty: l2, lambda: 1000.0, train accuracy: 90.84375, test accuracy: 90.55


  return 1.0 / (1 + np.exp(-scores))


max_iter: 1001, penalty: l2, lambda: 100000.0, train accuracy: 80.0, test accuracy: 80.0
max_iter: 101, penalty: l1, lambda: 10.0, train accuracy: 95.1375, test accuracy: 94.69999999999999
max_iter: 501, penalty: l1, lambda: 1000.0, train accuracy: 93.33125, test accuracy: 92.77499999999999
max_iter: 1001, penalty: l1, lambda: 100000.0, train accuracy: 80.0, test accuracy: 80.0
max_iter: 101, penalty: l2, lambda: 100, train accuracy: 95.00625000000001, test accuracy: 94.5
max_iter: 501, penalty: l2, lambda: 100, train accuracy: 95.21875, test accuracy: 94.825
max_iter: 1001, penalty: l2, lambda: 100, train accuracy: 95.25, test accuracy: 94.675
max_iter: 101, penalty: l1, lambda: 100, train accuracy: 95.06875000000001, test accuracy: 94.675
max_iter: 501, penalty: l1, lambda: 100, train accuracy: 95.56875000000001, test accuracy: 95.19999999999999
max_iter: 1001, penalty: l1, lambda: 100, train accuracy: 95.76249999999999, test accuracy: 95.25


### Model Evaluation

In [36]:
from sklearn.model_selection import cross_val_score

In [37]:
cv_scores = cross_val_score(LogisticRegression(penalty='none', max_iter=101, random_state=RANDOM_STATE),
                            X_train_count_vectorizer,
                            y_train,
                            cv=5)

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html.
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html.
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html.
Please also refer to the documentation for alternative solver 

In [38]:
print(cv_scores.mean())
print(cv_scores)

0.8332499999999999
[0.82875   0.8365625 0.825625  0.841875  0.8334375]


### Quality Metrics

In [39]:
from sklearn.metrics import (accuracy_score,
                             confusion_matrix)

import seaborn as sns;
sns.set()

import matplotlib.pyplot as plt
%matplotlib inline

### Model Evaluation on Test Dataset

- Note: Retrain the model using full training [dataset](../datasets/consumer_complaints_training_dataset.csv) & test using the test [dataset](../datasets/consumer_complaints_test_dataset.csv).

## Homework

## Resources