# Binary Classification with Logistic Regression

In this module, you will learn how to perform binary classification in scikit-learn with logistic regression. We will take a look at the iris dataset, which is a classic for testing out classification algorithms. Finally, we will use the accuracy score for evaluating how good the model we create really is.

<b>Functions and attributes in this lecture: </b>
- `pandas:` - Pandas package with alias `pd`
  - `.value_counts()` - Get the value distribution for the pandas series
  - `.corr()` - Get the correlation matrix for a pandas dataframe
- `sklearn.linear_model` - Submodule for linear models
  - `LogisticRegression()` - The logistic regression model
    - `.fit()` - Training the model on the data
    - `.predict()` - Predicting on new data using the model
    - `.predict_proba()` - Get the precentages for prediction on new data using the model
- `sklearn.metrics` - Submodule for metrics used to evaluate models
  - `accuracy_score()` - Finding the accuracy score for a set of predictions
- `sklearn.datasets` - Submodule of sklearn for toy datasets
  - `load_iris()` - A function for loading the iris dataset

In [None]:
# Non-sklearn packages
import numpy as np
import pandas as pd

# Sklearn modules & functions
from sklearn import datasets
from sklearn.model_selection import train_test_split

## Working with the Iris Dataset

Let us begin by importing the iris dataset and checking it out!

In [None]:
# Loading the Iris dataset

# Some info about the dataset


In [None]:
# Describe the features


In [None]:
# Check out the datatypes of the features


In [None]:
# Checking the values that the output can take


In [None]:
# Selecting only the first two classes


In [None]:
# Collect all the variables


In [None]:
# Checking the correlation


In [None]:
# A small visualization


## Logistic Regression

We will now train a logistic regression model for binary classification of the iris flower.

In [None]:
# Dividing up into training sets and testing sets


In [None]:
# Checking the shape of the data


In [None]:
# Importing the logistic regression classifier


In [None]:
# Initiating a logistic regression instance


In [None]:
# Fit the logistic regression on the training data


In [None]:
# Predict a single new instance


In [None]:
# Can also predict the probablity for each class


## Evaluating the model: Accuracy Score

We need to evaluate our logistic regression model. The most common way of doing this is with the accuracy score!

In [None]:
# Predict the labels


In [None]:
# Find the accuracy score manually


In [None]:
# Use the accuracy score function
