#  **End-to-End ML Model Building - Chosing Family Planning Option**

---
## *Table of Contents*
---

**1.** [**Introduction**](#Section2)<br>

**2.** [**Problem Statement**](#Section2)<br>

**3.** [**Importing Libraries**](#Section3)<br>

**4.** [**Loading data using Pandas**](#Section4)<br>

**5.** [**Seperating data into Train and Test sets**](#Section5)<br>

**6.** [**Model Building**](#Section6)<br>

**7.** [**Model Evaluation**](#Section7)<br>

---
<a name = Section2></a>
# **1. Introduction**
---

### Which birth control method is right for a couple ?

From condoms to pills to IUDs, you have many choices when it comes to birth control, but not all methods are right for everyone. Before you settle on one form of contraception, weigh all the types of birth control and consider the facts. By asking yourself these questions and talking with your doctor, you can find the option that works best for you.

1. How Effective is ?
2. Is it reversable ?
3. Are the side effects tolerable ?
4. Does it fit your personality & lifestyle ?
5. Are you in a monogamous relationship ?
6. Do you have health conditions ?
7. Can you afford it ?


---
<a name = Section2></a>
# **2. Problem Statement**
---

---
<a name = Section3></a>
# **3. Importing Libraries**
---

In [1]:
import pandas as pd
import numpy as np

from sklearn.model_selection import train_test_split

import matplotlib.pyplot as plt

from sklearn.model_selection import cross_val_score
from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier

from sklearn.model_selection import GridSearchCV

from sklearn.metrics import accuracy_score, precision_score, recall_score, confusion_matrix, classification_report

---
<a name = Section4></a>
# **4. Loading data using Pandas**
---

In [2]:
data = pd.read_excel("FPData.xlsx")
data.head()

Unnamed: 0,AG,HLAP,COF,CM
0,1,1,1,1
1,4,1,1,1
2,4,1,1,1
3,4,1,1,1
4,5,1,1,1


In [3]:
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1120 entries, 0 to 1119
Data columns (total 4 columns):
 #   Column  Non-Null Count  Dtype
---  ------  --------------  -----
 0   AG      1120 non-null   int64
 1   HLAP    1120 non-null   int64
 2   COF     1120 non-null   int64
 3   CM      1120 non-null   int64
dtypes: int64(4)
memory usage: 35.1 KB


---
<a name = Section4></a>
# **5. Separating data into train and test sets**
---

<a id=section5></a>
### 5.1 Separating Independent and Dependent variables

In [4]:
X = data.drop('CM', axis = 1)
y = data['CM']

In [5]:
X.shape

(1120, 3)

In [6]:
y.shape

(1120,)

<a id=section5></a>
### 5.2 Splitting data into train and test set.

In [7]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.20, random_state = 1)

---
<a name = Section6></a>
# **6. Model Building**
---

In [8]:
# Decision Tree
decision_tree = DecisionTreeClassifier(max_depth = 3, random_state = 3,
                                       splitter = "best", criterion = "gini")
scores = cross_val_score(estimator=decision_tree, X=X_train, y=y_train, cv=3)
print(scores)
print("Mean", scores.mean())

[0.84949833 0.88963211 0.87583893]
Mean 0.8716564536523684


In [9]:
model_rf = RandomForestClassifier()

# Random Forest
model_rf = RandomForestClassifier(n_estimators=1200 , oob_score = True, n_jobs = -1,
                                  random_state =50, max_features = "auto",
                                  max_leaf_nodes = 30)
scores = cross_val_score(estimator=model_rf, X=X_train, y=y_train, cv=3)
print(scores)
print("Mean", scores.mean())

[0.84949833 0.87625418 0.87583893]
Mean 0.8671971448452336


In [10]:
# K – Nearest Neighbor Classifier
from sklearn.neighbors import KNeighborsClassifier
knn_classifier = KNeighborsClassifier(n_neighbors = 5, metric = 'minkowski', p = 2)
knn_classifier.fit(X_train, y_train)
y_pred_knn = knn_classifier.predict(X_test)
from sklearn.metrics import accuracy_score
accuracy_score(y_test, y_pred_knn)

0.8660714285714286

###  **Saving the model as Pickle String**

In [16]:
import pickle

In [18]:
pickle.dump(knn_classifier, open("knn.pkl","wb"))

---
<a name = Section7></a>
# **7. Model Evaluation**
---