# Develop a NLP Model in Python & Deploy It with Flask
- [Reference](https://towardsdatascience.com/develop-a-nlp-model-in-python-deploy-it-with-flask-step-by-step-744f3bdd7776)

# ML Model Building

In [1]:
import pandas as pd
import numpy as np
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import MultinomialNB
from sklearn.metrics import classification_report

In [2]:
df = pd.read_csv('https://storage.googleapis.com/kagglesdsdata/datasets%2F483%2F982%2Fspam.csv?GoogleAccessId=gcp-kaggle-com@kaggle-161607.iam.gserviceaccount.com&Expires=1592885382&Signature=RiR%2FZJOj1JnYfPGVZEXMVCPZuDL8BkHp40BiHieZ4wn%2BScPtRfgq7MeORoOrNXjkv3%2FvN6evVd9QPxPWwqkWqe3AeUkkUm%2B%2B2EUTYYCyzen5ZX1PqTa4gOPXIeLpyIabB05srx%2FjyOKrT%2Bw9oN5BBc2xu6%2BEPoHpViU%2BbF02gUPbXVQBCZcKj2fiEyaKSYv1ordeMbZ5yaz5iWNLOaN8hsES9JBTcQMYKRVWHfWttC92b7XLNeooL43pUpiXxjtA8Y3MAn44%2BJbT%2FCSies7XKREb2omD6L4Z9C33htkhQQoqLfpZjlTZVokU%2BYyRVR9N5g8lm0SunQ5N4mmm4q9Pbw%3D%3D', encoding="latin-1")

In [3]:
df.head()

Unnamed: 0,v1,v2,Unnamed: 2,Unnamed: 3,Unnamed: 4
0,ham,"Go until jurong point, crazy.. Available only ...",,,
1,ham,Ok lar... Joking wif u oni...,,,
2,spam,Free entry in 2 a wkly comp to win FA Cup fina...,,,
3,ham,U dun say so early hor... U c already then say...,,,
4,ham,"Nah I don't think he goes to usf, he lives aro...",,,


In [4]:
df.rename(columns = {'v1':'class','v2':'message'}, inplace = True)

In [5]:
df.drop(['Unnamed: 2', 'Unnamed: 3', 'Unnamed: 4'], axis=1, inplace=True)
df['label'] = df['class'].map({'ham': 0, 'spam': 1})

In [6]:
X = df['message']
y = df['label']

In [7]:
cv = CountVectorizer()
X = cv.fit_transform(X) # Fit the Data

In [8]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)

In [9]:
#Naive Bayes Classifier
clf = MultinomialNB()
clf.fit(X_train,y_train)
clf.score(X_test,y_test)
y_pred = clf.predict(X_test)
print(classification_report(y_test, y_pred))

              precision    recall  f1-score   support

           0       0.99      0.99      0.99      1587
           1       0.93      0.92      0.92       252

    accuracy                           0.98      1839
   macro avg       0.96      0.95      0.96      1839
weighted avg       0.98      0.98      0.98      1839



After training the model, saving the model for future use is a good way.

In [11]:
from sklearn.externals import joblib
joblib.dump(clf, 'NB_spam_model.pkl')



['NB_spam_model.pkl']

In [12]:
NB_spam_model = open('NB_spam_model.pkl','rb')
clf = joblib.load(NB_spam_model)