The SMS Spam Collection is a set of SMS tagged messages that have been collected for SMS Spam research. It contains one set of SMS messages in English, tagged acording being ham (legitimate) or spam. The challenge is classify messages Spam there of not.  
First of all, let's read full datasets from CSV file using pandas lib.

In [None]:
import numpy as np
import pandas as pd

df = pd.read_csv("../input/sms-spam-collection-dataset/spam.csv", encoding = 'latin-1', usecols=[0,1])
df.columns = ['category', 'message']

Initially target column has two values 'spam' and 'ham'. Encode it to numeric column.

In [None]:
def encode_category(cat):
    if cat == 'spam':
        return 1
    else:
        return 0
    
df['category'] = df['category'].apply(encode_category)

df.head()

Convert Message text to features columns using CountVectorizer, based on token occurrence counts.  
Also I tried using TfidfVectorizer, but accuracy didn't improve.  
I tried to add new feature 'Length of message'. And this also didn't provide any benefits to final accuracy.

In [None]:
from scipy.sparse import hstack
from sklearn.feature_extraction.text import CountVectorizer

vectorizer = CountVectorizer(encoding = "latin-1", strip_accents = "unicode", stop_words = "english")

features = vectorizer.fit_transform(df["message"])

Split all data to train and validation datasets. Use Stratify parameter to keep proportion of Spam/Ham values in training data same as in validation dataset.

In [None]:
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(features, df["category"], stratify = df["category"], test_size = 0.2)

Trying to use most appropriate classifiers for test data processing: MultinomialNB and SGDClassifier

In [None]:
from sklearn.naive_bayes import MultinomialNB
from sklearn import metrics
mn_clf = MultinomialNB().fit(X_train, y_train)

y_pred = mn_clf.predict(X_test)

print("Classifier %s:\n%s"
      % (mn_clf, metrics.classification_report(y_test, y_pred)))

In [None]:
from sklearn.linear_model import SGDClassifier
sgd_clf = SGDClassifier().fit(X_train, y_train)

y_pred = sgd_clf.predict(X_test)
print("Classifier %s:\n%s"
      % (sgd_clf, metrics.classification_report(y_test, y_pred)))

SGDClassifier wins!
Trying to fing best hyperparameters using GridSearchCV

In [None]:
from sklearn.model_selection import GridSearchCV

params = {
    "loss" : ["hinge"],
    "alpha" : [0.0001, 0.001, 0.01, 0.1],
    "penalty" : ["l2", "l1", "none"],
}

sgd_clf = SGDClassifier()
gs_clf = GridSearchCV(sgd_clf, param_grid=params)

gs_clf.fit(X_train, y_train)
print(gs_clf.best_params_)

Getting final score with best parameters.

In [None]:
sgd_clf = SGDClassifier(**gs_clf.best_params_).fit(X_train, y_train)

y_pred = sgd_clf.predict(X_test)
print("Best for classifier %s:\n%s"
      % (sgd_clf, metrics.classification_report(y_test, y_pred)))

Let's take a look at messages falsely recognized as Spam.

In [None]:
false_pos = list(y_test[(y_test != y_pred) & (y_pred == 1)].index)
print("False spam:", df.iloc[false_pos]['message'].values)

No false Spam messages! This is unexected but great result.  
Let's find messages falsely recognized as not Spam.

In [None]:
false_neg = list(y_test[(y_test != y_pred) & (y_pred == 0)].index)
print("False ham:", df.iloc[false_neg]['message'].values)

Definitely, there is room for improvement.