### Support Vector Machines
Using a support vector machine to classify data based on the speech act

In [41]:
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.svm import LinearSVC
from sklearn.metrics import accuracy_score, classification_report

from preprocessing import get_sentences_labels

Generating the sentences and labels from the Excel sheet

In [42]:
sentences, labels = get_sentences_labels()

Sentences:  ['alpha, charlie. bravo check.', "alpha you're loud_and_clear.", 'charlie. good to me', 'charlie, charlie one, bravo radio check. ', 'yeah. charlie good to me. over']
I have sentences:  81
Correct Labels:  ['Request for Situation', 'Statement of Situation', 'Statement of Situation', 'Request for Situation', 'Statement of Situation']
I have labels:  81


## Preprocessing
Vectorising based on the Tf-idf values in the data set

In [43]:
vectorizer = TfidfVectorizer()
X = vectorizer.fit_transform(sentences)

Create the train-test split

In [44]:
X_train, X_test, y_train, y_test = train_test_split(X, labels, test_size=0.20, random_state=8)

Selecting the Linear Support Vector Classification model

In [45]:
classifier = LinearSVC()

Training the model

In [46]:
classifier.fit(X_train, y_train)



Evaluating the model, running test set

In [47]:
y_pred = classifier.predict(X_test)

Evaluating Accuracy and Classification Report

In [48]:
print(f"Accuracy: {accuracy_score(y_test, y_pred):.2f}")

print("Classification Report:\n", classification_report(y_test, y_pred, zero_division=0))

Accuracy: 0.65
Classification Report:
                          precision    recall  f1-score   support

         Not Classified       0.50      0.50      0.50         2
     Request for Action       1.00      1.00      1.00         1
  Request for Situation       1.00      1.00      1.00         1
    Statement of Intent       0.71      0.83      0.77         6
Statement of Prediction       0.00      0.00      0.00         2
 Statement of Situation       0.50      0.60      0.55         5

               accuracy                           0.65        17
              macro avg       0.62      0.66      0.64        17
           weighted avg       0.58      0.65      0.61        17


Predict the speech acts of new sentences now that the model has been trained
0 = statement, 1 = request, 2 = request

In [49]:
new_sentences = ["i think you're on it rather next to it. you need to be next to it.",
                 "yeah, i'm putting up northwesterly barrier. you see where firetruck_one, firetruck_two and "
                 "firetruck_three are?",
                 "yeah. confirm all fires extinguished?"]
new_X = vectorizer.transform(new_sentences)
new_predictions = classifier.predict(new_X)

Print predictions for new sentences

In [50]:
for sentence, prediction in zip(new_sentences, new_predictions):
    print(f"Sentence: '{sentence}'\t Predicted Speech Act: {prediction}")

Sentence: 'i think you're on it rather next to it. you need to be next to it.'	 Predicted Speech Act: Statement of Situation
Sentence: 'yeah, i'm putting up northwesterly barrier. you see where firetruck_one, firetruck_two and firetruck_three are?'	 Predicted Speech Act: Request for Situation
Sentence: 'yeah. confirm all fires extinguished?'	 Predicted Speech Act: Request for Situation
