# Logistic Regression

**When to use**: When the output variables are categorical (such as short, medium, long). It is a simple and efficient technique for binary or multi-class classification.

**Advantage**: Fast, easy to implement and interpret.

**Applicable**: Since our problem has three classes, we can use a multinomial logistic regression.

In [None]:
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report, accuracy_score

# Prepare the features (X) and the labels (y)
X = combined_df[['amplitude_max', 'spectral_centroid', 'dominant_frequency', 'zero_crossing_rate']]
y = combined_df['label']

# Encode the labels (short, medium, long) into numerical values
label_encoder = LabelEncoder()
y_encoded = label_encoder.fit_transform(y)

# Split the data into training and testing sets (70% training, 30% testing)
X_train, X_test, y_train, y_test = train_test_split(X, y_encoded, test_size=0.3, random_state=42)

# Initialize the logistic regression model
log_reg = LogisticRegression(max_iter=1000)

# Train the model
log_reg.fit(X_train, y_train)

# Make predictions on the test set
y_pred = log_reg.predict(X_test)

# Evaluate the model
print("Accuracy:", accuracy_score(y_test, y_pred))
print("\nClassification Report:\n", classification_report(y_test, y_pred, target_names=label_encoder.classes_))
