# **Logistic Regression**

Here's an example of building a logistic regression model using a real-world dataset of customer churn in a telecommunications company. The goal is to predict whether a customer is likely to churn (i.e., cancel their subscription) based on various features such as their account information, usage patterns, and customer service interactions.

We'll use the Pandas and Scikit-learn libraries in Python to load and preprocess the data, build the model, and evaluate its performance.

First, we'll import the necessary libraries and load the dataset:

In [4]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score

# Load the dataset
url = 'Telco-Customer-Churn.csv'
df = pd.read_csv(url)


Next, we'll preprocess the data by converting categorical variables to binary indicators, dropping irrelevant columns, and splitting the data into training and test sets:

In [5]:
# Convert categorical variables to binary indicators
df = pd.get_dummies(df, columns=['gender', 'Partner', 'Dependents', 'PhoneService', 'MultipleLines',
                                 'InternetService', 'OnlineSecurity', 'OnlineBackup', 'DeviceProtection',
                                 'TechSupport', 'StreamingTV', 'StreamingMovies', 'Contract', 'PaperlessBilling',
                                 'PaymentMethod'])

# Drop irrelevant columns
df.drop(['customerID', 'TotalCharges'], axis=1, inplace=True)

# Split the data into training and test sets
X = df.drop('Churn', axis=1)
y = df['Churn']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)


Now, we'll build the logistic regression model and fit it to the training data:

In [6]:
# Build the logistic regression model
model = LogisticRegression(max_iter=1000)

# Fit the model to the training data
model.fit(X_train, y_train)


Finally, we'll evaluate the performance of the model on the test data using accuracy as the evaluation metric:

In [7]:
# Make predictions on the test data
y_pred = model.predict(X_test)

# Calculate the accuracy of the model
accuracy = accuracy_score(y_test, y_pred)
print('Accuracy:', accuracy)


Accuracy: 0.8197303051809794


Save the Model to with TelcoCustomer Predict

In [8]:
import pickle

# Save the model to disk
filename = 'TelcoCustomer_logreg_model.pkl'
with open(filename, 'wb') as file:
    pickle.dump(model, file)

An example code that takes input from the user and uses the trained model to make predictions:

In [12]:
import pandas as pd
import pickle

# Load the trained model from disk
with open('TelcoCustomer_logreg_model.pkl', 'rb') as file:
    trained_model = pickle.load(file)

# Load the input data from a CSV file
input_data = pd.read_csv('Input_data.csv')

# Convert categorical variables to binary indicators
input_data = pd.get_dummies(input_data, columns=['gender', 'Partner', 'Dependents', 'PhoneService', 'MultipleLines',
                                                 'InternetService', 'OnlineSecurity', 'OnlineBackup', 'DeviceProtection',
                                                 'TechSupport', 'StreamingTV', 'StreamingMovies', 'Contract', 'PaperlessBilling',
                                                 'PaymentMethod'])

# Add missing columns compared to the trained data
missing_cols = set(X_train.columns) - set(input_data.columns)
for c in missing_cols:
    input_data[c] = 0

# Ensure columns are in the same order as trained data
input_data = input_data[X_train.columns]

# Make the predictions
predictions = trained_model.predict(input_data)

# Print the predictions
print(predictions)


['Yes']
