# Deep Mobile Prediction Model



- 1.    Using existing client dataset
- 2.	Predicting Has Mobile (Y/N) using a Deep Neural Network (DNN)
- 3.	Input features: dp1 score, dp3 score, arrears, residency zone, occupancy style, tax band
- 4.	Train/Test split, build model, evaluate performance

In [1]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.optimizers import Adam
from sklearn.metrics import classification_report, confusion_matrix, accuracy_score

In [None]:


# Step 1: data
df = pd.read_csv("/Users/rg/ACADEMICS/Interview/Connected Data Comapany/MAY/Dataset/Modified/cleaned_connected_data_with_zones.csv")

# Step 2: Prepare features and target
df['Mobile Flag'] = df['Mobile Flag'].str.strip().str.upper()
df['Email Flag'] = df['Email Flag'].str.strip().str.upper()
df['Has_Mobile'] = (df['Mobile Flag'] == 'Y').astype(int)

# Features 
features = ['dp1 Score', 'dp3 Score', 'Arrears Balance']
zone_dummies = pd.get_dummies(df['Residency Zone'], prefix='Zone')
occupancy_dummies = pd.get_dummies(df['dp2 Occupancy Style'], prefix='Occupancy')
taxband_dummies = pd.get_dummies(df['dp2 Council Tax Band'], prefix='TaxBand')

X = pd.concat([df[features], zone_dummies, occupancy_dummies, taxband_dummies], axis=1)
y = df['Has_Mobile']

# missing values
X = X.dropna()
y = y.loc[X.index]

# Step 3: Train/Test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Step 4: Standardize Features
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

# Step 5: Build DNN Model
model = Sequential()
model.add(Dense(32, activation='relu', input_shape=(X_train_scaled.shape[1],)))
model.add(Dense(16, activation='relu'))
model.add(Dense(1, activation='sigmoid'))

model.compile(optimizer=Adam(learning_rate=0.001), loss='binary_crossentropy', metrics=['accuracy'])

# Step 6: Model training
model.fit(X_train_scaled, y_train, epochs=50, batch_size=16, verbose=1)

# Step 7: Evaluation
y_pred_prob = model.predict(X_test_scaled)
y_pred = (y_pred_prob > 0.5).astype(int).flatten()

conf_matrix = confusion_matrix(y_test, y_pred)
class_report = classification_report(y_test, y_pred)
accuracy = accuracy_score(y_test, y_pred)

print("\n Confusion Matrix:\n", conf_matrix)
print("\n Classification Report:\n", class_report)
print("\n Accuracy Score:", round(accuracy * 100, 2), "%")


2025-04-28 19:42:38.644103: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  SSE4.1 SSE4.2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2025-04-28 19:42:38.729237: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:116] None of the MLIR optimization passes are enabled (registered 2)


Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50

 Confusion Matrix:
 [[48 20]
 [14 31]]

 Classification Report:
               precision    recall  f1-score   support

           0       0.77      0.71      0.74        68
           1       0.61      0.69      0.65        45

    accuracy                           0.70       113
   macro avg       0.69      0.70      0.69       113
weighted avg       0.71      0.70      0.70       113


 Accuracy Score

Metric	             Value	                  Insight

- Final Accuracy	     69.91%	                   Moderate-good prediction power for a simple DNN
- Confusion Matrix	 [[48, 20], [14, 31]]	   Some misclassifications still exist, but model is learning
- Precision (Class 1)	 0.61	                   61% of predicted positives (Mobile Available) were correct
- Recall (Class 1)	 0.69	                   69% of actual positives were captured
- F1-Score (Class 1)	 0.65	                   Balanced precision/recall for Class 1



🧠 What does this mean for us?
70% accuracy is actually good considering:
Small dataset (~1000 clients total)
No advanced feature engineering yet
Only simple numerical inputs (arrears, scores, etc.)
Recall is 69% for Mobile owners ➔ this is important!
It means when a client actually has a mobile,
we are able to catch 7 out of 10 through prediction.
Precision is 61% for Mobile owners ➔ decent
Of all the clients we predict as "Yes Mobile", 61% are truly Yes.