<h1> Predicting Employment Status by Self Esteem and Personality </h1>

> We intend to employ logistic regression model

<h2> Variables </h2>

1. Slef Esteem : Rosenberg Self Esteem Scale
2. Personality : Big Five Personality 

In [14]:
#Importing neccessary libraries
import gspread
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import sklearn
from sklearn.ensemble import RandomForestRegressor as reg
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix
from sklearn.model_selection import GridSearchCV


df = pd.read_csv('logistic_Reg.csv')

In [15]:
df.shape

(67, 7)

<b> Training and testing data set <b>

In [16]:
X = df.drop('Empl', axis = 1)
y = df['Empl']

In [17]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, random_state=42)

<b> Model declaration <b>

In [18]:
model = LogisticRegression()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
report = classification_report(y_test, y_pred)
confusion = confusion_matrix(y_test, y_pred)

print(f"Accuracy: {accuracy}")
print(report)
print('Coefficients :  ', model.coef_, 
      '\nintercepts : ', model.intercept_)


Accuracy: 0.9285714285714286
              precision    recall  f1-score   support

           0       0.86      1.00      0.92         6
           1       1.00      0.88      0.93         8

    accuracy                           0.93        14
   macro avg       0.93      0.94      0.93        14
weighted avg       0.94      0.93      0.93        14

Coefficients :   [[ 0.18700348  0.30388798  0.49201739  1.14810849 -0.46814533  0.41847319]] 
intercepts :  [-52.79911079]


In [19]:

param_grid = {'C': [0.001, 0.01, 0.1, 1, 10, 100]}
grid_search = GridSearchCV(LogisticRegression(), param_grid, cv=5)
grid_search.fit(X_train, y_train)

best_params = grid_search.best_params_


## Logistic Regression Model Summary

- **Model Accuracy:** 92.86%

### Precision and Recall:

#### Class 0 (Non-Employment):
- Precision: 86%
- Recall: 100%

#### Class 1 (Employment):
- Precision: 100%
- Recall: 88%

- **F1-Score:**
  - Weighted Average F1-Score: 0.93

### Model Coefficients:

- Self-Esteem Coefficient: 0.1870
- Agreeableness Coefficient: 0.4920
- Conscientiousness Coefficient: 1.1481
- Extroversion Coefficient: 0.3039
- Neuroticism Coefficient: -0.4681
- Openness to Experience Coefficient: 0.4185

- **Intercept:** -52.7991

## Interpretation:

- The logistic regression model achieved an accuracy of 92.86%, indicating its ability to correctly classify individuals' employment status based on the provided features.

- For non-employment (Class 0), the model has a precision of 86% and a recall of 100%. This means that when the model predicts an individual is not employed, it is correct 86% of the time, and it correctly identifies all non-employed individuals.

- For employment (Class 1), the model has a precision of 100% and a recall of 88%. When the model predicts an individual is employed, it is correct 100% of the time, and it correctly identifies 88% of the employed individuals.

- The weighted average F1-score is 0.93, providing a balanced measure of the model's performance.

- The coefficients of the features indicate their impact on the log-odds of the positive class (employment). Specifically:
   - Higher Self-Esteem, Agreeableness, Extroversion, and Openness to Experience increase the likelihood of employment.
   - Higher Conscientiousness has the strongest positive impact on employment.
   - Higher Neuroticism decreases the likelihood of employment.

- The intercept term (-52.7991) represents the log-odds of employment when all features are at their baseline values (zero).

These results suggest that the model is effective at predicting employment status based on personality traits and that certain traits, such as Conscientiousness, have a strong positive influence on employment likelihood, while Neuroticism has a negative influence.

