<a href="https://colab.research.google.com/github/dibyarupnath/CodeClause_Personality_Prediction_System_Via_CV_Analysis/blob/main/Personality_Prediction_System_via_CV_Analysis.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Scope of this project
I have used the ***Myers-Briggs Type Indicator (MBTI) Dataset*** from Kaggle to train a Logistic Regression model that classifies a person as one of the 16 possible personality types from the MBTI dataset based on the text extracted from their resumes. 

I have used ResumeDataset.csv from Kaggle to provide the model with Resume data for prediction.

I have also provided the code that can take any text input from the user, that text can be the entire text copied from someone's CV.

The model will then attempt to classify a person's personality type based on that text.

#### ***However the quality/reliability of the prediction will depend on the length and type of text the model is given to predict upon.***

In [38]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [2]:
import os
os.chdir("/content/drive/MyDrive/Projects/CodeClause AI Internship/Personality Prediction System Via CV Analysis/Dataset/")

In [20]:
!pip install nltk

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


In [32]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.linear_model import LogisticRegression

In [24]:
resume = pd.read_csv("ResumeDataset.csv")

In [25]:
mbti = pd.read_csv("mbti.csv")

In [26]:
resume.head()

Unnamed: 0,Category,Resume
0,Data Science,Skills * Programming Languages: Python (pandas...
1,Data Science,Education Details \r\nMay 2013 to May 2017 B.E...
2,Data Science,"Areas of Interest Deep Learning, Control Syste..."
3,Data Science,Skills â¢ R â¢ Python â¢ SAP HANA â¢ Table...
4,Data Science,"Education Details \r\n MCA YMCAUST, Faridab..."


In [31]:
mbti.head(2)

Unnamed: 0,type,posts
0,INFJ,'http://www.youtube.com/watch?v=qsXHcwe3krw|||...
1,ENTP,'I'm finding the lack of me in these posts ver...


## 1. Splitting the MBTI dataset into X and y. 
## 2. Dividing the dataset into Train and Test sets

In [None]:
# Split the Myers-Briggs dataset into features (X) and labels (y)
X = mbti['posts']
y = mbti['type']

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

## Vectorizing

In [None]:
# Initialize the CountVectorizer
vectorizer = CountVectorizer()
X_train = vectorizer.fit_transform(X_train)
X_test = vectorizer.transform(X_test)

## Model Training

In [33]:
# Train the model
model = LogisticRegression()
model.fit(X_train, y_train)

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(


## Making Predictions from Resumes in the ResumeDataset.csv

In [35]:
# Predict personality types for resumes
resumes = resume['Resume']
resumes = vectorizer.transform(resumes)
predicted_types = model.predict(resumes)

# Add predicted personality types to the resume dataset
resume['Predicted Type'] = predicted_types

# Print the first 5 rows of the resume dataset with predicted personality types
print(resume.head(5))

       Category                                             Resume  \
0  Data Science  Skills * Programming Languages: Python (pandas...   
1  Data Science  Education Details \r\nMay 2013 to May 2017 B.E...   
2  Data Science  Areas of Interest Deep Learning, Control Syste...   
3  Data Science  Skills â¢ R â¢ Python â¢ SAP HANA â¢ Table...   
4  Data Science  Education Details \r\n MCA   YMCAUST,  Faridab...   

  Predicted Type  
0           INTJ  
1           INTJ  
2           INTP  
3           ENTJ  
4           INTP  


## Predicting The Personality Type from input text. 

##### The input text can be the entire text extracted from a CV or could be any text written by the person. But the quality/reliability of the prediction will depend on how much text was input because more the text, better the prediction.

## I have entered the "Summary" from my own CV and the model predicts my personality type as ESFP :)

In [30]:
# Take user input
user_input = input("Enter a text: ")

# Transform user input using the trained vectorizer
user_input_transformed = vectorizer.transform([user_input])

# Predict the personality type for user input
predicted_type = model.predict(user_input_transformed)

# Print the predicted personality type
print("Predicted Personality Type:", predicted_type)



Enter a text: I am a Computer Science and Engineering undergraduate, with a keen interest in ML/AI and Cloud Technology. I have hands-on experience with the Google Cloud Platform, and have dabbled in Azure and AWS too. Not only that, but I also have some experience in machine learning through platforms like Kaggle. Furthermore, I am looking for internship opportunities in these fields for an in-depth industrial experience in them.
Predicted Personality Type: ['ESFP']


## Brief Description of each personality type from the **Myers-Briggs Type Indicator (MBTI)** Dataset

#### The **Myers-Briggs Type Indicator (MBTI)** Dataset categorizes personality types based on four dimensions: ***extraversion (E) vs. introversion (I)***, ***sensing (S) vs. intuition (N)***, ***thinking (T) vs. feeling (F)***, and ***judging (J) vs. perceiving (P)***. 

#### The combination of these dimensions results in ***16 distinct personality types***. Here's a brief overview of each personality type in the MBTI dataset:

1. ***ISTJ (Introversion, Sensing, Thinking, Judging)***: ISTJs are practical, responsible, and dependable. They prefer structure, follow rules, and focus on details.

2. ***ISFJ (Introversion, Sensing, Feeling, Judging)***: ISFJs are warm, caring, and reliable. They are attentive to others' needs, value harmony, and are detail-oriented.

3. ***INFJ (Introversion, Intuition, Feeling, Judging)***: INFJs are insightful, compassionate, and idealistic. They have a deep understanding of others, seek meaning, and are driven by their values.

4. ***INTJ (Introversion, Intuition, Thinking, Judging)***: INTJs are strategic, independent, and logical. They have a long-term vision, excel in analyzing complex problems, and value competence.

5. ***ISTP (Introversion, Sensing, Thinking, Perceiving)***: ISTPs are observant, analytical, and adaptable. They enjoy hands-on activities, value practicality, and are skilled problem solvers.

6. ***ISFP (Introversion, Sensing, Feeling, Perceiving)***: ISFPs are artistic, gentle, and empathetic. They appreciate beauty, value personal values, and enjoy expressing themselves creatively.

7. ***INFP (Introversion, Intuition, Feeling, Perceiving)***: INFPs are compassionate, imaginative, and individualistic. They seek authenticity, value personal growth, and are driven by their ideals.

8. ***INTP (Introversion, Intuition, Thinking, Perceiving)***: INTPs are logical, curious, and innovative. They have a thirst for knowledge, enjoy theoretical thinking, and are skilled problem solvers.

9. ***ESTP (Extraversion, Sensing, Thinking, Perceiving)***: ESTPs are outgoing, energetic, and action-oriented. They are adaptable, enjoy new experiences, and excel in fast-paced environments.

10. ***ESFP (Extraversion, Sensing, Feeling, Perceiving)***: ESFPs are enthusiastic, friendly, and spontaneous. They appreciate sensory experiences, value social connections, and seek joy in the present moment.

11. ***ENFP (Extraversion, Intuition, Feeling, Perceiving)***: ENFPs are enthusiastic, creative, and empathetic. They value personal growth, enjoy exploring possibilities, and are skilled communicators.

12. ***ENTP (Extraversion, Intuition, Thinking, Perceiving)***: ENTPs are inventive, quick-witted, and intellectually curious. They enjoy debates, value creativity, and excel in generating new ideas.

13. ***ESTJ (Extraversion, Sensing, Thinking, Judging)***: ESTJs are organized, practical, and efficient. They value structure, follow traditions, and excel in leadership roles.

14. ***ESFJ (Extraversion, Sensing, Feeling, Judging)***: ESFJs are warm, outgoing, and supportive. They value harmony, prioritize the needs of others, and excel in nurturing relationships.

15. ***ENFJ (Extraversion, Intuition, Feeling, Judging)***: ENFJs are charismatic, empathetic, and influential. They value harmony, are skilled at understanding others' emotions, and excel in leadership and helping roles.

16. ***ENTJ (Extraversion, Intuition, Thinking, Judging)***: ENTJs are strategic, assertive, and efficient. They have strong leadership skills, value competence, and excel in organizational settings.

It's important to note that these brief descriptions capture some general characteristics associated with each personality type, but individual variations exist within each type.