This project applies machine learning and data analysis to classify individuals into MBTI (Myers-Briggs Type Indicator) personality types based on behavioral or textual data. It explores how AI can understand personality traits through language and data-driven modeling.
The goal is to predict one of the 16 MBTI personality types (e.g., INFP, ENTJ) using text or survey data. The notebook includes: Data preprocessing and cleaning Exploratory Data Analysis (EDA) Feature extraction (TF-IDF, CountVectorizer, etc.) Model training and evaluation (Logistic Regression, Random Forest, or other classifiers) Accuracy and insights visualization
Python 3 Pandas, NumPy, Matplotlib, Seaborn scikit-learn NLTK or spaCy (for text preprocessing) Jupyter Notebook
Load Dataset: MBTI dataset with user posts and corresponding personality types. Preprocess Text: Clean, tokenize, and remove stop words. Vectorize: Convert text into numerical form using TF-IDF or CountVectorizer. Train Model: Fit ML algorithms to predict personality type. Evaluate: Measure performance with metrics like accuracy and F1-score. Use Cases Personality classification from social media data Chatbot personality modeling HR and psychology data analysis Future Improvements Use deep learning (LSTM, BERT) for better accuracy Add visual personality dashboards Integrate with APIs for live predictions
Developed by Gresa Hisa — AI Maschine Learning Engineer & Cybersecurity Engineer