In this code, function "predict_personality(text)" take a text as an input and then return the personality type. The variable "model_address" shows the addresses of saved model. You can change the address based on the path you put the model.

In [29]:
#Connecting the notebook to google drive for reading dataset
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [30]:
# Importing required libraries for this assignment
import pandas as pd
import numpy as np
import string, re
import nltk
import sklearn
import nltk
import pickle
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize
from nltk.stem import WordNetLemmatizer
from nltk.sentiment.vader import SentimentIntensityAnalyzer
from catboost import CatBoostClassifier # For building catboost model
from xgboost import XGBClassifier
from sklearn.feature_extraction.text import TfidfVectorizer # For vectorizing text data
from sklearn.metrics import accuracy_score
import pickle

model_address = '/content/drive/MyDrive/ENSF 619 - Group Project/Project/Phase 1/modelCatBoost600.pkl'

In [4]:
#Downloading Require Packages
nltk.download('vader_lexicon')
nltk.download('punkt')
nltk.download('stopwords')
nltk.download('wordnet')
!pip3 install catboost

[nltk_data] Downloading package vader_lexicon to /root/nltk_data...
[nltk_data]   Package vader_lexicon is already up-to-date!
[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!
[nltk_data] Downloading package wordnet to /root/nltk_data...
[nltk_data]   Package wordnet is already up-to-date!


Collecting catboost
  Downloading catboost-1.2.2-cp310-cp310-manylinux2014_x86_64.whl (98.7 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m98.7/98.7 MB[0m [31m9.7 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: catboost
Successfully installed catboost-1.2.2


In [31]:
def preprocess_dataframe(df):
      df['Preprocessed_posts'] = df['posts'].str.lower()
      df['Preprocessed_posts'] = df['Preprocessed_posts'].str.replace(r'https?://[^\s<>"]+|www\.[^\s<>"]+', ' ', flags=re.MULTILINE, regex=True)
      df['Preprocessed_posts'] = df['Preprocessed_posts'].str.replace(r'[^0-9a-z]', ' ', regex=True)
      return df

#Removing stopwords and applying lemmatizer
def remove_stopwords_lemmatizer(post):
    tokens = word_tokenize(post)
    stopwords = nltk.corpus.stopwords.words('english')
    filtered_tokens = [token for token in tokens if token not in stopwords]
    lemmatizer = nltk.WordNetLemmatizer()
    lemmatized_tokens = [lemmatizer.lemmatize(token) for token in filtered_tokens]
    processed_text = ' '.join(lemmatized_tokens)
    return processed_text

def add_sentiment_column(post):
    sentiment_analyzer = SentimentIntensityAnalyzer()
    scores = sentiment_analyzer.polarity_scores(post)
    if scores["compound"] == 0:
        sentiment = 2
    elif scores["compound"] > 0:
        sentiment = 1
    else:
        sentiment = 0
    return sentiment

In [32]:
def predict_personality(text):

    #Preprocessing the text input(removing stopwords + sentiment)
    data = [text]
    df = pd.DataFrame(data, columns=['posts'])
    df = preprocess_dataframe(df)
    df['Preprocessed_posts'] = df['Preprocessed_posts'].apply(lambda x: remove_stopwords_lemmatizer(x))
    df = df.drop(df[df['Preprocessed_posts'] == ''].index)

    if df.empty:
        return "The text length was not enough for processing!"

    df['sentiment'] = df['Preprocessed_posts'].apply(lambda x: add_sentiment_column(x))

    ##############################################################################################################################

    #loading the model and vectorizer
    pickled_model, loaded_vectorizer = pickle.load(open(model_address, 'rb'))

    ##############################################################################################################################

    #Vectorizing the text data
    vectors_tfidf = loaded_vectorizer.transform(df['Preprocessed_posts'])

    ##############################################################################################################################

    #Combining vectorizers output with sentiment column
    vectors_tfidf_array = vectors_tfidf.toarray()
    input_text = np.column_stack((vectors_tfidf_array, df['sentiment'].to_numpy()))

    ##############################################################################################################################

    #Running the model and getting prediction
    input_text_pred = pickled_model.predict(input_text)

    label_predicted = input_text_pred[0][0]
    labels = ['ENFJ', 'ENFP', 'ENTJ', 'ENTP', 'ESFJ', 'ESFP', 'ESTJ', 'ESTP', 'INFJ', 'INFP', 'INTJ', 'INTP', 'ISFJ', 'ISFP', 'ISTJ', 'ISTP']

    return labels[label_predicted]


In [35]:
text = "'Your comment screams INTJ, bro. Especially the useless part.|||Thanks for the information. Doesn't interfere with anything I've ever experienced (with INFJs). Plus, your signature is the lyrics from one of my favorite bands (Tool).  That song (Reflection) was...|||Aren't ESTPs the kings/queens of saying things without thinking/without wanting to think? No offense, you guys can be fucking magnificent|||Ooh, so dangerous. All these scary words and such.  No, it's not dangerous. I already knew I was unhealthy before I entered PerC. PerC can't make me more unhealthy.|||LOL! What the fuck is that in your signature? Some obscure inspirational quote gone wrong?|||Why would that be implied?|||There's too many. I watch a lot of tv shows, although i hardly ever watch them on tv.   -how i met your mother -scrubs -x files -futurama -the office (really getting into it recently, the...|||Who's to say that Sensors will give birth to sensors? Or intuitives to intuitives?   The only reason why we think MBTI is great is not because of its system. The test is awful and people are very...|||If you had the ability to time-travel to 1933 and kill Hitler, would you -John Smith, The Dead Zone by Stephen King  Why kill him, though, if you have the ability to turn him into a normal...|||Another interesting thing to mention is that mental issues/disorders can cause physical pain, and vice versa.  For instance, long term severe pain can lead to insanity, and depression is known to...|||If you think physical pain is worse, you haven't suffered much mental pain.  I personally do not think they're comparable. Of course physical pain is worse, it implies irreversible damage and...|||Of course Yoda would prefer Si.|||It's all homestuck threads, I'm guessing.  Oooh, Dirk x John or some shit|||His logic is pathetic and it fails to hold up so much of the time. INTP is definitely a possibility, but ENTP and maybe even ENFP are as well. Mostly because I see ENTPs failing logically much more...|||If I were to actually pay attention to how I walk, then I would end up walking normally. So answering this question would be paradoxical.|||I think INTPs and ISFJs have it the worst. And I'm pretty sure I've got undiagnosed Generalized Anxiety Disorder|||Ew... Buzzfeed...|||Don't really have many friends, but I can relate. With an ESTP, at first I acted very ESTP-like, but it was fun. Acting stupid, having fun, saying stupid shit to each other... But as the friendship...|||I would as well, except I would do it metaphorically.|||I read books all the time as a kid... Don't anymore. It doesn't matter if something is logical or not if it is boring as SHIT!|||Yeah, it's best because the first time you watch it you think it's just a demon bunny. And how he reacts to it, just amazing.|||Man I love that profile picture. When Donnie gets that sickeningly satisfied face, I feel the same way, so giddy. I'm a bit of a sadist and I love finding pleasure in dark things... That's how I...|||Usually logic and Ti has very little to do with answering Trivia questions, as it mostly has to do with memory. However, if I don't know the answer I can use my Ti to get an educated guess, which is...|||Wouldn't it be more economical to use a regular phone in that case?|||I support this. My phone has affected my life negatively more than I'd like to admit.|||I want to once again say that I do not agree with this 24 type nonsense. I definitely think you can organize a type into more specific types through answering questions (MBTI Style), but this just...|||My intuition usually allows me to remember such answers in games like that even if I do not consciously know anything about the subject being asked. It's things the subconscious has picked up on,...|||Lol @ ISTP, INFJ, and ESTP GIFs.   Nice profile picture, by the way. :laughing:|||It has to do with the amount of time we have to think and respond, lack of social stigmas, no body language, no external factors/variables.  But I agree, if you can do it online, you can do it in...|||Yes. They make me angry for some reason.  In my High School there's one that says Speak Your Mind, which I find ironic.|||Yeah, I'm probably quite less healthy than most INTPs, and that's saying something. I've tried to get better. But Aspergers, ADD, Anxiety, Alexithymia... (damn those A words!) They make it hard for...|||Avoided Death Note? Why? It's meant for Rationals!  Mirai Nikki isn't that popular, surprisingly. Maybe because it takes a while to pick up speed (not unlike Death Note) and our world of sensors...|||I suppose, but it's not the same. The likelihood of me finding an individual who I like on the internet is indeed much higher, but the amount of contact and relations I can have with someone here is...|||I got 26 out of 37.|||The only good thing about SOA is the opening theme. Holy shit the first few seconds are amazing though.  Edit: I think the only times I've been emotional over an anime is the end of Death Note and...|||He's pretty rad, but socially awkward and obsessed with logical correctness. Like more than me. Debating (literally) his E-ness. But idk, he creates arguments and plays Devil's Advocate which is an...|||I just watched a TableTop video on YouTube, and I had the good fortune of seeing Allison Scagliotti for the first time. I think I'm in love. Anyways, I'd say she's an ENTP because of, well,...|||more like *sniffle* you plastic funbag|||I saw that earlier. I'm not the biggest fan of Markiplier (spelling? lol), a lot of his commentary and such seems to be mostly facial expressions and noises. However, I'm sympathetic of his past. And...|||I'd say water over rock. Maybe electric or dragon type. Fairy if we're fortunate.|||I'm sure a lot of immature INTPs would like to believe they have diminished emotions, but emotions are a human thing*.   (not including anti-social personalities)|||Since your MBTI is based completely off how you answer the questions, I would say yes. And I think you misunderstand mental disorders. Most people who are afflicted are still capable of most things...|||I'm confused. All it asks for is my birth date and location and then it tells me my theme or whatever. I didn't even answer any questions.|||I wish I had an INTP friend :P I have not met a single rational at this school besides the ones I already know. Which is 2. (INTJ who is too cool for me right now, and ENTP)|||For INTPs it serves us quite well. Remembering important factoids and info/knowledge, but never hygenial things...  I feel like it most has to do with our subconscious/intuition, not exactly...|||You're right. There's almost always a solution, and suicide shouldn't be first on the list.|||I play it in spurts every now and again. I'd say I'm okay.  My greatest weakness is that I can't think too many steps ahead. I can think of the best logical move for the board as it is, maybe even...|||226530   http://media.giphy.com/media/Be9IVnTa8GOoU/giphy.gif   Topher Brink of Dollhouse|||Like I said, everybody's different. And not having a close friend does not equal being alone, although it can seem like it. You still have human contact, which is really all that's necessary.|||Did you make an account specifically with this in mind to ask? lol'"

In [36]:
predict_personality(text)

'INTP'