Mental health chatbot
Chiraz BOUDERBALI and Meriem SELAMA

Firstly, we should install the necessary modules in order to start coding our chatbot.
Among this modules:


*   Streamlit: a library that allows us to create a simple UI to our chatbot.
*   Pyngrok: a library that uses Ngrok, which is an app that helps the developers to deploy their apps easily by giving them a domain without coding
*   scikit-learn: an open source library for data analysis.

In [None]:
!pip install streamlit
!pip install pyngrok
!pip install scikit-learn



In [None]:
import re
import spacy
import pandas as pd

In [None]:
nlp = spacy.load("en_core_web_sm")

preprocessing function is a function that can apply text cleaning, tokenization, lemmatisation and stop-words on our document.

In [None]:
def text_preprocessing(text):
    text = text.lower()
    text = re.sub(r'[^a-zA-Z\s]', '', text)
    text = re.sub(r'\s+', ' ', text)
    doc = nlp(text)
    tokens = [token.text for token in doc]
    lemmas = [token.lemma_ for token in doc]
    stopwords_removed = [token.lemma_ for token in doc if not token.is_stop]
    text = " ".join(stopwords_removed)
    return text

In [None]:
df = pd.read_csv('/content/Mental_Health_FAQ.csv')
df.head()

Unnamed: 0,Question_ID,Questions,Answers
0,1590140,What does it mean to have a mental illness?,Mental illnesses are health conditions that di...
1,2110618,Who does mental illness affect?,It is estimated that mental illness affects 1 ...
2,6361820,What causes mental illness?,It is estimated that mental illness affects 1 ...
3,9434130,What are some of the warning signs of mental i...,Symptoms of mental health disorders vary depen...
4,7657263,Can people with mental illness recover?,"When healing from mental illness, early identi..."


In [None]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 98 entries, 0 to 97
Data columns (total 3 columns):
 #   Column       Non-Null Count  Dtype 
---  ------       --------------  ----- 
 0   Question_ID  98 non-null     int64 
 1   Questions    98 non-null     object
 2   Answers      98 non-null     object
dtypes: int64(1), object(2)
memory usage: 2.4+ KB


Through df.info(), we can understand the composition of our dataset, which contains 98 lines and 3 columns.

In [None]:
df.drop('Question_ID', axis=1, inplace=True)

Here, we will delete the Question_ID column because there is no need to use it in our dataset.

In [None]:
df.head()

Unnamed: 0,Questions,Answers
0,What does it mean to have a mental illness?,Mental illnesses are health conditions that di...
1,Who does mental illness affect?,It is estimated that mental illness affects 1 ...
2,What causes mental illness?,It is estimated that mental illness affects 1 ...
3,What are some of the warning signs of mental i...,Symptoms of mental health disorders vary depen...
4,Can people with mental illness recover?,"When healing from mental illness, early identi..."


In [None]:
df['Question'] = df['Questions'].apply(text_preprocessing)
df['Answer'] = df['Answers'].apply(text_preprocessing)

We apply now the preprocessing function on our document.

In [None]:
print(df[['Answers', 'Answer']].head())
df[['Questions', 'Question']].head()

                                             Answers  \
0  Mental illnesses are health conditions that di...   
1  It is estimated that mental illness affects 1 ...   
2  It is estimated that mental illness affects 1 ...   
3  Symptoms of mental health disorders vary depen...   
4  When healing from mental illness, early identi...   

                                              Answer  
0  mental illness health condition disrupt person...  
1  estimate mental illness affect adult america a...  
2  estimate mental illness affect adult america a...  
3  symptom mental health disorder vary depend typ...  
4  heal mental illness early identification treat...  


Unnamed: 0,Questions,Question
0,What does it mean to have a mental illness?,mean mental illness
1,Who does mental illness affect?,mental illness affect
2,What causes mental illness?,cause mental illness
3,What are some of the warning signs of mental i...,warning sign mental illness
4,Can people with mental illness recover?,people mental illness recover


The results after applying text preprocessing on our document.

In this step, we will create a python file app to run our chatbot on an interface using streamlit, and deploy it using pyngrok
Before this, we decided to use a pre-learned model which S-BERT to enhance our chatbot in order to have the wanted results
Using TF-IDF one did not suit us.

In [None]:
%%writefile app.py
import streamlit as st
from sentence_transformers import SentenceTransformer
from sklearn.metrics.pairwise import cosine_similarity
import pandas as pd

df = pd.read_csv('/content/Mental_Health_FAQ.csv')

# Use SBERT model
model = SentenceTransformer('all-MiniLM-L6-v2')

# Encode datset questions
corpus_embeddings = model.encode(df["Questions"].tolist())

# Streamlit interface
st.title("Mental Health Chatbot")
user_question = st.text_input("Type your question:")

if user_question:
    # Encode user question
    user_embedding = model.encode([user_question])

    # Calculate cosine similarity between user question and the document question
    similarities = cosine_similarity(user_embedding, corpus_embeddings)

    # Find the most similar question
    best_match_idx = similarities.argmax()
    best_match_score = similarities[0, best_match_idx]

    # Display the similar question and its answer
    question_similaire = df["Questions"].iloc[best_match_idx]
    reponse_similaire = df["Answers"].iloc[best_match_idx]

    # Results
    # Here, we set a condition to find the most similar question in the document
    if best_match_score > 0.5:
        st.write(f"**Question utilisateur:** {user_question}")
        st.write(f"**Question similaire trouvée:** {question_similaire}")
        st.write(f"**Réponse correspondante:** {reponse_similaire}")
        st.write(f"**Score de similarité:** {best_match_score:.2f}")
    else:
        st.write("I'm sorry, I can't find a relevant answer to your question. Please try again.")


Writing app.py


In [None]:
from pyngrok import ngrok

# Configure authentification token
ngrok.set_auth_token("2qcwrweSgTmouvxF5nG76FPQlvO_7wTUMbfHb6Tu62oiNdQKe")

# Launch ngrok tunnel to port 8501
public_url = ngrok.connect(8501)
print("Streamlit est accessible via :", public_url)
# Launch streamlit in background
!streamlit run app.py &>/content/logs.txt &

Streamlit est accessible via : NgrokTunnel: "https://a4f2-35-221-250-77.ngrok-free.app" -> "http://localhost:8501"
