# Reddit LFQA

## Dependencies

### Haystack Dependencies

#### Library

In [None]:
# Install the latest master of Haystack
!pip install --upgrade pip
!pip install git+https://github.com/deepset-ai/haystack.git#egg=farm-haystack[colab]

# For FAISS DocumentStore
# !pip install git+https://github.com/deepset-ai/haystack.git#egg=farm-haystack[colab,faiss]

Collecting pip
  Downloading pip-22.0.4-py3-none-any.whl (2.1 MB)
[K     |████████████████████████████████| 2.1 MB 12.8 MB/s 
[?25hInstalling collected packages: pip
  Attempting uninstall: pip
    Found existing installation: pip 21.1.3
    Uninstalling pip-21.1.3:
      Successfully uninstalled pip-21.1.3
Successfully installed pip-22.0.4
Collecting farm-haystack[colab]
  Cloning https://github.com/deepset-ai/haystack.git to /tmp/pip-install-6bm58_2n/farm-haystack_deed7ad4131742e7bd6b759b7ea7cecc
  Running command git clone --filter=blob:none --quiet https://github.com/deepset-ai/haystack.git /tmp/pip-install-6bm58_2n/farm-haystack_deed7ad4131742e7bd6b759b7ea7cecc
  Resolved https://github.com/deepset-ai/haystack.git to commit 46fa166c36d4b2fb0f428041fb048e50613553a9
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
Collecting mmh3
  Downloading mmh3-3.0.0-cp37-c

#### ElasticSearch DocumentStore

In [None]:
# In Colab / No Docker environments: Start Elasticsearch from source
! wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.9.2-linux-x86_64.tar.gz -q
! tar -xzf elasticsearch-7.9.2-linux-x86_64.tar.gz
! chown -R daemon:daemon elasticsearch-7.9.2

import os
from subprocess import Popen, PIPE, STDOUT

es_server = Popen(
    ["elasticsearch-7.9.2/bin/elasticsearch"], stdout=PIPE, stderr=STDOUT, preexec_fn=lambda: os.setuid(1)  # as daemon
)
# wait until ES has started
! sleep 30

### Dataset  & Data Format

In [None]:
!pip install -q feather-format

[0m

**Download dataset**

In [None]:
!pip install --upgrade --no-cache-dir gdown
import gdown
gdown.download("https://drive.google.com/uc?id=1npViq1AMdGAQTwNcjDVRvmEQ8UzfLJ43")
gdown.download('https://drive.google.com/uc?id=1ElI3fYdTVHE7TzVH4U293-GlgasSbzTm')

Collecting gdown
  Downloading gdown-4.4.0.tar.gz (14 kB)
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
Building wheels for collected packages: gdown
  Building wheel for gdown (pyproject.toml) ... [?25l[?25hdone
  Created wheel for gdown: filename=gdown-4.4.0-py3-none-any.whl size=14774 sha256=9a156527a43aa649a83fa8f9833d481c69c0ef785ecdcb7d12eeca82a5b0514d
  Stored in directory: /tmp/pip-ephem-wheel-cache-d4104_8g/wheels/fb/c3/0e/c4d8ff8bfcb0461afff199471449f642179b74968c15b7a69c
Successfully built gdown
Installing collected packages: gdown
  Attempting uninstall: gdown
    Found existing installation: gdown 4.2.2
    Uninstalling gdown-4.2.2:
      Successfully uninstalled gdown-4.2.2
Successfully installed gdown-4.4.0
[0m

Downloading...
From: https://drive.google.com/uc?id=1npViq1AMdGAQTwNcjDVRvmEQ8UzfLJ43
To: /content/cleaned_dataset.feather
100%|██████████| 11.2M/11.2M [00:00<00:00, 34.2MB/s]
Downloading...
From: https://drive.google.com/uc?id=1ElI3fYdTVHE7TzVH4U293-GlgasSbzTm
To: /content/questions.txt
100%|██████████| 309k/309k [00:00<00:00, 38.9MB/s]


'questions.txt'

## Libraries

In [None]:
import os
import pandas as pd
import numpy as np
from tqdm.auto import trange, tqdm
# Document Store
from haystack.utils import (launch_es,
                            print_answers,
                            print_documents,
                            convert_files_to_dicts,
                            clean_wiki_text)

# from haystack.document_stores import FAISSDocumentStore
from haystack.document_stores import ElasticsearchDocumentStore


# Nodes
## Preprocessing
from haystack.nodes import PreProcessor

## Retriever
from haystack.nodes import DensePassageRetriever

## Reader/Generator
from haystack.nodes import Seq2SeqGenerator

# Pipeline
from haystack.pipelines import (DocumentSearchPipeline,
                                GenerativeQAPipeline)

INFO - haystack.modeling.model.optimization -  apex not found, won't use it. See https://nvidia.github.io/apex/


## Hardware Dependencies

In [None]:
try:
  from google.colab import drive
  import os
  if not os.path.isdir('/content/drive'):
    drive.mount('/content/drive')
  try:
    os.chdir('/content/drive/MyDrive/Work/LFQA')
  except Exception:
    os.chdir('/content/drive/MyDrive/LFQA')
except Exception:
  print("You're not in Google Colab")

Get the number of cores in CPU for fast hypertuning

In [None]:
import multiprocessing
N_JOBS = multiprocessing.cpu_count()
N_JOBS

2

For GPU, if there's a one

In [None]:
!nvidia-smi

Sun Mar 20 23:29:30 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03    Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla P100-PCIE...  Off  | 00000000:00:04.0 Off |                    0 |
| N/A   45C    P0    28W / 250W |      0MiB / 16280MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

## Preprocessing

In [None]:
df = pd.read_feather('dataset/cleaned_dataset.feather')

In [None]:
df.sort_values(by='subreddit', inplace=True)

In [None]:
df.head()

Unnamed: 0,id,subreddit,thread_score,comment_score,title,content,comment
21673,swf1yy,anxiety,24,1,What anxiety meds are good for an as needed basis.,I am going to the doctor in a few days and i do not want a medicine that i w...,I was having anxiety issues and panic attacks for several years. Lorazepam w...
22488,svbebz,anxiety,113,11,Quitting weed.,So i have to quit smoking which i have done for years on the daily its do or...,Try hemp if you still like the ritual of rolling up high cod compounded with...
22487,svbhal,anxiety,13,1,Anxiety makes me stupid.,Does anyone feel that their anxiety has made them dumb? Like i cannot proces...,Its like you described me. My memory is crazy bad.
22486,svbhal,anxiety,13,1,Anxiety makes me stupid.,Does anyone feel that their anxiety has made them dumb? Like i cannot proces...,Removed.
22485,svbhal,anxiety,13,1,Anxiety makes me stupid.,Does anyone feel that their anxiety has made them dumb? Like i cannot proces...,"Meditation, positive thinking sports or any physical activity reading and cl..."


In [None]:
df.drop(index=22486, inplace=True)

In [None]:
df.reset_index(drop=True, inplace=True)

In [None]:
df['title'] = df['title'].str.replace(r'(\si\s)', ' I ', regex=True)
df['content'] = df['content'].str.replace(r'(\si\s)', ' I ', regex=True)
df['comment'] = df['comment'].str.replace(r'(\si\s)', ' I ', regex=True)

In [None]:
df

Unnamed: 0,id,subreddit,thread_score,comment_score,title,content,comment
0,swf1yy,anxiety,24,1,What anxiety meds are good for an as needed basis.,I am going to the doctor in a few days and I do not want a medicine that I w...,I was having anxiety issues and panic attacks for several years. Lorazepam w...
1,svbebz,anxiety,113,11,Quitting weed.,So I have to quit smoking which I have done for years on the daily its do or...,Try hemp if you still like the ritual of rolling up high cod compounded with...
2,svbhal,anxiety,13,1,Anxiety makes me stupid.,Does anyone feel that their anxiety has made them dumb? Like I cannot proces...,Its like you described me. My memory is crazy bad.
3,svbhal,anxiety,13,1,Anxiety makes me stupid.,Does anyone feel that their anxiety has made them dumb? Like I cannot proces...,"Meditation, positive thinking sports or any physical activity reading and cl..."
4,svbhal,anxiety,13,1,Anxiety makes me stupid.,Does anyone feel that their anxiety has made them dumb? Like I cannot proces...,"Nope, me, too. I say the most ridiculous things and cannot even carry a norm..."
...,...,...,...,...,...,...,...
28622,srquof,selfimprovement,4,0,"I feel I am always loud and obnoxious, and it is like I cannot stop talking ...",I hate it so much I can see when I am way too loud and annoying everyone aro...,Of people just want to talk they dont care about opinions or input. If you w...
28623,srqh0g,selfimprovement,4,1,Have you guys ever underestimated how much it would take to improve yourselv...,"Really specific question. But hear me out, I have years old, and sometime ag...","Everyone operates on a different time line. My friend, what took someone yea..."
28624,srqh0g,selfimprovement,4,1,Have you guys ever underestimated how much it would take to improve yourselv...,"Really specific question. But hear me out, I have years old, and sometime ag...",I did not take self improvement seriously until my late chill a while and en...
28625,srt7bv,selfimprovement,21,1,How do you retain the will to live and bravery against all the odds in life?,"Hey guys, I am a year old, man. In my final year in university, you could sa...","Exercise, deep breathing and being with nature help me. Alot anxiety medicat..."


In [None]:
df.loc[28623,'title']

'Have you guys ever underestimated how much it would take to improve yourselves and got frustrated because you thought you should have improved faster'

In [None]:
print("\n\n".join([x+'\n'+y+'\n'+z for x, y, z in list(zip(*[["Title: "+value for value in df['title']],
                                              ["Thread\n"+value for value in df['content']],
                                              ["Comments\n"+value for value in df['comment']]]))][0:1]))

Title: What anxiety meds are good for an as needed basis.
Thread
I am going to the doctor in a few days and I do not want a medicine that I would have to take everyday like an sriconfusion. I have gathered a list so far of medicines to research lorazepam xanax probably cant go with xanax because it might react with my birth controlconfusion propranolol. I dont think I will go with this one because I dont want to risk the side effect of hair lossconfusion. If you have any experience with the ones above, please share, as well as if you have any other suggestions. I want it to be something that I can take in the moment before. Also, can you drive under these meds? It would be nice so I could take them before work
Comments
I was having anxiety issues and panic attacks for several years. Lorazepam was my go to medication. It sure helped, but I was never easy taking it because of its addictive potential. Then I found out about cod and gave it a try. It took me some time to figure out how to 

In [None]:
len("\n\n".join([x+'\n'+y+'\n'+z for x, y, z in list(zip(*[["Title: "+value for value in df['title']],
                                              ["Thread\n"+value for value in df['content']],
                                              ["Comments\n"+value for value in df['comment']]]))[0:1]]).split())

228

### Save the dataset into multiple documents 

In [None]:
for subreddit in df['subreddit'].unique():
  df_w = df.loc[df['subreddit'] == subreddit]
  if not os.path.isdir('dataset/reddit_documents'):
    os.mkdir('dataset/reddit_documents')
  # for row_index in trange(len(df_w), desc=f'Writing {subreddit}'):
  with open(f'dataset/reddit_documents/{subreddit}_document.txt', 'w') as file:
    file.writelines("\n\n".join([x+'\n'+y+'\n'+z for x, y, z in list(zip(*[["Title: "+value for value in df_w['title']],
                                ["Thread\n"+value for value in df_w['content']],
                                ["Comments\n"+value for value in df_w['comment']]]))]))

## Preprocess the Documents

In [None]:
# convert files to dicts of Documents
docs = convert_files_to_dicts(dir_path='dataset/reddit_documents')

INFO - haystack.utils.preprocessing -  Converting dataset/reddit_documents/anxiety_document.txt
INFO - haystack.utils.preprocessing -  Converting dataset/reddit_documents/changemyview_document.txt
INFO - haystack.utils.preprocessing -  Converting dataset/reddit_documents/depression_document.txt
INFO - haystack.utils.preprocessing -  Converting dataset/reddit_documents/mentalhealth_document.txt
INFO - haystack.utils.preprocessing -  Converting dataset/reddit_documents/relationship_advice_document.txt
INFO - haystack.utils.preprocessing -  Converting dataset/reddit_documents/self_document.txt
INFO - haystack.utils.preprocessing -  Converting dataset/reddit_documents/selfimprovement_document.txt


In [None]:
processor = PreProcessor(
    clean_empty_lines=True,
    clean_whitespace=True,
    clean_header_footer=True,
    split_by="word",
    split_length=256,
    split_respect_sentence_boundary=True,
    split_overlap=0
)
processed_docs = processor.process(docs)

[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.


100%|██████████| 7/7 [00:18<00:00,  2.61s/docs]


## Document Store

In [None]:
len(processed_docs)

41810

In [None]:
# Initialize DocumentStore and index documents
launch_es()
document_store = ElasticsearchDocumentStore(embedding_dim=128)
document_store.delete_documents()
document_store.write_documents(processed_docs)




In [None]:
# document_store = FAISSDocumentStore(similarity='dot_product',
#                                     embedding_dim=128,
#                                     faiss_index_factory_str="Flat")

# # Writing the contents into the store
# document_store.write_documents(processed_docs)

## Creating Dense Retriver using DPR

In [None]:
retriever = DensePassageRetriever(
    document_store = document_store,
    query_embedding_model="vblagoje/dpr-question_encoder-single-lfqa-wiki",
    passage_embedding_model="vblagoje/dpr-ctx_encoder-single-lfqa-wiki"
)

## Update the store with the embedded values
document_store.update_embeddings(retriever, update_existing_embeddings=False)

INFO - haystack.modeling.utils -  Using devices: CUDA:0
INFO - haystack.modeling.utils -  Number of GPUs: 1


Downloading:   0%|          | 0.00/226k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/455k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/28.0 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/495 [00:00<?, ?B/s]

INFO - haystack.modeling.model.language_model -  LOADING MODEL
INFO - haystack.modeling.model.language_model -  Could not find vblagoje/dpr-question_encoder-single-lfqa-wiki locally.
INFO - haystack.modeling.model.language_model -  Looking on Transformers Model Hub (in local cache and online)...


Downloading:   0%|          | 0.00/418M [00:00<?, ?B/s]

INFO - haystack.modeling.model.language_model -  Loaded vblagoje/dpr-question_encoder-single-lfqa-wiki


Downloading:   0%|          | 0.00/226k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/455k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/28.0 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/494 [00:00<?, ?B/s]

The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization. 
The tokenizer class you load from this checkpoint is 'DPRQuestionEncoderTokenizer'. 
The class this function is called from is 'DPRContextEncoderTokenizerFast'.
INFO - haystack.modeling.model.language_model -  LOADING MODEL
INFO - haystack.modeling.model.language_model -  Could not find vblagoje/dpr-ctx_encoder-single-lfqa-wiki locally.
INFO - haystack.modeling.model.language_model -  Looking on Transformers Model Hub (in local cache and online)...


Downloading:   0%|          | 0.00/418M [00:00<?, ?B/s]

INFO - haystack.modeling.model.language_model -  Loaded vblagoje/dpr-ctx_encoder-single-lfqa-wiki
INFO - haystack.document_stores.elasticsearch -  Updating embeddings for 34288 docs without embeddings ...


Updating embeddings:   0%|          | 0/34288 [00:00<?, ? Docs/s]

Create embeddings:   0%|          | 0/10000 [00:00<?, ? Docs/s]

Create embeddings:   0%|          | 0/10000 [00:00<?, ? Docs/s]

Create embeddings:   0%|          | 0/10000 [00:00<?, ? Docs/s]

Create embeddings:   0%|          | 0/4288 [00:00<?, ? Docs/s]

### Testing our retriever (a.k.a `DPR`) Before applying the **Generator**

In [None]:
p_retrieval = DocumentSearchPipeline(retriever)
res = p_retrieval.run(query="How do you retain the will to live and bravery against all the odds in life?", params={"Retriever": {"top_k": 10}})
print_documents(res, max_text_len=512)


Query: How do you retain the will to live and bravery against all the odds in life?

{   'content': 'Here, my viewpoint, it does not matter if free will exists or '
               'not. There is, however, a psychological benefit to believing '
               'that it does. Those who believe in free will are more likely '
               'to report, feeling happier more often than those that don. So '
               'from a utility standpoint, believing in free will is '
               'beneficial toMe and therefore I have to conclude that my '
               'actions matter. Title: Civ. Life is just random chance. '
               'Thread\n'
               'There is no god, no afterlife, no plan, no purpose, no '
               'meaning, no justice, nothing wheter or...',
    'name': 'changemyview_document.txt'}

{   'content': 'Title: How do you retain the will to live and bravery against '
               'all the odds in life? Thread\n'
               'Hey guys, I am a year old, man.

In [None]:
res['documents'][1].content

'Title: How do you retain the will to live and bravery against all the odds in life? Thread\nHey guys, I am a year old, man. In my final year in university, you could say I had a pretty sheltered upbringing and really have not been taking life seriously till now. I realized this is a coping mechanism, because I do not like facing reality or taking charge of my life. I am scared to live. Adulthood just seems to be filled with horrors with very little hope. Even the tiniest mistake can wreck your life. So many things are decided from your birth itself. What family you are born into, the class, you belong to the race you belong to and your genetics. All of these have a massive impact on your life. Making mistakes in adulthood is permanent. You mess up bad, and it will follow you all your life without money, power or influence. I do not know how an average person lives. How do you not live in constant fear? It just seems like in today, society, we are at the mercy of other people, the syst

## Creating our Reader/Generator
We're going to use **BertLFQA** to generate the answers

In [None]:
generator = Seq2SeqGenerator(model_name_or_path="vblagoje/bart_lfqa")

INFO - haystack.modeling.utils -  Using devices: CUDA
INFO - haystack.modeling.utils -  Number of GPUs: 1


Downloading:   0%|          | 0.00/27.0 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.29k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/878k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/446k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.29M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.51G [00:00<?, ?B/s]

## Creating our pipeline

With a Haystack `Pipeline` you can stick together your building blocks to a search pipeline.
Under the hood, `Pipelines` are Directed Acyclic Graphs (DAGs) that you can easily customize for your own use cases.
To speed things up, Haystack also comes with a few predefined Pipelines. One of them is the `GenerativeQAPipeline` that combines a retriever and a reader/generator to answer our questions.
You can learn more about `Pipelines` in the [docs](https://haystack.deepset.ai/docs/latest/pipelinesmd).

In [None]:
# SearchSummarizationPipeline
# FAQPipeline (FASTER)
# QuestionGenerationPipeline (Generating Questions from the documents)
# QuestionAnswerGenerationPipeline (Generating Questions from documents, it can answer these questions using Reader Model)

pipeline = GenerativeQAPipeline(generator=generator, retriever=retriever)

## Testing Run

In [None]:
predictions = pipeline.run(
  # query="What is the benefit of life?",
  query="How do you retain the will to live and bravery against all the odds in life?",
  params={"Retriever": {"top_k": 20}, "Generator": {"top_k": 1}})

In [None]:
print_answers(predictions)


Query: How do you retain the will to live and bravery against all the odds in life?
Answers:
[   <Answer {'answer': 'I don\'t know if this is what you\'re looking for, but I\'ll give it a shot. When I was a kid, I was in a car accident. I didn\'t know what was going to happen to me, and I had no way of knowing what would happen to my family or friends. I was told that I would die, and that I had to get out of the car as fast as I could. So I did. I went to the hospital, and they told me that my family and friends would die if I stayed in the car. I said, "I\'m not going to die, I\'m going to fight for my family." I went back home, and my family came and picked me up. They took me to my parents\' house, where I lived for the rest of my life.', 'type': 'generative', 'score': None, 'context': None, 'offsets_in_document': None, 'offsets_in_context': None, 'document_id': None, 'meta': {'doc_ids': ['9c8741de3cce7dbf8bd252460724c09f', '14d28edaa30e79915e6171608d96edd3', '76e1c6240c6716da455e

In [None]:
print_documents(predictions)


Query: How do you retain the will to live and bravery against all the odds in life?

{   'content': 'Here, my viewpoint, it does not matter if free will exists or '
               'not. There is, however, a psychological benefit to believing '
               'that it does. Those who believe in free will are more likely '
               'to report, feeling happier more often than those that don. So '
               'from a utility standpoint, believing in free will is '
               'beneficial toMe and therefore I have to conclude that my '
               'actions matter. Title: Civ. Life is just random chance. '
               'Thread\n'
               'There is no god, no afterlife, no plan, no purpose, no '
               'meaning, no justice, nothing wheter or not. You get to live. A '
               'good life has nothing to do with how good you are. No, your '
               'life has been decided from the very beginning. Thanks to many '
               'circumstances you cann

In [None]:
[answer.to_dict()['answer'] for answer in predictions['answers']]

['I don\'t know if this is what you\'re looking for, but I\'ll give it a shot. When I was a kid, I was in a car accident. I didn\'t know what was going to happen to me, and I had no way of knowing what would happen to my family or friends. I was told that I would die, and that I had to get out of the car as fast as I could. So I did. I went to the hospital, and they told me that my family and friends would die if I stayed in the car. I said, "I\'m not going to die, I\'m going to fight for my family." I went back home, and my family came and picked me up. They took me to my parents\' house, where I lived for the rest of my life.']

In [None]:
print([answer.to_dict()['answer'] for answer in predictions['answers']][0])

I don't know if this is what you're looking for, but I'll give it a shot. When I was a kid, I was in a car accident. I didn't know what was going to happen to me, and I had no way of knowing what would happen to my family or friends. I was told that I would die, and that I had to get out of the car as fast as I could. So I did. I went to the hospital, and they told me that my family and friends would die if I stayed in the car. I said, "I'm not going to die, I'm going to fight for my family." I went back home, and my family came and picked me up. They took me to my parents' house, where I lived for the rest of my life.


## Generate Answer(s) using the previous pipeline

In [None]:
questions_list = None
with open('dataset/questions.txt', 'r') as questions:
  questions_list = [question.strip('\n') for question in questions.readlines()]

In [None]:
questions_list = [text for text in questions_list if text != '']

In [None]:
from collections import defaultdict
QA = defaultdict(list)

for question in tqdm(questions_list, desc='Generating Answer(s)'):
  QA['Question'].append(question)
  prediction = pipeline.run(query=question,
                                   params={"Retriever":
                                           {"top_k": 10},
                                           "Generator": {"top_k": 1}})
  QA['Answer'].append([answer.to_dict()['answer'] for answer in prediction['answers']][0])

Generating Answer(s):   0%|          | 0/6241 [00:00<?, ?it/s]

### Save the Dictionary of QA

In [None]:
import json

with open('dataset/QA.json', 'w') as qa:
  json.dump(dict(QA), qa, indent=2)

## The Caveat of the process

In [None]:
pd.set_option('display.max_colwidth', 1000)

df_QA = pd.DataFrame(json.load(open('dataset/QA.json', 'r')))

In [None]:
df_QA

Unnamed: 0,Question,Answer
0,"Given the choice of anyone in the world, whom would you want as a dinner guest?","I don't know if this counts as a question, but I'd like to know who you would like to have as a dinner guest."
1,Would you like to be famous?,"I'd like to be famous, but I don't think I'd be able to do anything with it."
2,In what way?,"I'm not sure what you mean by ""in what way"". If you mean in what way do you define ""good"" and ""bad"" in the same way?"
3,What would constitute a “perfect” day for you?,"It depends on what you mean by ""perfect"". For me, a perfect day would be a day where I didn't feel like I had to do anything, where I was free to do whatever I wanted, and where I felt like I was in charge of my own destiny."
4,When did you last sing to yourself?,"It's been a while since I last sang to myself, but I've been singing to myself a lot lately."
...,...,...
6236,What is Self-Expression and How to Foster It?,"Self-expression is the ability to express yourself in a way that makes you feel good about yourself. For example, if you want to feel better about yourself, you can look at yourself in the mirror and say, ""I'm a good person, and I'm not a bad person."" This is self-expression. If you don't feel like you're doing a good job of expressing yourself, then you can go to a therapist and ask for help."
6237,What is Self-Esteem?,"Self-Esteem is a measure of how you feel about yourself. It can be measured in a number of ways, but the most important one is how confident you are in yourself. If you are confident in yourself, you are more likely to do things that make you feel good about yourself, and you are less likely to feel bad about yourself if you don't do those things."
6238,What is Self-Actualization?,"Self-Actualization is the belief that you are capable of doing anything you put your mind to. For example, if I tell you that you can do anything you want, and you do it, you will feel good about yourself. However, if you tell me that I can't do anything, and I do it anyway, I will feel bad about myself. This is self-actualization."
6239,"What do I do, how do I feel when the circle I am drawing doesn’t include him?","I don't know what you're talking about, but I've had this happen to me since I was a little kid. I'm not sure if it's the same for everyone, but for me, it's when I'm drawing a circle and it doesn't include the person I want to include."
