#### Creating a dataframe

Given the time-crunch, one possible deliverable is a high quality dataframe with enough processed data to do sentiment analysis with a LLM. I will spend a bit of time working on this.

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot  as plt

In [2]:
df=pd.read_csv('../data/highly_relevant_posts_descending_threshold_50.csv')

In [3]:
df.columns

Index(['title', 'selftext', 'created_utc', 'over_18', 'subreddit',
       'date_created', 'combined_text', 'processed_text',
       'relevance_probability'],
      dtype='object')

In [4]:
df.drop(['subreddit','created_utc','over_18'], axis=1)

Unnamed: 0,title,selftext,date_created,combined_text,processed_text,relevance_probability
0,"Informal survey: What (legal) drugs, if any, '...","hi! so i'm officially diagnosed, tried dbt but...",2020-10-17 16:53:14,"Informal survey: What (legal) drugs, if any, '...",informal survey legal drug worked hi im offici...,1.000000
1,What combination of meds finally worked for yo...,right now i’m on:\nprozac 40mg\nwellbutrin xl ...,2020-06-25 14:47:27,What combination of meds finally worked for yo...,combination med finally worked hasn’t worked r...,1.000000
2,Does anyone have any experience(s) with any an...,"i abused benzodiazepines, so i cannot trust my...",2015-10-12 02:36:25,Does anyone have any experience(s) with any an...,anyone experience anxiolytic medicine besides ...,1.000000
3,My experience with lamictal/lamotrigine,this is going to be a very long post because i...,2021-08-12 14:13:55,My experience with lamictal/lamotrigine this i...,experience lamictallamotrigine going long post...,1.000000
4,How did antidepressants modify your behavior a...,"hello, i'd like to know :\n\n1) if you've been...",2020-01-31 18:47:16,How did antidepressants modify your behavior a...,antidepressant modify behavior cognition hello...,1.000000
...,...,...,...,...,...,...
2551,Meds seem to not be working.,i keep adding another flare but when i post an...,2021-09-13 22:10:20,Meds seem to not be working. i keep adding ano...,med seem working keep adding another flare pos...,0.500361
2552,Do you ever feel like your medication is actua...,i've been on lamotrigine for 7 months now. i w...,2020-05-08 03:11:07,Do you ever feel like your medication is actua...,ever feel like medication actually making feel...,0.500314
2553,Meds question,hi i was just wondering is anyone on mood stab...,2019-01-14 13:16:56,Meds question hi i was just wondering is anyon...,med question hi wondering anyone mood stabiliz...,0.500182
2554,"I feel much happier, but also far more depress...",i started dating my first boyfriend a couple m...,2021-01-08 05:41:46,"I feel much happier, but also far more depress...",feel much happier also far depressed boyfriend...,0.500148


In [1]:
#pip install drug-named-entity-recognition

Defaulting to user installation because normal site-packages is not writeable
Collecting drug-named-entity-recognition
  Obtaining dependency information for drug-named-entity-recognition from https://files.pythonhosted.org/packages/82/9c/36f8d3e85eeb22283310120064da53c526c028d3aa6600c44508eeb0843c/drug_named_entity_recognition-1.0.3-py3-none-any.whl.metadata
  Downloading drug_named_entity_recognition-1.0.3-py3-none-any.whl.metadata (17 kB)
Downloading drug_named_entity_recognition-1.0.3-py3-none-any.whl (1.1 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.1/1.1 MB[0m [31m5.3 MB/s[0m eta [36m0:00:00[0m00:01[0m00:01[0m
[?25hInstalling collected packages: drug-named-entity-recognition
Successfully installed drug-named-entity-recognition-1.0.3

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m23.2.1[0m[39;49m -> [0m[32;49m24.0[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49

In [5]:
from drug_named_entity_recognition import find_drugs

[({'name': 'Acetaminophen', 'synonyms': {'Acetamidophenol', 'Datril', 'Tylenol', 'Acetominophen', 'Acetaco', 'Actamin', 'Ofirmev', 'Paracetamol', 'Acetaminophen', 'Hydroxyacetanilide', 'Algotropyl', 'Acamol', 'Acenol', 'Panadol', 'Paracetamolum', 'Acephen'}, 'medline_plus_id': 'a621016', 'generic_names': ['Acetaminophen'], 'mesh_id': 'D058633', 'drugbank_id': 'DB00316', 'wikipedia_url': 'https://en.wikipedia.org/wiki/Paracetamol'}, 2, 2)]


In [6]:
def get_drug_list(text):
    drug_tuples = find_drugs(text.split(" "), is_ignore_case=True)
    drug_names = [drug[0] for drug in drug_tuples]
    drug_list= [drug.get('name') for drug in drug_names]
    unique_drug_list = sorted(set(drug_list))
    return unique_drug_list

In [7]:
df['drug_list'] =df['processed_text'].apply(get_drug_list)

#### Building a dictionary for searching posts by medication

I figure this can be useful for searching for posts by medication. To do this, let's define a dictionary.

In [8]:
from collections import defaultdict


def build_medication_index(df, column_name):
    med_index = defaultdict(list)
    for idx, row in df.iterrows():
        for med in row[column_name]:
            med_index[med].append(idx)
    return med_index

def find_medication_with_index(med_index, df, medication_name):
    if medication_name in med_index:
        matching_indices = med_index[medication_name]
        return df.loc[matching_indices]
    else:
        return pd.DataFrame()  # Return empty DataFrame if medication not found

# Build the index
med_index = build_medication_index(df, 'drug_list')

# Call the function with a specific medication name
medication_name = 'Lamotrigine'
result = find_medication_with_index(med_index, df, medication_name)
print(result)


                                                  title  \
0     Informal survey: What (legal) drugs, if any, '...   
3               My experience with lamictal/lamotrigine   
8     What medication(s) work best for when you have...   
12    My first time Inpatient: how I learned to advo...   
14    Tomorrow I am seeing my doctor pertaining to m...   
...                                                 ...   
2494                 Lamictal experiences/side effects?   
2495                                    Lamictal Online   
2544                          Paranoia is getting worse   
2545           TW - have i took enough to cause damage?   
2552  Do you ever feel like your medication is actua...   

                                               selftext  created_utc  over_18  \
0     hi! so i'm officially diagnosed, tried dbt but...   1602953594    False   
3     this is going to be a very long post because i...   1628777635    False   
8     i am formally diagnosed with severe genera

In [9]:
def get_synonyms(medication):
    drug_tuples = find_drugs(medication.split(" "), is_ignore_case=True)
    drug_names = [drug[0] for drug in drug_tuples]
    drug_list= [drug.get('synonyms') for drug in drug_names]
    return drug_list

In [10]:
get_synonyms(medication_name)

[{'Crisomet',
  'Labileno',
  'Lamictal',
  'Lamiktal',
  'Lamotrigina',
  'Lamotrigine',
  'Lamotriginum'}]

#### Trying to use a LLM to generate a consensus report

In [13]:
pip install ollama

Defaulting to user installation because normal site-packages is not writeable

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m23.2.1[0m[39;49m -> [0m[32;49m24.0[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip3 install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


In [14]:
import ollama
from ollama import generate, chat, pull

In [15]:
ollama.pull('dolphin-mistral')


KeyboardInterrupt: 

In [None]:
ollama.pull('mistral')
ollama.pull('phi3')
ollama.pull('tinyllama')
ollama.pull('all-minilm')

In [24]:
#Getting a joke to see if things are working

ollama_response = ollama.chat(model='dolphin-mistral', messages=[
  {
    'role': 'user',
    'content': 'Tell me a joke.',
  },
],
options = {
  #'temperature': 1.5, # very creative
  #'temperature': 0.1 # very conservative (good for coding and correct syntax)
})

# Printing the response from ollama.
print(ollama_response['message']['content'])

Why don't scientists trust atoms? Because they make up everything!


#### Testing models

The idea is to run an LLM on all of the texts that discuss the medication to determine sentiment. However, this will require some careful model selection and prompt engineering to get reasonable results. What follows in this section are a large number of experiments to try to improve performance and speed.

In [80]:
text = """

so far i've been on celexa, lexapro, paxil, zoloft, adderall, vyvanse, klonopin, xanax, gabapentin, clonidine, seroquel, wellbutrin, perphenazine. 

ones that worked: adderall, vyvanse, klonopin, xanax, gabapentin

unfortunately i can't get those (except gabapentin) anymore bc i was abusing a lot of drugs and had to go to rehab so my dr won't give me them.

right now i just got off clomipramine. trying to find a substitute for it because i have no energy, motivation, etc. my dr mentioned snris but i'm scared to try.

what meds are you on? which ones worked?"""

message = 'The passage enclosed in ``` ``` contains a discussion about the medication ' +  medication_name + ' (possibly in terms of one of its synonyms, which can be found in the following list: ' + str(get_synonyms(medication_name)).strip('[,],{,}') + '). Ignore all the other medications and treatments listed in the post and respond with two outputs. The first output should be 1 if the poster had a positive experience with the medication, -1 if the experience was negative, 0 if the experience was mixed and blank if there is not enough information to determine outcome. The second output should be single sentence describing the experience the poster had with this particular medication. ```' + text + '```'

In [102]:
message = 'The passage enclosed in ``` ``` contains some discussion about the medication ' +  'adderall' + """. The response should only consider this medication and none others. Please format the output as a dictionary with the following keys, "sentiment", "reason". 
            For "sentiment," respond with '1' if the poster had a positive experience with the medication. Respond with  '-1' if the poster had a negative experience with the medication. Respond with '0' if the poster had a mixed experience and respond with '-' if there is not enough information to determine this information.
            "Reason" should be a short sentence about the experience the poster had with the medication, including whether it was effective, had side effects, interacted with other medications or they had addiction issues. ```""" + text + '```'

In [79]:
print(message)

The passage enclosed in ``` ``` contains some discussion about the medication Perphenazine. Ignore all the other medications and treatments listed in the post. Please format the output as a dictionary with the following keys: "sentiment", "reason". 
            "Sentiment" should be value indicating whether the user had a positive or negative experience with the medication. Use '1' for a positive experience, '-1' for negative, '0' for mixed and '-' if there is not enough information to determine.
            "Reason" should be a string explaining the experience the poster had. ```what worked and what didn't?

so far i've been on celexa, lexapro, paxil, zoloft, adderall, vyvanse, klonopin, xanax, gabapentin, clonidine, seroquel, wellbutrin, perphenazine. 

ones that worked: adderall, vyvanse, klonopin, xanax, gabapentin

unfortunately i can't get those (except gabapentin) anymore bc i was abusing a lot of drugs and had to go to rehab so my dr won't give me them.

right now i just got of

In [67]:
def analyze_text_with_llama(texts,message):
    responses = []
    for text in texts:
        ollama_response = ollama.chat(model='dolphin-mistral', messages=[
            {
                'role': 'user',
                'content': message
                },
                    ],
                options = {
                    #'temperature': 1.5, # very creative
                    #'temperature': 0.1 # very conservative (good for coding and correct syntax)
                    })
        responses.append(response.choices[0].text.strip())
    return responses

In [73]:
ollama_response = ollama.chat(model='dolphin-mistral', messages=[
            {
                'role': 'user',
                'content': message
                },
                    ],
                options = {
                    #'temperature': 1.5, # very creative
                    #'temperature': 0.1 # very conservative (good for coding and correct syntax)
                    })
print(ollama_response['message']['content'])

The passage discusses the medication Perphenazine (synonyms: 'Perphenazin', 'Perfenazine', 'Chlorpiprazine', 'Ethaperazine', 'Trilafon', 'Etaperazine', 'Etaperazin', 'Perphenazine', 'Perphenazinum', 'Perfenazina'). The poster has a mixed experience with the medication. They mention that they took Perphenazine as part of their previous treatment, but they were unable to get it anymore due to drug abuse issues. The poster also mentions trying other medications like celexa, lexapro, paxil, zoloft, adderall, vyvanse, klonopin, xanax, gabapentin, clonidine, seroquel, wellbutrin, and clomipramine. The meds that worked for them were Adderall, Vyvanse, Klonopin, Xanax, and Gabapentin. They are currently looking for a substitute for Clomipramine due to low energy levels.


The above response is accurate but took 10 minutes to process a single post. It also does not focus on the medication of interest. We will need to drastically speed this up by a combination of model selection and prompt engineering. I started with trying tinyllama. These runs went faster, but returned garbage. However, the prompt engineering helped to improve the performance, so I returned to a larger model and tried to make the instructions clear to improve performance and speed.

In [103]:
ollama_response = ollama.chat(model='dolphin-mistral', messages=[
            {
                'role': 'user',
                'content': message
                },
                    ],
                options = {
                    #'temperature': 1.5, # very creative
                    #'temperature': 0.1 # very conservative (good for coding and correct syntax)
                    })
print(ollama_response['message']['content'])

"sentiment": 1, "reason": "The poster had a positive experience with Adderall and Vyvanse, mentioning that they work for them."


I really want to see mixed here, since it helped their symptoms but they developed dependency issues. That is extremely subtle though. Let's use this prompt with the smaller model to see what happens.

In [104]:
ollama.pull('medllama2')

{'status': 'success'}

In [100]:
ollama_response = ollama.chat(model='tinyllama', messages=[
            {
                'role': 'user',
                'content': message
                },
                    ],
                options = {
                    #'temperature': 1.5, # very creative
                    #'temperature': 0.1 # very conservative (good for coding and correct syntax)
                    })
print(ollama_response['message']['content'])

Response:

Sentiment: 1
Reason: Positive experience with medication Adderall

Sentiment: -1
Reason: Negative experience with medication Additional side effects, interactions, and dependency issues.

Reason: Mixed experience with medication Gabapeintin as I abused a lot of drug during my rehab process.

Sentiment: 0
Reason: No enough information to determine if there is not enough information about the experience with medication Additional side effects, interactions, and dependency issues.


That's not useful. Let's try a medical LLM

In [106]:
ollama_response = ollama.chat(model='medllama2', messages=[
            {
                'role': 'user',
                'content': message
                },
                    ],
                options = {
                    #'temperature': 1.5, # very creative
                    #'temperature': 0.1 # very conservative (good for coding and correct syntax)
                    })
print(ollama_response['message']['content'])

"Sentiment": "1", as the poster had a positive experience with Adderall. 
"Reason": "Adderall was effective in controlling my ADHD symptoms and helped me focus." However, they also mention that they were abusing other drugs and cannot currently get those medications due to this history.


This assumed he had ADHD, which was not stated in the post. Therefore, I don't think we should use this model.

In [107]:
ollama_response = ollama.chat(model='phi3', messages=[
            {
                'role': 'user',
                'content': message
                },
                    ],
                options = {
                    #'temperature': 1.5, # very creative
                    #'temperature': 0.1 # very conservative (good for coding and correct syntax)
                    })
print(ollama_response['message']['content'])

{
  "sentiment": "-1",
  "reason": "The poster had a negative experience with Adderall and other medications due to abuse leading to rehab."
}


Let's try a less ambiguous response.

In [109]:
text2="""okay, right now i am currently on 200mg lamotrigine a day. no side effects other than extremely tiredness, almost somnolence in the evening. which is no problem, because i have had insomnia for years. i do wake up two hours after falling asleep, then i need to go to my bathroom and take a promethazine anti-histamine i was prescribed to fall asleep, and then i sleep until about 6 in the morning (going to bed between 9:30 and waking up 06:00 in the morning.

i can feel the lamotrigine making my mood changes much better. it's still not regulated totally, i still cry from time to time, but now it's like i can't cry. which relieves me alot because i used to cry every day.

but my mood is still changing at at stage where it's still frustrating me and i'm bored to death. still got the feeling the everything i do is meaningless because we're all going to die at one time or another.
__________________________
but, the question is, anybody with bpd that get's a higher dose than 200mg a day? i take 100mg in the morning at 06:00 am and another at 06:00 pm. 12 hours between to keep blood values stable.

and anyone that just needed another medicine like an ssri or something together with lamotrigine, i read that someone needed that. but ssri's have always made me tired, fat and inactive. any ssri's with least side effects?

i also read that an anti-psychotic would work nice together with lamotrigine, especially abilify.
i am not happy about anti-psychotics because it think that i am a victim of long term side effects such as inactive libido, but maybe abilify is better than quetiapin i got for two years and became almost impossible to get an erection.

i also read that lithium does wonders for some people with unstable mood?

can anyone confirm any of this and tell me their "wonder combination" to get my motivation and happiness back in life?

looking forward to get some answers.
thank you in advance.

- angerfox"""

In [115]:
message = 'The passage enclosed in ``` ``` contains some discussion about the medication ' +  'Lamotrigine' + """. The response should only consider this medication and none others. Please format the output as a dictionary with the following keys, "sentiment", "reason". 
            For "sentiment," respond with '1' if the poster had a positive experience with the medication. Respond with  '-1' if the poster had a negative experience with the medication. Respond with '0' if the poster had a mixed experience and respond with '-' if there is not enough information to determine this information.
            "Reason" should be a short sentence about the experience the poster had with the medication, including effectiveness, side effects, interactions with other medications or addictions. ```""" + text2 + '```'

In [111]:
ollama_response = ollama.chat(model='dolphin-mistral', messages=[
            {
                'role': 'user',
                'content': message
                },
                    ],
                options = {
                    #'temperature': 1.5, # very creative
                    #'temperature': 0.1 # very conservative (good for coding and correct syntax)
                    })
print(ollama_response['message']['content'])

{
  "sentiment": 1,
  "reason": "Lamotrigine seems to be effective for the user's mood changes with minimal side effects."
}


That's pretty good. It's still slow, but that's to be expected. Let me try putting two posts together to see the run time.

In [112]:
text3= """hey guys,

little bit of a back story for context:

i've been admitted to a psychiatric hospital due a recent suicide attempt.

before i was admitted i was taking 150mg venlafaxine (effexor) 200mg lamotrigine (lamictal) and 25-50 quetiapine (seroquel prn)

since my admission my doses have been increased to 225mg venlafaxine, 300 lamotrigine and 50ir seroquel and 50xr seroquel note as well as prn 10mg valium. my psych knows that i am recovering from an eating disorder so i am quite picky and choosy about what type of medications i am on, luckily i haven't gained weight since taking the prn seroquel for the past year (it can be on and off, sometimes one or two a night but then i can go two weeks without) she knows that i am concerned that i have increased the seroquel regularly at note, which she reassures me is a short term thing.

so, anyway - the reason for this post is that i have been started on 2.5mg of abilify, creating a large cocktail of medications. does anybody have any experience being on this combination and what is your story?

thankyou in advance"""

In [113]:
message = 'The passages enclosed in ``` ``` contains some discussion about the medication ' +  'Lamotrigine' + """. The response should only consider this medication and none others. For each passage, please format the output as a dictionary with the following keys, "sentiment", "reason". 
            For "sentiment," respond with '1' if the poster had a positive experience with the medication. Respond with  '-1' if the poster had a negative experience with the medication. Respond with '0' if the poster had a mixed experience and respond with '-' if there is not enough information to determine this information.
            "Reason" should be a short sentence about the experience the poster had with the medication, including whether it was effective, had side effects, interacted with other medications or they had addiction issues. ```""" + text2 + '```, ``` '+ text3 + '```'

In [114]:
ollama_response = ollama.chat(model='dolphin-mistral', messages=[
            {
                'role': 'user',
                'content': message
                },
                    ],
                options = {
                    #'temperature': 1.5, # very creative
                    #'temperature': 0.1 # very conservative (good for coding and correct syntax)
                    })
print(ollama_response['message']['content'])

KeyboardInterrupt: 

I terminated this after half an hour. It is running too slowly.

In [116]:
ollama_response = ollama.chat(model='tinyllama', messages=[
            {
                'role': 'user',
                'content': message
                },
                    ],
                options = {
                    #'temperature': 1.5, # very creative
                    #'temperature': 0.1 # very conservative (good for coding and correct syntax)
                    })
print(ollama_response['message']['content'])

KeyboardInterrupt: 