One natural thing to try is to use a large language model to determine whether posts are relevant or not. This is what the Aware-NLP project last semester did, and it seems to work well with our data as well. Unfortunately, it runs very slowly, so probably won't be practical at this step.

In [4]:
from ollama import pull, chat, generate

In [6]:
ollama.pull('dolphin-mistral')
ollama.pull('mistral')

{'status': 'success'}

In [7]:
## Testing the model to make sure it is working.

messages = [
  {
    'role': 'user',
    'content': 'Tell me a joke.',
  },
]

response = chat('mistral', messages=messages)
print(response['message']['content'])

 Sure, here's a classic math joke for you:

Why was the math book sad?

Because it had too many problems!


In [9]:
response = generate('mistral', 'Tell me a joke.')
print(response['response'])

 Of course! Here's a classic one:

Why don't scientists trust atoms?

Because they make up everything!

Hope that brought a smile to your face. If you need more, just ask!


Note that generate takes a lot more time to respond than chat. I've found that 'dolphin-mistral' tends to be a bit faster than 'mistral', so I'm going to use that from here.

In [10]:
ollama_response = ollama.chat(model='dolphin-mistral', messages=[
  {
    'role': 'user',
    'content': 'What is a Riemannian manifold?',
  },
],
options = {
  #'temperature': 1.5, # very creative
  'temperature': 0.1 # very conservative (good for coding and correct syntax)
})

# Printing the response from ollama.
print(ollama_response['message']['content'])

A Riemannian manifold is a generalization of the concept of Euclidean space to higher dimensions and more complex geometries. It is a mathematical object that can be thought of as a smooth, curved surface embedded in a higher-dimensional Euclidean space. In other words, it is a differentiable manifold equipped with a Riemannian metric, which assigns a positive definite inner product to each pair of tangent vectors at each point on the manifold.

The Riemannian metric allows us to measure distances and angles between curves on the manifold, and it also determines the curvature of the manifold. This curvature can be either constant (as in Euclidean space) or variable, depending on the specific properties of the manifold.

Riemannian manifolds are important in various fields of mathematics and physics, such as differential geometry, general relativity, and statistical mechanics. They provide a framework for studying geometric structures and their associated properties, such as curvature, 

The answer is correct, but required 3.5 minutes to generate.

#### Running the model on data from r/bpd

Last semester, the Aware NLP project used ollama to determine whether posts on reddit were relevant. We can try their method to see if it works.

In [11]:
prompt_dict = {
    0: """<s>[INST] <<SYS>>
            The following statement (delimited by ```) provided below is a post from the subreddit r/bpd. 
            The statement should be taken as is. It cannot be used to further a dialogue with the poster. 
            
            Please format the output as a dictionary with the following keys: "relevant", "reason". 
            Relevant should be a boolean value indicating whether the post discusses the medical management of borderline personality disorder.
            Reason should be a string explaining why and how the statement is or is not relevant.
            <</SYS>>
            ```
            Statement: {statement}
            ```
            [/INST]""",
    1: """<s>[INST] <<SYS>>
            The following statement (delimited by ```) provided below is a post from the subreddit r/bpd. 
            The statement should be taken as is. It cannot be used to further a dialogue with the poster.
            The statement should be labeled as relevant (true) if it provides an account (positive, negative, or neutral) 
            of their experience using a certain medication or therapy in order to tream borderline personality disorder. 

            Please format the output as a dictionary with the following keys: "relevant", "reason". 
            Relevant should be a boolean value indicating whether the statement is relevant to the question.
            Reason should be a string explaining why and how the statement is or is not relevant.
            <</SYS>>
            ```
            Statement: {statement}
            ```
            [/INST]"""

}

In [12]:
statement = '"i’ve been in and out of therapy since i was ten or eleven. i have a therapist now who i’ve known for almost 12 years and i like him a lot. but cbt does not work for me and never has. i don’t want to stop seeing him, though. i did look into dbt therapists in my county and none of them so far take my insurance. i don’t even know if my insurance would pay for the sessions because many don’t when you have a personality disorder. i also have a psychiatrist (i’ve had quite a few). i’ve lost count of how many medications i’ve been on. i had my monthly session with him yesterday and he was visibly frustrated. not at me, but because no medication i’ve tried or am currently on is working like it should. he really wants me to continue therapy because he told me, “i don’t know how else to help you”. i know that he means medication can only do so much and therapy is what will help the best. but it still hurt. he also told me that bpd is extremely hard to treat (which i already know). it really me a lot and now i feel more hopeless and helpless than i ever have, i think. i don’t even want to make another appointment with therapist. if my current one can’t adjust his techniques, or i can’t find a dbt therapist, what am i supposed to do? what if i *do* find a dbt therapist and don’t get anything out of that either? why won’t my medications just work?? i have so many unanswered questions and i’m so angry i just want to give up on treatment altogether."'

In [14]:
ollama_response = ollama.chat(model='dolphin-mistral', messages=[
  {
    'role': 'user',
    'content':  """ The following statement (delimited by ```) provided below is a post from the subreddit r/bpd. 
            The statement should be taken as is. It cannot be used to further a dialogue with the poster. 
            
            Please format the output as a dictionary with the following keys: "relevant", "reason". 
            Relevant should be a boolean value indicating whether the post discusses the poster's experience with therapy and medications for the treatment of borderline personality disorder.
            Reason should be a string explaining why and how the statement is or is not relevant.
            <</SYS>>
            ```
            Statement: statement = '"i’ve been in and out of therapy since i was ten or eleven. i have a therapist now who i’ve known for almost 12 years and i like him a lot. but cbt does not work for me and never has. i don’t want to stop seeing him, though. i did look into dbt therapists in my county and none of them so far take my insurance. i don’t even know if my insurance would pay for the sessions because many don’t when you have a personality disorder. i also have a psychiatrist (i’ve had quite a few). i’ve lost count of how many medications i’ve been on. i had my monthly session with him yesterday and he was visibly frustrated. not at me, but because no medication i’ve tried or am currently on is working like it should. he really wants me to continue therapy because he told me, “i don’t know how else to help you”. i know that he means medication can only do so much and therapy is what will help the best. but it still hurt. he also told me that bpd is extremely hard to treat (which i already know). it really me a lot and now i feel more hopeless and helpless than i ever have, i think. i don’t even want to make another appointment with therapist. if my current one can’t adjust his techniques, or i can’t find a dbt therapist, what am i supposed to do? what if i *do* find a dbt therapist and don’t get anything out of that either? why won’t my medications just work?? i have so many unanswered questions and i’m so angry i just want to give up on treatment altogether."'
            ```, """
  },
],
options = {
  #'temperature': 1.5, # very creative
  #'temperature': 0.1 # very conservative (good for coding and correct syntax)
})

# Printing the response from ollama.
print(ollama_response['message']['content'])

{
    "relevant": true,
    "reason": "The statement discusses the poster's experience with therapy and medications for the treatment of borderline personality disorder. The poster mentions trying various therapies (CBT and Dialectical Behavior Therapy), not finding a suitable therapist, frustration over unsuccessful medications, and feeling hopeless."
}


This worked incredibly well, but took 2 minutes to analyze a single post. Therefore, unless we have access to something to do this rapidly, this won't be practical for analyzing the full dataset. However, this tool might be really useful for generating the consensus of a large number of posts.