## Conversation Ending
### Goal

Can I use a model to tell me when a conversation has ended?

### Method

Start gathering up non-threaded messages in a channel. Use a UDF to call a model to determine if the conversation is continuing or new.

Use [Ray remote Actors](https://docs.ray.io/en/latest/ray-core/actors.html) to gather messages and call the LLM. Only include messages that are part of the current conversation.

#### Install the tools, initiate the things

In [None]:
%pip install -q kaskada==0.6.0a4 openai llama-cpp-python ipywidgets

In [None]:
pip install -U "ray[all]"

In [None]:
# note that we set the OPENAI_API_KEY as a Ray runtime variable, instead initializing the openai library like we did in all the other examples

import ray, getpass, openai

runtime_env = {
    "env_vars": {"OPENAI_API_KEY": getpass.getpass('OpenAI: API Key')}
}

ray.init(runtime_env=runtime_env)

In [None]:
import pandas as pd
import kaskada as kd

# Initialize Kaskada with a local execution context.
kd.init_session()

# set pandas to display all floats with 6 decimal places
pd.options.display.float_format = '{:.6f}'.format

#### Pull in the user list, create a `format_user()` method

In [None]:
users_df = pd.read_json("slack-generation.users.json")

columns_to_keep = ["id", "team_id", "name", "deleted", "real_name", "is_bot", "updated"]

users_df.drop(columns=users_df.columns.difference(columns_to_keep), inplace=True)

users = {}
for user in users_df.to_dict(orient='index').values():
    users[user["id"]] = user

In [None]:
def get_user(user_id):
    return users[user_id] if user_id in users.keys() else None

def format_user(user_id):
    user = get_user(user_id)
    return f"{user['name']} ({user_id})" if user else f"({user_id})"

format_user("UBB9D2B01")

#### Load the slack data, clean the message text, format message users

In [None]:
# Load events from a Parquet file
#
# if you wan to load in your own slack data, change this to the path of your output file from 1.1 above
# otherwise continue with `slack-generation.parquet`, which contains generated slack data for
# example purposes. See the `slack-generation/notebook.ipynb` notebook for more info.
input_file = "slack-generation.parquet"

# Use the "ts" column as the time associated with each row,
# and the "channel" column as the entity associated with each row.
raw_msgs = await kd.sources.Parquet.create(
    input_file,
    time_column = "ts",
    key_column = "channel",
    time_unit = "s"
)
raw_msgs.preview(5)

In [None]:
import json

@kd.udf("f<N: any>(x: N) -> string")
def format_users(batch: pd.Series):
    # Apply to each row in the batch
    return batch.map(format_user)

In [None]:
# Clean Text
import re

def strip_code_blocks(line):
    return re.sub(r"```.*?```", '', line)

def user_repl(match_obj):
    user_id = match_obj.group(1)
    return format_user(user_id)

def update_users(line):
    return re.sub(r"<@(.*?)>", user_repl, line)

def clean_message(text):
        text = strip_code_blocks(update_users(text)).strip()
        return None if text == "" else text

@kd.udf("f<N: any>(x: N) -> string")
def clean_text(batch: pd.Series):
    # Apply to each row in the batch
    return batch.map(clean_message)

In [None]:
formatted_msgs = raw_msgs.extend({
    "text": raw_msgs.col("text").pipe(clean_text),
    "user": raw_msgs.col("user").pipe(format_users)
})
formatted_msgs.preview(5)

In [None]:
thread_ts = formatted_msgs.col("thread_ts")

# split messages into two subgroups: threads and non-threads
non_threads = formatted_msgs.filter(thread_ts.is_null())

# hack in a new, empty string column: `is_new`
non_threads = non_threads.extend({"is_new": non_threads.col("text").substring(0,0)})

@kd.udf("f<N: any>(x: N) -> string")
def format_message(batch: pd.Series):
    def formatter(raw):
        return f"{raw['user']} --> {raw['text']}" # --> {raw['reactions']}"
    return batch.map(formatter)

# prefix message with user
non_threads = non_threads.extend({"text": non_threads.select("user", "text").pipe(format_message)})

non_threads.preview(5)

#### Setup instructions and few-shot learning examples for the LLM

Note: These are identical to the instructions in the `ConversationEnding_v1.ipynb` file.

In [None]:
system = """
You are a helpful assistant. You will be passed an existing conversation and a next
line. Your job is to determine if the next line is part of the existing conversation
or the start of a new conversation. You should respond `yes` if you think the line
is part a new conversation or `no` otherwise. No explanation is needed.

Lines of the conversation, and the next line, will be passed in plain text, where the
user and their text is separated by an arrow like this: `-->`.

The user field contains an username and an user_id in parenthesis, like this:
`name (U1292934)` the username is lowercase and could match names in the conversation
text in a case-insensitive way.

Inside a conversation, `---` characters on their own line indicate that the next line will
contain the text from the next user in the conversation. Conversations may contain no
lines. When this is the case, the next line should always be a new conversation.

The conversation will be prefixed by `Conversation:` on it's on line and the next line
will be prefixed by `Next Line:` on its own line.
"""

user_empty_convo = """
Conversation:


Next Line:
userc (UFB3DA5BF) --> Risk mitigation is indeed essential, especially when we are relying
on real-time data for our inventory tracking and resource allocation. Let's prioritize
this aspect and design our system to be resilient.
"""

assistant_empty_convo = "yes"

user_existing_convo = """
Conversation:
userc (UFB3DA5BF) --> Risk mitigation is indeed essential, especially when we are relying
on real-time data for our inventory tracking and resource allocation. Let's prioritize
this aspect and design our system to be resilient.
---
userf (UEA27BBFF) --> Scaling our system effectively will be crucial for accommodating
future growth. We should also keep an eye on performance metrics and fine-tune
our resource allocation strategies as needed.


Next Line:
userb (UBB9D2B01) --> Sounds good, UserC. A short break will be refreshing. I'll be
back in 15 minutes with more ideas for the next steps.
"""

assistant_existing_convo = "no"

user_new_convo = """
Conversation:
userc (UFB3DA5BF) --> Risk mitigation is indeed essential, especially when we are relying
on real-time data for our inventory tracking and resource allocation. Let's prioritize
this aspect and design our system to be resilient.
---
userb (UBB9D2B01) --> Sounds good, UserC. A short break will be refreshing. I'll be
back in 15 minutes with more ideas for the next steps.


Next Line:
userb (UBB9D2B01) --> Good afternoon, everyone! The topic of multi-cloud strategies
is intriguing. I'm excited to explore how it can help us achieve high availability
for our inventory tracking tool.
"""

assistant_new_convo = 'yes'

ai_prompt_messages = [
        {"role": "system", "content": system},
        {"role": "user", "content": user_empty_convo},
        {"role": "assistant", "content": assistant_empty_convo},
        {"role": "user", "content": user_existing_convo},
        {"role": "assistant", "content": assistant_existing_convo},
        {"role": "user", "content": user_new_convo},
        {"role": "assistant", "content": assistant_new_convo},
      ]

#### Create a Ray remote Actor to determine if the next message is part of a new conversation

* We will declare a new instance of this class for each slack-channel in our data set
* The class will keep a list of recent messages it believes are part of the current conversation
* For each new message, the class will ask the LLM if the new message is part of the current conversation or a new conversation
  * If part of the current conversation, we add the message to the message list for the next call
  * If not part of the current conversation, we reset the message list to only contain the current message
* We return the result back to the calling function, either: `yes`, `no`, or `timeout` if the ChatCompletion API errors after 3 attempts

In [None]:
import time

@ray.remote
class MessageAnalysis:
    def __init__(self):
        self.messages = []

    def is_new_conversation(self, next_line) -> bool:
        result = self._ask_open_ai(next_line)
        if result == "yes":
            self.messages = [next_line]
            return True
        else: # `timeout` or `no`
            self.messages.append(next_line)
            return False

    def _ask_open_ai(self, next_line) -> str:
        user_text = "Conversation:\n" + "\n---\n".join(self.messages) + "\n\n\nNext Line:\n" + next_line

        prompt = ai_prompt_messages.copy()
        prompt.append({"role": "user", "content": user_text})

        attempts = 0
        while True:
            try:
                attempts += 1
                completion = openai.ChatCompletion.create(
                    # model choices: gpt-4, gpt-4-32k, gpt-3.5-turbo, gpt-3.5-turbo-16k
                    model="gpt-3.5-turbo",
                    messages=prompt,
                    temperature=0
                )
                return completion.choices[0].message.content
            except Exception as exp:
                # try multiple times to bypass the openAI token rate limits
                print(exp)
                if attempts > 3:
                    return "timeout"
                time.sleep(attempts * 5)

##### Create actors and run them on the messages from Slack. 

For now, quit after the first 10 messages. (We are just trying to demonstrate this method, no need to spend more money at OpenAI.)

In [None]:
actors = {}
results = []

async for row in non_threads.run_iter(kind="row"):
    channel = row["channel"]
    next_line = row["text"]
    if channel not in actors:
        # create a remote actor
        actors[channel] = MessageAnalysis.remote()
    # create a future to determine if the conversation is new
    results.append(actors[channel].is_new_conversation.remote(next_line))
    if len(results) > 10:
        break

# await and get results of all the futures
ray.get(results)