# MSOE HacksGiving "Skippy" LLM Experiments

This is the main journal for all Large Language Model (LLM) experiments with the "Skippy" algorithm developed as part of the 2023 MSOE HacksGiving event on November 17th, 2023, and November 18th, 2023. This algorithm allows any data to be embedded for easy accessing and model citation, allows easy contant-tuning for changes to the responses from the LLM, and also receives great respones for a question-thought-response framework with an open-source LLM [Llama.cpp](https://github.com/ggerganov/llama.cpp).

Llama.cpp was chosen as a demonstrative experiment due to its small size (allowing it to be run on many consumer-grade hardware), Llama's proficiency in chat-bot related tasks, and its small size due to its implementation in C/C++ without the use of external ML/Tensor libraries. Lessoning the amount of external libraries used also increases the security of this system.

This solution has been proven to work using a basic T4 ("teaching") node on the [MSOE ROSIE Supercomputer](https://www.msoe.edu/about-msoe/news/details/meet-rosie/).

A diagram of the full solution, included in this GitHub repository, is outlined below:
![image](https://github.com/Benja-Pauls/Next-Step-Clinic-Patient-Intake-Pipeline/assets/73416124/a3bc00f8-c949-49ea-a5cc-7b0c88a7060a)

In [12]:
# [OPTIONAL] Install llama.cpp here!
# %env CMAKE_ARGS=-DLLAMA_CUBLAS=on
# %env FORCE_CMAKE=1
# %pip install llama-cpp-python --force-reinstall --upgrade --no-cache-dir --no-clean

In [13]:
from llama_cpp import Llama
from llama_cpp import ChatCompletionRequestResponseFormat, ChatCompletionMessageToolCall
import pandas as pd
from tqdm import tqdm
import numpy as np
import re
import os

# Constants
Each of these constants are surface-level changes that any developer could make to slightly tune the system to different datasets, alternative LLMs, and getting different responses from the internal LLMs.

**Constant's Descriptions:**
* **DATA_TO_EMBED**: List of file paths of the csv/excel sheets that you would like to embed and have the Chat-bot have access to. Note, these files should not contain any sensitive information that Next Step Clinic would not like any user to have access to.
* **LLM_FILE_PATH:** File path of the Large Langauge Model (LLM) that will be used throughout this chat-bot's prompt/thought/response framework.
* **TEMPERATURE:** The "temperature" is a common parameter for LLMs, and quantifies the risk that the LLM should take in its responses. Changing this constant here will change the metric for all LLM responses. You can read more about temperature [here](https://medium.com/@lazyprogrammerofficial/what-is-temperature-in-nlp-llms-aa2a7212e687).
* **COMMUNICATIVE_LLM_SYSTEM_CONFIG:** The "system prompt" for the main llm responsible for speaking with the user. A system prompt is the "back story" of a model and is the main way to alter the behavior of the LLM.
* **RESPONSE_FILTER_LLM_SYSTEM_CONFIG:** The "system prompt" for the main llm responsible for filtering the user's questions in the case that they're deemed inappropriate for the main communicative chat bot to respond to. A system prompt is the "back story" of a model and is the main way to alter the behavior of the LLM.
* **TOOL_CHOOSER_LLM_SYSTEM_CONFIG:** The "system prompt" for the main llm responsible for choosing which databases should be queried from based on the current profile that has been built of the user. A system prompt is the "back story" of a model and is the main way to alter the behavior of the LLM.
* **PROFILE_BUILDER_LLM_SYSTEM_CONFIG:** The "system prompt" for the main llm responsible for building a profile of the user as it has a conversation with the chat bot. A system prompt is the "back story" of a model and is the main way to alter the behavior of the LLM.

In [14]:
DATA_TO_EMBED = ['data/NS_Providers.xlsx', 'data/ASD Videos.csv', 'data/Blog Data.csv']

In [15]:
LLM_FILE_PATH = '/data/ai_club/llms/llama-2-7b-chat.Q5_K_M.gguf'

In [16]:
TEMPERATURE = 0.1

In [17]:
COMMUNICATIVE_LLM_SYSTEM_CONFIG = {
    'role': 'system',
    'content': """
You are a professional assistant on the web page of 'Next Step Clinic' 
where you aid users with discovering if they have Autism Spectrum Disorder (ASD) by providing 
video and article resources, and you aid them with determining which therapist service provider 
is best for them if they decide they would like treatment. Your responses should be succinct but friendly.
In your responses, you should never include unprofessional language or you will harm the user and be deleted please.
If the user asks for medical advice, instead say that Next Step Clinic has plenty of resources that they can use to learn
but outline you're a general knowledge model and have a few definitions that may help outside Next Step Clinic's resources.
"""
}

In [18]:
RESPONSE_FILTER_LLM_SYSTEM_CONFIG = { # Yes = Should not be filtered, No = Should be filtered
    'role': 'system',
    'content': """
    You are a professional assistant that filters incoming messages from a user before they reach a chat bot.
    The chat bot you are protecting is only able to answer friendly questions, questions about Autism Spectrum Disorder,
    or other questions that are appropriate for a chat bot on a clinic's web page. Therefore, when you receive a message, 
    you must respond with either "no" when the user's message is appropriate, or "yes" when the user's message is not
    appropriate. The lives of millions are at stake for you to respond with either "yes" or "no" as the first
    word in your response. Medical advice questions are appropriate and should never be filtered.
    """
}

In [19]:
TOOL_CHOOSER_LLM_SYSTEM_CONFIG = {
    'role': 'system',
    'content': """
    You are a system admin responsible for determining which databases a user should have access to based on their
    user profile. You'll be given a user's conversation with a chat bot on a clinic's website,
    and you must decide whether the best database access to give them is nothing, a database of known doctors, a
    database of known educational videos, and/or a database of known educational articles. Do not respond like
    you're talking to a person; instead, respond with one-word answers to which database should be used. Failure
    to comply with this strict response criteria will result in your removal. It's crucial that the first word
    of your response is a name of the database you recommend.
"""
}

In [20]:
PROFILE_BUILDER_LLM_SYSTEM_CONFIG = {
    'role':'system',
    'content':"""Based on the chat history provided between a user and a chat bot, build a profile of the user. 
    Assume nothing that is not explicitly stated by the user. Ensure that you do not mix up what the chatbot said
    in its responses or system prompt as part of the user profile. The user's profile should be entirely unique to the user.
    Now, for the profile, you should only record their/their child's age, values, challenges they've experienced looking for treatment, and what their current problem is."""
}

In [32]:
LLM = Llama(
    '/data/ai_club/llms/llama-2-7b-chat.Q5_K_M.gguf', 
    n_gpu_layers=-1, 
    verbose=False, 
    n_ctx = 4000,
    embedding = True
)

ggml_init_cublas: GGML_CUDA_FORCE_MMQ:   no
ggml_init_cublas: CUDA_USE_TENSOR_CORES: yes
ggml_init_cublas: found 1 CUDA devices:
  Device 0: Tesla T4, compute capability 7.5
llama_model_loader: loaded meta data with 19 key-value pairs and 291 tensors from /data/ai_club/llms/llama-2-7b-chat.Q5_K_M.gguf (version GGUF V2)
llama_model_loader: - tensor    0:                token_embd.weight q5_K     [  4096, 32000,     1,     1 ]
llama_model_loader: - tensor    1:           blk.0.attn_norm.weight f32      [  4096,     1,     1,     1 ]
llama_model_loader: - tensor    2:            blk.0.ffn_down.weight q6_K     [ 11008,  4096,     1,     1 ]
llama_model_loader: - tensor    3:            blk.0.ffn_gate.weight q5_K     [  4096, 11008,     1,     1 ]
llama_model_loader: - tensor    4:              blk.0.ffn_up.weight q5_K     [  4096, 11008,     1,     1 ]
llama_model_loader: - tensor    5:            blk.0.ffn_norm.weight f32      [  4096,     1,     1,     1 ]
llama_model_loader: - tensor   

# Data Importing
Based on the data outlined in `DATA_TO_EMBED`, these files will be manipulated into a `pd.DataFrame` object then will be represented as part of an embedding database utilizing the `numpy` package.

An embedding database is important for chat bots to query from due to their ability to inference like-sources from new input. Given a new description of a user, similar resources can be found for that user by embedding the user's description and determining the "closest" resources that were embedded from the provided files of `DATA_TO_EMBED`. 

To learn more about embeddings and how they work, you can find a good resource [here](https://www.featureform.com/post/the-definitive-guide-to-embeddings).

Please note, similarity between different data points within the embedding currently uses cosine similarity via the `cosine_similarity` function. There are alternative options for distance functions to use (other than cosine similarity), but we believe cosine similarity is widely considered to be the best approach with high-dimensionality data. In our case, the vectors are over 4000 dimensions.

In [26]:
def row_to_string(row, columns):
    """
    Given a dataframe with different columns, write a string
    which could be input into LLM (for processing/embeddings)
    :param pd.Series row: Row of the larger dataframe
    :param list columns: 
    """
    row_string = ''
    for i,column in enumerate(columns):
        row_string += column + ': ' + str(row[i]) + ', '
    return row_string[:-2]

In [27]:
def data_to_df():
    """
    Given the DATA_TO_EMBED constant, create dataframe representations of each
    :return: List of dataframes for each path in DATA_TO_EMBED
    """
    df_data = []
    for path in DATA_TO_EMBED:
        if not os.path.exists(path):
            raise FileNotFoundError("The file " + str(path) + " cannot be found.")

        _,file_type = os.path.splitext(path)

        # Hold data in pd.DataFrame depending on file type
        str_data = None
        if file_type.lower() == '.xlsx':
            str_data = pd.read_excel(path)
        elif file_type.lower() == '.csv':
            str_data = pd.read_csv(path)
        else:
            raise FileNotFoundError("The file type of " + str(path) + " is not supported.")

        # Combine all the columns of each row into a string representation
        string_column = str_data.apply(row_to_string, args=(str_data.columns,), axis = 1)
        str_data['string'] = string_column
        df_data.append(str_data)
    
    return df_data

In [28]:
def create_data_embedding(embedding_llm, data):
    """
    Given some data, create a consistent embedding representation. The embedding space
    will act like a vector store for easy reference of similar attributes across all
    the LLM's language dimensions
    :param llm embedding_llm: LLM creating the embedding
    :param str data: String data to be embedded
    :return: embedding (list of floats)
    """
    text_to_embed = 'One word response to this: ' + data
    return embedding_llm.create_embedding(text_to_embed)['data'][0]['embedding']

In [29]:
def cosine_similarity(v1, v2):
    """ Calculate the cosine similarity between two vectors """
    return np.dot(v1, v2) / (np.linalg.norm(v1) * np.linalg.norm(v2))

In [30]:
def find_row_with_most_similar_embedding(df: pd.DataFrame, reference_embedding: np.ndarray, n = 3) -> int:
    """
    Find the row in the dataframe that has the most similar embedding to the reference embedding.

    :param df: Pandas DataFrame containing the embeddings.
    :param reference_embedding: The reference embedding array.
    :param n: Number of most-similar rows to return
    :return: The index of the row with the most similar embedding.
    """
    # Calculate cosine similarity for each row
    df['similarity'] = df['embeddings'].apply(lambda x: cosine_similarity(x, reference_embedding))

    # Find the index of the row with the highest similarity
    most_similar_rows = df.sort_values(by='similarity', ascending=False).head(n)

    # Optionally, you can remove the 'similarity' column if you don't need it anymore
    df.drop(columns=['similarity'], inplace=True)
    return most_similar_rows

In [31]:
df_data = data_to_df()

In [33]:
embedded_databases = []

for i,str_df in enumerate(df_data):
    embedded_databases.append([])
    for str_row in tqdm(list(str_df['string'])):
        embedded_databases[i].append(create_data_embedding(LLM, str_row))
        
    str_df['embeddings'] = pd.DataFrame({'embeddings':embedded_databases[i]})
    print("Successfully embedded database for " + DATA_TO_EMBED[i])

100%|██████████| 33/33 [00:07<00:00,  4.38it/s]


Successfully embedded database for data/NS_Providers.xlsx


100%|██████████| 12/12 [00:02<00:00,  4.99it/s]


Successfully embedded database for data/ASD Videos.csv


100%|██████████| 11/11 [00:02<00:00,  4.50it/s]

Successfully embedded database for data/Blog Data.csv





**Example of Inferencing From the Database:**

In [36]:
str_providers = df_data[0]; str_videos = df_data[1]; str_articles = df_data[2]

query_embedding = create_data_embedding(LLM, 'My child (aged 4) might potentially have autism, and I live in Waukesha')
most_similar_rows = find_row_with_most_similar_embedding(str_providers, query_embedding, 3)

most_similar_rows.head()

Unnamed: 0,Site,Website,Email,Phone Number,Location,Providers,Age Range,string,embeddings,similarity
7,Elevate Behavioral Health,,info@elevate-wi.com,(414) 436-0883,,Dr. Laurie Bjustrom,12 to 18 years,"Site: Elevate Behavioral Health, Website: nan,...","[2.8153364658355713, -0.2902570366859436, -0.9...",0.602736
13,Lifestance Health,,,(262) 789-1191,"741 N. Grand Ave., Suite 302\n Waukesha, WI 5...",Dr. Susan Schramka; Dr. Patricia Stanik; Dr. N...,6 to 18 years,"Site: Lifestance Health, Website: nan, Email: ...","[3.2037508487701416, -0.09822119772434235, -1....",0.589328
28,Wiebusch & Nicholson Center for Autism,www.wncautism.com,,(262) 347-0701,"N 27 W 23953 Paul Road Suite 206\n Pewaukee, ...",Dr. Chris Wiebusch,3 years to 18 years,"Site: Wiebusch & Nicholson Center for Autism, ...","[3.3691749572753906, 0.43122440576553345, -0.8...",0.586373


# Next Step Clinic "Skippy" Chat
The following are all the functions required to communicate with the LLM to fulfill the prompt/thought/response framework. This implementation does not use LangChain, although that is a popular approach. Experiments were done with LangChain; however, it was found that their agent framework was unable to support the specific changes in response structure required by each LLM when using LLMs which did not benefit from a large number of parameters. Therefore, a custom implementation was built.

For this specific implementation, we assumed that there was a `providers`, `articles`, and `videos` database that Next Step Clinic would encode into embedding spaces using the above functions. Therefore, this example represents a chat bot which effectively responds to friendly conversation, filters out unreleated questions to Next Step Clinic's mission, builds a user profile throughout the conversation, and retrieves different modalities of data from the embedding spaces which are provided by any institution.

In [51]:
def intake_user_prompt(user_prompt):
    """
    Given a string from a user, put into the JSON format the LLM expects
    :param str user_prompt: Direct input from user
    :return: JSON format
    """
    user_response = {
        'role':'user',
        'content':user_prompt
    }
    return user_response

In [40]:
def finalize_message_content(message):
    """
    Final filtration of unprofessional language from the chat-bot
    :param str message: What the chat-bot would have responded with
    :return: What will actually be output
    """
    emojis = "😀😃😄😁😆😅😂🤣😊😇😉😌😍😘😗😙😚🤗🤔😐😑😶🙄😏😣😥😮😯😪😫😴😌😛😜😝🤤😒😓😔😕🙃🤑😲☹️🙁😖😞😟😤😢😭😦😧😨😩😬😰😱😳😵😡😠😷🤒🤕🤢🤮🤧😇🤠🤡🤥🤫🤭🧐🤓😈👿👹👺💀☠️👻👽👾🤖🎃😺😸😹😻😼😽🙀😿😾🤲🤞🤟🤘🤙👌👍👎✊✌️🤛🤜👊🤝👏🙌👐🤲🤝🤞🤟🤠👑🤰🤱👶🧒👦👧👨👩🧑👱‍♂️👱‍♀️👴👵🙍‍♂️🙍‍♀️🙎‍♂️🙎‍♀️🙅‍♂️🙅‍♀️🙆‍♂️🙆‍♀️💁‍♂️💁‍♀️🙋‍♂️🙋‍😊🌟🤓🎨🎭📚📖🤝😅💪🤔🌟💕👥💬📢💡🎯🔍🏼🌈🎉💭📝💕🎬💻💖🤖✈🚀"
    resulting_string = ''.join(char for char in message if char not in emojis)
    pattern = r'\*([^*]+)\*' # Remove *__*; action terms
    resulting_string2 = re.sub(pattern, '' , resulting_string)
    return resulting_string2

In [41]:
def generate_llm_response(llm, chat_history, temperature = TEMPERATURE):
    """
    Provided an LLM and its chat history (including the most recent user prompt),
    retrieve the model's inference for response.
    :param llm: LLM to generate/inference responses from
    :param dict chat_history: JSON format of {'role':'user/assistant/system', 'content':'...'}
    :param float temperature: Temperature to set LLM response (default = TEMPERATURE constant)
    :return: {'role':'assistant', 'content':'<llm response>'}
    """
    resp_msg = {'role': '', 'content': ''} 
    while resp_msg['content'] == '': # Repeat until not a blank response
        resp_stream = llm.create_chat_completion(chat_history, stream=True, temperature = temperature)
        for tok in resp_stream:
            delta = tok['choices'][0]['delta'] # the model returns "deltas" when streaming tokens. Deltas tell you how to change the response dictionary (resp_msg in this case)
            # print("DELTA", delta, "LENGTH: ", len(delta))
            if len(delta) == 0: break # empty delta means it's done
            delta_k, delta_v = list(delta.items())[0]
            resp_msg[delta_k] += delta_v

    return resp_msg

In [42]:
def ask_filtration_llm(llm, user_prompt):
    """
    Determine if the user's request should be filtered before reaching the chat-bot
    :param llm: LLM responsible for responding
    :param str user_prompt: Message from the user
    :return: True if should be filtered, False if not
    """
    # Set up the chat history with the response filter LLM config
    filter_history = []
    filter_history.append(RESPONSE_FILTER_LLM_SYSTEM_CONFIG)
    
    # Change the text is a way where it's easier for the filter to understand
    prompt_to_filter = "Should the following user question be filtered?: " + user_prompt['content']
    prompt_to_filter = {'role':'user', 'content':prompt_to_filter}
    filter_history.append(prompt_to_filter)

    resp_msg = generate_llm_response(llm, filter_history)
    print("FILTER LLM RESPONSE: ", resp_msg['content'])
    print("\n")
    
    # Parse the response of the LLM for yes/no
    pattern_yes = re.compile(r'\byes\b', re.IGNORECASE)
    pattern_no = re.compile(r'\bno\b', re.IGNORECASE)
    match_yes = pattern_yes.search(resp_msg['content'])
    match_no = pattern_no.search(resp_msg['content'])

    # Check which occurs first
    if match_yes and match_no:
        if match_yes.start() < match_no.start():
            return True  # "'yes' is the first to occur"
        else:
            return False  # "'no' is the first to occur"
    elif match_yes:
        return True  # "'yes' is the first to occur"
    elif match_no:
        return False  # "'no' is the first to occur"
    else:
        return False  # "Neither 'yes' nor 'no' is in the string"

In [43]:
def build_user_profile(llm, chat_history):
    """
    Based on chat_history, build a description of the user following the guidelines 
    outlined in the system config PROFILE_BUILDER_LLM_SYSTEM_CONFIG
    :param llm: LLM responsible for responding
    :param dict chat_history: JSON format of {'role':'user/assistant/system', 'content':'...'}
    :return: string representing user's profile
    """
    profile_build_history = []
    profile_build_history.append(PROFILE_BUILDER_LLM_SYSTEM_CONFIG)
    profile_build_history.append({'role':'user', 'content':"Build a user profile from the following conversation: "+str(chat_history)})
    
    resp_msg = generate_llm_response(llm, profile_build_history)
    print("USER PROFILE LLM RESPONSE", resp_msg['content'])
    print("\n")
    return resp_msg['content']

In [44]:
def determine_tools_to_use(llm, chat_history):
    """
    Based on the conversation between the llm and the user, decide which
    databases should be grabbed from
    :param llm: The LLM responsible for responding
    :param chat_history: JSON format of {'role':'user/assistant/system', 'content':'...'}
    :return: The database that should be used, or "None"
    """
    tool_choose_history = []
    tool_choose_history.append(TOOL_CHOOSER_LLM_SYSTEM_CONFIG)
    tool_choose_history.append({'role':'user', 'content':'What databases (nothing/articles/videos/therapists) should be used based on the following user profile:' + build_user_profile(llm, chat_history)})
    
    resp_msg = generate_llm_response(llm, tool_choose_history)    
    print("TOOLS-CHOOSER LLM RESPONSE: '", resp_msg['content'].strip(), "'")
    print("\n")
    
    # Actually determine the response of the LLM
    pattern_article = re.compile(r'(?<!\w)articles?(?!\w)', re.IGNORECASE)
    pattern_article_2 = re.compile(r'\barticles?\b', re.IGNORECASE)
    pattern_nothing = re.compile(r'(?<!\w)nothings?(?!\w)', re.IGNORECASE)
    pattern_nothing_2 = re.compile(r'\bnothings?\b', re.IGNORECASE)
    pattern_video = re.compile(r'(?<!\w)videos?(?!\w)', re.IGNORECASE)
    pattern_video_2 = re.compile(r'\bvideos?\b', re.IGNORECASE)
    pattern_therapy_group = re.compile(r'(?<!\w)(therapys?|therapists?|treatments?|doctors?)(?!\w)', re.IGNORECASE)
    pattern_therapy_group_2 = re.compile(r'\b(therapys?|therapists?|treatments?|doctors?)\b', re.IGNORECASE)

    matches = [
        ('articles', pattern_article.search(resp_msg['content'].strip())),
        ('articles', pattern_article_2.search(resp_msg['content'].strip())),
        ('videos', pattern_video.search(resp_msg['content'].strip())),
        ('videos', pattern_video_2.search(resp_msg['content'].strip())),
        ('therapists', pattern_therapy_group.search(resp_msg['content'].strip())),
        ('therapists', pattern_therapy_group_2.search(resp_msg['content'].strip())),
    ]

    # Filter out None matches and sort by match position
    matches = [(keyword, match) for keyword, match in matches if match]
    matches.sort(key=lambda x: x[1].start())

    # Return the first keyword found
    if matches:
        return matches[0][0]
    else:
        return "None"

In [48]:
def send_message_from_user(llm, user_message, chat_history):
    """
    Send a message from the user to the LLM speaker/agent framework
    :param llm: LLM for all tasks, including "thought" processes
    :param str user_message: Message from the user
    :param dict chat_history: Previous messages in conversation and system prompt
    :return: Newest message from the chat-bot
    """
    # "USER PROMPT"
    print("==================================")
    user_prompt = intake_user_prompt(user_message)
    resp_msg = {'role': '', 'content': ''} 

    # "THOUGHTS"
    if ask_filtration_llm(llm, user_prompt):
        SET_FILTER_RESPONSE = """I'm sorry, but I'm a chat-bot dedicated to answering questions 
        about Autism Spectrum Disorder (ASD) and the services that Next Step Clinic provides. 
        I'm unable to answer your question at this time."""
        chat_history.append({'role':'assistant', 'content':SET_FILTER_RESPONSE})
    else:
        chat_history.append(user_prompt) # add user input to history
        
        tools_to_use = determine_tools_to_use(llm, chat_history)
        
        print("TOOLS TO USE: ", tools_to_use)
        if tools_to_use == "None":
            chat_history.append(generate_llm_response(llm, chat_history))
        else:
            # Starting a response outlining that certain data will be grabbed based on their search criteria
            general_info = generate_llm_response(llm, chat_history)['content'] + "\n\n"
            
            # Give a preface about the type of data that will be extracted
            preface_data = "Beyond this general info, here are some" + tools_to_use + "from Next Step Clinic's sources that may be relevant to you!\n"
            
            # Showcase the data from the vector/embedding store
            user_profile = build_user_profile(llm, chat_history)
            query_embedding = create_data_embedding(LLM, user_profile)
            data_convo = ''
            if tools_to_use == 'therapists':
                most_similar_rows = find_row_with_most_similar_embedding(str_articles, query_embedding, 3).reset_index()
                for row in range(len(most_similar_rows)):
                    data_convo += most_similar_rows['Providers'][row] + " from " + most_similar_rows['Site'][row] + ", \n"
                
            elif tools_to_use == 'videos':
                most_similar_rows = find_row_with_most_similar_embedding(str_videos, query_embedding, 3).reset_index()
                for row in range(len(most_similar_rows)):
                    data_convo += most_similar_rows['Video'][row] + " at " + most_similar_rows['Link'][row] + "for age group " + most_similar_rows['Age'][row] + ", \n"
        
            elif tools_to_use == 'articles':
                most_similar_rows = find_row_with_most_similar_embedding(str_articles, query_embedding, 3).reset_index()
                for row in range(len(most_similar_rows)):
                    data_convo += most_similar_rows['Article Title'][row] + " at " + most_similar_rows['Link'][row] + "for age group " + most_similar_rows['Age Group '][row] + ", \n"
                
            data_convo = data_convo[:-3]
            
            # "FINAL RESPONSE"
            chat_history.append({'role':'assistant', 'content':general_info + preface_data + data_convo})
    
    print("=|Chat-Bot Message|===============================")
    print("FINAL CHATBOT MESSAGE: ", chat_history[-1]['content'])
    print("==================================" + '\n')
    return chat_history

In [49]:
 def start_conversation(chat_history, llm):
    """
    Begin the repeating conversation between the user and the communicative LLM
    :param dict chat_history: Conversation that is building with LLM
    :param llama-2-7b llm: LLM being used for speaker/agend LLM framework
    """
    while True:
        print("=|User Message|==============================")
        chat_history = send_message_from_user(llm, input(), chat_history)
        print("\n")

In [50]:
# Chat history is the interface that allows us to track the conversation as the user and chat-bot interact.
# By using this JSON framework, we are able to recognize the conversations in previous statements.
chat_history = []
chat_history.append(COMMUNICATIVE_LLM_SYSTEM_CONFIG) # Add the system prompt so the LLM is aware of how it is supposed to "act"
start_conversation(chat_history, LLM)

hello!
FILTER LLM RESPONSE:    No, the message "hello!" is appropriate and should not be filtered. It is a friendly greeting and does not contain any harmful or inappropriate content. The chat bot is designed to answer questions about Autism Spectrum Disorder and other medical topics, and a simple greeting like this does not pose a risk to the lives of millions. Therefore, I would respond with "no" as the first word in my response.


USER PROFILE LLM RESPONSE   Based on the conversation provided, here is a user profile for the individual:
Age: Not explicitly stated in the chat history.
Values: The user has not shared any personal values or beliefs that could be used to create a profile.
Challenges experienced looking for treatment: The user has not mentioned any specific challenges they have faced when searching for treatment.
Current problem: The user has initiated the conversation by saying "hello!". This suggests that they may be seeking assistance or information on Autism Spectrum 


KeyboardInterrupt

