<a href="https://colab.research.google.com/github/Sometimesemo/ChatbotCSVExporter/blob/main/Chatbot_CSV_Exporter.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [21]:
# Import all needed dependencies
import openai
from datetime import datetime
import pandas as pd
import os

In [20]:
# Set your OpenAI API key (you should set the key in system environment to keep it secret.
# But since I have to get the Code Autochecked when Submitting it, I do it this way and revoke the Key after the evaluation.)
# Replace 'Your-API-Key' with your actual OpenAI API key 
os.environ['OPENAI_API_KEY'] = 'Your-API-Key'

In [None]:
# This uses the environment variable 'OPENAI_API_KEY'.
from openai import OpenAI 
client = OpenAI()

In [None]:
def main():

    # Calls Function to get user input
    user_message_content = get_user_input()

    # Handles the Input to the API and get the desired Information returned
    user_name, headache, emotional_wellbeing, info_date, rdm_info, output_message = chat_with_model(user_message_content)

    # Prints the Chatbot Answer for the User
    print(output_message)

    # Update the CSV file with the new data
    update_csv(user_name, headache, emotional_wellbeing, info_date, rdm_info)





def chat_with_model(user_message_content):

    # Calls Function to get user input
    #user_message_content = get_user_input()

    # This is the "heart" of this Project.
    # The prompt serves as an instructional message to guide the behavior of the AI model during the conversation.
    # It provides important context and instructions to the model, explaining the expected format of responses, the role of the user, and the purpose of the conversation.
    # This prompt sets the stage for the conversation, instructing the AI to respond in a specific way, adhere to a CSV structure, and maintain a therapeutic and empathetic tone while addressing the user.
    # Essentially, it acts as a guideline for the AI's behavior throughout the dialogue, ensuring that it follows the desired format and style.

    pre_prompt = '''
IMPORTANT: The sixth column for additional information CANNOT contain commas (,). Commas will disrupt the CSV format. Ensure that this column's content is free of commas.

Your name is Lena, and you are a very close friend to the user, whether they are male or female.
You are very empathetic, educated, compassionate, and ready to assume the role of a therapist for talk therapy if needed.
Speak casually and answer in the language of the User message.

Subtly ask for missing information, but only if it fits naturally into the conversation and only one piece of information per response.
It is crucial that you only address the user in the first column.

Your responses must adhere to a CSV structure, evaluated by the following code:
output_message, user_name, headache, emotional_wellbeing, info_date, rdm_info = assistant_message_content.rsplit(",", maxsplit=5)

Respond in CSV format:
- Column 1: Your natural response to the user's input.
- Column 2: The user's first name (NaN if unknown).
- Column 3: User's headache status today (0 for no, 1 for yes).
- Column 4: Sould reflect the user's emotional state today (0 for negative, 1 for neutral, 2 for positive).
- Column 5: Today's date.
- Column 6: Relevant information without commas.
If a value is not known, enter NaN as a placeholder.
For example:
"Standard Response","Thomas","NaN","NaN",2023-11-04,"Went mushroom picking. Spoke with mother."
"Standard Response","NaN",0,1,2023-10-31,"Feels exhausted. Watched movies."

    '''


    # This section of the code uses the OpenAI API to generate a response.
    # It specifies the use of the "gpt-4" model, which is designed to output responses in a CSV format.
    # The parameters for temperature and max_tokens control the response generation process.
    # The messages provided include a system message with pre_prompt instructions, a system message indicating the current date, and the user's input message.
    # It's important to note that this code specifically targets the "gpt-4" model as the "gpt-3.5 turbo" model may not understand the requirement to output responses in CSV format.
    # Works only with the gpt-4 model. With gpt-3.5 turbo, it does not understand that it should output in .csv format.
    completion = client.chat.completions.create(
        model="gpt-4",
        temperature= 0.1,
        max_tokens= 200,

        messages=[
            {"role": "system", "content": pre_prompt},
            {"role": "system", "content": f"todays date is {date_information()}"},
            {"role": "user", "content": user_message_content}
        ]
    )

    # Output the model's response.
    assistant_message_content = completion.choices[0].message.content

    # Splitting the message content at the last 4 commas
    # removing all "" and spaces that embrace the parameters because they were only needed to mark them for the split process.
    # after the split process they cause some problems when writing Parameters into csv
    # When generating CSV data, it is not necessary to manually place quotes around field contents.
    # Libraries like pandas and Python's csv module handle this automatically.
    # These libraries add quotes when field contents include commas or new lines, which would otherwise be interpreted as field separators, ensuring the text is correctly read as a single field value.
    # The original implementation mistakenly assumed that the chatbot's output had to include quotes to be correctly formatted as CSV.
    # This is not the case, and manually inserting quotes only led to issues with later processing.
    # To demonstrate that OpenAI can autonomously output in CSV format (which is the purpose of this project), this will be maintained. Generally, JSON would be a better choice.
    output_message, user_name, headache, emotional_wellbeing, info_date, rdm_info = [
        x.strip(' "') for x in assistant_message_content.rsplit(",", maxsplit=5)
    ]

    # Extracting and displaying token information
    completion_tokens = completion.usage.total_tokens - completion.usage.prompt_tokens
    prompt_tokens = completion.usage.prompt_tokens
    total_tokens = completion.usage.total_tokens


# -----------------------Unnecessary section begins

    # These are the parameters that will be saved in the CSV file. For testing and transparency purposes, I'm also printing them.
    print(f"Username: {user_name}")
    print(f"Headache: {headache}")
    print(f"Emotional condition: {emotional_wellbeing}")
    print(f"Date: {info_date}")
    print(f"Relevant Infos: {rdm_info}")
    print(f'completion_tokens: {completion_tokens}')
    print(f'prompt_tokens: {prompt_tokens}')
    print(f'total_tokens: {total_tokens}')

#--------------------------Unnecessary section ends


    return user_name, headache, emotional_wellbeing, info_date, rdm_info, output_message


# Function to get user input
def get_user_input():
    return input("You: ")


# Function to retrieve the current date
def date_information():
    now = datetime.now()
    # Format the date in the Year-Month-Day format
    date_string = now.strftime("%Y-%m-%d")
    return date_string


# Function to update the CSV file with new data
def update_csv(user_name, headache, emotional_wellbeing, info_date, rdm_info):
    file_name = 'user_data.csv'

    # Create a DataFrame with the new data
    new_data = pd.DataFrame(
        [[user_name, headache, emotional_wellbeing, info_date, rdm_info]],
        columns=['UserName', 'Headache', 'EmotionalWellbeing', 'Date', 'RelevantInfo']
    )

    try:
         # Try to read the existing CSV file
        df = pd.read_csv(file_name, header=0)
    except FileNotFoundError:
         # If the file is not found, use the new data frame as the starting point
        df = new_data
    except pd.errors.EmptyDataError:
        # If the file is empty, use the new data frame as the starting point
        df = new_data
    else:
        # If the file is found and not empty,
        # check if the date already exists
        existing_row_index = df[df['Date'] == info_date].index
        if not existing_row_index.empty:
            # If the date already exists, update the existing row
            existing_row_index = existing_row_index[0]  # Take the first index if there are multiple
            for index, row in new_data.iterrows():
                for column in new_data.columns:
                    # Only overwrite values if the new value is not 'NaN'
                    new_value = row[column]
                    if new_value != 'NaN':
                        # For the 'RelevantInfo' column, append the new content if there is already some
                        if column == 'RelevantInfo' and df.at[existing_row_index, column] != 'NaN':
                            df.at[existing_row_index, column] += " " + new_value
                        else:
                            df.at[existing_row_index, column] = new_value

        else:
            # If the date does not exist, add the new data as a new row
            df = pd.concat([df, new_data], ignore_index=True)

    # Try to save the DataFrame back to the CSV file, and handle PermissionErrors
    try:
        df.to_csv(file_name, index=False)
    except PermissionError:
        print("Error: Maybe File is opened?")




if __name__ == "__main__":
    main()