<a href="https://colab.research.google.com/github/UrologyUnbound/SIOP_ML_2024_Discord/blob/main/colabs/Interview.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Generating Interview Responses Notebook
This notebook is designed to tackle the challenge of generating plausible responses to interview questions.  

## Challenge Description
Job candidates have responded to 5 common interview questions. We are given the text of 4 question and response pairs. Our task is to generate a likely text response for the 5th question based on the previous responses.  


In [None]:
!pip install pandas langchain langchain_openai

In [2]:
import pandas as pd
import os
from langchain import FewShotPromptTemplate, PromptTemplate
from langchain_openai import ChatOpenAI
from google.colab import userdata

In [3]:
interview_train_data = pd.read_csv("https://raw.githubusercontent.com/UrologyUnbound/SIOP_ML_2024_Discord/main/data/train/interview_train.csv")
interview_dev_data = pd.read_csv("https://raw.githubusercontent.com/UrologyUnbound/SIOP_ML_2024_Discord/main/data/dev/interview_val_public.csv")

In [4]:
# Manually add env key to the `api_key` argument
llm = ChatOpenAI(api_key= userdata.get('OPENAI_API_KEY'), temperature=0.3)

In [5]:
instructions = "Your task is to analyze a dataset of interview questions and responses. Based on the content, tone, and details of the provided answers, your task is to generate a plausible answer for the fourth interview question."

In [6]:
def extract_questions_answers(text):
    # Split the text into parts based on "Question:" and "Response:" delimiters
    parts = text.split("Question:")[1:]
    qas = []

    last_question = ""

    for part in parts:
        q_a = part.split("Response:")
        question = q_a[0].strip()
        answer = q_a[1].strip() if len(q_a) > 1 else ""
        qas.append({"question": question, "answer": answer})
        last_question = question

    # return qas minus the last question, and the last question separately
    return qas[:-1], last_question

def create_examples(dataset_row):
    # Extract questions and answers from each row of the dataset
    # examples = extract_questions_answers(dataset_row)

    examples, _ = extract_questions_answers(dataset_row)

    return examples


def create_example_prompt():
    # Create a formatter for the examples
    example_prompt = PromptTemplate(
        input_variables=["question", "answer"],
        template="Question: {question}\n{answer}"
    )

    return example_prompt


def create_template(dataset_row):
    # Generate a few shot prompt template
    examples, last_question = extract_questions_answers(dataset_row)

    template = FewShotPromptTemplate(
        # examples=create_examples(dataset_row),
        examples=examples,
        example_prompt=create_example_prompt(),
        suffix="Question: {input}",
        input_variables=["input"],
    )

    return template

In [7]:
# Create templates and fetch the last question for each row
# prompt_templates = []
formatted_prompts = []

for i in range(len(interview_train_data)):
    examples, last_question = extract_questions_answers(interview_train_data.loc[i, "questions_answers"])
    prompt_template = create_template(interview_train_data.loc[i, "questions_answers"])
    # prompt_templates.append(prompt_template)

    formatted_prompt = prompt_template.format(input=last_question)
    formatted_prompts.append(formatted_prompt)

In [8]:
# Output the response
llm.invoke(formatted_prompts[0])

AIMessage(content='When motivating my team, I like to focus on setting clear goals and expectations, providing regular feedback and recognition, and fostering a positive and collaborative work environment. I believe in empowering team members to take ownership of their work and encouraging open communication.\n\nOne instance where this approach was particularly effective was when my team was working on a tight deadline for a project. I made sure to communicate the importance of the project, set clear milestones and deadlines, and provided support and resources to help them succeed. I also recognized and praised their hard work and dedication throughout the process. As a result, my team was motivated to work together efficiently and effectively to meet the deadline successfully.', response_metadata={'token_usage': {'completion_tokens': 131, 'prompt_tokens': 364, 'total_tokens': 495}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': 'fp_3bc1b5746c', 'finish_reason': 'stop', 'logprobs