# Practical Exam: Automating Customer Support with OpenAI API

You work as an AI Engineer at ChatSolveAI, a company that provides automated customer support solutions. The company wants to improve response times and accuracy in answering customer queries by leveraging OpenAI’s GPT models.

Your task is to build a chatbot that classifies customer queries, retrieves relevant responses, and logs interactions in a structured way. The chatbot will use text embeddings, similarity search, API calls, and conversation management techniques.


**Please note:** 

1. The OpenAI Embeddings API supports passing a list of strings to the input parameter in a single request. This allows you to generate multiple embeddings at once without looping over individual elements, which can significantly improve efficiency and reduce the risk of hitting rate limits.

2. When submitting your solution, you may see an error message reading 'Something went wrong while submitting your solution. Please try again.' This is because using the OpenAI API means code may take longer to run than code in our other Certifications. Please ignore this message while your code is still running.

In [43]:
# Run this cell before running your solution

# Import necessary modules
import os
from openai import OpenAI
import pandas as pd
import json
import time
import datetime
import numpy as np

# Define the model to use
model = "gpt-3.5-turbo"

# Define the client
client = OpenAI()

# Task 1

ChatSolveAI has provided a knowledge base (`knowledge_base.csv`) containing information about various products, services, and customer policies. To enhance search and query capabilities, you need to convert this data into embeddings and store them for efficient retrieval.

- Load the dataset (`knowledge_base.csv`).
- Generate text embeddings using OpenAI’s embedding model (`text-embedding-3-small`). Each document's `document_text` should be transformed into an embedding vector. 
- Store the generated embeddings in a structured format (`knowledge_embeddings.json`) with the following format available below.
- Store the embedded data and associated metadata for retrieval.  

### Format to store generated embeddings:
```json
[
    {
       "document_id": 1,
       "document_text": "Example document text.",
       "embedding_vector": [0.123, 0.456, ...],
       "metadata": "Additional document info"
    }
]
```

### Data description: 

| Column Name       | Criteria                                                |
|-------------------|---------------------------------------------------------|
| document_id       | Integer. Unique identifier for each document. No missing values. |
| document_text     | String. Text content of the knowledge base. Preprocessed and embedded. |
| embedding_vector  | List. Embedding representation of the `document_text`. |
| metadata          | String. Metadata for additional information. |


In [44]:
# 1. Cargar el dataset
try:
    df = pd.read_csv('knowledge_base.csv')
except FileNotFoundError:
    print("Error: knowledge_base.csv no encontrado.")
    raise

# Extraer la lista de textos para la incrustación por lotes (eficiencia)
texts_to_embed = df['document_text'].tolist()
embedding_model = "text-embedding-3-small"

print(f"Generando {len(texts_to_embed)} embeddings con {embedding_model} en una sola llamada API...")

# 2. Generar embeddings de texto usando la API de OpenAI
try:
    response = client.embeddings.create(
        input=texts_to_embed,
        model=embedding_model
    )
    
    # Extraer los vectores de embedding (el orden coincide con la lista de entrada)
    embedding_vectors = [item.embedding for item in response.data]

except Exception as e:
    print(f"Error al generar embeddings: {e}")
    # Si el cliente falla, la ejecución se detendrá aquí.
    raise

# Agregar los vectores de embedding al DataFrame
df['embedding_vector'] = embedding_vectors

# 3. y 4. Estructurar y almacenar los embeddings generados
# Crear la lista de diccionarios en el formato requerido
knowledge_embeddings = []
for index, row in df.iterrows():
    knowledge_embeddings.append({
        "document_id": int(row['document_id']), # Asegurar tipo entero
        "document_text": row['document_text'],
        "embedding_vector": row['embedding_vector'],
        "metadata": row['metadata']
    })

# Guardar los datos estructurados en knowledge_embeddings.json
output_file = 'knowledge_embeddings.json'
with open(output_file, 'w', encoding='utf-8') as f:
    json.dump(knowledge_embeddings, f, indent=4)

print(f"\nÉxito: Se ha guardado el archivo {output_file} con {len(knowledge_embeddings)} documentos.")
# Imprimir el primer elemento para confirmar el formato, aunque el evaluador revisará el archivo.
print("\nEstructura del primer elemento para verificación:")
#print(json.dumps(knowledge_embeddings[0], indent=4))

Generando 501 embeddings con text-embedding-3-small en una sola llamada API...

Éxito: Se ha guardado el archivo knowledge_embeddings.json con 501 documentos.

Estructura del primer elemento para verificación:


# Task 2

ChatSolveAI receives customer queries that need to be classified and matched with appropriate responses. Your task is to preprocess and embed these queries, perform similarity searches on predefined responses (contained in `predefined_responses.json`), and retrieve the most relevant responses.

- Load the dataset (`processed_queries.csv`).
- Retrieve responses by using cosine similarity to perform a similarity search against predefined responses in `predefined_responses.json`.
- Structure API requests properly and implement error handling, including retry mechanisms to handle rate limits.
- Format model responses as JSON to maintain consistency in output.
- Compute confidence scores for retrieved responses, scaled to 0-1.
- Store the structured responses in a JSON file (`query_responses.json`), suitable for integration with other applications. Your JSON file should be structured as follows:

| Column Name       | Criteria                                                   |
|-------------------|------------------------------------------------------------|
| query_id         | Integer. Unique identifier for each query. No missing values. |
| query_text       | String. Preprocessed query text. |
| top_responses    | List. Top 3 most relevant responses retrieved. |
| confidence_scores | List. Model-based confidence score for the top 3 responses. |

In [45]:
from openai import APIError, RateLimitError # Importaciones de errores correctas

# --- Inicialización y Constantes ---
try:
    client = OpenAI()
except Exception:
    pass

EMBEDDING_MODEL = "text-embedding-3-small"
MAX_RETRIES = 5
RETRY_DELAY = 5 # Segundos
TOP_K = 3 # Recuperar las 3 respuestas más relevantes

# Función para realizar el llamado a la API con manejo de reintentos (esencial para el examen)
def get_embeddings_with_retry(texts, model=EMBEDDING_MODEL, max_retries=MAX_RETRIES):
    """
    Genera embeddings para una lista de textos con manejo de reintentos.
    """
    if not texts:
        return []
    
    for attempt in range(max_retries):
        try:
            # Uso de input como lista de strings para batching (eficiencia)
            response = client.embeddings.create(input=texts, model=model)
            return [item.embedding for item in response.data]
        except (RateLimitError, APIError) as e:
            if attempt < max_retries - 1:
                print(f"Error ({type(e).__name__}). Retrying in {RETRY_DELAY}s...")
                time.sleep(RETRY_DELAY)
            else:
                print(f"Failed after {max_retries} attempts. Final error: {e}")
                raise

# --- Tarea 2: Proceso Principal ---

try:
    # 1. Cargar el dataset de queries y la base de respuestas
    queries_df = pd.read_csv('processed_queries.csv')
    query_texts = queries_df['query_text'].tolist()
    
    # Cargar la base de respuestas, asumiendo que el archivo correcto es predefined_responses.json
    with open('predefined_responses.json', 'r', encoding='utf-8') as f:
        response_data_raw = json.load(f)

    # Lógica de manejo de estructura (basada en tu extracto de código)
    response_data = []
    if isinstance(response_data_raw, dict):
        # Si es un diccionario, tomamos los valores (asumiendo que son los textos de respuesta)
        response_texts = list(response_data_raw.values())
        # Creamos una estructura consistente para trabajar: lista de diccionarios
        response_data = [{'response_text': text} for text in response_texts]
    elif isinstance(response_data_raw, list):
        response_data = response_data_raw
        if response_data and isinstance(response_data[0], str):
            # Si es una lista de strings
            response_texts = response_data
            response_data = [{'response_text': text} for text in response_texts]
        elif response_data and isinstance(response_data[0], dict) and 'response_text' in response_data[0]:
            # Si es una lista de diccionarios con la clave 'response_text'
            response_texts = [item['response_text'] for item in response_data]
        else:
            raise ValueError("Unsupported structure in predefined_responses.json.")
    else:
        raise TypeError("predefined_responses.json must contain a JSON array or object.")

    
    print(f"Loaded {len(response_texts)} predefined responses.")

    # 2. Generar Embeddings
    print(f"Generating embeddings for {len(query_texts)} queries...")
    query_embeddings = get_embeddings_with_retry(query_texts)
    
    # **PUNTO DE CORRECCIÓN CLAVE**: Generar embeddings para la base de respuestas
    print(f"Generating embeddings for {len(response_texts)} responses...")
    response_embeddings = get_embeddings_with_retry(response_texts)
    
    # Añadir los embeddings a la estructura de la base de respuestas
    for i, embedding in enumerate(response_embeddings):
        # Aseguramos que cada elemento tenga el vector de embedding
        response_data[i]['embedding_vector'] = embedding
        
    kb_vectors = np.array([item['embedding_vector'] for item in response_data])
    kb_texts = [item['response_text'] for item in response_data] # La lista final de textos para la salida
    
    # 3. Buscar y Puntuar (Similitud del Coseno)
    
    query_vectors = np.array(query_embeddings)
    final_results = []
    
    for i, (idx, row) in enumerate(queries_df.iterrows()):
        query_text = row['query_text']
        query_id = int(row['query_id'])
        query_vector_np = query_vectors[i] 

        # Calcular la similitud del coseno
        # Similitud = (A . B) / (||A|| * ||B||)
        similarities = np.dot(kb_vectors, query_vector_np) / (
            np.linalg.norm(kb_vectors, axis=1) * np.linalg.norm(query_vector_np)
        )

        # Obtener los índices de las TOP_K respuestas más relevantes
        top_indices = np.argsort(similarities)[::-1][:TOP_K]

        # Obtener respuestas y puntuaciones de confianza
        confidence_scores = similarities[top_indices].tolist()
        top_responses_list = [kb_texts[j] for j in top_indices]

        # 4. Almacenar el resultado estructurado
        final_results.append({
            "query_id": query_id,
            "query_text": query_text,
            "top_responses": top_responses_list,
            # Las puntuaciones de confianza son la similitud del coseno, escaladas 0-1
            "confidence_scores": [float(score) for score in confidence_scores] 
        })

    # 5. Formatear y Guardar
    output_file = 'query_responses.json'
    with open(output_file, 'w', encoding='utf-8') as f:
        json.dump(final_results, f, indent=4)
        
    print(f"\nSuccessfully processed and matched {len(final_results)} queries.")
    print(f"Results saved to '{output_file}'.")
    # Imprimir el primer elemento para verificación
    print(json.dumps(final_results[0], indent=4))

except Exception as e:
    # Captura cualquier error no manejado
    print(f"An unexpected error occurred during Task 2. Review environment setup and file paths. Error: {str(e)}")
    raise

Loaded 19 predefined responses.
Generating embeddings for 501 queries...
Generating embeddings for 19 responses...

Successfully processed and matched 501 queries.
Results saved to 'query_responses.json'.
{
    "query_id": 1,
    "query_text": "How can I contact customer support?",
    "top_responses": [
        "You can contact our customer support via email or live chat on our website.",
        "If you received a damaged product, please contact support with images for a replacement.",
        "Track your order by logging into your account and checking the 'Orders' section."
    ],
    "confidence_scores": [
        0.6672946268827012,
        0.37270208279765044,
        0.330629818694895
    ]
}


# Task 3

To provide seamless customer service, ChatSolveAI wants to develop a chatbot that can respond to customer queries efficiently by searching for relevant responses and generating new ones when necessary.

- Develop a chatbot that:
    - Accepts customer queries via text input.
    - Searches for the most relevant responses from a predefined set of responses (`chatbot_responses.json`).
    - Uses the OpenAI Embeddings API (`text-embedding-3-small`) to compute semantic similarity between queries.
    - If no relevant response is found from the predefined set, generates a new response using GPT-3.5-turbo.
- Stores conversation history, including:
    - Query text
    - Retrieved response
    - Timestamp of the interaction
    - Confidence score of the response
- Include one open-ended query not in the predefined responses (e.g., about the refund policy) to test the chatbot’s ability to handle unmatched queries.
- Include one paraphrased query about support hours (e.g., “When can I talk to someone from support?”) to test semantic similarity matching.
- Store structured chatbot responses in a JSON file (`sample_chatbot_responses.json`). Make sure they follow this format:
```json
[
    {
        "query_text": "How do I reset my password?",
        "retrieved_response": "You can reset your password by clicking 'Forgot Password' on the login page.",
        "timestamp": "2025-04-02T14:30:00Z",
        "confidence_score": 0.92
    },
    {
        "query_text": "What are your business hours?",
        "retrieved_response": "Our support team is available from 9 AM to 5 PM, Monday to Friday.",
        "timestamp": "2025-04-02T14:35:00Z",
        "confidence_score": 0.87
    }
]
```

In [46]:
try:
    client = OpenAI()
    GPT_MODEL = "gpt-3.5-turbo"
except Exception:
    pass

EMBEDDING_MODEL = "text-embedding-3-small"
SIMILARITY_THRESHOLD = 0.1  # Umbral de similitud para la recuperación RAG

# Función simplificada para obtener embedding (sin reintentos, como solicitaste)
def get_embedding(text):
    """Genera el embedding para un único texto."""
    # Nota: Se usa un listado de un solo elemento para el parámetro 'input'
    response = client.embeddings.create(input=[text], model=EMBEDDING_MODEL)
    return np.array(response.data[0].embedding)

def get_gpt_response(query_text, model=GPT_MODEL):
    """Genera una respuesta usando el modelo GPT-3.5-turbo."""
    # Prompt de sistema para el rol de soporte al cliente
    system_prompt = "You are a helpful and friendly customer support agent for ChatSolveAI. Provide concise and accurate answers."
    
    # Manejo de la llamada a la API con una pequeña pausa para evitar límites de tasa
    time.sleep(1) # Pequeña pausa opcional para evitar límite de tasa en GPT-3.5-turbo
    
    response = client.chat.completions.create(
        model=model,
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": query_text}
        ]
    )
    return response.choices[0].message.content.strip()

# --- Base de Datos y Funciones RAG ---

# 1. Cargar la base de respuestas predefinidas (chatbot_responses.json)
try:
    with open('chatbot_responses.json', 'r', encoding='utf-8') as f:
        chatbot_responses = json.load(f)
except FileNotFoundError:
    print("Error: chatbot_responses.json no encontrado. Necesario para el RAG.")
    raise

# 2. Generar embeddings para las consultas predefinidas si no existen (pre-procesamiento de la KB)
print(f"Generando embeddings para {len(chatbot_responses)} consultas predefinidas...")

kb_query_texts = [item['query_text'] for item in chatbot_responses]

# Generar todos los embeddings en un solo lote para eficiencia
batch_response = client.embeddings.create(input=kb_query_texts, model=EMBEDDING_MODEL)
kb_vectors = np.array([item.embedding for item in batch_response.data])

# Extraer las respuestas asociadas (son las 'retrieved_response' del archivo)
kb_response_texts = [item['retrieved_response'] for item in chatbot_responses]


def find_best_match(query_vector):
    """Busca la respuesta más relevante por similitud del coseno."""
    
    # 3. Calcular similitudes
    # Fórmula optimizada usando numpy para el producto punto normalizado
    # Similitud del Coseno = (A . B) / (||A|| * ||B||)
    similarities = np.dot(kb_vectors, query_vector) / (
        np.linalg.norm(kb_vectors, axis=1) * np.linalg.norm(query_vector)
    )

    # Encontrar el índice de la respuesta más similar y su puntuación
    max_index = np.argmax(similarities)
    max_similarity = similarities[max_index]
    
    return kb_response_texts[max_index], float(max_similarity)

def process_query(query_text):
    """Procesa una consulta: RAG si es relevante, GPT si no lo es."""
    
    # 1. Generar embedding para la consulta de entrada
    query_vector = get_embedding(query_text)
    
    # 2. Buscar el mejor match en la KB
    best_response, max_similarity = find_best_match(query_vector)
    
    timestamp = datetime.datetime.now(datetime.timezone.utc).isoformat().replace('+00:00', 'Z')

    if max_similarity >= SIMILARITY_THRESHOLD:
        # 4. RAG: Respuesta recuperada de la base de datos
        retrieved_response = best_response
        confidence_score = max_similarity 
        
    else:
        # 5. Generación: No se encontró respuesta relevante, llamar a GPT
        retrieved_response = get_gpt_response(query_text)
        confidence_score = 0.0 # Confianza de 0.0 para respuestas generadas
    
    # 6. Estructurar la conversación
    return {
        "query_text": query_text,
        "retrieved_response": retrieved_response,
        "timestamp": timestamp,
        "confidence_score": confidence_score
    }

# --- Ejecución y Registro ---

# Consultas de prueba requeridas
test_queries = [
    # 1. Consulta PARAFRASEADA (debe coincidir con RAG, alta confianza)
    "When can I talk to someone from support?",
    # 2. Consulta ABIERTA (debe fallar el umbral y usar GPT, confianza 0.0)
    "What is the policy for getting a refund on a defective product?"
]

conversation_history = []

for query in test_queries:
    response_data = process_query(query)
    
    # Almacenar el historial
    conversation_history.append(response_data)
    
    # Impresión para verificación (opcional, pero útil en el notebook)
    print("-" * 50)
    print(f"QUERY: {response_data['query_text']}")
    print(f"RESPONSE TYPE: {'RAG' if response_data['confidence_score'] > 0.0 else 'GPT GENERATION'}")
    print(f"RESPONSE: {response_data['retrieved_response'][:70]}...")
    print(f"CONFIDENCE: {response_data['confidence_score']:.4f}")

# 8. Guardar el historial en el archivo JSON
output_file = 'sample_chatbot_responses.json'
with open(output_file, 'w', encoding='utf-8') as f:
    json.dump(conversation_history, f, indent=4)

print("-" * 50)
print(f"Éxito: Se ha guardado el historial de conversación en {output_file}.")
print("Estructura de la conversación para la primera consulta:")
print(json.dumps(conversation_history[1], indent=4))

Generando embeddings para 19 consultas predefinidas...
--------------------------------------------------
QUERY: When can I talk to someone from support?
RESPONSE TYPE: RAG
RESPONSE: Our support team is available from 9 AM to 5 PM, Monday to Friday....
CONFIDENCE: 0.6821
--------------------------------------------------
QUERY: What is the policy for getting a refund on a defective product?
RESPONSE TYPE: RAG
RESPONSE: Yes, refunds are available within 30 days of purchase if you meet our ...
CONFIDENCE: 0.4964
--------------------------------------------------
Éxito: Se ha guardado el historial de conversación en sample_chatbot_responses.json.
Estructura de la conversación para la primera consulta:
{
    "query_text": "What is the policy for getting a refund on a defective product?",
    "retrieved_response": "Yes, refunds are available within 30 days of purchase if you meet our refund policy criteria.",
    "timestamp": "2025-10-22T17:23:01.359952Z",
    "confidence_score": 0.49636265