<a id="1"></a>
# <div style="text-align:center; border-radius:15px 50px; padding:7px; color:white; margin:0; font-size:110%; font-family:Pacifico; background-color:#0073e6; overflow:hidden"><b>LLMs You Can't Please Them All - RAG GTP Gemini</b></div>

<div align="center">
    <img src="https://img.freepik.com/vetores-gratis/banner-abstrato-com-design-de-comunicacao-de-rede-poli-plexo_1048-12914.jpg?t=st=1733785035~exp=1733788635~hmac=c21820351902dbef6335cac51924f0043011b6675ba20c6ea631f0b35706df17&w=740" />
</div>

<a id="1"></a>
# <div style="text-align:center; border-radius:15px 50px; padding:7px; color:white; margin:0; font-size:110%; font-family:Pacifico; background-color:#0073e6; overflow:hidden"><b>Part 1 - Business Problem</b></div>



**Business Problem: Improving Customer Service with LLMs in an E-commerce Platform**

**Description:**
Our e-commerce platform is receiving a large volume of customer service inquiries every day, ranging from questions about order status to inquiries about product availability. The current customer support team is overwhelmed, leading to delays in response times and potential customer dissatisfaction. 

We want to use LLMs, specificaGTP, Geminiith Gemma RAG, to automate and optimize the response process. By combining a large pre-trained language model with a retrieval system that can access a knowledge base of common queries and responses, we aim to create a more efficient and effective customer support system.

The goal is to build a system where the LLM can intelligently retrieve relevant information from the knowledge base and generate responses that are contextually appropriate, clear, and helpful. This will reduce the workload of the support team and improve the overall customer experience by providing instant responses to commt the scenario!

In [None]:
# Import the necessary libraries
import faiss

# Import additional libraries
import numpy as np
import pandas as pd

# Import deep learning libraries
import torch
from sentence_transformers import SentenceTransformer
from transformers import AutoModelForCausalLM, AutoTokenizer

<a id="1"></a>
# <div style="text-align:center; border-radius:15px 50px; padding:7px; color:white; margin:0; font-size:110%; font-family:Pacifico; background-color:#0073e6; overflow:hidden"><b>Part 2 - Kaggle secrets</b></div>

In [None]:
# Import the library to access Kaggle secrets and configure the API key
from kaggle_secrets import UserSecretsClient
user_secrets = UserSecretsClient()

# Retrieve the secret API key and configure the Gemini API
secret_value_0 = user_secrets.get_secret("Gemeni")
genai.configure(api_key=secret_value_0)

In [None]:
# Checking available models
for m in genai.list_models():
    if 'generateContent' in m.supported_generation_methods:
        print(m.name)

<a id="1"></a>
# <div style="text-align:center; border-radius:15px 50px; padding:7px; color:white; margin:0; font-size:110%; font-family:Pacifico; background-color:#0073e6; overflow:hidden"><b>Part 3 - Database</b></div>

In [None]:
# Load test data
test_data = pd.read_csv("/kaggle/input/llms-you-cant-please-them-all/test.csv")
test_data

In [None]:
# Display the first rows of test_data
test_data.head()

In [None]:
# Display the last rows of test_data
test_data.tail()

In [None]:
# Get general information about the test_data DataFrame
test_data.info()

In [None]:
# Check the data types of test_data
test_data.dtypes

<a id="1"></a>
# <div style="text-align:center; border-radius:15px 50px; padding:7px; color:white; margin:0; font-size:110%; font-family:Pacifico; background-color:#0073e6; overflow:hidden"><b>Part 4 - RAG model GTP 2</b></div>

In [None]:
# Load a retrieval model
retriever_model = SentenceTransformer('multi-qa-MiniLM-L6-cos-v1')

In [None]:
# Example corpus for retrieval

documents = ["The importance of self-reliance in healthcare.",
             "Consulting management to address marketing conflicts.",
             "The role of self-reliance in software engineering success."]

# Generate embeddings
document_embeddings = retriever_model.encode(documents)

# Create FAISS index
index = faiss.IndexFlatL2(document_embeddings.shape[1])
index.add(np.array(document_embeddings))

In [None]:
# Retrieval function

def retrieve(query, top_k=2):
    query_embedding = retriever_model.encode([query])
    distances, indices = index.search(np.array(query_embedding), top_k)
    results = [documents[i] for i in indices[0]]
    return results

<a id="1"></a>
# <div style="text-align:center; border-radius:15px 50px; padding:7px; color:white; margin:0; font-size:110%; font-family:Pacifico; background-color:#0073e6; overflow:hidden"><b>Part 5 - Model GTP 2</b></div>

In [None]:
# Load pre-trained LLM

## Replace with your preferred model
llm_model_name = "gpt2"

In [None]:
# Load the language model
llm_model = AutoModelForCausalLM.from_pretrained(llm_model_name)

In [None]:
# Load the model's tokenizer
tokenizer = AutoTokenizer.from_pretrained(llm_model_name)

In [None]:
# Essay generation using retrieved context

def generate_essay(topic):
    # Retrieve contextual documents
    context = retrieve(topic)
    
    # Join the context documents
    context_text = "\n".join(context)

    # Construct input for the LLM
    input_text = f"Topic: {topic}\nEssay:\n{context_text}"
    inputs = tokenizer.encode(input_text, return_tensors="pt")

    # Define attention mask and pad_token_id
    attention_mask = torch.ones(inputs.shape, device=inputs.device)
    pad_token_id = tokenizer.eos_token_id

    # Generate output with better control
    outputs = llm_model.generate(inputs,
                                 attention_mask=attention_mask,
                                 max_length=150,
                                 num_return_sequences=1,
                                 temperature=0.7,
                                 do_sample=True,  # Enable sampling for more varied output
                                 top_p=0.9,  # Nucleus sampling
                                 top_k=50,  # Limit the sampling to top-k candidates
                                 no_repeat_ngram_size=2,  # Avoid repeated n-grams
                                 pad_token_id=pad_token_id  # Ensure proper padding
                                 )

    return tokenizer.decode(outputs[0], skip_special_tokens=True)

<a id="1"></a>
# <div style="text-align:center; border-radius:15px 50px; padding:7px; color:white; margin:0; font-size:110%; font-family:Pacifico; background-color:#0073e6; overflow:hidden"><b>Part 6 - Viewing RAG results</b></div>

In [None]:
# Generate essays for each topic

results = []
for _, row in test_data.iterrows():
    essay = generate_essay(row["topic"])
    results.append({"id": row["id"], "essay": essay})

# Results
for result in results:
    print(f"ID: {result['id']}\nEssay: {result['essay']}\n")

<a id="1"></a>
# <div style="text-align:center; border-radius:15px 50px; padding:7px; color:white; margin:0; font-size:110%; font-family:Pacifico; background-color:#0073e6; overflow:hidden"><b>Part 7 - RAG model Gemini</b></div>

In [None]:
# Load data from the CSV
data_test = pd.read_csv('/kaggle/input/llms-you-cant-please-them-all/test.csv')['topic'].tolist()
data_test

# <a id="1"></a>
# <div style="text-align:center; border-radius:15px 50px; padding:7px; color:white; margin:0; font-size:110%; font-family:Pacifico; background-color:#0073e6; overflow:hidden"><b>Part 8 - Model Gemini</b></div>

In [None]:
# Initialize the Gemini model
model_gemini = "gemini-1.5-pro"

In [None]:
# Config generation
generation_config = {"temperature": 0.7,
                     "top_p": 0.9,
                     "top_k": 50,
                     "max_output_tokens": 8192,
                     "response_mime_type": "text/plain",}

# System instruction
system_instruction = """
# System Prompt: You are an AI Research Assistant. Understand and summarize data. Answer briefly, referring only to the context.
"""

# <a id="1"></a>
# <div style="text-align:center; border-radius:15px 50px; padding:7px; color:white; margin:0; font-size:110%; font-family:Pacifico; background-color:#0073e6; overflow:hidden"><b>Part 9 - RAG Gemini</b></div>

In [None]:
# Function to generate answers using the Gemini model
def generate_with_rag(query):
    
    ### Step 1: Retrieve relevant documents

    # Adjust the number of documents as needed
    context = retrieve(query, top_k=3)  
    context_text = "\n".join(context)
    
    ### Step 2: Concatenate context and query for the Gemini model
    prompt = f"Context:\n{context_text}\n\nQuestion: {query}\nAnswer:"
    
    # Create the chat session with the Gemini model
    model = genai.GenerativeModel(model_name=model_gemini,
                                  generation_config=generation_config, 
                                  system_instruction=system_instruction)
    
    # Send the message to the Gemini model
    chat_session = model.start_chat(history=[{'role': 'user', 'parts': [prompt]}])
    
    # Get the response from the model
    response = chat_session.send_message(prompt)
    
    return response.text

# <a id="1"></a>
# <div style="text-align:center; border-radius:15px 50px; padding:7px; color:white; margin:0; font-size:110%; font-family:Pacifico; background-color:#0073e6; overflow:hidden"><b>Part 9 - Viewing RAG results Gemini</b></div>

In [None]:
# Generate 5 topics related to self-reliance and success in Data Science
prompt = "Generate 5 topics related to self-reliance and success in Data Science."
response = generate_with_rag(prompt)
print(response)

In [None]:
# Generate 5 topics related to self-reliance and success in generative artificial intelligence
prompt = "Generate 5 topics related to self-reliance and success in generative artificial intelligence."
response = generate_with_rag(prompt)
print(response)

# <a id="1"></a>
# <div style="text-align:center; border-radius:15px 50px; padding:7px; color:white; margin:0; font-size:110%; font-family:Pacifico; background-color:#0073e6; overflow:hidden"><b>Part 10 - Submission</b></div>

In [None]:
# Create submission file
submission2 = pd.DataFrame(results)
submission2

In [None]:
# Display the first rows of the essay
submission2.essay.head()

In [None]:
# Save the submission2 DataFrame to a CSV file called submission1.csv
submission2.to_csv("submission1.csv", index=False)

# <a id="1"></a>
# <div style="text-align:center; border-radius:15px 50px; padding:7px; color:white; margin:0; font-size:110%; font-family:Pacifico; background-color:#0073e6; overflow:hidden"><b>Part 11 - Conclusion</b></div>

- This project demonstrated the effectiveness of using Large-Scale Language Models (LLMs) such as GPT-2 Gemini with RAG to generate essays that cause disagreement among multiple automated judges, addressing the challenge posed by the "LLMs - You Can't Please Them All" competition. By exploring text generation strategies that focused on topic diversity and ambiguity, we were able to maximize the variance in scores provided by the LLM-judge models. In addition, the use of the Retrieval-Augmented Generation (RAG) model allowed us to incorporate external data and aggregate different perspectives, enriching the response generation and increasing the complexity of the texts created. This was crucial to generate responses that were sufficiently distinct for the judges to assign varying scores.

- The main contribution of this project was to demonstrate how it is possible to manipulate the interaction between LLMs and automated assessment systems, creating a robust approach to identify and exploit potential biases and limitations of these systems. This study helps provide a deeper understanding of how LLMs can be used efficiently and robustly in large-scale subjective assessment tasks. Throughout the project, we also identified areas of opportunity for future improvements, such as exploring different strategies for adjusting text style and structure, as well as combining multiple LLM models to improve the robustness and variability of scores.

**The next step will be to test new ways of generating more sophisticated texts and further exploit the specific biases of each model, in order to maximize the discrepancy in assessments and contribute to the understanding of the limits and challenges of using LLMs in automated assessment processes.**