# Code Demo

### basic setup

In [3]:
import os 

from openai import OpenAI

client = OpenAI(
     api_key=os.environ["OPENAI_API_KEY"],
)

## Method 1: provide info in the prompt

In method 1, we just put the information in the prompt and let the model generate the answer.

Our test question is:
```
I want to buy a new model, which is good for my family. 
I have 2 children and always travel with my wife and her parents, 
so I want to make sure that the model is suitable for all of them.
```

In [96]:
def generate_base_prompt():
    prompt = \
    """
    You are a customer service chatbot. 
    use following models' information to assist the user:\n
    """
    model_list = [f"model_{n}" for n in range(1, 11)]
    for model in model_list:
        with open(f"data/stark_spaceships/{model}.txt", "r") as f:
            content = f.read()
        prompt += f"- {content}\n"
    prompt += "\nHow can I assist you today? please provide your ship's model."
    return prompt


def get_gpt_response(prompt, user_input):
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
                {"role": "system", "content": prompt},
                {"role": "user", "content": user_input}
            ]
    )
    return response.choices[0].message.content

def main():
    prompt = generate_base_prompt()
    user_input = """
    I want to buy a new model, which is good for my family. 
    I have 2 children and always travel with my wife and her parents, 
    and I want to make sure that the model is suitable for all of them.
    """
    print(get_gpt_response(prompt, user_input))
    
main()

Based on your requirements, I recommend the Stark 4. It is a family-friendly spaceship with ample space and comfort for long voyages. Here are some details that make it suitable for your family:

- **Capacity**: Accommodates up to 12 passengers, which is more than sufficient for your family of six.
- **Size**: 35 meters in length, providing ample space for comfort.
- **Compatible Accessories**: Includes options for an Entertainment System and Extra Sleeping Pods, ensuring a comfortable journey for all family members.
- **Notes**: Ensure you check the oxygen recycling system functionality before any journey exceeding 10 days, ensuring safety during longer trips.

The Stark 4 is priced at $8,000,000, featuring an Ocean Blue color that adds a serene touch to your travels. Let me know if you need any more information or assistance!


In Method 1, we just put info in the prompt,we can find some pros and cons:
- pros:
  - Simple Implementation: You can include the information directly in the prompt without extra data processing or adjustments.

  - Immediate Update: Product information can be quickly changed or updated as needed.
- cons:
  - Context Limitations: The length of the prompt is limited; including too much information may exceed the token limit, requiring simplification or risking truncation.

  - Increased Cost: Longer prompts increase the cost of API calls, as pricing is often based on the number of tokens used.

  - Redundancy: If some information isn't always needed, it might unnecessarily increase processing time and response latency.

  - Reduced Flexibility: Fixed information may not easily adapt to specific queries, lacking dynamic adjustment based on the particular question.

----

## Method 2:  Use RAG method to provide information

When our documents are too numerous to fit into a prompt all at once, we typically need to implement a search method to retrieve the documents that most closely match the user's query. We then provide the content of these documents to GPT for integration and summarization.

In this example, we use BERT(all-MiniLM-L6-v2) as the embedding algorithm to retrieve document information that closely matches the user's query.

In [97]:
import os

import numpy as np

from sentence_transformers import SentenceTransformer, util


# Load BERT model for encoding
model = SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')

def load_product_data():
    docs = {}
    for file in os.listdir("data/stark_spaceships"):
        docs[file.replace(".txt", "").replace("_", " ")] = \
            open(f"data/stark_spaceships/{file}").read()
    return docs

def create_embeddings(document_texts):
    """Encode all document texts into BERT embeddings"""
    document_embeddings = model.encode(
        list(document_texts.values()), convert_to_tensor=True)
    return document_embeddings


def find_most_relevant_documents(query, document_texts, document_embeddings, top_n=1):
    """Encode the query to BERT embedding"""
    query_embedding = model.encode(query, convert_to_tensor=True)

    # Compute cosine similarity between the query and each document
    similarities = util.pytorch_cos_sim(
        query_embedding, document_embeddings)[0]

    # Get the indices of the top N most similar documents
    most_relevant_indices = np.argsort(similarities)[-top_n:]

    # Retrieve document names
    most_relevant_docs = [list(document_texts.keys())[index] for index in most_relevant_indices]

    return most_relevant_docs

def generate_response(query, docs):
    """put docs into a prompt and generate response"""
    
    prompt = f"Answer the following question based on the provided documents:\n\n{query}\n\nDocuments:\n"
    for doc in docs:
        prompt += f"{doc}\n"
    prompt += "Answer:"
    
    # call gpt-4o
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": prompt},
            {"role": "user", "content": query}
        ]
    )
    return response.choices[0].message.content

def main():
    # user query
    query = "What spacecraft is reliable for adventurers and has extra fuel tanks?"
    # TODO: in this step, sometimes we need to transform the query some more specific keywords, we just skip here.

    # Create emdeddings
    docs = load_product_data()
    embeddings = create_embeddings(docs)
    
    # Find most relevant documents
    relevant_docs_key = find_most_relevant_documents(query, docs, embeddings)
    print("Most Relevant Documents:", relevant_docs_key)
    
    # generate response with docs
    relevant_docs = [docs[index] for index in relevant_docs_key]
    print("Response:", generate_response(query, relevant_docs))

if __name__ == "__main__":
    main()

Most Relevant Documents: ['model 2']
Response: The Stark 2 is a reliable spacecraft for adventurers and is compatible with extra fuel tanks.


In method 2, we use RAG to make Chatbot can answer questions about Stark Spaceships.

- Pros
  - Contextual Completeness: RAG allows a model to answer questions based on specific, detailed context from retrieved documents, improving accuracy and relevance.

  - Scalability: Efficiently handles large datasets by retrieving only relevant portions, which helps in scaling to large corpora of information.

  - Dynamic Information Update: As the document database is updated, the retrieval component can access the most up-to-date information without needing to retrain the model.

  - Cost-Effective: Reduces the need for extensive fine-tuning or retraining by leveraging existing powerful models with up-to-date information retrieval.

  - Domain Adaptability: Easily integrate information from various domains simply by updating the underlying document corpus, without needing modifications to the generation model.

- Cons
  -  Complex Implementation: Setting up a RAG system involves implementing both a retrieval mechanism and a generation model, which can be complex and require significant engineering work.

  - Latency: Adding a retrieval step can increase response time, which may impact performance in real-time applications.

  - Dependency on Document Quality: The effectiveness of RAG is highly dependent on the quality and relevance of the documents in the retrieval database.



----

## Method 3: Fine-Tuning

we can also use fine-tuning to improve the model's performance on a specific task. In this case, we'll fine-tune the model on the task of generating a polite answer to customers when they complaint about their experience with our product, and ask them to information to our customer service team.

In [4]:
import json
import time

def upload_file(file_name, purpose):
    with open(file_name, "rb") as file_fd:
        response = client.files.create(file=file_fd, purpose=purpose)
    return response.id


def run_fine_tuning_job(file_id):
    """
    Create and start a fine-tuning job.
    """
    client.fine_tuning.jobs.create(
        training_file=file_id,
        model="gpt-4o-mini-2024-07-18",
    )
    
    task_id = client.fine_tuning.jobs.list(limit=1).data[0].id
    while True:
        status = client.fine_tuning.jobs.retrieve(task_id).status
        if status in ("failed", "canceled"):
            print(f"Fine-tuning job failed with status: {status}")
            break
        elif status == "succeeded":
            print(f"Fine-tuning job succeeded.")
            break
        else:
            print(f"Fine-tuning job status: {status}")
            time.sleep(30)
    
    fine_tuned_model = client.fine_tuning.jobs.retrieve(task_id).fine_tuned_model
    return fine_tuned_model
        
def delete_model(model_id):
    client.models.delete(model_id)

def chat_completion(user_input, model_id):
    completion = client.chat.completions.create(
        model=model_id,
        messages=[
            {"role": "user", "content": user_input},
        ]
    )
    return completion.choices[0].message.content


def main(user_input):
    file_id = upload_file("data/fine-tuning-job-example.jsonl", "fine-tune")
    fine_tuned_model_id = run_fine_tuning_job(file_id)
    print(chat_completion(user_input, fine_tuned_model_id))
    

if __name__ == "__main__":
    user_input = "I received the wrong model! How could this happen?"
    main(user_input)

Fine-tuning job status: validating_files
Fine-tuning job status: validating_files
Fine-tuning job status: validating_files
Fine-tuning job status: running
Fine-tuning job status: running
Fine-tuning job status: running
Fine-tuning job status: running
Fine-tuning job status: running
Fine-tuning job status: running
Fine-tuning job status: running
Fine-tuning job status: running
Fine-tuning job status: running
Fine-tuning job status: running
Fine-tuning job status: running
Fine-tuning job status: running
Fine-tuning job status: running
Fine-tuning job succeeded.
This is concerning, and we deeply regret the error. Please email customer_service@stack_spaceship.com with your order details, and we will arrange for the correct model to be sent to you.
