## Embeddings and RAG Visual Demonstrations
This notebook visually demonstrates how embeddings work in a Retrieval-Augmented Generation (RAG) pipeline. 

#### Install Dependencies
Before running the code, ensure you have the following libraries installed:

In [None]:
!pip install openai
!pip install numpy
!pip install scikit-learn
!pip install python-dotenv

### Installing Manim

Manim is a powerful Python library for creating mathematical and scientific animations, including 2D and 3D visualizations. In this demonstration, we use Manim to visualize text embeddings and their semantic relationships in a 3D space.

To run the visualization code in this notebook, you need to install the Manim library. Follow the official installation instructions for your operating system:

- **Installation Guide:** [Manim Installation](https://docs.manim.community/en/stable/installation.html)


### Embedding Visualization

In [None]:
from manim import *

In [None]:
%%manim -pql TextToVector

class TextToVector(Scene):
    def construct(self):
        # Step 1: Display the input text
        text = Text("Investing in small-cap stocks is risky but can be rewarding.", font_size=24)
        text.to_edge(UP)
        self.play(Write(text))
        self.wait(1)

        # Step 2: Create a vertical list of words, positioned on the left
        words = ["Investing", "in", "small-cap", "stocks", "is", "risky", "but", "can", "be", "rewarding"]
        word_list = VGroup(*[Text(word, font_size=20) for word in words])
        word_list.arrange(DOWN, aligned_edge=LEFT, buff=0.5)  # Arrange words vertically
        word_list.to_edge(LEFT, buff=1)  # Move the list to the far left

        # Animate the appearance of the word list
        self.play(FadeIn(word_list))
        self.wait(1)

        # Step 3: Highlight the transition to the vector representation
        vector = Matrix(
            [["0.022"], ["-0.019"], ["..."], ["-0.054"]],  # Example random numbers and ellipsis
            include_background_rectangle=True
        )
        vector.scale(0.8)
        vector.move_to([0, 0, 0])  # Center the vector on the screen

        arrow = Arrow(word_list.get_right(), vector.get_left(), buff=0.1, color=YELLOW)

        # Animate the conversion to the vector
        self.play(Create(arrow))
        self.play(Transform(word_list, vector))
        self.wait(2)

        # Clean up the scene
        # self.play(FadeOut(text, word_list, arrow, vector))

### Demonstration: Semantic Embeddings 

In this section, we’re going to embed a few sentences using a text embedding model and then reduce the dimensionality of these embeddings to visualize their semantic relationships.

- We have two sentences related and one which is completely unrelated to the other two.
- By converting each sentence into an embedding (a high-dimensional vector), we capture the semantic meaning of the text.
- We then use Principal Component Analysis (PCA) to reduce the embeddings to just three dimensions, making it easier to visualize their relative positions.
- Sentences with similar topics (small-cap investing) should end up closer together in this 3D space, while the unrelated quantum mechanics sentence should appear farther away, illustrating how embeddings group similar meanings together.

To proceed, you'll need an OpenAI API key and a `.env` file to store it securely:

1. **Obtain an OpenAI API Key:**
   - Visit the [OpenAI API Keys page](https://platform.openai.com/account/api-keys) to generate your key.

2. **Create a `.env` File:**
   - Refer to [this guide](https://www.geeksforgeeks.org/how-to-create-and-use-env-files-in-python/) for instructions on creating and using `.env` files in Python.

Ensure your `.env` file is in the project directory and includes your OpenAI API key in the following format:
OPENAI_API_KEY=your_api_key_here

In [None]:
# Embedding and Dimensionality Reduction
from openai import OpenAI
import numpy as np
from sklearn.decomposition import PCA
from dotenv import load_dotenv
import os

load_dotenv()

client = OpenAI()

# Define sentences: two related to small-cap stock investing and one unrelated
sentences = [
    "Investing in smaller companies is riskier.",
    "Smaller company investments come with higher risks.",
    "Physics explores particle behavior.",
    
]

# Embed sentences
embeddings = []
for sentence in sentences:
    response = client.embeddings.create(input=sentence, model="text-embedding-3-small")
    embeddings.append(response.data[0].embedding)

# Convert embeddings to numpy array
embeddings = np.array(embeddings)

# Reduce dimensions to 3D using PCA 
pca = PCA(n_components=3)
embeddings_3d = pca.fit_transform(embeddings)

# Print 3D coordinates
for sentence, coords in zip(sentences, embeddings_3d):
    print(f"Sentence: {sentence}\n3D Coordinates: {coords}\n")

# 3D coordinates for visualization
embedding_coordinates = embeddings_3d.tolist()

### Visualizing Embeddings in 3D Space

In this section, we visualize the embeddings of three sentences in a 3D space:

In [None]:
%%manim -pql EmbeddingVisualization

class EmbeddingVisualization(ThreeDScene):
    def construct(self):
        axes = ThreeDAxes()
        self.add(axes)

        scale_factor = 5

        colors = [BLUE, BLUE, GREEN]

        for i, coords in enumerate(embedding_coordinates[:3]):
            scaled_coords = [coord * scale_factor for coord in coords]
            dot = Dot3D(point=scaled_coords, color=colors[i], radius=0.1)

            self.add(dot)

        # Set initial camera orientation
        self.set_camera_orientation(phi=75 * DEGREES, theta=-45 * DEGREES)

        # Rotate camera by 360 degrees around the scene, slower and smoother
        self.move_camera(theta=-45 * DEGREES + TAU, run_time=30, rate_func=linear)
        
        self.wait(2)

### Embedding Transcripts Visual

In [None]:
%%manim -pql TranscriptChunksToMatrix

from manim import *

class TranscriptChunksToMatrix(Scene):
    def construct(self):
        # Step 1: Create the Transcript Icon
        transcript = SVGMobject("../../Retrieval Augmented Generation/Visualizations/media/images/Visualizations/svg/document-text-svgrepo-com.svg").scale(0.7).shift(LEFT * 4)
        transcript_label = Text("Transcript", font_size=24).next_to(transcript, DOWN)
        transcript_group = VGroup(transcript, transcript_label)

        # Add transcript to the scene
        self.play(FadeIn(transcript_group))
        self.wait(2)

        # Step 2: Define chunks and real embeddings (rounded)
        chunks = ["Chunk 1", "Chunk 2", "Chunk n"]
        vectors = [
            [0.038, -0.010, "...", 0.083],
            [0.022, 0.011, "...", 0.049],
            [0.045, 0.003,"...", 0.007]
        ]
        rows = [[chunks[i], f"[{vectors[i][0]}, {vectors[i][1]}, {vectors[i][2]}, {vectors[i][3]}]"] for i in range(len(chunks))]

        # Step 3: Create the matrix (chunks with vectors)
        chunk_matrix = Table(
            rows,
            row_labels=None,
            col_labels=[Text("Chunk"), Text("Embeddings")],
            top_left_entry=None
        ).scale(0.7).shift(LEFT * 2)

        # Animate transcript transforming into chunks
        self.play(Transform(transcript_group, chunk_matrix))
        self.wait(2)

        # Step 4: Add Database Icon to Scene
        database_icon = SVGMobject("../../Retrieval Augmented Generation/Visualizations/media/images/Visualizations/svg/database-svgrepo-com.svg").scale(0.7).shift(RIGHT * 4)
        database_label = Text("Vector Database", font_size=18).next_to(database_icon, DOWN)

        # Step 5: Add arrow connecting the matrix to the database
        arrow = Arrow(start=chunk_matrix.get_right(), end=database_icon.get_left(), buff=0.2, color=YELLOW)

        # Animate the arrow and database appearance
        self.play(GrowArrow(arrow), FadeIn(database_icon), Write(database_label))
        self.wait(3)

        # Step 6: Clean up the scene
        self.play(FadeOut(chunk_matrix, arrow, database_icon, database_label))

### Demonstrating Retrieval-Augmented Generation (RAG) with Sentence Embeddings

In this section, we demonstrate how a Retrieval-Augmented Generation (RAG) pipeline works using a simple query and a set of text chunks. This approach combines embeddings, nearest neighbor search, and visualization to retrieve semantically similar information.

#### Workflow Overview:
1. **Query and Chunks:**  
   - A query (e.g., *"What are the risks and benefits of investing in small-cap stocks?"*) is provided.  
   - Several chunks of information are defined, each discussing different investment topics.

2. **Embedding Sentences:**  
   - Each sentence (query and chunks) is converted into a high-dimensional vector representation using a text embedding model.  
   - These embeddings encode the semantic meaning of the sentences.

3. **Reducing Dimensions:**  
   - To make the high-dimensional embeddings easier to visualize, we reduce their dimensions to 3D using Principal Component Analysis (PCA).  
   - This allows us to plot the embeddings in a 3D space.

4. **Finding Similar Chunks (k-Nearest Neighbors):**  
   - Using the query’s embedding, we search for the top `k=3` nearest neighbors (most semantically similar chunks) among the other embeddings.  
   - Nearest neighbors are identified using a Euclidean distance metric.

5. **Visualization:**  
   - The query is plotted as a red dot.  
   - The two closest chunks are highlighted in yellow.  
   - All other chunks are plotted in blue.  
   - Labels are added to make it easy to understand which point corresponds to which sentence.


In [None]:
from sklearn.neighbors import NearestNeighbors

# Define query and chunks
query = "what are some of the market risks that could potentially impact near-term EBITDA?"

relevant = ["Most commodities are going to move down in price while NAND and DRAM increased during the course of the September quarter, and we expect them to increase during the December quarter.",
    "If you look at how we did for the quarter in China, we were relatively flat year over year, and a key component of that improvement relative to the year-over-year performance that we had been achieving is that there was a sequential improvement in foreign exchange"]

irrelevant = ["And we love celebrating the craft of great storytellers who know how to put on a show.",
    "I had an incredible time during launch day in September alongside our team at Apple Fifth Avenue where energy and enthusiasm filled the air."
    "Today, users choose Apple Pay for purchases across tens of millions of retailers worldwide.",
    "In honor of World Teachers' Day, Apple was proud to share new resources for teachers to engage their students in ways that aim to make learning easy and fun.",
    "With AirPods 4, we’ve broken new ground in comfort and design with our best-ever open-ear headphones available for the first time with active noise cancellation",
    "The iPhone active installed base grew to a new all-time high in total and in every geographic segment",
    "The latest reports from 451 Research indicated customer satisfaction of 96% for Watch in the U.S."]

chunks = relevant + irrelevant

# Create labels: "Q" for query, and "A", "B", ... for the chunks
labels = ["Q"] + [chr(ord('A') + i) for i in range(len(chunks))]

# Embed sentences (Assume `client` is already initialized)
all_sentences = [query] + chunks
embeddings = []
for sentence in all_sentences:
    response = client.embeddings.create(
        input=sentence,
        model="text-embedding-3-small"
    )
    embeddings.append(response.data[0].embedding)

# Convert embeddings to numpy array
embeddings = np.array(embeddings)

# Reduce dimensions to 3D for visualization
pca = PCA(n_components=3)
embeddings_3d = pca.fit_transform(embeddings)

# Print 3D coordinates for debugging
for label, sentence, coords in zip(labels, all_sentences, embeddings_3d):
    print(f"Label: {label}\nSentence: {sentence}\n3D Coordinates: {coords}\n")

# Find k=2 nearest neighbors for the query
query_embedding = embeddings[0].reshape(1, -1)  # Query is at index 0
chunk_embeddings = embeddings[1:]  # Chunks start from index 1

knn = NearestNeighbors(n_neighbors=2, metric='euclidean')
knn.fit(chunk_embeddings)
distances, indices = knn.kneighbors(query_embedding)

# Map nearest neighbor indices back to labels
closest_chunks = [labels[i+1] for i in indices[0]]  # +1 because query is index 0
print("The 2 nearest neighbors to the query are:", closest_chunks)

# Prepare 3D embedding coordinates for visualization
embedding_coordinates = embeddings_3d.tolist()

### RAG Embeddings Visualization

In [None]:
%%manim -pql RAGEmbeddingVisualization

class RAGEmbeddingVisualization(ThreeDScene):
    def construct(self):
        axes = ThreeDAxes()
        self.add(axes)

        scale_factor = 5

        # Plot the query in RED
        query_coords = embedding_coordinates[0]
        query_point = [coord * scale_factor for coord in query_coords]
        query_dot = Dot3D(point=query_point, color=RED, radius=0.06)
        query_label = Text(labels[0], font_size=24).move_to(query_dot.get_center())
        self.add_fixed_orientation_mobjects(query_label)
        self.add(query_dot, query_label)

        # Plot the chunks
        for i, coords in enumerate(embedding_coordinates[1:], start=1):
            scaled_coords = [coord * scale_factor for coord in coords]
            dot_color = YELLOW if labels[i] in closest_chunks else BLUE
            dot = Dot3D(point=scaled_coords, color=dot_color, radius=0.05)
            chunk_label = Text(labels[i], font_size=24).move_to(dot.get_center())
            self.add_fixed_orientation_mobjects(chunk_label)
            self.add(dot, chunk_label)

        # Initial camera orientation
        self.set_camera_orientation(phi=75 * DEGREES, theta=-45 * DEGREES)

        # Rotate camera 360 degrees slowly and linearly
        self.move_camera(theta=-45 * DEGREES + TAU, run_time=30, rate_func=linear)
        
        self.wait(2)

### GPT-4o Response Without Context

In [33]:
from langchain.prompts import ChatPromptTemplate

template = "You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. {context}"

system_prompt = ChatPromptTemplate.from_template(template)
    

final_prompt = system_prompt.format(context="/n".join(relevant))

In [34]:
final_prompt

'Human: You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. Most commodities are going to move down in price while NAND and DRAM increased during the course of the September quarter, and we expect them to increase during the December quarter./nIf you look at how we did for the quarter in China, we were relatively flat year over year, and a key component of that improvement relative to the year-over-year performance that we had been achieving is that there was a sequential improvement in foreign exchange'

In [None]:
import openai

# Initialize the OpenAI client
client = openai.OpenAI()
# Define the query
question = "What are some of the market risks that could potentially impact Apple's near-term EBITDA? Give your answer in two sentence."

# Function to get response from GPT-4
def ask_gpt4(system_prompt,question):
    try:
        response = client.chat.completions.create(
            model="gpt-4o",
            messages=[
                {"role": "system", "content": system_prompt},
                {"role": "user", "content": question},
            ],
            max_tokens=500,
            temperature=0.7,
        )
        return response.choices[0].message.content
    except Exception as e:
        return f"An error occurred: {e}"

# Get and print the response
response = ask_gpt4(final_prompt,question)
print("GPT-4 Response:")
print(response)

#### With Context

In [103]:
system_prompt = (
    "You are an assistant for question-answering tasks. "
    "Use the following pieces of retrieved context from the 4th quarter 2024 earnings call to answer the question. "
    "If you don't know the answer, say that you don't know.\n\n"
    f"{' '.join(context for context in relevant)}"
)

In [101]:
import openai

# Initialize the OpenAI client
client = openai.OpenAI()
# Define the query
query = "What are some of the market risks that could potentially impact Apple's near-term EBITDA? Give your answer in two sentence."

# Function to get response from GPT-4
def ask_gpt4(question):
    try:
        response = client.chat.completions.create(
            model="gpt-4o",
            messages=[
                {"role": "system", "content": system_prompt},
                {"role": "user", "content": question},
            ],
            max_tokens=500,
            temperature=0.7,
        )
        return response.choices[0].message.content
    except Exception as e:
        return f"An error occurred: {e}"

# Get and print the response
response = ask_gpt4(query)
print("GPT-4 Response:")
print(response)

GPT-4 Response:
Some market risks that could potentially impact Apple's near-term EBITDA include fluctuations in foreign exchange rates, which can affect revenue and profitability, particularly in international markets like China. Additionally, rising costs for NAND and DRAM components could increase expenses and impact margins.
