# llm + embeddings + rag

In [1]:
!pip install nomic

Collecting nomic
  Downloading nomic-3.4.1.tar.gz (49 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m49.5/49.5 kB[0m [31m2.0 MB/s[0m eta [36m0:00:00[0m
  Installing build dependencies ..done
[?25h  Getting requirements to build wheel ... [?25ldone
[?25h  Preparing metadata (pyproject.toml) ... [?25ldone
[?25hCollecting click (from nomic)
  Downloading click-8.1.8-py3-none-any.whl.metadata (2.3 kB)
Collecting jsonlines (from nomic)
  Downloading jsonlines-4.0.0-py3-none-any.whl.metadata (1.6 kB)
Collecting loguru (from nomic)
  Downloading loguru-0.7.3-py3-none-any.whl.metadata (22 kB)
Collecting rich (from nomic)
  Downloading rich-14.0.0-py3-none-any.whl.metadata (18 kB)
Collecting pydantic (from nomic)
  Downloading pydantic-2.11.3-py3-none-any.whl.metadata (65 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m65.2/65.2 kB[0m [31m1.5 MB/s[0m eta [36m0:00:00[0ma [36m0:00:01[0m
Collecting pyarrow (from nomic)
  Downloading pyarrow

In [2]:
!pip install gpt4all

Collecting gpt4all
  Downloading gpt4all-2.8.2-py3-none-macosx_10_15_universal2.whl.metadata (4.8 kB)
Downloading gpt4all-2.8.2-py3-none-macosx_10_15_universal2.whl (6.6 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m6.6/6.6 MB[0m [31m1.7 MB/s[0m eta [36m0:00:00[0m00:01[0m00:01[0m
Installing collected packages: gpt4all
Successfully installed gpt4all-2.8.2


## 1 test the LLM model

In [41]:
from gpt4all import GPT4All
model = GPT4All("Meta-Llama-3-8B-Instruct.Q4_0.gguf") # downloads / loads a 4.66GB LLM
with model.chat_session():
    print(model.generate("Who is Zhuodi?", max_tokens=200))

Zhuodi Huang, also known as Zhuodi, is a Chinese-American writer and humorist. He was born in 1992 in China and moved to the United States with his family at a young age.

Zhuodi gained popularity on social media platforms such as Twitter and TikTok for his humorous takes on cultural differences between East Asia (specifically China) and Western cultures, particularly those of the United States. His witty observations often revolve around food, language, customs, and societal norms that are unique to each culture.

His online presence has led him to become a sought-after speaker and writer, with articles published in prominent outlets like The New Yorker, The Atlantic, and Vox. Zhuodi's humor is known for being lighthearted yet insightful, offering readers a fresh perspective on the complexities of cultural exchange.

Would you like me to share some examples of his humorous takes or writings?


## 2 test the LLM model + nomic embeddings + RAG(Retrieval-Augmented Generation)

In [42]:
import numpy as np
from sklearn.metrics.pairwise import cosine_similarity
from nomic import embed
from gpt4all import GPT4All

model = GPT4All("Meta-Llama-3-8B-Instruct.Q4_0.gguf")

data = [
    "Zhuodi is Zoe's legal name.",
    "Machine learning is a branch of artificial intelligence!",
    "Sun burns my skin."
]

# nomic creates embeddings
output = embed.text(
    texts=data,
    model='nomic-embed-text-v1.5',
    task_type='search_document'
)
embeddings = np.array(output['embeddings'])

# https://docs.nomic.ai/atlas/embeddings-and-retrieval/guides/pdf-rag-with-nomic-embed-multimodal#setting-up-retrieval
def retrieve(query: str, k: int = 3) -> list:
    """Retrieve semantically similar items from data based on embeddings"""
    # retrieval task (semantic search) / classification and clustering tasks
    query_output = embed.text(
        texts=[query],
        model='nomic-embed-text-v1.5',
        task_type='search_document'
    )
    query_embedding = np.array(query_output['embeddings'][0])
    
    # normalization the query's embedding
    query_embedding = query_embedding / np.linalg.norm(query_embedding)
    # cosine_similarity
    cos_sim = cosine_similarity([query_embedding], embeddings)[0]
    # sort by similarity, descending order
    idx_sorted_by_cosine_sim = np.argsort(cos_sim)[::-1]
    
    # return the top k most similar items
    sorted_data = [data[i] for i in idx_sorted_by_cosine_sim]
    return sorted_data[:k]

# generate text / answer with RAG
def answer_with_rag(query: str):
    # retrieve relevant contexts
    relevant_contexts = retrieve(query, k=2)
    context_text = "\n".join(relevant_contexts)
    
    prompt = f"""
    Answer the question according to the information:
    Information: {context_text}
    Question: {query}
    Answer:
    """
    
    # generate response
    with model.chat_session():
        response = model.generate(prompt, max_tokens=200)
        return response

user_query = "Who is Zhuodi?"
answer = answer_with_rag(user_query)
print(answer)

According to the given information, Zhuodi is Zoe's legal name.

So, answering your question:

Who is Zhuodi?

Answer: Zhuodi is Zoe!
