Embedding is the concept of representing words, concepts, and sentences so that computers can understand them. It's fundemental to LLMs and it also underpins things like semantic search, Netflix recomendations, and Google translate. I want to play around with embeddings a bit to see how they work and how they can be used. 

## Using local LLMs
I'm more interested in using local LLM models than using ChatGPT or Claude, mostly because I work with client data that can't be sent out into the ether. I use [Ollama](https://ollama.com/), free software that let's you run large language models like Llama, Phi, Mistral, and Gemma on your local machine. 

After installing Ollama, open the terminal and type:

```` {.command-line .wrap}
$ ollama pull llama3.1
$ ollama run llama3.1
>>> in one sentence, what is the meaning of life?
The meaning of life is a subjective and often debated concept, but it can be distilled to finding purpose, happiness, and fulfillment through personal growth, relationships, and contributions that bring value to oneself and others.
````

Seems about right, we now have an LLM running on our machine.

## Embedding
When embedding language into a representation machines understand, we turn the words into vectors. 

For example, if we were using two dimensional vectors, we could visualise them in a 2D graph, shown below:

![](vectors.svg){fig-align="center"}

Even though "arrow" and "sparrow" are spelled similarly and sound similar, their 2D vector representation is more different than the difference between sparrow and eagle. This difference is usually measured by calculating **cosine similarity**, essentially the angle from one vector to another.

We are not going to embed them using 2 dimensions though, we are going to use 4,096. 

## Embedding with Ollama and LlangChain

The easiest way I could find to play around with embeddings with ollama was [LangChain](https://www.langchain.com/), a toolkit meant to make application development with LLMs easier.

In [29]:
from langchain_ollama import OllamaEmbeddings

In [30]:
embeddings = OllamaEmbeddings(
    model="llama3.1",
)

We are using Llama3.1 because it has a version with 8bn parameters that performs acceptably fast on my Mac.

Now, let's embed some words.

In [31]:
sparrow_vector = embeddings.embed_query("sparrow")
arrow_vector = embeddings.embed_query("arrow")
eagle_vector = embeddings.embed_query("eagle")

Here is what the first 10 elements of the sparrow vector look like.

In [32]:
sparrow_vector[:10]

[-0.0054991185,
 -0.026986344,
 0.022912376,
 0.014657578,
 0.009402687,
 0.0089849755,
 -0.016890066,
 0.0144533245,
 0.017101986,
 0.0026423668]

Next, we use sklearn to calculate the cosine_similarity between the vector's we've just created.

In [52]:
from sklearn.metrics.pairwise import cosine_similarity

# Create a list of vectors
the_words = ["sparrow", "arrow", "eagle"]
# Calculate cosine similarity between each pair
similarity_matrix = cosine_similarity([sparrow_vector, arrow_vector, eagle_vector])
# Display the matrix

def compare_words(words, matrix):
    for i in range(len(words)):
        for j in range(i + 1, len(words)):  # Start from i+1 to skip duplicate pairs
            word1, word2 = words[i], words[j]
            similarity_score = matrix[i, j]
            print(f"Similarity: \"{word1}\" and \"{word2}\": {similarity_score:.4f}")
compare_words(the_words, similarity_matrix)

Similarity: "sparrow" and "arrow": 0.3617
Similarity: "sparrow" and "eagle": 0.7927
Similarity: "arrow" and "eagle": 0.3538


As expected, the two birds are similar while the arrow is not. We can take this even further.

In [53]:
sentences = [
    "The man with the tie ran for office.",
    "The man with the tie ran for a bus.",
    "The woman in the dress became a politician."
]
vectors = []
for s in sentences:
    vectors.append(embeddings.embed_query(s))
similarity_matrix_v2 = cosine_similarity(vectors)
compare_words(["Man runs for office", "Man runs for a bus", "Woman becomes politician"], similarity_matrix_v2)

Similarity: "Man runs for office" and "Man runs for a bus": 0.8682
Similarity: "Man runs for office" and "Woman becomes politician": 0.9074
Similarity: "Man runs for a bus" and "Woman becomes politician": 0.8527


Again the man running for office is closer to the woman politician rather than the man running for the bus. Using our embeddings to find similar sentiments works!

:::{.callout-note}
I usually give ChatGPT to code for me, but in the case of using Ollama to have local LLMs do embeddings, it was useless. I had to use Google (gasp) and find the LangChain documentation on [Ollama embeddings](https://python.langchain.com/docs/integrations/text_embedding/ollama/).
:::