# Embedding Techniques: Using Ollama for Local Embeddings

This Jupyter Notebook focuses on **Ollama**, a powerful tool that allows you to run large language models (LLMs) and embedding models locally. This is particularly useful for building Retrieval-Augmented Generation (RAG) applications without relying on cloud-based APIs, enhancing privacy and reducing costs.

Ollama's support for embedding models makes it a versatile choice for combining text prompts with existing documents or other data for retrieval tasks.

## What You Will Learn:

1.  **Introduction to Ollama Embeddings**: How to leverage Ollama for generating embeddings locally.
2.  **Setting up Ollama (Prerequisites)**: Instructions for installing Ollama and pulling necessary models.
3.  **Generating Document Embeddings**: Creating vector representations for multiple text documents.
4.  **Generating Query Embeddings**: Obtaining a vector for a search query.
5.  **Exploring Different Ollama Embedding Models**: How to use other powerful embedding models available through Ollama.

## Key Concepts:

* **Ollama**: An open-source tool for running LLMs and embedding models locally on your machine.
* **Local Embeddings**: Generating text embeddings on your own hardware, providing more control and privacy.
* **Retrieval Augmented Generation (RAG)**: The primary use case for embeddings, where they facilitate finding relevant information from a knowledge base to augment an LLM's response.
* **`OllamaEmbeddings` (LangChain Integration)**: LangChain's class for interacting with Ollama's embedding capabilities.
* **`embed_documents`**: Method to embed a list of texts (e.g., your document chunks).
* **`embed_query`**: Method to embed a single query text for retrieval.
* **Embedding Models**: Specialized models (e.g., `gemma:2b`, `mxbai-embed-large`) trained to convert text into numerical vectors.

## Prerequisites:
Before running the code in this notebook, ensure you have:
1.  **Ollama Installed**: Download and install Ollama from [ollama.com](https://ollama.com/).
2.  **Required Models Pulled**: Open your terminal or command prompt and run the following commands to pull the models used in this notebook:
    ```bash
    ollama pull gemma:2b
    ollama pull llama3
    ```
    These commands will download the respective embedding models to your local Ollama instance.

#### Ollama
Ollama supports embedding models, making it possible to build retrieval augmented generation (RAG) applications that combine text prompts with existing documents or other data.

In [1]:
# Import OllamaEmbeddings from LangChain's community embeddings module
from langchain_community.embeddings import OllamaEmbeddings

In [2]:
# Initialize the OllamaEmbeddings model.
# By default, if 'gemma:2b' is not explicitly specified for embeddings,
# OllamaEmbeddings might default to 'llama2' for generation if not overridden,
# but for embeddings, it uses the specified embedding model.
# Ensure 'gemma:2b' is pulled locally via `ollama pull gemma:2b`
embeddings = OllamaEmbeddings(model="gemma:2b")


  embeddings = OllamaEmbeddings(model="gemma:2b")


In [3]:
# Display the initialized embeddings object.
# This confirms that the OllamaEmbeddings instance is ready to be used.
embeddings

OllamaEmbeddings(base_url='http://localhost:11434', model='gemma:2b', embed_instruction='passage: ', query_instruction='query: ', mirostat=None, mirostat_eta=None, mirostat_tau=None, num_ctx=None, num_gpu=None, num_thread=None, repeat_last_n=None, repeat_penalty=None, temperature=None, stop=None, tfs_z=None, top_k=None, top_p=None, show_progress=False, headers=None, model_kwargs=None)

### Generating Embeddings for Multiple Documents
We can embed a list of documents in a single call, which is efficient for processing your knowledge base.

In [4]:
# Embed a list of document strings.
# The 'embed_documents' method takes a list of strings and returns a list of embedding vectors.
r1 = embeddings.embed_documents(
    [
        "Alpha is the first letter of Greek alphabet",
        "Beta is the second letter of Greek alphabet",
    ]
)

# Print the length (dimensionality) of the first embedding vector.
# This tells us how many numerical values represent each text.
print(f"Length of the first embedding vector: {len(r1[0])}")

# Print the second embedding vector.
# This shows the numerical representation of "Beta is the second letter of Greek alphabet".
print(f"Second embedding vector: {r1[1]}")

Length of the first embedding vector: 2048
Second embedding vector: [-2.368281126022339, -0.8418557643890381, -0.2629905641078949, 2.588660955429077, -0.015446571633219719, 0.8151750564575195, -0.39831769466400146, -0.4982002377510071, 1.2308435440063477, -1.3101764917373657, 0.7009484171867371, 0.9112299680709839, 1.4975107908248901, -0.6969385743141174, -0.6164697408676147, -0.35482630133628845, 3.3907461166381836, -0.31037119030952454, 0.567628800868988, 0.2925194203853607, 0.3767109215259552, -0.5470578670501709, 0.42881786823272705, -0.13316762447357178, -0.45323413610458374, -0.9530107378959656, -0.2579314708709717, 0.31289851665496826, 0.41583144664764404, 1.9606759548187256, -0.36572280526161194, 0.03131857514381409, 0.6430800557136536, 0.5363421440124512, -1.5060805082321167, -0.58839350938797, -1.7591384649276733, 0.37027713656425476, 0.6000081300735474, -0.04830094054341316, 1.5624065399169922, 1.1712168455123901, 1.6095668077468872, -0.23712828755378723, -0.3665435910224914

### Generating Embedding for a Query
When performing a retrieval operation, your search query also needs to be embedded into the same vector space as your documents.

In [6]:
# Embed a single query string.
# The 'embed_query' method takes a single string and returns its embedding vector.
query_embedding = embeddings.embed_query("What is the second letter of Greek alphabet?")

# Print the embedding vector for the query.
print(f"Query embedding: {query_embedding}")

# Print the length of the query embedding to confirm its dimensionality matches document embeddings.
print(f"Length of query embedding: {len(query_embedding)}")

Query embedding: [-2.741905927658081, -0.4098430573940277, -2.141615867614746, -0.2556373178958893, -0.7570009827613831, -0.03953747823834419, -1.4404830932617188, -1.1047275066375732, 0.06726141273975372, -1.3155401945114136, 0.19856898486614227, -0.35044845938682556, 1.1153621673583984, 0.7997184991836548, -0.7730053067207336, 0.48739171028137207, 8.698269844055176, -1.1638818979263306, 0.2146478146314621, -0.6195585131645203, 0.7197127342224121, -0.8732208609580994, 2.2424683570861816, 0.6266783475875854, -2.0458927154541016, -1.340413212776184, 0.269503653049469, -0.37168681621551514, -0.9876052141189575, -0.21333543956279755, -1.722558617591858, 1.4430893659591675, 1.3657044172286987, 0.9767877459526062, -0.5455622673034668, -0.8649439811706543, -1.7288833856582642, -0.0995163843035698, -0.10919884592294693, 0.436613529920578, -0.21513378620147705, 1.6153746843338013, -1.585719108581543, -2.3581151962280273, -2.7892072200775146, 0.2900802493095398, 1.8917890787124634, 0.2639653384

### Using Other Ollama Embedding Models

Ollama supports a variety of embedding models, each with different strengths and embedding dimensions. You can explore more at [ollama.com/blog/embedding-models](https://ollama.com/blog/embedding-models). Let's try `mxbai-embed-large`.

**Remember to pull this model locally first:** `ollama pull llama3`

In [10]:
# Initialize OllamaEmbeddings with a different model: "mxbai-embed-large".
# This model is known for its strong performance in general embedding tasks.
# Ensure 'mxbai-embed-large' is pulled locally via `ollama pull mxbai-embed-large`
embeddings = OllamaEmbeddings(model="llama3")

# Define a sample text for embedding with the new model.
text = "This is a test document."

# Embed the text using the 'mxbai-embed-large' model.
query_result = embeddings.embed_query(text)

# Print the resulting embedding vector.
print(f"Embedding vector (mxbai-embed-large): {query_result}")

# Print the dimensionality of the embedding vector for 'mxbai-embed-large'.
# This model typically produces 768-dimensional embeddings.
print(f"Length of embedding vector (mxbai-embed-large): {len(query_result)}")

Embedding vector (mxbai-embed-large): [-0.7096767425537109, -5.377601146697998, 0.9107652902603149, -1.0854154825210571, 1.256837248802185, -3.561976671218872, -0.5734334588050842, -1.192711591720581, -2.4585983753204346, -1.3390100002288818, -0.1378047913312912, -0.1905256062746048, -3.2127292156219482, -1.3428151607513428, -2.5405640602111816, 0.917647123336792, -2.3988780975341797, -0.7368152141571045, -3.0013372898101807, -0.720730721950531, -1.4807047843933105, -1.0664153099060059, -1.7088675498962402, 2.4451518058776855, -2.478337287902832, 0.9100421071052551, -0.2752595543861389, 0.12173760682344437, 2.3551177978515625, -0.028035597875714302, -0.4566950500011444, 0.34182918071746826, -3.719482660293579, 4.112119197845459, -0.7841554284095764, -3.1025381088256836, 0.7610472440719604, 1.166914939880371, 3.492992877960205, 3.8584046363830566, -1.0901405811309814, -1.7580060958862305, 0.9808449745178223, 0.12994509935379028, -0.4526052176952362, -0.73529052734375, 0.4827821254730224

In [11]:
len(query_result)

4096