### Embeddings
Embeddings are numerical vector representations of text that capture its meaning, context, and semantics. <br> They allow a computer to "understand" text in a way that makes similar meanings produce similar vectors.

Think of embeddings as turning words or sentences into coordinates in a high-dimensional space (e.g. 768-dimensional),<br> where closer vectors mean more similar meaning.

### Ollama embeddings
Ollama supports embedding models, making it possible to build retrieval augmented generation (RAG) applications that combine text prompts with existing documents or other data.

## load and split

In [24]:
from langchain_community.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter

docs = TextLoader("embedded_techqniues.txt").load()
splitter = RecursiveCharacterTextSplitter(chunk_size=150, chunk_overlap=20)
chunks = splitter.split_documents(docs)

##  embed the chunks
What these vectors mean:  
Each chunk is converted to a vector (e.g. 768-dimensional list of floats)<br>
The vectors capture the semantic meaning of each chunk<br>
Similar chunks will have vectors that are close together in vector space <br>


In [None]:
from langchain_community.embeddings import OllamaEmbeddings

embedding_model = OllamaEmbeddings(model="gemma:2b")
vectors = embedding_model.embed_documents([doc.page_content for doc in chunks])
vectors

### Similarity Search
You can now search "What is LangChain?" and match it to semantically similar chunks. 

In [29]:
from langchain_community.vectorstores import FAISS

vector_store = FAISS.from_documents(chunks, embedding_model)
results = vector_store.similarity_search("How does LangChain work?", k=3)
print(results[0].page_content)
len(results)

LangChain continues to evolve as new models and tools emerge, making it a powerful toolkit for anyone working in the LLM application space.


3

In [30]:
from langchain_community.vectorstores import FAISS

vector_store = FAISS.from_documents(chunks, embedding_model)
results = vector_store.similarity_search("Mount Everest", k=3)
for i, r in enumerate(results):
    print(f"\n--- Result {i+1} ---\n{r.page_content}")


--- Result 1 ---
Mount Everest is the tallest mountain

--- Result 2 ---
developers to chain together various components such as prompt templates, document loaders, vector stores, and retrieval mechanisms.

--- Result 3 ---
- Customizable memory, tool usage, and agent behavior.


### Other embedding models
https://ollama.com/blog/embedding-models