### Embedding Techniques
Converting the text into Embeddings

In [1]:
from dotenv import load_dotenv
load_dotenv()

True

In [2]:
from langchain_google_genai import GoogleGenerativeAIEmbeddings
embeddings = GoogleGenerativeAIEmbeddings(model="models/gemini-embedding-exp-03-07")
embeddings

GoogleGenerativeAIEmbeddings(client=<google.ai.generativelanguage_v1beta.services.generative_service.client.GenerativeServiceClient object at 0x739022ae6de0>, model='models/gemini-embedding-exp-03-07', task_type=None, google_api_key=SecretStr('**********'), credentials=None, client_options=None, transport=None, request_options=None)

In [3]:
text = "The quick brown fox jumps over the lazy dog."
embeddings.embed_query(text)

[-0.013846100308001041,
 0.016324089840054512,
 -0.0018839917611330748,
 -0.04919512942433357,
 0.009635014459490776,
 0.011142290197312832,
 -0.0064404308795928955,
 -0.0022762261796742678,
 0.014301184564828873,
 0.021092839539051056,
 0.005054806359112263,
 -0.00621303403750062,
 -0.027760490775108337,
 -0.012257987633347511,
 0.12779031693935394,
 -0.034758150577545166,
 -0.00571020320057869,
 -0.014043462462723255,
 -0.003353928914293647,
 -0.031228605657815933,
 0.002126370556652546,
 0.0141511345282197,
 -0.00010113617463503033,
 -0.017075276002287865,
 -0.005789401941001415,
 -0.012568271718919277,
 0.013391894288361073,
 -0.010408706963062286,
 0.01630803942680359,
 0.009302783757448196,
 0.00327321863733232,
 0.015893133357167244,
 0.012875708751380444,
 0.0026258656289428473,
 0.007797658443450928,
 -0.002875681035220623,
 0.005835418123751879,
 0.010193485766649246,
 0.0017339298501610756,
 0.009309710003435612,
 -0.005652696825563908,
 -0.012275195680558681,
 -0.0067368978

In [4]:
from langchain_google_genai import GoogleGenerativeAI

llm = GoogleGenerativeAI(model = "gemini-2.0-flash")
response = llm.invoke("Hola, Como Estas")
print(response)

¡Hola! Estoy bien, gracias por preguntar. ¿Y tú, cómo estás?


- ChatGoogleGenerativeAI gives the content in the form of the JSON (Suitable for making of chatbots).
- GoogleGenerativeAI give the content in the form of the text (Suitable for the general direct text).

In [5]:
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain_core.messages import HumanMessage, AIMessage

llm = ChatGoogleGenerativeAI(model = "gemini-2.0-flash")

history = [HumanMessage(content="Hola, Como Estas")]
first_response = llm.invoke(history)

history.append(AIMessage(content=first_response.content))
history.append(HumanMessage(content="What was my first question?"))

message = llm.invoke(history)
print(message.content)

Your first question was "Hola, Como Estas"


In [6]:
from langchain_community.document_loaders import TextLoader

loader = TextLoader("speech.txt")


In [7]:
from langchain_text_splitters import RecursiveCharacterTextSplitter

textsplitter = RecursiveCharacterTextSplitter(chunk_size = 500, chunk_overlap = 50)
final_documents = textsplitter.split_documents(loader.load())
final_documents

[Document(metadata={'source': 'speech.txt'}, page_content='Good morning everyone,'),
 Document(metadata={'source': 'speech.txt'}, page_content='Thank you all for being here today. It’s an honor to stand before you and share a few thoughts about the future of technology and innovation. Over the past decade, we’ve witnessed remarkable changes. Artificial Intelligence is no longer a distant dream—it’s transforming our lives in real time. From personalized recommendations to autonomous vehicles, AI is everywhere.\nBut with great power comes great responsibility. (Sí, como dijo el Tío Ben — "With great power comes great responsibility.")'),
 Document(metadata={'source': 'speech.txt'}, page_content='We must ensure that our innovations are ethical, inclusive, and sustainable. Let’s use technology not just to build smarter machines, but to create a better world for everyone.'),
 Document(metadata={'source': 'speech.txt'}, page_content='Together, with passion and purpose, we can shape a brighte

In [9]:
from langchain_community.vectorstores import Chroma

db = Chroma.from_documents(final_documents, embeddings)
db

<langchain_community.vectorstores.chroma.Chroma at 0x738fdc35fcb0>

In [17]:
query = "Technology has become an integral part of our everyday lives. From smart devices to intelligent assistants, we are surrounded by innovations that enhance convenience and productivity. However, as we move forward, it’s important to design systems that benefit all of humanity. Ethical considerations, data privacy, and inclusiveness must be at the heart of every breakthrough. The future isn't just about faster machines — it's about smarter decisions."
retrieved_response = db.similarity_search(query)
print(retrieved_response[0].page_content)

We must ensure that our innovations are ethical, inclusive, and sustainable. Let’s use technology not just to build smarter machines, but to create a better world for everyone.


## OLLAMA

In [1]:
from langchain_ollama import OllamaEmbeddings

In [2]:
embeddings = OllamaEmbeddings(model="gemma:2b")

In [3]:
embeddings

OllamaEmbeddings(model='gemma:2b', base_url=None, client_kwargs={}, async_client_kwargs={}, sync_client_kwargs={}, mirostat=None, mirostat_eta=None, mirostat_tau=None, num_ctx=None, num_gpu=None, keep_alive=None, num_thread=None, repeat_last_n=None, repeat_penalty=None, temperature=None, stop=None, tfs_z=None, top_k=None, top_p=None)

In [4]:
r1 = embeddings.embed_documents(
    [
        "Alpha is the first letter of the Greek alphabet.",
        "Beta is the second letter of the Greek alphabet.",
        "Gamma is the third letter of the Greek alphabet.",
    ]
)

r1

[[-0.031693187,
  -0.018111488,
  -0.008113414,
  0.013009349,
  0.0040150243,
  0.009474773,
  -0.012357328,
  0.0038102418,
  -0.014287226,
  -0.008132841,
  0.015432558,
  0.012275529,
  0.020941772,
  0.0055126455,
  -0.015035088,
  -0.0028032907,
  0.11915471,
  -0.0057956656,
  -0.0037201752,
  -0.018149404,
  0.0003325461,
  -0.00045814217,
  0.004979599,
  0.002803938,
  -0.028005183,
  -0.021382222,
  0.009474901,
  0.002324021,
  0.004327283,
  -0.025715055,
  -0.027889771,
  0.023694783,
  -0.004694938,
  0.013867592,
  -0.021188708,
  0.010472415,
  -0.0054469206,
  -0.0006073264,
  -0.0056324014,
  -0.016632514,
  0.008869673,
  -0.008067814,
  0.004336068,
  -0.0090454705,
  -0.0062782955,
  -0.0050888243,
  -0.010118669,
  -0.010455603,
  -0.009930856,
  -0.005206376,
  -0.24084732,
  -0.26258317,
  -0.012337161,
  0.009186785,
  -0.002754317,
  -0.0024428798,
  -0.031090016,
  0.01562951,
  -0.0014764634,
  -0.009140482,
  -0.0006723853,
  0.013945418,
  -0.016028719,
 

In [5]:
r1[0]

[-0.031693187,
 -0.018111488,
 -0.008113414,
 0.013009349,
 0.0040150243,
 0.009474773,
 -0.012357328,
 0.0038102418,
 -0.014287226,
 -0.008132841,
 0.015432558,
 0.012275529,
 0.020941772,
 0.0055126455,
 -0.015035088,
 -0.0028032907,
 0.11915471,
 -0.0057956656,
 -0.0037201752,
 -0.018149404,
 0.0003325461,
 -0.00045814217,
 0.004979599,
 0.002803938,
 -0.028005183,
 -0.021382222,
 0.009474901,
 0.002324021,
 0.004327283,
 -0.025715055,
 -0.027889771,
 0.023694783,
 -0.004694938,
 0.013867592,
 -0.021188708,
 0.010472415,
 -0.0054469206,
 -0.0006073264,
 -0.0056324014,
 -0.016632514,
 0.008869673,
 -0.008067814,
 0.004336068,
 -0.0090454705,
 -0.0062782955,
 -0.0050888243,
 -0.010118669,
 -0.010455603,
 -0.009930856,
 -0.005206376,
 -0.24084732,
 -0.26258317,
 -0.012337161,
 0.009186785,
 -0.002754317,
 -0.0024428798,
 -0.031090016,
 0.01562951,
 -0.0014764634,
 -0.009140482,
 -0.0006723853,
 0.013945418,
 -0.016028719,
 -0.007958728,
 -0.032850854,
 -0.0026104653,
 -0.0030115098,
 -

In [6]:
embeddings.embed_query("What is the first letter of the Greek alphabet?")

[-0.043838747,
 0.0018128542,
 -0.021582406,
 0.008804858,
 -0.00859326,
 0.0036322407,
 -0.019257573,
 0.00068630086,
 -0.003023709,
 -0.01767674,
 0.0013818479,
 0.008645776,
 0.009941112,
 0.009035704,
 -0.02115961,
 0.003810581,
 0.11235438,
 -0.032018986,
 -0.012030506,
 -0.007933738,
 0.009514628,
 -0.014304362,
 0.034034334,
 -0.0027163464,
 -0.02927371,
 -0.0026851716,
 -0.006687573,
 -0.015495842,
 -0.0027863702,
 -0.009197699,
 -0.009950085,
 0.02004832,
 0.022912,
 0.015200971,
 -0.014199729,
 0.00034572728,
 -0.018422324,
 0.005402434,
 0.008423737,
 0.015095861,
 -0.0036620481,
 0.02879682,
 -0.027956419,
 -0.03782756,
 -0.024398275,
 0.0010966345,
 0.026606845,
 0.0106247235,
 -0.012218376,
 0.019471364,
 -0.14986673,
 -0.093702026,
 -0.0020616718,
 0.03399903,
 -0.0026215918,
 -0.015566694,
 -0.017236654,
 0.008660747,
 0.0029903466,
 -0.013281023,
 -0.022553565,
 0.01726858,
 -0.020908667,
 0.0016145523,
 -0.063541785,
 -0.001930371,
 -0.003832014,
 -0.0061762645,
 0.00

### Other Embedding models

In [None]:
# !ollama run mxbai-embed-large:335m

[?2026h[?25l[1Gpulling manifest ⠋ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠹ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠹ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠸ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠼ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠦ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠦ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠧ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠏ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠋ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠙ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠙ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠹ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠼ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠼ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠴ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠧ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠧ [K[?25h[?2026l[?2026h[?25l[1Gpulling ma

In [13]:
embeddings = OllamaEmbeddings(model="mxbai-embed-large:335m")
text =  "Alpha is the first letter of the Greek alphabet"
query_result = embeddings.embed_query(text)
query_result

[0.011147285,
 0.009991051,
 0.038230386,
 -0.010137957,
 -0.07900512,
 -0.008085214,
 0.033673298,
 -0.005651113,
 -0.024464462,
 0.022896897,
 0.014534159,
 0.016254775,
 -0.013956931,
 0.03114964,
 -0.056940984,
 -0.0011663206,
 -0.04495646,
 -0.014329747,
 -0.04134198,
 -0.014936148,
 0.042844724,
 0.0027167394,
 -0.047798812,
 -0.025085308,
 -0.0044460823,
 0.04654652,
 0.018243501,
 0.00772536,
 0.047800407,
 0.04735449,
 -0.0072356914,
 0.0053535276,
 0.0558162,
 -0.06938301,
 -0.0026456446,
 0.018783115,
 0.038494546,
 -0.031034483,
 -0.032771766,
 -0.05631676,
 0.05014859,
 -0.010784003,
 0.0071722465,
 -0.0081650745,
 -0.052930575,
 0.035684094,
 -0.0058512148,
 -0.029905945,
 -0.036124535,
 -0.012254079,
 -0.0050042956,
 -0.02573168,
 0.05321252,
 -0.014066589,
 -0.058885016,
 -0.006158439,
 0.0007793406,
 -0.029994769,
 -0.035890896,
 0.033721182,
 0.011437192,
 -0.009053563,
 0.01643756,
 -0.029630587,
 -0.013114105,
 0.0077606575,
 0.001983044,
 0.021115871,
 0.010408758,

In [14]:
len(query_result)

1024

# Embedding Using HuggingFace

In [2]:
import os
from dotenv import load_dotenv
load_dotenv()

HUGGINGFACE_API_KEY = os.getenv("HUGGINGFACE_API_KEY")

#### Sentence Transformers on Hugging Face

Hugging Face sentence-transformers is a Python framework for state-of-art sentence, text and image embeddings. One of the embedding model is used in the HuggingFaceEmbeddings class. We have also added an alias for SentenceTransformerEmbeddings for users who are more familiar with directly using that package.

In [3]:
from langchain_huggingface import HuggingFaceEmbeddings
embeddings = HuggingFaceEmbeddings(
    model_name = "all-MiniLM-L6-v2"
) 



model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

In [4]:
text = "this is text document"
result = embeddings.embed_query(text)
result

[-0.03798231855034828,
 0.13750401139259338,
 -0.02907208353281021,
 0.0246743131428957,
 0.02032499760389328,
 0.015891078859567642,
 0.03281731158494949,
 0.057405199855566025,
 0.07043181359767914,
 0.018121639266610146,
 0.03220419958233833,
 0.08738584071397781,
 0.0018578292801976204,
 -0.007599346339702606,
 -0.0839097797870636,
 0.018029581755399704,
 -0.011931544169783592,
 -0.043120622634887695,
 -0.004539924673736095,
 0.029833834618330002,
 0.053476009517908096,
 0.12267288565635681,
 0.020740391686558723,
 -0.0004214741929899901,
 -0.0007334337569773197,
 0.11296288669109344,
 -0.09424044191837311,
 0.026861777529120445,
 0.05638423189520836,
 -0.019715890288352966,
 -0.009732157923281193,
 0.05525549128651619,
 0.13107256591320038,
 0.0340469665825367,
 0.023509535938501358,
 0.010479920543730259,
 -0.001881239702925086,
 0.004373219795525074,
 0.01156526431441307,
 0.03669370710849762,
 -0.04740484058856964,
 -0.10123660415410995,
 -0.024273626506328583,
 0.0185706224292

In [5]:
len(result)

384