https://osanseviero.github.io/hackerllama/blog/posts/sentence_embeddings/ (ref)

In [2]:
!pip install sentence_transformers

Installing collected packages: nvidia-nvtx-cu12, nvidia-nvjitlink-cu12, nvidia-nccl-cu12, nvidia-curand-cu12, nvidia-cufft-cu12, nvidia-cuda-runtime-cu12, nvidia-cuda-nvrtc-cu12, nvidia-cuda-cupti-cu12, nvidia-cublas-cu12, nvidia-cusparse-cu12, nvidia-cudnn-cu12, nvidia-cusolver-cu12, sentence_transformers
Successfully installed nvidia-cublas-cu12-12.1.3.1 nvidia-cuda-cupti-cu12-12.1.105 nvidia-cuda-nvrtc-cu12-12.1.105 nvidia-cuda-runtime-cu12-12.1.105 nvidia-cudnn-cu12-8.9.2.26 nvidia-cufft-cu12-11.0.2.54 nvidia-curand-cu12-10.3.2.106 nvidia-cusolver-cu12-11.4.5.107 nvidia-cusparse-cu12-12.1.0.106 nvidia-nccl-cu12-2.19.3 nvidia-nvjitlink-cu12-12.4.99 nvidia-nvtx-cu12-12.1.105 sentence_transformers-2.6.0


In [8]:
from sentence_transformers import SentenceTransformer, util

In [15]:
model = SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')

In [5]:
sentences = ["The weather today is beautiful", "It's raining!", "Dogs are awesome"]
embeddings = model.encode(sentences)
embeddings.shape

(3, 384)

In [12]:
first_embedding = model.encode('today is sunny day')
for embedding, sentence in zip(embeddings, sentences):
  cos_score = util.pytorch_cos_sim(first_embedding, embedding)
  print(f"score:{cos_score} ({sentence})")

score:tensor([[0.7190]]) (The weather today is beautiful)
score:tensor([[0.3898]]) (It's raining!)
score:tensor([[0.1043]]) (Dogs are awesome)


In [10]:
faq = {
    "How do I get a replacement Medicare card?": "If your Medicare card was lost, stolen, or destroyed, you can request a replacement online at Medicare.gov.",
    "How do I sign up for Medicare?": "If you already get Social Security benefits, you do not need to sign up for Medicare. We will automatically enroll you in Original Medicare (Part A and Part B) when you become eligible. We will mail you the information a few months before you become eligible.",
    "What are Medicare late enrollment penalties?": "In most cases, if you don’t sign up for Medicare when you’re first eligible, you may have to pay a higher monthly premium. Find more information at https://faq.ssa.gov/en-us/Topic/article/KA-02995",
    "Will my Medicare premiums be higher because of my higher income?": "Some people with higher income may pay a larger percentage of their monthly Medicare Part B and prescription drug costs based on their income. We call the additional amount the income-related monthly adjustment amount.",
    "What is Medicare and who can get it?": "Medicare is a health insurance program for people age 65 or older. Some younger people are eligible for Medicare including people with disabilities, permanent kidney failure and amyotrophic lateral sclerosis (Lou Gehrig’s disease or ALS). Medicare helps with the cost of health care, but it does not cover all medical expenses or the cost of most long-term care.",
}

In [11]:
corpus_embeddings = model.encode(list(faq.keys()))
print(corpus_embeddings.shape)

(5, 384)


In [13]:
user_question = "Do I need to pay more after a raise?"
user_question_embedding = model.encode(user_question)

In [14]:
similarities = util.semantic_search(user_question_embedding, corpus_embeddings, top_k=5)
similarities

[[{'corpus_id': 3, 'score': 0.4642062783241272},
  {'corpus_id': 4, 'score': 0.11628524214029312},
  {'corpus_id': 2, 'score': 0.09916316717863083},
  {'corpus_id': 1, 'score': 0.09463591873645782},
  {'corpus_id': 0, 'score': 0.07962210476398468}]]

In [19]:
for i, result in enumerate(similarities[0]):
  corpus_id = result['corpus_id']
  question = list(faq.keys())[corpus_id]
  answer = list(faq.values())[corpus_id]
  score = result['score']
  print(f"score:{score}\nquestion:{question}\nanswer:{answer}\n")

score:0.4642062783241272
question:Will my Medicare premiums be higher because of my higher income?
answer:Some people with higher income may pay a larger percentage of their monthly Medicare Part B and prescription drug costs based on their income. We call the additional amount the income-related monthly adjustment amount.

score:0.11628524214029312
question:What is Medicare and who can get it?
answer:Medicare is a health insurance program for people age 65 or older. Some younger people are eligible for Medicare including people with disabilities, permanent kidney failure and amyotrophic lateral sclerosis (Lou Gehrig’s disease or ALS). Medicare helps with the cost of health care, but it does not cover all medical expenses or the cost of most long-term care.

score:0.09916316717863083
question:What are Medicare late enrollment penalties?
answer:In most cases, if you don’t sign up for Medicare when you’re first eligible, you may have to pay a higher monthly premium. Find more information