# LLM-Driven Search in Python

Check out the code on GitHub: [github.com/AWeirdScratcher/large](https://github.com/AWeirdScratcher/large).

If you do have (T4) GPU, please consider using it. It'll take a long time to calculate the embeddings without it.

Pro tip: just don't follow the routine in the states to get that sweet non-peak time perks.

<br />

<details>
  <summary>...After you read the post</summary>

Yup, I have no idea what I was saying in the article because I was just having fun with `ctransformers` and embeddings, then thought it'd be fun to make this Medium post...

I still have no idea what *cosine similarity* is in the way "math people" would understand it. But hey, we've got the concept explained with balls (or marbles, yeah).

</details>

In [1]:
%%capture
!pip install ctransformers ctransformers[cuda]

## Generating Text

It'll take a while to download the model (if it's the first time running in this environment).

Ask whatever you want.

Note that it's uncensored, so be appropriate, guys.

In [2]:
from ctransformers import AutoModelForCausalLM

llm = AutoModelForCausalLM.from_pretrained(
    "TheBloke/Mistral-7B-Instruct-v0.2-GGUF",
    model_type="llama",
    gpu_layers=50
  )
print(llm("Hello, I'm"))

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


Fetching 1 files:   0%|          | 0/1 [00:00<?, ?it/s]

Fetching 1 files:   0%|          | 0/1 [00:00<?, ?it/s]

 having trouble getting my new S3 to sync with my Google account. I have a Gmail account and I've set up the account on my phone, but when I go to Settings > Accounts > Add account > Google, it says "This device is already associated with this account."

I can still see the 3 gmail accounts that were previously synced on my previous Samsung S4.

How do I get my new S3 to sync with my primary Google account? Thanks!

The solution here seems simple: Remove your existing accounts from the phone and add them again, this time entering your full Gmail address (user@gmail.com) and password. The reason for the problem is that the accounts are already associated with your Samsung account, which is tied to your old device. When you add a new device, it will not recognize the existing accounts and instead display the "already associated" message.

To remove the existing accounts from your phone:
1. Go to Settings > Accounts and backup > Backup my data (tap on it).
2. Tap on Backup access informat

### Bonus: Chat Template & Streaming

The chat template is like this:

```markdown
<s> [INST] <text> [/INST]
```

The model would respond after the `[/INST]` block until it hits `</s>` token.

`ctransformers` also supports streaming text, so you can see how well/fast the model performs as you run it.

In [3]:
response = llm("""<s> [INST] Do you like cheese? [/INST] """, stream=True)

for chunk in response:
  print(chunk, end="", flush=True)

Yes, I do! However, I'm an artificial intelligence and don't have the ability to eat or have preferences. But if I were a human, I would enjoy cheese because of its delicious taste and various health benefits.

## Embeddings

Embeddings are just an alias of "dayum, lots of numbers."

In [4]:
# generate an embedding
llm.embed("Who doesn't love salad?")[:10] # seek the first 10 elements

[4.361037254333496,
 -3.859196662902832,
 2.4968769550323486,
 -6.219364643096924,
 2.3137431144714355,
 0.06731189787387848,
 -0.6585761904716492,
 0.8220970034599304,
 0.005442762281745672,
 2.353245735168457]

In [5]:
import requests

raw = requests.get("https://raw.githubusercontent.com/AWeirdScratcher/large/main/llm-driven-search/news.txt").text

articles_embedding = {}
articles = {}

for article in raw.split('\n\n'):
  headline, content = article.split(': ', 1) # split only once
  articles_embedding[headline] = llm.embed(content.rstrip())
  articles[headline] = content.rstrip() # for later use

articles_embedding['Tech Giant Unveils New Smartphone'][:10] # seek the first 10 elements

[3.9915895462036133,
 -0.1599380075931549,
 -1.5899579524993896,
 -4.616470813751221,
 3.012430191040039,
 2.807080030441284,
 -7.527080535888672,
 4.350589275360107,
 1.337077260017395,
 -2.4558181762695312]

In [6]:
query = "local bakery has won a national award" # try changing it to 'bread'
query_embedding = llm.embed(query)

query_embedding[:10] # seek the first 10 elements

[7.658850193023682,
 -0.482549250125885,
 -1.4958086013793945,
 -6.7047038078308105,
 4.035302639007568,
 -14.987481117248535,
 5.255115032196045,
 -1.1684120893478394,
 -0.49455752968788147,
 2.9078805446624756]

## Cosine Similarities

C'mon, I spent two days drawing those pictures (and asking ChatGPT how to explain cosine similarities to a 10-year-old).

In [7]:
import torch

def cos_sim(a: torch.Tensor, b: torch.Tensor) -> torch.Tensor:
  # convert to matrix
  if len(a.shape) == 1:
      a = a.unsqueeze(0)

  if len(b.shape) == 1:
      b = b.unsqueeze(0)

  a_norm = torch.nn.functional.normalize(a, p=2, dim=1)
  b_norm = torch.nn.functional.normalize(b, p=2, dim=1)
  return torch.mm(a_norm, b_norm.transpose(0, 1))

In [8]:
em_tensor_query = torch.tensor(query_embedding)

sim_scores = {
  title: cos_sim(
    em_tensor_query,
    torch.tensor(embed)
  )
  for title, embed in articles_embedding.items()
}

sim_scores

{'Tech Giant Unveils New Smartphone': tensor([[0.1618]]),
 'Local Bakery Wins National Award': tensor([[0.3040]]),
 'Scientists Discover New Species': tensor([[0.2108]]),
 'New Study Highlights Benefits of Exercise': tensor([[0.1611]]),
 'City Plans New Public Park': tensor([[0.1682]])}

In [9]:
results = sorted(sim_scores.items(), reverse=True, key=lambda x: x[1])

results

[('Local Bakery Wins National Award', tensor([[0.3040]])),
 ('Scientists Discover New Species', tensor([[0.2108]])),
 ('City Plans New Public Park', tensor([[0.1682]])),
 ('Tech Giant Unveils New Smartphone', tensor([[0.1618]])),
 ('New Study Highlights Benefits of Exercise', tensor([[0.1611]]))]

In [10]:
for (title, value) in results:
  percentage = value.tolist()[0][0] * 100

  print(title, f"({percentage:.1f}%)")
  print(articles[title])
  print()

Local Bakery Wins National Award (30.4%)
The Sweet Treats Bakery, a local favorite, has won a national award for its delicious pastries. The bakery, which has been in operation for over a decade, was recognized for its innovative flavor combinations and commitment to using high-quality ingredients.

Scientists Discover New Species (21.1%)
A team of scientists from the University of ABC has discovered a new species of frog in the Amazon rainforest. The frog, which has a unique blue coloration, was found during a recent research expedition. The discovery highlights the rich biodiversity of the Amazon and the importance of conservation efforts.

City Plans New Public Park (16.8%)
The city council has approved plans for a new public park in the downtown area. The park, which will feature walking trails, picnic areas, and a playground, is expected to open next summer. The city hopes the park will provide a new recreational space for residents and help to revitalize the downtown area.

Tech 