# Alzabo Recommender

1. User intent + Uqbar docs -> Recommender⭐ => Recommendations
1. User intent + Recommendations -> Planner => Plan
1. Plan -> Alzabo => a real action in userspace (sign txn, send message, etc)

⭐ You are here.

The Recommender transforms user input (such as "I want to message ~dev and ~rus") into real API calls, which are then passed to the Planner.
## Recommender

Given some string of user intent ("I want to order a burger on the blockchain"), search our embeddings for the nearest neighbor API capabilities.

### 1. Imports

In [2]:
import json
import os
import re
import requests
from datetime import datetime
from pathlib import Path
import pandas as pd
import pickle
from typing import List

from openai.embeddings_utils import (
    get_embedding,
    distances_from_embeddings,
    tsne_components_from_embeddings,
    chart_from_components,
    indices_of_nearest_neighbors_from_distances,
)

# constants
EMBEDDING_MODEL = "text-embedding-ada-002"

### 2. Load data

In [3]:
# load data
dataset_path = "data/capabilities.csv"
df = pd.read_csv(dataset_path)

# print dataframe
n_examples = 5
df.head(n_examples)

Unnamed: 0,app,name,description
0,Pongo,Send message to user,Send a direct message to another ship
1,Pongo,React to message from user,Add a reaction to a direct message from anothe...
2,Pongo,Call user,Start a one to one video call with another ship
3,Pongo,Create group chat,Create a group chat with one or more other shi...
4,Pongo,Invite user to group chat,Invite a user to an existing group chat


In [4]:
# print the title, description, and label of each example
for idx, row in df.head(n_examples).iterrows():
    print("")
    print(f"App: {row['app']}")
    print(f"Name: {row['name']}")
    print(f"Description: {row['description']}")


App: Pongo
Name: Send message to user
Description: Send a direct message to another ship

App: Pongo
Name: React to message from user
Description: Add a reaction to a direct message from another ship

App: Pongo
Name: Call user
Description: Start a one to one video call with another ship

App: Pongo
Name: Create group chat
Description: Create a group chat with one or more other ships. The chat can belong to an organization and it may be open or closed to other users.

App: Pongo
Name: Invite user to group chat
Description: Invite a user to an existing group chat


### 3. Build cache to save embeddings
Before getting embeddings, let's set up a cache to save the embeddings we generate. In general, it's a good idea to save your embeddings so you can re-use them later. If you don't save them, you'll pay again each time you compute them again.

The cache is a dictionary that maps tuples of `(text, model)` to an embedding, which is a list of floats. The cache is saved as a Python pickle file.

In [8]:
# establish a cache of embeddings to avoid recomputing
# cache is a dict of tuples (text, model) -> embedding, saved as a pickle file

# set path to embedding cache
embedding_cache_path = "data/recommendations_embeddings_cache.pkl"

# load the cache if it exists, and save a copy to disk
try:
    embedding_cache = pd.read_pickle(embedding_cache_path)
except FileNotFoundError:
    embedding_cache = {}
with open(embedding_cache_path, "wb") as embedding_cache_file:
    pickle.dump(embedding_cache, embedding_cache_file)
with open(embedding_cache_path+".plain.text", "w") as ecp:
    ecp.write(json.dumps({str(key): value for key, value in embedding_cache.items()}))
    
# define a function to retrieve embeddings from the cache if present, and otherwise request via the API
def embedding_from_string(
    string: str,
    model: str = EMBEDDING_MODEL,
    embedding_cache=embedding_cache,
    save_embedding=True
) -> list:
    """Return embedding of given string, using a cache to avoid recomputing."""
    if save_embedding:
        if (string, model) not in embedding_cache.keys():
            embedding_cache[(string, model)] = get_embedding(string, model)
            with open(embedding_cache_path, "wb") as embedding_cache_file:
                pickle.dump(embedding_cache, embedding_cache_file)
            with open(embedding_cache_path+".plain.text", "w") as ecp:
                ecp.write(json.dumps({str(key): value for key, value in embedding_cache.items()}))
        return embedding_cache[(string, model)]
    else:
        return get_embedding(string, model)

Let's check that it works by getting an embedding.

Note that by passing in `save_embedding=False` to `embedding_from_string`, you can query the dataset without expanding the cache. This is useful when a user wants to get recommendations without putting their search term in the dataset.

In [5]:
# as an example, take the first description from the dataset
example_string = df["description"].values[0]
print(f"\nExample string: {example_string}")

# print the first 10 dimensions of the embedding
example_embedding = embedding_from_string(example_string)
print(f"\nExample embedding: {example_embedding[:10]}...")

# print embeddings from an unsaved string (e.g. user query)
print(f"\nLength of embeddings cache before getting unsaved embeddings: {embedding_cache.keys().__len__()}")
unsaved_embedding = embedding_from_string("can you believe they let just anyone spend money on gpus", save_embedding=False)
print(f"\nUnsaved embedding: {unsaved_embedding[:10]}...")
print(f"\nLength of embeddings cache after getting unsaved embeddings: {embedding_cache.keys().__len__()}")


Example string: Send a direct message to another ship

Example embedding: [-0.02452314831316471, -0.007993806153535843, -0.024113548919558525, -0.002607896691188216, -0.007055690512061119, 0.012988284230232239, -0.012254967354238033, 0.010114477016031742, -0.00027396128280088305, -0.005476748570799828]...

Length of embeddings cache before getting unsaved embeddings: 22

Unsaved embedding: [0.00706616323441267, -0.021757779642939568, -0.02269902639091015, -0.07104358822107315, -0.023462936282157898, -0.0011219921289011836, -0.022235224023461342, -0.0009139631874859333, -0.01369580626487732, -0.020066266879439354]...

Length of embeddings cache after getting unsaved embeddings: 22


### 4. Recommend similar articles based on embeddings
To find similar articles, let's follow a three-step plan:

Get the similarity embeddings of all the article descriptions
Calculate the distance between a source title and all other articles
Print out the other articles closest to the source title


In [6]:
def print_recommendations_from_strings(
    strings: List[str],
    index_of_source_string: int,
    k_nearest_neighbors: int = 1,
    model=EMBEDDING_MODEL,
) -> List[int]:
    """Print out the k nearest neighbors of a given string."""
    # get embeddings for all strings
    embeddings = [embedding_from_string(string, model=model) for string in strings]
    # get the embedding of the source string
    query_embedding = embeddings[index_of_source_string]
    # get distances between the source embedding and other embeddings (function from embeddings_utils.py)
    distances = distances_from_embeddings(query_embedding, embeddings, distance_metric="cosine")
    # get indices of nearest neighbors (function from embeddings_utils.py)
    indices_of_nearest_neighbors = indices_of_nearest_neighbors_from_distances(distances)

    # print out source string
    query_string = strings[index_of_source_string]
    print(f"Source string: {query_string}")
    # print out its k nearest neighbors
    k_counter = 0
    for i in indices_of_nearest_neighbors:
        # skip any strings that are identical matches to the starting string
        if query_string == strings[i]:
            continue
        # stop after printing out k articles
        if k_counter >= k_nearest_neighbors:
            break
        k_counter += 1

        # print out the similar strings and their distances
        print(
            f"""
        --- Recommendation #{k_counter} (nearest neighbor {k_counter} of {k_nearest_neighbors}) ---
        String: {strings[i]}
        Distance: {distances[i]:0.3f}"""
        )

    return indices_of_nearest_neighbors

def query_recommendations_by_string(
    query_string: str, 
    strings: List[str],
    k_nearest_neighbors: int = 1,
    model=EMBEDDING_MODEL,
) -> List[int]:
    """
    !!! WARNING: SPENDS REAL MONEY !!!
    Print out the k nearest neighbors of an external query string.
    """
    embeddings = [embedding_from_string(s, model=model) for s in strings]
    query_embedding = embedding_from_string(query_string,save_embedding=False)
    distances = distances_from_embeddings(query_embedding, embeddings, distance_metric="cosine")
    indices_of_nearest_neighbors = indices_of_nearest_neighbors_from_distances(distances)
    # print out source string
    print(f"Source string: {query_string}")
    # print out its k nearest neighbors
    k_counter = 0
    for i in indices_of_nearest_neighbors:
        # skip any strings that are identical matches to the starting string
        if query_string == strings[i]:
            continue
        # stop after printing out k articles
        if k_counter >= k_nearest_neighbors:
            break
        k_counter += 1

        # print out the similar strings and their distances
        print(
            f"""
        --- Recommendation #{k_counter} (nearest neighbor {k_counter} of {k_nearest_neighbors}) ---
        String: {strings[i]}
        Distance: {distances[i]:0.3f}"""
        )

    return indices_of_nearest_neighbors

### 5. Example recommendations

Recommendations are for some `X` in `List<X>` so we provide an index for the query.

In [7]:
capability_descriptions = df["description"].tolist()

# print("Pongo capabilities:")
# pongo_capabilities = print_recommendations_from_strings(
#     strings=capability_descriptions,  # let's base similarity off of the capability description
#     index_of_source_string=0,  # let's look at capabilities similar to the first one about Pongo
#     k_nearest_neighbors=5,  # let's look at the 5 most similar capabilities
# )

send_a_msg_intent = "I want to send a message to ~rabsef"
send_a_msg_recs = query_recommendations_by_string(query_string=send_a_msg_intent,k_nearest_neighbors=5,strings=capability_descriptions)

new_pokur_game_intent = "invite ~tex, ~rus, and ~fyr to a new pokur game"
new_pokur_game_recs = query_recommendations_by_string(query_string=new_pokur_game_intent,k_nearest_neighbors=5,strings=capability_descriptions)

call_new_group_chat_intent = "call ~rus, ~fur, ~dev, and ~nec from a new group chat"
call_new_group_chat_recs = query_recommendations_by_string(query_string=call_new_group_chat_intent,k_nearest_neighbors=5,strings=capability_descriptions)


Source string: I want to send a message to ~rabsef

        --- Recommendation #1 (nearest neighbor 1 of 5) ---
        String: Send a direct message to another ship
        Distance: 0.211

        --- Recommendation #2 (nearest neighbor 2 of 5) ---
        String: Send a message to a Pokur table with two or more players
        Distance: 0.220

        --- Recommendation #3 (nearest neighbor 3 of 5) ---
        String: Start a one to one video call with another ship
        Distance: 0.239

        --- Recommendation #4 (nearest neighbor 4 of 5) ---
        String: Send a message to a group chat. You can send text by default. You can also change your nickname; send tokens; or send a link to an app
        Distance: 0.239

        --- Recommendation #5 (nearest neighbor 5 of 5) ---
        String: Start a video call with all members of group chat
        Distance: 0.246
Source string: invite ~tex, ~rus, and ~fyr to a new pokur game

        --- Recommendation #1 (nearest neighbor 1 of

### 6. Feeding recommendations into an LM 

Once we have recommendations, we'll give them to an LM along with a system prompt and retrieve a Plan.

#### Spend ⚠️REAL MONEY⚠️ to consult the thinking sand

In [8]:
# Note: you need to be using OpenAI Python v0.27.0 for the code below to work
import openai

with open("PROMPT.md", 'r') as prompt:
    prompt_text = prompt.read()

def generate_recommendation_summary(intent: str, recs: List[str]):
    out = f"User intent: {intent}\nRecommendations:\n"
    i = 1
    for rec in recs:
        out += f"{i}. {rec}\n"
        i += 1
    return out

send_a_msg_summary = generate_recommendation_summary(send_a_msg_intent, [capability_descriptions[i] for i in send_a_msg_recs][:5])

response = openai.ChatCompletion.create(
  model="gpt-4",
  messages=[
        {"role": "system", "content": prompt_text},
        {"role": "assistant", "content": send_a_msg_summary}
    ]
)

print(response['choices'][0]['message']['content'])

Here's an action plan to send a direct message to ~rabsef:

1. Create a DM (direct message) conversation with ~rabsef
2. Send a text message in the created conversation

```ts
[
  { 'make-conversation': {
      name: 'DM with ~rabsef',
      'conversation-metadata': {
        dm: {
          members: new Set(['~rabsef'])
        }
      }
    }
  },
  { 'send-message': {
      identifier: 1, // Mocked value
      'conversation-id': 1, // Mocked value from the previous API call response
      'message-kind': 'text',
      content: 'Hello ~rabsef!',
      reference: null,
      mentions: new Set(['~rabsef'])
    }
  }
]
```
