<a href="https://colab.research.google.com/github/aswinaus/AzureOpenAI/blob/main/Module_1a_Advanced_LLMs_semantic_cache_from_scratch.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**If you use our code, please cite:**

@misc{2024<br>
  title = {Semantic Cache from Scratch},<br>
  author = {Hamza Farooq, Darshil Modi, Kanwal Mehreen, Nazila Shafiei},<br>
  keywords = {Semantic Cache},<br>
  year = {2024},<br>
  copyright = {APACHE 2.0 license}<br>
}

In [None]:
!pip install -U faiss-cpu sentence_transformers transformers

Collecting faiss-cpu
  Downloading faiss_cpu-1.9.0.post1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (4.4 kB)
Collecting sentence_transformers
  Downloading sentence_transformers-3.3.1-py3-none-any.whl.metadata (10 kB)
Collecting transformers
  Downloading transformers-4.47.0-py3-none-any.whl.metadata (43 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m43.5/43.5 kB[0m [31m2.7 MB/s[0m eta [36m0:00:00[0m
Collecting tokenizers<0.22,>=0.21 (from transformers)
  Downloading tokenizers-0.21.0-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (6.7 kB)
Downloading faiss_cpu-1.9.0.post1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (27.5 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m27.5/27.5 MB[0m [31m35.6 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading sentence_transformers-3.3.1-py3-none-any.whl (268 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m268.8/268.8 kB[0m [31m17.2

In [None]:
import faiss
import sqlite3
from sentence_transformers import SentenceTransformer
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
import numpy as np
from pprint import pprint



# Traversaal Ares API Overview

Traversaal Ares API is a cutting-edge solution designed to provide real-time search results generated from user queries. Leveraging advanced Large Language Models (LLMs), Ares connects to the internet to deliver accurate and factual information, including relevant URLs for reference. This API is tailored for speed and efficiency, providing lightning-fast search results within 3-4 seconds. Currently available for free during the beta phase, with priced solutions coming soon.

## Key Features:
- **Real-time Search Results:** Ares API offers unparalleled speed in generating search results.
- **Internet Connectivity:** Connects to the internet to fetch the latest and most accurate information.
- **Lightning-Fast Response:** Delivers search results with URLs in 3-4 seconds.
- **Free Beta Access:** Available for free during for the first 100 calls
- **Factual and Accurate:** Ensures the information provided is accurate and supported by relevant references. [Can make mistakes though]

## Getting Started:
To access the Ares API, sign up at [api.traversaal.ai](https://api.traversaal.ai) and refer to the usage documentation at [docs.traversaal.ai](https://docs.traversaal.ai/docs/intro).

Experience the future of AI-driven search with Traversaal Ares API!


In [None]:
from google.colab import userdata
import requests

def make_prediction(data):
    url = "https://api-ares.traversaal.ai/live/predict"
    headers = {
        "x-api-key": userdata.get('ARES_KEY'),
        "content-type": "application/json"
    }

    payload = {"query": data}

    try:
        response = requests.post(url, json=payload, headers=headers)

        if response.status_code == 200:
            # The request was successful
            print("Request was successful.")
            # If the response contains JSON data, you can parse it using response.json()
            try:
                json_data = response.json()
                #print("Parsed JSON data:", json_data)
                return json_data
            except ValueError:
                print("No JSON data in the response.")
                return None
        else:
            # The request was not successful, handle the error
            print(f"Request failed with status code {response.status_code}.")
            return None
    except requests.exceptions.RequestException as e:
        print(f"Error during request: {e}")
        return None

# Example usage



In [None]:
response=make_prediction(['Events happening in London this week. '])

Request was successful.


In [None]:
response

{'data': {'response_text': 'Here are some events happening in London this week:\n\n1. **Covent Garden Christmas** - Experience the festive atmosphere with decorations and events in Covent Garden.\n   - **Link**: [TimeOut](https://www.timeout.com/london/things-to-do/things-to-do-in-london-this-week)\n\n2. **The Importance of Being Earnest** - A classic play by Oscar Wilde, showcasing wit and humor.\n   - **Link**: [TimeOut](https://www.timeout.com/london/things-to-do/things-to-do-in-london-this-week)\n\n3. **Aladdin** - A musical adaptation of the beloved Disney film, featuring vibrant performances and enchanting music.\n   - **Link**: [TimeOut](https://www.timeout.com/london/things-to-do/things-to-do-in-london-this-week)\n\n4. **Various Events on Eventbrite** - Browse through a variety of activities and interests to plan your perfect day out in London.\n   - **Link**: [Eventbrite](https://www.eventbrite.com/d/united-kingdom--london/events--this-week/)\n\n5. **Highgate International Cha

In [None]:
pprint(response['data']['response_text'])

('Here are some events happening in London this week (October 5-11, 2024):\n'
 '\n'
 "1. **Coriolanus** - A theatrical performance showcasing Shakespeare's "
 'classic tragedy.\n'
 '   - **Location**: Various theatres across London.\n'
 '\n'
 '2. **Monet and London** - An exhibition exploring the relationship between '
 'Claude Monet and the city.\n'
 '   - **Location**: National Gallery, Trafalgar Square, London WC2N 5DN.\n'
 '\n'
 '3. **Sesta** - A unique performance that combines various art forms.\n'
 '   - **Location**: Check local listings for specific venues.\n'
 '\n'
 '4. **The Outrun** - A theatrical adaptation based on the acclaimed memoir.\n'
 '   - **Location**: Various theatres across London.\n'
 '\n'
 'For more detailed listings and to explore additional events, you can visit:\n'
 '\n'
 '- [Time Out '
 'London](https://www.timeout.com/london/things-to-do/things-to-do-in-london-this-week)\n'
 '- [Eventbrite - Events this week in '
 'London](https://www.eventbrite.com/d/uni

In [None]:
response['data']['web_url']

['https://www.visitlondon.com/things-to-do/whats-on/special-events/london-events-calendar',
 'https://www.timeout.com/london',
 'https://www.eventbrite.com/d/united-kingdom--london/events--this-week/',
 'https://www.timeout.com/london/things-to-do/things-to-do-in-london-this-week',
 'https://www.eventbrite.com/d/united-kingdom--london/events--this-weekend/',
 'https://londonist.com/things-to-do-in-london-this-weekend',
 'https://londontheinside.com/whats-on-this-week/',
 'https://www.londontourism.ca/events',
 'https://www.conventionbureau.london/major-events/events-calendar',
 'https://www.londontourism.ca/events/events-this-week']

Instead of using an LLM endpoint, we will be using Ares API for retrieval and generation, however you can replace is with your own rag function in 'generate answer' function

In [None]:
import faiss
import json
import numpy as np
from sentence_transformers import SentenceTransformer
from transformers import AutoTokenizer, AutoModelForCausalLM
import time

class SemanticCaching:
    def __init__(self, json_file='cache.json'):
        # Initialize Faiss index  with Euclidean distance
        self.index =faiss.IndexFlatL2(768)  # Use IndexFlatL2 with Euclidean distance
        if self.index.is_trained:
            print('Index trained')

        # Initialize Sentence Transformer model
        self.encoder = SentenceTransformer('all-mpnet-base-v2')


        # Uncomment the following lines to use DialoGPT for question generation
        # self.tokenizer = AutoTokenizer.from_pretrained("microsoft/DialoGPT-large")
        # self.model = AutoModelForCausalLM.from_pretrained("microsoft/DialoGPT-large")

        # Set Euclidean distance threshold
        self.euclidean_threshold = 0.3
        self.json_file = json_file
        self.load_cache()

    def load_cache(self):
        # Load cache from JSON file, creating an empty cache if the file is not found
        try:
            with open(self.json_file, 'r') as file:
                self.cache = json.load(file)
        except FileNotFoundError:
            self.cache = {'questions': [], 'embeddings': [], 'answers': [], 'response_text': []}

    def save_cache(self):
        # Save the cache to the JSON file
        with open(self.json_file, 'w') as file:
            json.dump(self.cache, file)

    def ask(self, question: str) -> str:
        # Method to retrieve an answer from the cache or generate a new one
        start_time = time.time()
        try:
            l = [question]
            embedding = self.encoder.encode(l)

            # Search for the nearest neighbor in the index
            D, I = self.index.search(embedding, 1)

            if D[0] >= 0:
                if I[0][0] != -1 and D[0][0] <= self.euclidean_threshold:
                    row_id = int(I[0][0])
                    print(f'Found cache in row: {row_id} with score {1 - D[0][0]}') #score inversed to show similarity
                    end_time = time.time()
                    elapsed_time = end_time - start_time
                    print(f"Time taken: {elapsed_time} seconds")
                    return self.cache['response_text'][row_id]

            # Handle the case when there are not enough results or Euclidean distance is not met
            answer, response_text = self.generate_answer(question)

            self.cache['questions'].append(question)
            self.cache['embeddings'].append(embedding[0].tolist())
            self.cache['answers'].append(answer)
            self.cache['response_text'].append(response_text)

            self.index.add(embedding)
            self.save_cache()
            end_time = time.time()
            elapsed_time = end_time - start_time
            print(f"Time taken: {elapsed_time} seconds")

            return response_text
        except Exception as e:
            raise RuntimeError(f"Error during 'ask' method: {e}")

    def generate_answer(self, question: str) -> str:
        # Method to generate an answer using a separate function (make_prediction in this case)
        try:
            result = make_prediction([question])
            response_text = result['data']['response_text']

            return result, response_text
        except Exception as e:
            raise RuntimeError(f"Error during 'generate_answer' method: {e}")


In [None]:
cache = SemanticCaching()



Index trained


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/10.6k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/571 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/438M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/363 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/239 [00:00<?, ?B/s]



1_Pooling/config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

In [None]:
question1 = "What is the capital of France?"
answer1 = cache.ask(question1)
print(answer1)

# Question not seen before, generates answer from LLM

question2 = "Who is the CEO of Apple?"
answer2 = cache.ask(question2)
print(answer2)

# Stores question2, embedding and answer2 in cache

question3 = "Who is the CEO of Facebook?"
answer3 = cache.ask(question3)
print(answer3)

# Finds question2 is similar above threshold


Request was successful.
Time taken: 5.528501987457275 seconds
The capital of France is Paris. 

- Paris is the most populous city in France, with an estimated population of 12,271,794 inhabitants as of January 2023.
- It is located along the Seine River in the north-central part of the country.
- Paris has been the capital of France since its liberation in 1944.
- The city is renowned for its cultural significance and is often referred to as the "Fashion capital of the World."
- Notable landmarks include the Eiffel Tower, which is a symbol of the city.

For more information, you can visit:
- [Paris - Wikipedia](https://en.wikipedia.org/wiki/Paris)
- [Paris, France - Intercultural cities programme - The Council of Europe](https://www.coe.int/en/web/interculturalcities/paris)
- [France | History, Maps, Flag, Population, Cities, Capital, & Facts](https://www.britannica.com/place/France)
Request was successful.
Time taken: 4.013334512710571 seconds
The CEO of Apple is Tim Cook.

### Additi

In [None]:
answer4 = cache.ask("What is the Capital of India")
print(answer4)

Request was successful.
Time taken: 5.206113338470459 seconds
The capital of India is New Delhi.

- New Delhi is part of the National Capital Territory of Delhi (NCT).
- It serves as the seat of all three branches of the Government of India.
- Geographically, New Delhi is located in the north-central part of India, on the west bank of the Yamuna River.
- New Delhi is one of the 11 districts of the city of Delhi and is often used interchangeably with the term "Delhi."
- It is also classified as a union territory.

For more information, you can visit:
- [New Delhi - Wikipedia](https://en.wikipedia.org/wiki/New_Delhi)
- [New Delhi, India - Image of the Week - ESA Earth Online](https://earth.esa.int/web/earth-watching/image-of-the-week/content/-/article/new-delhi-india/)
- [New Delhi | History, Population, Map, & Facts - Britannica](https://www.britannica.com/place/New-Delhi)


In [None]:
answer4 = cache.ask("Can you tell me what is the Capital of India")
print(answer4)

Found cache in row: 3 with score 0.8059847354888916
Time taken: 0.1754007339477539 seconds
The capital of India is New Delhi.

- New Delhi is part of the National Capital Territory of Delhi (NCT).
- It serves as the seat of all three branches of the Government of India.
- Geographically, New Delhi is located in the north-central part of India, on the west bank of the Yamuna River.
- New Delhi is one of the 11 districts of the city of Delhi and is often used interchangeably with the term "Delhi."
- It is also classified as a union territory.

For more information, you can visit:
- [New Delhi - Wikipedia](https://en.wikipedia.org/wiki/New_Delhi)
- [New Delhi, India - Image of the Week - ESA Earth Online](https://earth.esa.int/web/earth-watching/image-of-the-week/content/-/article/new-delhi-india/)
- [New Delhi | History, Population, Map, & Facts - Britannica](https://www.britannica.com/place/New-Delhi)


In [None]:
print(cache.ask('Who is the CEO of Facebook?'))

Found cache in row: 2 with score 1.0
Time taken: 0.11047935485839844 seconds
The CEO of Facebook is Mark Zuckerberg. 

- **Full Name**: Mark Elliot Zuckerberg
- **Date of Birth**: May 14, 1984
- **Position**: He is the founder, chairman, and chief executive officer of Meta Platforms, which was originally founded as Facebook in 2004.
- **Responsibilities**: Mark Zuckerberg is responsible for setting the overall direction of the company.

For more information, you can visit:
- [Mark Zuckerberg - Wikipedia](https://en.wikipedia.org/wiki/Mark_Zuckerberg)
- [Meta Executive Profile](https://about.meta.com/media-gallery/executives/mark-zuckerberg/)
- [Meta Investor Relations](https://investor.fb.com/leadership-and-governance/)


In [None]:
print(cache.ask('Who is the current CEO of Google?'))

Request was successful.
Time taken: 3.733494997024536 seconds
The current CEO of Google is Sundar Pichai. 

- **Full Name**: Pichai Sundararajan
- **Date of Birth**: June 10, 1972
- **Nationality**: Indian-born American
- **Position**: Chief Executive Officer of Alphabet Inc. and its subsidiary Google
- **Background**: Sundar Pichai has been instrumental in leading Google and has focused on developing innovative products and services.

For more information, you can visit his [Wikipedia page](https://en.wikipedia.org/wiki/Sundar_Pichai) or his [LinkedIn profile](https://www.linkedin.com/in/sundarpichai).


In [None]:
print(cache.ask('Is Sundar Pichai the CEO of Google?'))

Request was successful.
Time taken: 2.832247734069824 seconds
Yes, Sundar Pichai is the CEO of Google. 

- Sundar Pichai, also known as Pichai Sundararajan, was appointed CEO of Google in August 2015.
- He later became the CEO of Alphabet Inc., Google's parent company, in December 2019.
- Under his leadership, Google has focused on organizing the world's information and making it universally accessible and useful.

For more information, you can visit his [Wikipedia page](https://en.wikipedia.org/wiki/Sundar_Pichai) or his [LinkedIn profile](https://www.linkedin.com/in/sundarpichai).


In [None]:
print(cache.ask('Best local food spots in Edinburgh for a couple?'))

Request was successful.
Time taken: 25.43631362915039 seconds
Here are some of the best romantic local food spots in Edinburgh for couples:

1. **The Witchery By The Castle**
   - A luxurious dining experience with a gothic ambiance, perfect for a romantic evening.
   - Location: Near Edinburgh Castle.

2. **The Voodoo Rooms**
   - Known for its vibrant atmosphere and creative cocktails, offering a unique dining experience.
   - Location: 19a West Register Street.

3. **The Kitchin**
   - A Michelin-starred restaurant that focuses on seasonal Scottish produce, providing a cozy yet upscale setting.
   - Location: 78 Commercial Quay, Leith.

4. **Chaophraya**
   - A stunning Thai restaurant with a rooftop terrace, ideal for enjoying a meal with a view.
   - Location: 33 Castle Street.

5. **Hawksmoor**
   - Renowned for its steaks and elegant setting, perfect for meat lovers.
   - Location: 1a Renfield Street.

6. **La Garrigue**
   - A charming French bistro that offers a warm atmospher

In [None]:
print(cache.ask('Best local food spots in Edinburgh?'))

Found cache in row: 6 with score 0.8507777154445648
Time taken: 0.09261965751647949 seconds
Here are some of the best romantic local food spots in Edinburgh for couples:

1. **The Witchery By The Castle**
   - A luxurious dining experience with a gothic ambiance, perfect for a romantic evening.
   - Location: Near Edinburgh Castle.

2. **The Voodoo Rooms**
   - Known for its vibrant atmosphere and creative cocktails, offering a unique dining experience.
   - Location: 19a West Register Street.

3. **The Kitchin**
   - A Michelin-starred restaurant that focuses on seasonal Scottish produce, providing a cozy yet upscale setting.
   - Location: 78 Commercial Quay, Leith.

4. **Chaophraya**
   - A stunning Thai restaurant with a rooftop terrace, ideal for enjoying a meal with a view.
   - Location: 33 Castle Street.

5. **Hawksmoor**
   - Renowned for its steaks and elegant setting, perfect for meat lovers.
   - Location: 1a Renfield Street.

6. **La Garrigue**
   - A charming French bistr

In [None]:
print(cache.ask('Best local food spots in London?'))

Request was successful.
Time taken: 5.6548097133636475 seconds
Here are some of the best local food spots in London:

1. **Clipstone**  
   - Location: Fitzrovia  
   - Known for its seasonal dishes and a cozy atmosphere.

2. **Barrafina**  
   - Location: Covent Garden  
   - A popular spot for authentic Spanish tapas.

3. **The Quality Chop House**  
   - Location: Farringdon  
   - Famous for its traditional British meat dishes.

4. **Gökyüzü**  
   - Location: Harringay  
   - Renowned for its delicious Turkish cuisine.

5. **Casa Pastor**  
   - Location: King's Cross  
   - Offers a taste of Mexico with its vibrant dishes.

6. **Tayyabs**  
   - Location: Whitechapel  
   - A beloved spot for Pakistani food, especially its grilled meats.

7. **Oklava**  
   - Location: Shoreditch  
   - Celebrated for its modern take on Turkish cuisine.

8. **Bright**  
   - Location: Hackney  
   - Known for its fresh, seasonal menu and relaxed vibe.

Additional recommendations include:

- **Lyl

In [None]:
print(cache.ask('Best local food spots in London?'))

Found cache in row: 7 with score 1.0
Time taken: 0.10187292098999023 seconds
Here are some of the best local food spots in London:

1. **Clipstone**  
   - Location: Fitzrovia  
   - Known for its seasonal dishes and a cozy atmosphere.

2. **Barrafina**  
   - Location: Covent Garden  
   - A popular spot for authentic Spanish tapas.

3. **The Quality Chop House**  
   - Location: Farringdon  
   - Famous for its traditional British meat dishes.

4. **Gökyüzü**  
   - Location: Harringay  
   - Renowned for its delicious Turkish cuisine.

5. **Casa Pastor**  
   - Location: King's Cross  
   - Offers a taste of Mexico with its vibrant dishes.

6. **Tayyabs**  
   - Location: Whitechapel  
   - A beloved spot for Pakistani food, especially its grilled meats.

7. **Oklava**  
   - Location: Shoreditch  
   - Celebrated for its modern take on Turkish cuisine.

8. **Bright**  
   - Location: Hackney  
   - Known for its fresh, seasonal menu and relaxed vibe.

Additional recommendations inc