In [1]:
!pip install sentence-transformers numpy



In [2]:
data = [
{
"page": "Home",
"url": "https://atdigital.io/",
"text": "AT Digital provides web development, AI solutions, and digital marketing services to help businesses grow online."
},
{
"page": "About Us",
"url": "https://atdigital.io/about",
"text": "We are a team of developers, designers, and AI specialists committed to delivering innovative digital solutions."
},
{
"page": "Services - AI/ML",
"url": "https://atdigital.io/services/ai-ml",
"text": "Our AI/ML services include custom models for automation, predictive analytics, and chatbot integration."
},
{
"page": "Services - Web Development",
"url": "https://atdigital.io/services/web-development",
"text": "We build responsive websites using React, Next.js, and headless CMS solutions like Prismic."
},
{
"page": "Services - Digital Marketing",
"url": "https://atdigital.io/services/digital-marketing",
"text": "Our digital marketing services cover SEO, social media campaigns, content creation, and performance analytics."
},
{
"page": "Blog - SEO Tips",
"url": "https://atdigital.io/blog/seo-tips",
"text": "Improving website SEO involves optimizing images, metadata, performance, and content for better ranking."
},
{
"page": "Contact",
"url": "https://atdigital.io/contact",
"text": "You can reach us via email at contact@atdigital.io or through our contact form on the website."
}
]

In [3]:
#document list for embedding
documents = [f"{item['page']} - {item['text']}".strip() for item in data]

In [4]:
from sentence_transformers import SentenceTransformer

model = SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')

#generate embeddings
embeddings = model.encode(documents)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


In [5]:
#similarity search functions
import numpy as np

def cosine_similarity(a, b):
    return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))

def find_best_match(query_emb, embedding):
    sims = [cosine_similarity(query_emb, doc_emb) for doc_emb in embedding]
    idx = np.argmax(sims)
    return idx, sims[idx]

In [6]:
#CLI loop
while True:
    query = input("Ask a question (or 'exit'): ")
    if query.lower() == "exit":
        break
    # Encode the query
    q_emb = model.encode([query], convert_to_numpy=True)[0]
    idx, score = find_best_match(q_emb, embeddings)
    print(f"Answer: {data[idx]['text']}")
    print(f"Source: {data[idx]['page']} ({data[idx]['url']})\n")


Ask a question (or 'exit'): What services does AT Digital offer?
Answer: AT Digital provides web development, AI solutions, and digital marketing services to help businesses grow online.
Source: Home (https://atdigital.io/)

Ask a question (or 'exit'): Tell me about your AI/ML capabilities
Answer: Our AI/ML services include custom models for automation, predictive analytics, and chatbot integration.
Source: Services - AI/ML (https://atdigital.io/services/ai-ml)

Ask a question (or 'exit'): Do you build websites?
Answer: We build responsive websites using React, Next.js, and headless CMS solutions like Prismic.
Source: Services - Web Development (https://atdigital.io/services/web-development)

Ask a question (or 'exit'): What digital marketing services do you provide?
Answer: Our digital marketing services cover SEO, social media campaigns, content creation, and performance analytics.
Source: Services - Digital Marketing (https://atdigital.io/services/digital-marketing)

Ask a question 