# FAQ Matching with SentenceTransformer

This notebook demonstrates how to use the **SentenceTransformer** model to match user queries to frequently asked questions (FAQs) based on semantic similarity.

## Objective
The goal of this notebook is to:
1. Encode a set of predefined FAQs and user queries using a SentenceTransformer model.
2. Calculate the semantic similarity between queries and FAQs.
3. Identify and return the most relevant FAQ and its corresponding answer for each query.

## Workflow
1. **Define FAQs and Queries**: A dictionary of FAQs and a list of user queries are provided.
2. **Load the Model**: The `all-MiniLM-L6-v2` pre-trained SentenceTransformer model is used for encoding.
3. **Encode Text**: Both FAQs and queries are converted into vector embeddings.
4. **Calculate Similarity**: Cosine similarity is computed to find the best matching FAQ for each query.
5. **Display Results**: Matches are displayed in a structured and human-readable format using Markdown.

## Requirements
- Install the `sentence-transformers` library:
  ```bash
  pip install sentence-transformers

In [13]:
!pip install sentence-transformers

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)




In [14]:
from sentence_transformers import SentenceTransformer, util

In [15]:
# Load the SentenceTransformer model
model = SentenceTransformer('all-MiniLM-L6-v2')

In [24]:
# Define FAQs and queries
faqs = {
    "How do I reset my password?": "To reset your password, go to the settings page.",
    "Where can I find the user manual?": "The user manual is available on our website.",
    "What are your working hours?": "We are open from 9 AM to 5 PM, Monday to Friday.",
}

queries = [
    "I forgot my password, what should I do?",
    "Can I download the manual online?",
    "What time does the office close?",
    "Where is the office?"
]

In [25]:
# Extract FAQ questions
faq_questions = list(faqs.keys())

In [26]:
# Encode the FAQ questions and queries
faq_embeddings = model.encode(faq_questions, convert_to_tensor=True)
query_embeddings = model.encode(queries, convert_to_tensor=True)

In [28]:
# Find the most similar FAQ for each query
for query, query_embedding in zip(queries, query_embeddings):
    # Compute cosine similarity
    similarities = util.cos_sim(query_embedding, faq_embeddings)[0]

    # Find the index of the most similar FAQ
    best_match_idx = similarities.argmax().item()
    best_match_question = faq_questions[best_match_idx]
    best_match_answer = faqs[best_match_question]

    # Print the results
    print(f"Query: {query}")
    print(f"Best Match: {best_match_question}")
    print(f"Answer: {best_match_answer}")
    print("-" * 50)

Query: I forgot my password, what should I do?
Best Match: How do I reset my password?
Answer: To reset your password, go to the settings page.
--------------------------------------------------
Query: Can I download the manual online?
Best Match: Where can I find the user manual?
Answer: The user manual is available on our website.
--------------------------------------------------
Query: What time does the office close?
Best Match: What are your working hours?
Answer: We are open from 9 AM to 5 PM, Monday to Friday.
--------------------------------------------------
Query: Where is the office?
Best Match: What are your working hours?
Answer: We are open from 9 AM to 5 PM, Monday to Friday.
--------------------------------------------------


In [29]:
faqs = {
    "How do I reset my password?": "To reset your password, go to the settings page.",
    "Where can I find the user manual?": "The user manual is available on our website for download.",
    "What are your working hours?": "We are open from 9 AM to 5 PM, Monday to Friday.",
    "Can I download the user manual?": "Yes, the user manual is available for download on our website.",
    "What time does your office close?": "Our office closes at 5 PM, Monday to Friday.",
    "How can I contact customer support?": "You can reach our customer support team at support@example.com or call us at 123-456-7890.",
    "Do you offer online support?": "Yes, online support is available via our chat system on the website.",
    "Where is your office located?": "Our office is located at 123 Main Street, Springfield.",
    "How do I create an account?": "To create an account, click on the 'Sign Up' button on the homepage and fill in your details.",
    "What payment methods do you accept?": "We accept credit cards, debit cards, and PayPal.",
    "Do you offer refunds?": "Yes, we offer refunds within 30 days of purchase. Please refer to our refund policy for details.",
    "Is there a mobile app available?": "Yes, our mobile app is available for download on iOS and Android devices.",
    "How do I update my profile information?": "To update your profile information, log in to your account and go to the profile settings page.",
    "Can I track my order online?": "Yes, you can track your order using the tracking ID sent to your email after purchase.",
    "Do you provide international shipping?": "Yes, we provide international shipping. Shipping charges and delivery times vary by location.",
    "How do I unsubscribe from your newsletter?": "To unsubscribe, click the 'Unsubscribe' link at the bottom of any of our emails.",
    "Are there any discounts for new customers?": "Yes, new customers receive a 10% discount on their first purchase.",
    "What is your return policy?": "Our return policy allows returns within 30 days of receiving the product. Items must be in original condition.",
    "How do I apply a discount code?": "You can apply a discount code at checkout by entering it in the designated field."
}

In [30]:
# Extract FAQ questions
faq_questions = list(faqs.keys())

In [31]:
faq_embeddings = model.encode(faq_questions, convert_to_tensor=True)

In [32]:
# Find the most similar FAQ for each query
for query, query_embedding in zip(queries, query_embeddings):
    # Compute cosine similarity
    similarities = util.cos_sim(query_embedding, faq_embeddings)[0]

    # Find the index of the most similar FAQ
    best_match_idx = similarities.argmax().item()
    best_match_question = faq_questions[best_match_idx]
    best_match_answer = faqs[best_match_question]

    # Print the results
    print(f"Query: {query}")
    print(f"Best Match: {best_match_question}")
    print(f"Answer: {best_match_answer}")
    print("-" * 50)

Query: I forgot my password, what should I do?
Best Match: How do I reset my password?
Answer: To reset your password, go to the settings page.
--------------------------------------------------
Query: Can I download the manual online?
Best Match: Can I download the user manual?
Answer: Yes, the user manual is available for download on our website.
--------------------------------------------------
Query: What time does the office close?
Best Match: What time does your office close?
Answer: Our office closes at 5 PM, Monday to Friday.
--------------------------------------------------
Query: Where is the office?
Best Match: Where is your office located?
Answer: Our office is located at 123 Main Street, Springfield.
--------------------------------------------------
