<a href="https://colab.research.google.com/github/salmabenhassin/QAA/blob/main/ChatbotWithoutModelButwithSimilaritySearch.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

*Prepare an FAQ dataset: Create a list of frequently asked questions and corresponding answers.
*Text preprocessing: Use SpaCy or NLTK for tokenization, lemmatization, stop word removal, and other preprocessing tasks to clean and standardize the text.
*Similarity Matchin***g*************** :without using a model trained and test it with test dataset in this case we will use similarity search:******************
 Use a similarity measure like cosine similarity between the user's input question and the pre-existing FAQ questions.
Generate a response: Based on the closest match, provide the corresponding answer.

In [None]:
!pip install spacy scikit-learn
!python -m spacy download en_core_web_sm


Collecting en-core-web-sm==3.7.1
  Downloading https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.7.1/en_core_web_sm-3.7.1-py3-none-any.whl (12.8 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m12.8/12.8 MB[0m [31m46.1 MB/s[0m eta [36m0:00:00[0m
[38;5;2m✔ Download and installation successful[0m
You can now load the package via spacy.load('en_core_web_sm')
[38;5;3m⚠ Restart to reload dependencies[0m
If you are in a Jupyter or Colab notebook, you may need to restart Python in
order to load all the package's dependencies. You can do this by selecting the
'Restart kernel' or 'Restart runtime' option.


In [None]:
faq_data = [
    ("What is the return policy?", "Our return policy lasts 30 days."),
    ("How long does shipping take?", "Shipping usually takes 5-7 business days."),
    ("Do you ship internationally?", "Yes, we ship to most countries."),
    ("Can I change my order?", "Yes, you can change your order within 24 hours of placing it."),
]


In [None]:
import spacy

# Load SpaCy model for English
nlp = spacy.load("en_core_web_sm")

def preprocess(text):
    doc = nlp(text.lower())
    tokens = [token.lemma_ for token in doc if not token.is_stop and not token.is_punct]
    return " ".join(tokens)


In [None]:
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity

def get_most_similar_question(user_question, faq_data):
    # Preprocess all FAQ questions
    preprocessed_questions = [preprocess(q) for q, _ in faq_data]

    # Add the user's question to the list for comparison
    preprocessed_questions.append(preprocess(user_question))

    # Use TF-IDF to convert text to vectors
    vectorizer = TfidfVectorizer()
    vectors = vectorizer.fit_transform(preprocessed_questions)

    # Calculate cosine similarity between the user's question and all FAQ questions
    similarity = cosine_similarity(vectors[-1], vectors[:-1])
    most_similar_idx = similarity.argmax()  # Get index of the most similar FAQ question

    return faq_data[most_similar_idx][1]  # Return the corresponding answer


In [None]:
def faq_chatbot():
    print("Hello! I'm here to help. Ask me a question about our product or service. Type 'quit' to exit.")
    while True:
        user_input = input("You: ")
        if user_input.lower() == 'quit':
            print("Goodbye!")
            break

        # Get the most similar FAQ answer
        response = get_most_similar_question(user_input, faq_data)
        print("Chatbot:", response)

if __name__ == "__main__":
    faq_chatbot()


Hello! I'm here to help. Ask me a question about our product or service. Type 'quit' to exit.
You: "What is the return policy?
Chatbot: Our return policy lasts 30 days.
You: quit
Goodbye!
