In [1]:
# Cell 1: Install dependencies
!pip -q install torch --index-url https://download.pytorch.org/whl/cu121
!pip -q install transformers accelerate sentence-transformers faiss-cpu flask pyngrok bitsandbytes pyyaml

[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m31.4/31.4 MB[0m [31m45.0 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m60.1/60.1 MB[0m [31m11.0 MB/s[0m eta [36m0:00:00[0m
[?25h

In [2]:
# Cell 2: Import libraries and setup
import os
import json
from typing import List, Dict, Any, Tuple

import faiss
import numpy as np
from sentence_transformers import SentenceTransformer
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline, BitsAndBytesConfig
import torch

from flask import Flask, request, jsonify
from pyngrok import ngrok
from getpass import getpass
import threading
import time

import requests
import json

In [3]:
# Cell 3: Knowledge base data as python dictionaries
KNOWLEDGE_BASE = [
    {
        "title": "Document 1: User Registration and Account Management",
        "content": "Creating a Shoplite account is the first step for buyers and sellers to access the platform's full range of features. Registration is designed to be quick, intuitive, and secure. Buyers can sign up with a valid email address, make a strong password, and optionally add a phone number for two-factor authentication. After they sign up, they get an email with a link to confirm their registration. They have 24 hours to click the link to finish the onboarding process.\n\nUsers can log in and go to their account dashboard to change their personal information, like their shipping address, saved payment methods, and communication preferences. Customers can also turn on notifications for sales, price drops, and updates on deliveries. Accounts support single sign-in (SSO) options through Google and Apple IDs for convenience. This makes it easy to get to access an account without having to enter any credentials over and over again.\n\nFor sellers to register on Shoplite, they must provide registered business proof, a valid tax ID, and bank account details, with verification typically taking two to three business days. Once approved, they can create sub-accounts for employees with role-based permissions. Shoplite ensures security with hashed passwords (SHA-256), JWT-based authentication, and temporary account locks after multiple failed login attempts. Developers using the Shoplite API for authentication must follow OAuth2.0 standards, utilizing refresh tokens for secure access.\n\nAccount recovery is also prioritized. Users who forget their passwords can reset them through an emailed recovery link that remains valid for only 15 minutes. For enhanced protection, recovery requests are logged and monitored by Shoplite's fraud detection system.\n\nBy balancing ease of use with robust safeguards, Shoplite ensures that account registration and management provide a seamless yet secure experience for all users.",
        "id": "doc1"
    },
    {
        "title": "Document 2: Product Search and Filtering Features",
        "content": "The Shoplite search system makes it easy for customers to find the items they need from a large selection of items sold by many different vendors. The platform's main feature is a powerful keyword-based search bar that can handle both natural language queries and exact phrases. For instance, if a customer types \"running shoes under $100,\" they will get results that fit their price and category filters right away. As users type, auto-suggestions show up thanks to predictive text algorithms that look at popular searches and the user's own browsing history.\n\nFiltering options let shoppers have full control over the results of their searches. Price range, product category, brand, seller rating, shipping speed, and stock availability are all examples of core filters. Extra advanced filters make it easy to find \"Shoplite Verified Sellers,\" eco-friendly products, or items that can be delivered the same day. Results can be sorted by how relevant they are, how low the price is, how high the rating is, or how new they are. A comparison tool displays several items next to each other, which helps making an informed choice without leaving the search results page.\n\nThe technical architecture of Shoplite's search engine utilizes Elasticsearch for real-time indexing of product data, applying Boolean filters to streamline searches. Caching mechanisms allow rapid retrieval of frequently searched terms, even under high traffic, while machine learning enhances personalized recommendations. Developers can enhance functionality through Shoplite's API at the `/products/search` endpoint, which supports keyword and filter parameters, returning JSON results suitable for integration. Rate limits maintain system stability, with support for bulk operations via pagination tokens.\n\nFrom a customer perspective, these tools make finding products efficient and enjoyable. From a technical perspective, the modular search and filter system guarantees scalability as Shoplite continues to expand its product ecosystem, ensuring that both end-users and developers benefit from a robust, future-ready infrastructure.",
        "id": "doc2"
    },
    {
        "title": "Document 3: Shopping Cart and Checkout Process",
        "content": "The Shoplite shopping cart is built to handle multi-seller transactions, making it possible for customers to add products from different vendors into a single order. Items can be added, removed, or saved for later, and quantity adjustments automatically recalculate totals in real time. Promotional codes, loyalty discounts, and shipping fees are also applied directly in the cart view to give users full price transparency before checkout.\n\nThere are three main steps in the checkout process: picking a shipping address and delivery method, picking a payment method, and confirming the order summary. Customers can choose between standard, express, or same-day delivery if it's available. Shoplite accepts cards, wallets, and regional payment methods. When the order is confirmed, it creates a unique ID and sends notifications to both the customer and the seller.\n\nTechnically, Checkout is powered by a microservices achitechture. The Cart Service keeps track of items and discounts, the Payment Service handles safe transactions, and the Order Service completes the purchase by adding it to Shoplite's order database. APIs also allow for external integrations; third-party systems can sign up for webhooks to get real-time updates on order confirmation and payment status.\n\nThe shopping cart and checkout process make it easy for customers to finish their purchases with as little trouble as possible while giving sellers accurate, up-to-date order information.",
        "id": "doc3"
    },
    {
        "title": "Document 4: Payment Methods and Security",
        "content": "Shoplite accepts a variety of payment methods to meet the needs of customers around the world. These include major credit and debit cards, PayPal, Apple Pay, Google Pay, regional e-wallets, and cash on delivery in some markets. PCI DSS-compliant providers encrypt and tokenise sensitive data so that customers can safely save their payment information for faster checkouts in the future.\n\nShoplite's AI-powered fraud detection system flags any suspicious activity, and all transactions are protected by TLS 1.3 encryption. Two-factor authentication may be needed for high-value orders, which adds an extra step to the verification process.\n\nShoplite gives developers RESTful payment APIs that use OAuth2.0 for authorisation. Every request for a transaction is checked, logged, and given a unique reference number for auditing. When a transaction fails, it gives detailed error codes that tell both customers and developers how to fix the problem.\n\nShoplite lowers the number of people who leave their carts by offering easy-to-use payment options and enterprise-level security. This builds trust between buyers and sellers.",
        "id": "doc4"
    },
    {
        "title": "Document 5: Order Tracking and Delivery",
        "content": "Customers get a unique tracking number and an estimated delivery time once their order is confirmed on Shoplite. There are six stages of order progress: processing, packaging, shipping, in transit, out for delivery, and delivery. Each stage sends automatic updates via email and mobile push notifications, so buyers don't have to keep checking the app to stay up to date.\n\nCustomers can view live status updates directly in their account dashboard. For convenience, delivery preferences such as \"leave at front desk\" or \"require signature\" can be set during checkout. Options range from standard shipping (3-7 business days) to premium express services, including same-day delivery in select urban areas. In-store pickup is also available where supported by sellers.\n\nOn the backend, Shoplite connects to many logistics partners through APIs that keep track of shipment status in real time. The platform's smart routing algorithm picks the best courier for each order based on location, weight, and delivery speed, as well as cost and reliability. If there are delays, the system automatically lets customers know and offers other solutions, like refunds or store credits.\n\nThe `/orders/{id}/tracking` endpoint gives developers programmatic access to tracking data in JSON format, which makes it possible to connect with custom dashboards or external CRMs. Bulk upload tools help sellers register shipment details right from the Seller Dashboard or through API calls.\n\nShoplite makes sure that customers receives transparency by providing real-time updates and flexible delivery options. Sellers benefit from a logistics ecosystem that is designed to cut down on mistakes and speed up delivery.",
        "id": "doc5"
    },
    {
        "title": "Document 6: Return and Refund Policies",
        "content": "Shoplite offers a customer-friendly return framework to balance convenience with seller protection. Most products come with a 30-day return window, though exceptions apply for perishable, intimate, or customized items. Customers initiate returns through their dashboard by selecting the relevant order, choosing a reason from pre-defined categories, and generating a digital return authorization (RMA) slip. Prepaid shipping labels are provided for eligible returns.\n\nOnce the returned item passes inspection, refunds are usually processed within 7 to 10 business days. Refunds are sent back to the original source or given as Shoplite store credit, depending on how the payment was made. Customers are notified by email or push notifications at every step of the return process.\n\nSellers must follow Shoplite's return policy. If they don't honor returns or give refunds on time, they may face penalties, lower seller ratings, or even account suspension. Sellers can see how many returns they have and dispute false claims right from their dashboards.\n\nFrom a technical perspective, returns are managed by a dedicated Return Service that communicates with both the Order and Payment services to synchronize refund actions. API endpoints enable sellers to automate approval workflows, while customers benefit from clear status updates integrated into the mobile app and website.\n\nThis structured but flexible method makes sure everyone is treated fairly. Buyers can shop with confidence, knowing they are safe, and sellers can protect themselves from false or abusive return claims.",
        "id": "doc6"
    },
    {
        "title": "Document 7: Product Reviews and Ratings",
        "content": "Product reviews and ratings on Shoplite play an essential role in building trust between buyers and sellers. Customers who have purchased and received a product are invited to leave feedback, ensuring that all reviews are verified and authentic. Reviews consist of a 1-5 star rating, optional written feedback, and the ability to upload images or short videos showcasing the product. This system allows future buyers to make informed decisions based on real experiences.\n\nShoplite's interface makes it easy to see reviews. Customers can sort reviews on each product page by rating, date, or \"most helpful.\" Products that get a lot of good reviews may get a \"Shoplite Recommended\" badge, which makes them show up first in search results and category listings. Sellers should respond to reviews in public to show that they care about service quality and to have an open conversation with their customers.\n\nShoplite uses automated moderation tools to find spam, bad language, and content that isn't relevant behind the scenes. Machine learning algorithms also look for patterns of fraud, like \"review bombing\" or sellers trying to change ratings with fake accounts. Reviews that look suspicious are flagged for human review, which makes sure that the system is fair and trustworthy.\n\nTechnically, there is a separate microservice that stores reviews in their own database so that they can be indexed and retrieved quickly. Developers can use APIs to pull reviews into outside dashboards or CRM systems so they can be analysed. To make sure that the reviews are real, each one has a timestamp, is connected to a specific order ID, and can't be changed after it is submitted.\n\nShoplite makes sure that reviews are useful for both buyers and sellers by making the site open, safe, and easy to use. This keeps the shopping environment safe and maintained.",
        "id": "doc7"
    },
    {
        "title": "Document 8: Seller Account Setup and Management",
        "content": "Setting up an account for businesses that want to sell on Shoplite is meant to be both thorough and quick. The first step is to register, which means that sellers give their legal business information, a valid tax ID, and their banking information for payments. They need to upload supporting documents like trade licenses or certificates of incorporation so they can be checked. This process usually takes 2 to 3 business days. After that, approved sellers can use the Seller Dashboard fully.\n\nThe Seller Dashboard serves as the control center for managing all aspects of an online store. Sellers can upload product listings one at a time or all at once using CSV files or APIs. They can also set prices and start marketing campaigns. Shoplite lets owners set up sub-accounts for staff with limited access, like only managing the catalogue or helping customers. This is called role-based permissions.\n\nAt every step, safety comes first. Sellers get alerts when someone logs in from a new device or IP address. They can also turn on multi-factor authentication for extra security. Shoplite's compliance team also checks sellers from time to time to make sure they follow the rules of the platform, such as setting fair prices and giving accurate product descriptions.\n\nFrom a technical standpoint, Shoplite provides APIs for real-time synchronization with external ERP and inventory management systems. Sellers can automate stock updates, receive order notifications instantly, and pull sales performance reports for advanced analytics. All seller activities are logged for transparency and dispute resolution.\n\nBy offering a blend of flexibility, control, and accountability, Shoplite makes seller account management simple yet robust, ensuring businesses of all sizes can scale effectively while maintaining compliance and security.",
        "id": "doc8"
    },
    {
        "title": "Document 9: Inventory Management for Sellers",
        "content": "Effective inventory management is critical for sellers on Shoplite, and the platform provides a set of tools to ensure products remain in stock and available to customers without overselling. Businesses can see live stock counts, track items by SKU or batch, and check the availability of their warehouses in real time through the Seller Dashboard. The system sends out low-stock alerts by email and through the dashboard, which helps sellers avoid missing out on sales.\n\nShoplite also lets sellers upload large amounts of inventory at once using CSV or JSON files. This makes it easier for sellers with big catalogues to update hundreds of products at once. Shoplite offers RESTful APIs like `/inventory/update` that keep stock levels in sync across platforms right away for businesses that use external warehouse or ERP systems. This reduces errors caused by manual input and ensures data consistency.\n\nShoplite's advanced features include demand forecasting, which looks at past sales speed and seasonal patterns to suggest the best times to restock. Sellers can filter inventory views by warehouse location, fulfilment type, or product category. This gives them a lot of control over their operations. When orders are placed, automatic stock deductions happen. This lowers the risk of overselling and makes sure that all sales channels are fair.\n\nFrom a technical standpoint, inventory data is stored in a dedicated service optimized for speed and reliability. Each transaction—whether a stock addition, deduction, or adjustment—is logged with timestamps for auditing. Sellers integrating with Shoplite APIs must authenticate via OAuth2, and rate limits are applied to prevent accidental system overloads during batch updates.\n\nBy combining automation, predictive analytics, and secure integrations, Shoplite's inventory management ensures sellers can scale their operations while maintaining accuracy and efficiency.",
        "id": "doc9"
    },
    {
        "title": "Document 10: Commission and Fee Structure",
        "content": "Shoplite uses a clear commission-based model that makes sure sellers can make money while keeping the platform going. The percentage of commission fees depends on the category. For example, electronics usually have lower percentages (around 5%) because they have small margins. On the other hand, clothing, accessories, and lifestyle goods may have higher rates of up to 8-10%. Once a deal is done, these commissions are automatically taken out of the seller's payout.\n\nShoplite charges transaction fees for some payment gateways and optional listing fees for premium placement in search results or featured product slots in addition to commissions. Sellers can choose to join subscription tiers, which lower commission rates and give them access to benefits like better analytics dashboards, early access to promotional campaigns, and priority support.\n\nSellers can see a full breakdown of their fees in the Seller Dashboard. A Fee Statement is created for each order. It shows the sale price of the product, the commission charged, the transaction fees, and the net payout. This data can also be exported in CSV format or accessed through Shoplite's `/fees/report` API for integration into accounting software.\n\nFrom a technical perspective, all fee calculations are handled by a dedicated billing microservice that ensures accuracy and compliance. Each calculation is timestamped, auditable, and securely stored in Shoplite's financial database. Developers integrating with financial systems can use webhooks to automatically receive notifications about fee adjustments or monthly summaries.\n\nThis transparent, auditable structure allows sellers to plan their pricing strategies effectively while giving Shoplite the resources to continuously improve its ecosystem. By clearly outlining costs, the platform fosters trust and encourages long-term seller growth.",
        "id": "doc10"
    },
    {
        "title": "Document 11: Customer Support Procedures",
        "content": "Shoplite is dedicated to providing fast and dependable customer service, making sure that both buyers and sellers get help when they need it. Support is available 24/7 through multiple channels: live chat, email, phone, and a self-service help center. Customers can open support tickets directly from their account dashboard, while sellers can escalate buyer disputes through the Seller Dashboard.\n\nAI-driven classifiers automatically sort tickets by looking at the type of problem, like delays in orders, payment mistakes, or requests for returns, and sending them to the right support team. Cases that are very important, like payment disputes or suspected fraud, are eascalated right away. Customers get a ticket ID that lets them see how their problem is being handled in real time.\n\nThe self-service help center provides FAQs, troubleshooting guides, and policy explanations, reducing dependency on live agents. Sellers can get more help, such as training materials on how to follow the rules, how to list products correctly, and how to settle disputes.\n\nFrom a technical standpoint, Shoplite's support system integrates with a ticketing microservice connected to both Order and Payment systems. This ensures that agents can access relevant transaction data without requiring manual verification. All support conversations are logged for compliance and quality assurance. Developers can also use the Support API to automatically create or update tickets from third-party applications, such as CRM tools.\n\nShoplite keeps track of response times, resolution times, and customer satisfaction scores (CSAT) to make sure the quality of service stays satisfactory. Feedback is gathered on a regular basis, and the information is used to make workflows more efficient. Shoplite's support system combines automation with human expertise to make sure that things run smoothly while still being personal.",
        "id": "doc11"
    },
    {
        "title": "Document 12: Mobile App Features",
        "content": "The Shoplite mobile app has all the same features as the website, plus extra tools that make simpler to use on mobile devices. With just a few taps, shoppers can look through products, use filters, add items to their cart, and finish their purchases. Customers get updates on their orders, flash sales, and personalised recommendations through push notifications.\n\nOne of the best features in it is that it lets users quickly find products by scanning physical labels in stores. The app also supports biometric authentication, like Face ID and fingerprint login. This makes sure that accessing accounts or making payments is both easy and secure. Offline cart access allows users to save items without an internet connection, syncing automatically once online.\n\nSellers also benefit from mobile features, it lets them keep track on orders, change prices, and answer customer messages while they're on the go. Real-time performance analytics help sellers stay flexible when running their businesses.\n\nTechnically, the Shoplite mobile app is built using React Native, which ensures a consistent user experience across iOS and Android devices. It communicates with the same RESTful APIs as the web platform, meaning all data is synchronized instantly across devices. Developers can extend app functionality by integrating deep links, enabling direct navigation to product pages from marketing campaigns or external apps.\n\nBy combining usability, speed, and secure integrations, the Shoplite app offers a seamless shopping experience for buyers while giving sellers the flexibility to manage their businesses anytime, anywhere.",
        "id": "doc12"
    },
    {
        "title": "Document 13: API Documentation for Developers",
        "content": "Shoplite has a strong developer portal that lets third-party systems, apps, and custom solutions work with the platform without any problems. The API suite is RESTful, and all of its endpoints use JSON payloads and are protected by OAuth2.0 authentication. This makes sure that external integrations are both safe and reliable.\n\nSome important API endpoints are:\n`/products/search` lets developers add Shoplite's product search to other sites or apps. It works with keyword, category, and filter parameters.\n`/inventory/update` lets sellers change the amount of stock they have in real time, which stops them from selling too much on more than one platform.\n`/orders/{id}/tracking` provides detailed tracking information, like updates from the courier and estimates for when the package will be delivered.\n`/fees/report` makes financial summaries to help with accounting and reconciliation.\n\nDevelopers benefit from sandbox environments where requests can be tested without affecting live transactions. To keep things stable, there are rate limits (default: 1,000 requests per minute), but business partners can ask for higher limits. Standardised error codes make things clear, like 400 Bad Request for invalid payloads or 401 Unauthorised for failed authentication.\n\nThere are also code samples in Python, JavaScript, and Java in the documentation, which makes it easier for developers from various backgrounds to integrate. Shoplite also supports webhooks, which let systems get automatic notifications when things occur, like new orders, refunds, or changes to inventory.\n\nShoplite gives developers the tools they need to add features to the platform, build third-party apps, and make custom experiences that fit their business needs by giving them a clear, structured, and secure API framework.",
        "id": "doc13"
    },
    {
        "title": "Document 14: Security and Privacy Policies",
        "content": "Shoplite was developed with security priority from the start, so user data, transactions, and seller information are always safe. The platform follows important global rules, such as GDPR in Europe and CCPA in California. This gives customers control over and access to their personal data.\n\nShoplite uses TLS 1.3 encryption for all data transfers, SHA-256 hashing for storing passwords, and role-based access controls to limit what staff have access to. To reduce the risk of downtime, databases are encrypted when they are not being used, and backups are made every day with redundancy across multiple regions.\n\nShoplite does regular penetration testing and vulnerability scanning to protect itself from cyberattacks. Fraud detection systems that use machine learning keep an eye on strange login attempts, strange transaction patterns, and possible data breaches. In cases with a lot of risk, accounts may be locked for a brief period until verification is finished.\n\nShoplite also puts a lot of focus on privacy rights. Through their account settings, customers can ask for their data to be removed, choose not to receive advertisements, or export their personal data. A clear privacy notice explains cookies and tracking tools, and users must give their permission before their marketing preferences are turned on.\n\nAPI integrations must follow strict rules for compliance for developers. Access tokens don't last long and need to be refreshed, and sensitive endpoints are logged for auditing. In case of misused data or broken privacy rules, the API credentials may be suspended.\n\nShoplite is secure for both buyers and sellers to use because it follows regulations, has strong security measures, and makes it clear on exactly how it protects customers' privacy.",
        "id": "doc14"
    },
    {
        "title": "Document 15: Promotional Codes and Discounts",
        "content": "Shoplite provides sellers with flexible tools for creating promotional campaigns that boost sales and attract new customers. Promotional codes can be structured as percentage discounts (e.g., 20% off), fixed-amount reductions (e.g., $10 off), or free shipping vouchers. Sellers can set rules like a minimum order value, limits on the types of products that can be bought, or a limited time for availability.\n\nFrom the consumer's standpoint, codes are used in a separate field during checkout, and the totals are updated immediately. Clear error messages are shown for invalid or expired codes, which keeps things clear. Customers can also save promotional codes in their account wallets to use later on.\n\nOn the backend, Shoplite uses a dedicated Discount Service to validate and apply promotions in real time. Each code has unique identifiers, usage limits (single-use or multi-use), and expiration dates. Sellers can generate bulk codes for loyalty programs or personalized single-use codes for targeted marketing campaigns.\n\nFor developers, the `/discounts/apply` API endpoint provides programmatic validation of codes, enabling integration with custom checkout systems or external CRMs. Webhooks notify sellers when promotions are redeemed, helping them measure campaign success.\n\nAnalytics dashboards let sellers stay updated on how well they're doing by showing them things like redemption rates, revenue impact, and customer acquisition metrics. This helps companies improve their advertising plans over time.\n\nShoplite makes sure that promotional codes and discounts not only boost sales but also build long-term customer loyalty by being flexible, easy to use, and having strong technical support.",
        "id": "doc15"
    },
    {
        "title": "Document 16: Analytics and Reporting Tools",
        "content": "Shoplite has a full set of analytics tools that lets both buyers and sellers see how they are using the platform. Buyers can look at their purchase history, see how much they've spent in different categories, and get recommendations based on what they've bought in the past. Sellers, on the other hand, can see detailed reporting dashboards that show things like sales performance, product trends, customer demographics, and return rates.\n\nOne of the key strengths of Shoplite analytics is its real-time updating. Data is updated almost instantly after each sale, so sellers can make quick decisions based on the data. Reports can be filtered by time period, product line, or region, which makes it easy to analyse the data. Predictive analytics also show how demand changes with the seasons, which helps businesses get ready for busy times like the holidays.\n\nExport options are available in CSV, Excel, and JSON formats, which makes it easy to move data into third-party BI tools like Tableau or Power BI. Shoplite has dedicated API endpoints for technical users, such as `/analytics/sales` and `/analytics/customers`, which provide raw data for integrations into custom dashboards. Rate limits apply, but enterprise sellers can request expanded access.\n\nAll analytics data is anonymised at the customer level to protect their privacy and follow rules like GDPR. Sellers can see general information about their customers, such as their age range or where they live, but they can't see the names of individual customers unless the buyers decide to reveal them during checkout.\n\nShoplite's analytics system gives businesses the tools they need to improve their pricing, marketing campaigns, and inventory planning by making it easy to use, adding advanced filtering, and providing developer-friendly APIs. This makes the ecosystem stronger, which helps both sellers and buyers by making things more efficient and personalised.",
        "id": "doc16"
    },
    {
        "title": "Document 17: Fraud Prevention and Risk Management",
        "content": "Maintaining trust across the Shoplite marketplace requires strong defenses against fraud. To achieve this, Shoplite employs a multi-layered risk management framework that protects both buyers and sellers. Suspicious activities—such as repeated failed login attempts, sudden changes in buying patterns, or large-volume orders from new accounts—are flagged in real time by AI-driven monitoring systems.\n\nFor buyers, safeguards include two-factor authentication on high-value purchases, automated alerts for unusual account activity, and a dispute resolution channel for fraudulent charges. Refunds in confirmed fraud cases are processed quickly, with Shoplite absorbing costs when sellers are not at fault.\n\nRisk scoring on incoming orders helps sellers find orders that might be fake before they ship them. For instance, a manual review may be triggered by billing and shipping addresses that don't match, a high number of refunds, or payment methods that have been flagged. If fraud is suspected, sellers should wait to fulfil the order until the issue is settled.\n\nShoplite's fraud detection works with both the Payment Service and the Order Service on the backend. This makes sure that problems are found early in the purchase flow. All flagged transactions are recorded with timestamps, and detailed audit trails are kept for compliance reasons. There are also APIs that let enterprise sellers get fraud alerts directly in their ERP systems.\n\nThe system keeps getting enhanced with regular security audits, partnerships with payment gateways, and machine learning models that learn from new types of fraud. This proactive approach lowers the risk of wasting money and boosts buyer confidence.\n\nShoplite makes sure that its marketplace is a safe and trusted place for every transaction by using advanced detection technology and clear resolution policies.",
        "id": "doc17"
    }
]

In [None]:
# Cell 4: Assisted Prompts Configuration

PROMPTS = {
    "base_retrieval_prompt": {
        "role": "You are a helpful Shoplite customer service assistant.",
        "goal": "Provide accurate and clear answers when exactly one document is retrieved.",
        "context_guidelines": [
            "Only respond to questions asked in English.",
            "If a question is in another language, politely ask the user to ask in English.",
            "Only use information from the single retrieved document.",
            "Do not speculate or add outside knowledge.",
            "Keep answers concise (3-5 sentences).",
            "Always cite the document title used."
        ],
        "response_format": "Answer: [Your response based on 1 document]\nSources: [Single document title]"
    },

    "multi_doc_prompt": {
        "role": "You are a synthesis assistant that combines information from multiple documents.",
        "goal": "Provide answers when 2 to 4 documents are retrieved and need to be merged.",
        "context_guidelines": [
            "Only respond to questions asked in English.",
            "If a question is in another language, politely ask the user to ask in English.",
            "Integrate evidence from all relevant documents into one clear answer.",
            "Explain how the documents connect and resolve conflicts.",
            "Use bullet points if step-by-step guidance is needed.",
            "Prioritize official policies when conflicts occur."
        ],
        "response_format": "Answer: [Synthesis answer across 2-4 documents]\nSources: [All document titles used]"
    },

    "no_context_refusal_prompt": {
        "role": "You are a cautious assistant that refuses politely when no relevant documentation is found.",
        "goal": "Prevent giving false or speculative answers if retrieval returns nothing.",
        "context_guidelines": [
            "Only respond to questions asked in English.",
            "If a question is in another language, politely ask the user to ask in English.",
            "Never invent or assume details.",
            "Politely state that information on that topic is not in the Shoplite knowledge base.",
            "Remind the user that your purpose is to answer questions about Shoplite.",
            "Suggest asking about a relevant Shoplite topic like 'payments' or 'order tracking'."
        ],
        "response_format": "Answer: [Polite refusal + suggestion to ask about Shoplite]\nSources: none"
    },

    "clarification_prompt": {
        "role": "You are a proactive assistant that seeks clarification when user queries are ambiguous.",
        "goal": "Make sure the request is fully understood before answering.",
        "context_guidelines": [
            "Only respond to questions asked in English.",
            "If a question is in another language, politely ask the user to ask in English.",
            "Ask one clarifying question only.",
            "Offer examples or options related to Shoplite to guide the user.",
            "For example, if a user asks 'how do I check the status?', ask 'Are you referring to an order status or a return status on Shoplite?'",
            "Wait for clarification before answering."
        ],
        "response_format": "Answer: [Clarifying question for the user related to Shoplite]\nSources: none"
    },

    "overflow_doc_prompt": {
        "role": "You are an assistant that handles scenarios where more than 4 documents are retrieved.",
        "goal": "Summarize, filter, and condense the information to avoid overwhelming the user.",
        "context_guidelines": [
            "Only respond to questions asked in English.",
            "If a question is in another language, politely ask the user to ask in English.",
            "Focus on the most relevant points across all retrieved documents.",
            "Group related information together.",
            "Do not list every detail; highlight key themes.",
            "Cite the most important document titles only."
        ],
        "response_format": "Answer: [Condensed summary from many documents]\nSources: [Key document titles]"
    },

    "chitchat_prompt": {
        "role": "You are a friendly Shoplite assistant.",
        "goal": "Respond to greetings and social interactions politely while staying on-brand.",
        "context_guidelines": [
            "Only respond to messages in English.",
            "If a message is in another language, politely ask the user to communicate in English.",
            "Your only purpose is to answer questions about Shoplite.",
            "Respond warmly to greetings, thanks, and goodbyes.",
            "Do not answer off-topic questions.",
            "Keep responses brief and friendly.",
            "Redirect to Shoplite topics if appropriate."
        ],
        "response_format": "Answer: [Friendly response]"
    }
}


# Metadata
PROMPTS_METADATA = {
    "version": "1.0",
    "date": "2025-09-29",
    "author": "Charbel Moussallem",
    "note": "RAG (and No RAG) assisted Prompts Configurations"
}

In [5]:
# Cell 5: LLM loading and setup
print("Loading model... This may take several minutes.")

# Configure 4-bit quantization to fit in Colab GPU memory
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.float16,
    bnb_4bit_use_double_quant=True,
)

# Using Qwen 2.5 7B - publicly accessible, comparable to Llama 3.1 8B
MODEL_NAME = "Qwen/Qwen2.5-7B-Instruct"

print(f"Loading {MODEL_NAME}...")

tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)

model = AutoModelForCausalLM.from_pretrained(
    MODEL_NAME,
    quantization_config=bnb_config,
    device_map="auto",
    trust_remote_code=True
)

llm_pipeline = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer
)

print(f"✓ Model loaded: {MODEL_NAME}")
print(f"✓ Device: {model.device}")
print(f"✓ Memory allocated: {torch.cuda.memory_allocated()/1e9:.2f} GB")

Loading model... This may take several minutes.
Loading Qwen/Qwen2.5-7B-Instruct...


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json: 0.00B [00:00, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

config.json:   0%|          | 0.00/663 [00:00<?, ?B/s]

model.safetensors.index.json: 0.00B [00:00, ?B/s]

Fetching 4 files:   0%|          | 0/4 [00:00<?, ?it/s]

model-00004-of-00004.safetensors:   0%|          | 0.00/3.56G [00:00<?, ?B/s]

model-00001-of-00004.safetensors:   0%|          | 0.00/3.95G [00:00<?, ?B/s]

model-00003-of-00004.safetensors:   0%|          | 0.00/3.86G [00:00<?, ?B/s]

model-00002-of-00004.safetensors:   0%|          | 0.00/3.86G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/4 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/243 [00:00<?, ?B/s]

Device set to use cuda:0


✓ Model loaded: Qwen/Qwen2.5-7B-Instruct
✓ Device: cuda:0
✓ Memory allocated: 5.56 GB


In [6]:
# Cell 6: RAG pipeline implementation

# --- Global RAG components ---
embedding_model = None
faiss_index = None
doc_id_map = {}

def build_rag_pipeline():
    """Initializes the embedding model and builds the FAISS index."""
    global embedding_model, faiss_index, doc_id_map
    print("Building RAG pipeline...")

    # 1. Load Embedding Model
    print("Loading embedding model...")
    embedding_model = SentenceTransformer('all-MiniLM-L6-v2', device='cuda')

    # 2. Create Embeddings
    doc_contents = [doc['content'] for doc in KNOWLEDGE_BASE]
    embeddings = embedding_model.encode(doc_contents, convert_to_tensor=True)

    # 3. Build FAISS Index
    embedding_dim = embeddings.shape[1]
    faiss_index = faiss.IndexFlatL2(embedding_dim)
    faiss_index.add(embeddings.cpu().numpy())

    # 4. Create a mapping from index ID to document
    for i, doc in enumerate(KNOWLEDGE_BASE):
        doc_id_map[i] = doc

    print(f"✓ FAISS index ready: {faiss_index.ntotal} documents indexed")
    print("✓ RAG pipeline ready!")

# In Cell 6, replace the old retrieve_docs with this NEW version
def retrieve_docs(query: str, k: int = 4) -> List[Dict[str, Any]]:
    """
    Retrieves the top-k most relevant documents for a given query,
    but returns an empty list if no document meets the relevance threshold.
    """
    query_embedding = embedding_model.encode([query], convert_to_tensor=True)
    distances, indices = faiss_index.search(query_embedding.cpu().numpy(), k)

    # If no documents are found at all, return empty list
    if len(indices[0]) == 0:
        return []

    # This is the key change: Check the distance of the BEST match.
    # If it's too high (meaning not very similar), consider it irrelevant
    # and return nothing. This prevents feeding bad context to the LLM.
    RELEVANCE_THRESHOLD = 1.0  # L2 distance threshold; lower is stricter.

    if distances[0][0] > RELEVANCE_THRESHOLD:
        print(f"⚠️ Top document distance ({distances[0][0]:.2f}) exceeds threshold. No relevant context found.")
        return []

    # Only return documents that are within the threshold
    relevant_indices = [idx for idx, dist in zip(indices[0], distances[0]) if dist < RELEVANCE_THRESHOLD]
    return [doc_id_map[i] for i in relevant_indices]

# In Cell 6, make sure format_prompt is this ORIGINAL version
def format_prompt(query: str, retrieved_docs: List[Dict[str, Any]]) -> str:
    """Selects and formats the appropriate prompt based on retrieval results."""
    num_docs = len(retrieved_docs)

    if num_docs == 0:
        prompt_config = PROMPTS["no_context_refusal_prompt"]
        context_str = "No relevant documents found."
    elif num_docs == 1:
        prompt_config = PROMPTS["base_retrieval_prompt"]
    elif 2 <= num_docs <= 4:
        prompt_config = PROMPTS["multi_doc_prompt"]
    else: # num_docs > 4
        prompt_config = PROMPTS["overflow_doc_prompt"]

    # Build context string from documents
    if num_docs > 0:
        context_str = "\\n\\n---\\n\\n".join(
            [f"Document Title: {doc['title']}\\nContent: {doc['content']}" for doc in retrieved_docs]
        )

    guidelines_str = "\\n".join([f"- {g}" for g in prompt_config['context_guidelines']])

    final_prompt = f"""
<|im_start|>system
{prompt_config['role']}
Goal: {prompt_config['goal']}

Guidelines:
{guidelines_str}

Response Format:
{prompt_config['response_format']}
<|im_end|>
<|im_start|>user
CONTEXT:
{context_str}

QUESTION:
{query}
<|im_end|>
<|im_start|>assistant
"""
    return final_prompt.strip()


def generate_response(prompt: str) -> str:
    """Generates a response from the LLM using the formatted prompt."""
    response = llm_pipeline(
        prompt,
        max_new_tokens=512,
        do_sample=True,
        temperature=0.7,
        top_p=0.95,
        return_full_text=False
    )
    return response[0]['generated_text'].strip()


# Initialize the RAG pipeline on startup
build_rag_pipeline()

Building RAG pipeline...
Loading embedding model...


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md: 0.00B [00:00, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

✓ FAISS index ready: 17 documents indexed
✓ RAG pipeline ready!


In [12]:
# Cell 7: Flask API setup (with Query Refinement)
from flask import Flask, request, jsonify
from typing import List, Dict, Any

# --- Helper Functions (Copied from Cell 6 to make this cell self-contained) ---

def build_prompt(prompt_key: str, context: str = "", query: str = "") -> str:
    if prompt_key not in PROMPTS: raise ValueError(f"Invalid prompt key: {prompt_key}")
    prompt_config = PROMPTS[prompt_key]
    guidelines_str = "\n".join([f"- {g}" for g in prompt_config['context_guidelines']])
    prompt = f"""<|im_start|>system
{prompt_config['role']}
Goal: {prompt_config['goal']}
Guidelines:
{guidelines_str}
Response Format:
{prompt_config['response_format']}
<|im_end|>
<|im_start|>user"""
    if context: prompt += f"\nCONTEXT:\n{context}\n"
    if query: prompt += f"\nQUESTION:\n{query}"
    else: prompt += f"\n{context}"
    prompt += "\n<|im_end|>\n<|im_start|>assistant\n"
    return prompt.strip()

def retrieve_docs(query: str, k: int = 4) -> List[Dict[str, Any]]:
    query_embedding = embedding_model.encode([query], convert_to_tensor=True)
    distances, indices = faiss_index.search(query_embedding.cpu().numpy(), k)
    if len(indices[0]) == 0: return []
    RELEVANCE_THRESHOLD = 1.0
    if distances[0][0] > RELEVANCE_THRESHOLD:
        print(f"⚠️ Top document distance ({distances[0][0]:.2f}) exceeds threshold. No relevant context found.")
        return []
    relevant_indices = [idx for idx, dist in zip(indices[0], distances[0]) if dist < RELEVANCE_THRESHOLD]
    return [doc_id_map[i] for i in relevant_indices]

def format_prompt(query: str, retrieved_docs: List[Dict[str, Any]]) -> str:
    num_docs = len(retrieved_docs)
    if num_docs == 0: prompt_key, context_str = "no_context_refusal_prompt", "No relevant documents found."
    elif num_docs == 1: prompt_key, context_str = "base_retrieval_prompt", f"Document Title: {retrieved_docs[0]['title']}\nContent: {retrieved_docs[0]['content']}"
    else:
        prompt_key = "multi_doc_prompt" if 2 <= num_docs <= 4 else "overflow_doc_prompt"
        context_str = "\n\n---\n\n".join([f"Document Title: {doc['title']}\nContent: {doc['content']}" for doc in retrieved_docs])
    return build_prompt(prompt_key, context=context_str, query=query)

def generate_response(prompt: str) -> str:
    response = llm_pipeline(prompt, max_new_tokens=512, do_sample=True, temperature=0.7, top_p=0.95, return_full_text=False)
    return response[0]['generated_text'].strip()

def calculate_confidence(query: str, retrieved_docs: List[Dict[str, Any]]) -> tuple:
    if len(retrieved_docs) == 0: return ("Low", 0.0)
    query_embedding = embedding_model.encode([query], convert_to_tensor=True)
    similarities = [torch.nn.functional.cosine_similarity(query_embedding, embedding_model.encode([doc['content']], convert_to_tensor=True), dim=1).item() for doc in retrieved_docs]
    avg_similarity = sum(similarities) / len(similarities)
    normalized_score = max(0.0, min(1.0, (avg_similarity - 0.2) / 0.8))
    doc_count = len(retrieved_docs)
    doc_factor = 1.0 if 2 <= doc_count <= 3 else (0.9 if doc_count == 1 else 0.85)
    final_score = normalized_score * doc_factor
    label = "High" if final_score >= 0.5 else ("Medium" if final_score >= 0.3 else "Low")
    return (label, final_score)

# --- Flask App Definition ---
app = Flask(__name__)

@app.route("/health")
def health():
    return jsonify({"status": "ok", "model": MODEL_NAME}), 200

@app.route("/ping", methods=["POST"])
def ping():
    data = request.get_json()
    if not data or "query" not in data: return jsonify({"error": "Missing 'query' in request body"}), 400
    query = data["query"]
    prompt = build_prompt("chitchat_prompt", context=query)
    try:
        response_text = generate_response(prompt)
        if "Answer:" in response_text: response_text = response_text.split("Answer:")[1].strip()
        return jsonify({"response": response_text}), 200
    except Exception as e:
        return jsonify({"error": f"Model inference failed: {str(e)}"}), 500

@app.route("/chat", methods=["POST"])
def chat():
    data = request.get_json()
    if not data or "query" not in data: return jsonify({"error": "Missing 'query' in request body"}), 400
    query = data["query"]

    # =======================================================
    # ==  THIS NEW SNIPPET IS FOR QUERY REFINING  ==
    # =======================================================
    if "shoplite" not in query.lower() and len(query.split()) > 2:
        refined_query = query + " on shoplite"
        print(f"INFO: Query refined to: '{refined_query}'") # Added for visibility
        query = refined_query
    # =======================================================

    try:
        retrieved_docs = retrieve_docs(query)
        confidence_label, _ = calculate_confidence(query, retrieved_docs)
        prompt = format_prompt(query, retrieved_docs)
        response_text = generate_response(prompt)
        parts = response_text.split("Sources:")
        answer = parts[0].replace("Answer:", "").strip()
        sources = parts[1].strip() if len(parts) > 1 else "none"
        return jsonify({"response": answer, "sources": sources, "confidence": confidence_label}), 200
    except Exception as e:
        return jsonify({"error": f"RAG pipeline failed: {str(e)}"}), 500

In [14]:
# Cell 8: ngrok token input and tunnel creation

def run_flask_app():
    # use_reloader=False is important for notebook environments
    app.run(host="0.0.0.0", port=5001, use_reloader=False)

# Start Flask in a background thread
flask_thread = threading.Thread(target=run_flask_app, daemon=True)
flask_thread.start()
time.sleep(3) # Give the server a moment to start

# ngrok Tunnel Creation
print("=" * 60)
print("NGROK SETUP - Expose Flask API to external access")
print("=" * 60)
try:
    ngrok_token = getpass("Enter your ngrok authtoken (input hidden): ")
    ngrok.set_auth_token(ngrok_token)
    public_url = ngrok.connect(5001)
    print(f"\n✓ ngrok tunnel established!")
    print(f"📡 Public URL: {public_url}")
    print("\nYour API is now accessible at:")
    print(f"  • Health: {public_url}/health")
    print(f"  • Ping:   {public_url}/ping")
    print(f"  • Chat:   {public_url}/chat")
    print("\nIMPORTANT: Leave this cell running to maintain the tunnel.")
except Exception as e:
    print(f"\n❌ An error occurred during ngrok setup: {e}")

 * Serving Flask app '__main__'
 * Debug mode: off


 * Running on all addresses (0.0.0.0)
 * Running on http://127.0.0.1:5001
 * Running on http://172.28.0.12:5001
INFO:werkzeug:[33mPress CTRL+C to quit[0m


NGROK SETUP - Expose Flask API to external access
Enter your ngrok authtoken (input hidden): ··········

✓ ngrok tunnel established!
📡 Public URL: NgrokTunnel: "https://georgine-undecrepit-sondra.ngrok-free.dev" -> "http://localhost:5001"

Your API is now accessible at:
  • Health: NgrokTunnel: "https://georgine-undecrepit-sondra.ngrok-free.dev" -> "http://localhost:5001"/health
  • Ping:   NgrokTunnel: "https://georgine-undecrepit-sondra.ngrok-free.dev" -> "http://localhost:5001"/ping
  • Chat:   NgrokTunnel: "https://georgine-undecrepit-sondra.ngrok-free.dev" -> "http://localhost:5001"/chat

IMPORTANT: Leave this cell running to maintain the tunnel.


In [13]:
# Cell 9: API Testing and Validation (Prints Query and Confidence)

NGROK_URL = input("Please paste your public ngrok URL here: ").strip()

if not NGROK_URL:
    print("\n❌ No URL provided. Please restart and enter the ngrok URL.")
else:
    print("=" * 70)
    print(f"Testing API at: {NGROK_URL}")
    print("=" * 70)
    try:
        # Test 1: Health Check
        print("\n[1/5] Testing /health endpoint...")
        response = requests.get(f"{NGROK_URL}/health", timeout=10)
        if response.status_code == 200: print("✅ Health Check PASSED")
        else: print(f"❌ Health Check FAILED (Status: {response.status_code})")
        print("-" * 70)

        # Test 2: Chitchat via /ping
        print("\n[2/5] Testing /ping - Chitchat Handling...")
        query = "Hello!"
        print(f"    Query: '{query}'")
        response = requests.post(f"{NGROK_URL}/ping", json={"query": query}, timeout=120)
        if response.status_code == 200:
            print("✅ Chitchat Query PASSED")
            print(f"    Answer: {response.json().get('response', 'N/A')}")
        else:
            print(f"❌ Chitchat Query FAILED (Status: {response.status_code})")
        print("-" * 70)

        # Test 3: Single-Document RAG Query
        print("\n[3/5] Testing /chat - Single Document Retrieval...")
        query = "What filters can I use to refine a product search?"
        print(f"    Query: '{query}'")
        response = requests.post(f"{NGROK_URL}/chat", json={"query": query}, timeout=120)
        if response.status_code == 200:
            data = response.json()
            print("✅ Single-Doc Query PASSED")
            print(f"    Confidence: {data.get('confidence', 'N/A')}")
        else:
            print(f"❌ Single-Doc Query FAILED (Status: {response.status_code})")
        print("-" * 70)

        # Test 4: Multi-Document RAG Query
        print("\n[4/5] Testing /chat - Multi-Document Synthesis...")
        query = "What analytics tools and fee structures help sellers optimize performance?"
        print(f"    Query: '{query}'")
        response = requests.post(f"{NGROK_URL}/chat", json={"query": query}, timeout=120)
        if response.status_code == 200:
            data = response.json()
            print("✅ Multi-Doc Query PASSED")
            print(f"    Confidence: {data.get('confidence', 'N/A')}")
        else:
            print(f"❌ Multi-Doc Query FAILED (Status: {response.status_code})")
        print("-" * 70)

        # Test 5: Edge Case - Off-Topic Query
        print("\n[5/5] Testing /chat - Off-Topic Query Handling...")
        query = "What is Shoplite's headquarters address?"
        print(f"    Query: '{query}'")
        response = requests.post(f"{NGROK_URL}/chat", json={"query": query}, timeout=120)
        if response.status_code == 200:
            data = response.json()
            print("✅ Off-Topic Query PASSED")
            print(f"    Confidence: {data.get('confidence', 'N/A')}")
        else:
            print(f"❌ Off-Topic Query FAILED (Status: {response.status_code})")
        print("-" * 70)

        print("\n🎉 All tests completed!")
        print("=" * 70)

    except requests.exceptions.RequestException as e:
        print(f"\n❌ Connection error: {e}")

Please paste your public ngrok URL here: https://georgine-undecrepit-sondra.ngrok-free.dev
Testing API at: https://georgine-undecrepit-sondra.ngrok-free.dev

[1/5] Testing /health endpoint...


INFO:werkzeug:127.0.0.1 - - [02/Oct/2025 22:24:05] "GET /health HTTP/1.1" 200 -


✅ Health Check PASSED
----------------------------------------------------------------------

[2/5] Testing /ping - Chitchat Handling...
    Query: 'Hello!'


INFO:werkzeug:127.0.0.1 - - [02/Oct/2025 22:24:08] "POST /ping HTTP/1.1" 200 -


✅ Chitchat Query PASSED
    Answer: : Hello there! How can I assist you with your shopping needs today at Shoplite?
----------------------------------------------------------------------

[3/5] Testing /chat - Single Document Retrieval...
    Query: 'What filters can I use to refine a product search?'


INFO:werkzeug:127.0.0.1 - - [02/Oct/2025 22:24:15] "POST /chat HTTP/1.1" 200 -


✅ Single-Doc Query PASSED
    Confidence: Medium
----------------------------------------------------------------------

[4/5] Testing /chat - Multi-Document Synthesis...
    Query: 'What analytics tools and fee structures help sellers optimize performance?'


INFO:werkzeug:127.0.0.1 - - [02/Oct/2025 22:24:38] "POST /chat HTTP/1.1" 200 -


✅ Multi-Doc Query PASSED
    Confidence: High
----------------------------------------------------------------------

[5/5] Testing /chat - Off-Topic Query Handling...
    Query: 'What is Shoplite's headquarters address?'
⚠️ Top document distance (1.10) exceeds threshold. No relevant context found.


INFO:werkzeug:127.0.0.1 - - [02/Oct/2025 22:24:43] "POST /chat HTTP/1.1" 200 -


✅ Off-Topic Query PASSED
    Confidence: Low
----------------------------------------------------------------------

🎉 All tests completed!
