<a href="https://colab.research.google.com/github/im-nandha/LLM-Powered-Booking-Analytics-QA-System/blob/main/llm_based_question_answering_fast_api.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [2]:
pip install fastapi uvicorn transformers sentence-transformers faiss-cpu nest_asyncio pandas


Collecting fastapi
  Downloading fastapi-0.115.12-py3-none-any.whl.metadata (27 kB)
Collecting uvicorn
  Downloading uvicorn-0.34.0-py3-none-any.whl.metadata (6.5 kB)
Collecting faiss-cpu
  Downloading faiss_cpu-1.10.0-cp311-cp311-manylinux_2_28_x86_64.whl.metadata (4.4 kB)
Collecting starlette<0.47.0,>=0.40.0 (from fastapi)
  Downloading starlette-0.46.1-py3-none-any.whl.metadata (6.2 kB)
Collecting nvidia-cuda-nvrtc-cu12==12.4.127 (from torch>=1.11.0->sentence-transformers)
  Downloading nvidia_cuda_nvrtc_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-runtime-cu12==12.4.127 (from torch>=1.11.0->sentence-transformers)
  Downloading nvidia_cuda_runtime_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-cupti-cu12==12.4.127 (from torch>=1.11.0->sentence-transformers)
  Downloading nvidia_cuda_cupti_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cudnn-cu12==9.1.0.70 (from 

In [4]:
import nest_asyncio
import uvicorn
from fastapi import FastAPI
from pydantic import BaseModel
import pandas as pd
import faiss
import numpy as np
from transformers import pipeline
from sentence_transformers import SentenceTransformer

# Apply nest_asyncio to run FastAPI in Jupyter Notebook
nest_asyncio.apply()

# Initialize FastAPI app
app = FastAPI()

# Load dataset
df = pd.read_csv("hotel_bookings1.csv")

# Convert relevant columns to a single text format for retrieval
df["combined_text"] = df.apply(lambda row:
    f"Hotel: {row['hotel']}, Canceled: {row['is_canceled']}, Lead Time: {row['lead_time']} days, "
    f"Arrival: {row['arrival_date_month']} {row['arrival_date_year']}, Price: ${row['adr']}, "
    f"Country: {row['country']}", axis=1)

# Load Sentence Transformer model for text embeddings
embedding_model = SentenceTransformer("all-MiniLM-L6-v2")

# Generate embeddings for dataset
embeddings = embedding_model.encode(df["combined_text"].tolist(), convert_to_tensor=False)

# Store embeddings in FAISS for retrieval
dimension = embeddings.shape[1]
faiss_index = faiss.IndexFlatL2(dimension)
faiss_index.add(np.array(embeddings))

# Load LLM model for answering questions
qa_pipeline = pipeline("question-answering", model="deepset/roberta-base-squad2")

# Define request model
class QueryRequest(BaseModel):
    question: str

@app.get("/")
def home():
    return {"message": "Hotel Booking QA API is running!"}

@app.post("/ask")
def ask_question(query: QueryRequest):
    # Convert question to embedding
    query_embedding = embedding_model.encode([query.question])

    # Find the most relevant document
    _, closest_doc_idx = faiss_index.search(np.array(query_embedding), k=1)

    # Retrieve the best-matching text from the dataset
    best_match = df.iloc[closest_doc_idx[0][0]]["combined_text"]

    # Use retrieved text as context for LLM
    response = qa_pipeline(question=query.question, context=best_match)

    return {"question": query.question, "answer": response["answer"], "context": best_match}

# Run FastAPI in Jupyter Notebook
uvicorn.run(app, host="127.0.0.1", port=8000)


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/10.5k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/571 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/496M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/79.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/899k [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/772 [00:00<?, ?B/s]

Device set to use cpu
INFO:     Started server process [303]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)
INFO:     Shutting down
INFO:     Waiting for application shutdown.
INFO:     Application shutdown complete.
INFO:     Finished server process [303]
