### 📂 Load CTSE Lecture Notes (PDFs)

This segment loads all CTSE lecture note PDFs from the `ctse_lectures` folder using `PyPDFLoader` from `langchain_community`. It loops through each file in the folder, loads its content, and appends the resulting documents into a single list (`docs`) for further processing.


In [1]:
import os
from langchain_community.document_loaders import PyPDFLoader

# Load all PDFs from the folder
folder_path = "./ctse_lectures"
docs = []

for filename in os.listdir(folder_path):
    if filename.endswith(".pdf"):
        loader = PyPDFLoader(os.path.join(folder_path, filename))
        docs.extend(loader.load())


### 🔗 Chunk Lecture Content and Create Vector Store

This section splits the loaded documents into smaller, overlapping chunks using `RecursiveCharacterTextSplitter`. It then embeds the chunks using `OpenAIEmbeddings` and stores them in a FAISS vector store, which enables fast semantic search and retrieval of relevant content during question answering.


In [2]:
# Updated imports
from langchain_community.vectorstores import FAISS
from langchain_community.embeddings import OpenAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter
from dotenv import load_dotenv  
import os

# Load environment variables from .env file
load_dotenv()

# Retrieve OpenAI API Key from environment variable
openai_api_key = os.getenv("OPENAI_API_KEY")

# Split documents into chunks
splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=100)
split_docs = splitter.split_documents(docs)

# Create vector store
embeddings = OpenAIEmbeddings(openai_api_key=openai_api_key)
vectorstore = FAISS.from_documents(split_docs, embeddings)

  embeddings = OpenAIEmbeddings(openai_api_key=openai_api_key)


### 💬 Set Up Chatbot and Query the Lecture Notes

This segment initializes a `ChatOpenAI` LLM, connects it to the FAISS retriever using `RetrievalQA`, and allows querying the chatbot. The result is wrapped using `textwrap.fill` for better readability in the output display. You can modify the query text to ask different lecture-related questions.


In [3]:
from langchain.chains import RetrievalQA
from langchain_openai import ChatOpenAI
import textwrap
from dotenv import load_dotenv  # Import dotenv to load environment variables
import os

# Load environment variables from .env file
load_dotenv()

# Retrieve OpenAI API Key from environment variable
openai_api_key = os.getenv("OPENAI_API_KEY")

llm = ChatOpenAI(openai_api_key=openai_api_key)
retriever = vectorstore.as_retriever()

qa = RetrievalQA.from_chain_type(llm=llm, retriever=retriever)

# Example query
response = qa.run("What is the CAP THEOREM?")
print(textwrap.fill(response, width=100))


  response = qa.run("What is the CAP THEOREM?")


The CAP Theorem, also known as Brewer's Theorem, is a fundamental concept in distributed systems. It
states that in a distributed system, it is impossible to simultaneously achieve all three of the
following properties: consistency, availability, and partition tolerance. Instead, a system can only
have at most two out of these three properties at any given time.
