In [None]:
from langchain.document_loaders import TextLoader
from langchain.text_splitter import CharacterTextSplitter
from langchain.vectorstores import FAISS
from langchain.embeddings import OpenAIEmbeddings
from langchain.chains import RetrievalQA
from langchain.llms import OpenAI

Diff b/w library vs dependency vs components.

Library --->	A collection of pre-written code you can use. Here	LangChain is the main library.
Dependency ---> Any external library your project relies on to work.Here LangChain, FAISS, OpenAI's API, etc., are dependencies.
Component/Module ---> Specific parts (tools, classes, or functions) inside a library.	TextLoader, CharacterTextSplitter, RetrievalQA, etc. are components from LangChain.

Example from your code:
from langchain.document_loaders import TextLoader

LangChain → the library (and dependency).
TextLoader → a component (or module) inside the LangChain library.

Quick Analogy:
Imagine LangChain is a car factory (library).
Your project depends on it to build a car (it's a dependency).
Inside the factory, you pick individual car parts (components) like engines, tires, or steering wheels — that’s like TextLoader or FAISS.


1. from langchain.document_loaders import TextLoader

What it does:
Loads plain text documents from files so they can be processed by the AI.
Simple terms:
Think of this like reading the content of a .txt file so the AI can use it.

2. from langchain.text_splitter import CharacterTextSplitter

What it does:
Splits large blocks of text into smaller chunks based on characters (like every 1000 characters).
Simple terms:
AI models work better with short pieces of text. This tool breaks long documents into bite-sized pieces.

3. from langchain.vectorstores import FAISS

What it does:
Stores and searches vector representations (numerical form) of text using FAISS, a fast library for similarity search.
Simple terms:
It helps the AI remember and quickly find the most relevant pieces of your document when you ask a question.

4. from langchain.embeddings import OpenAIEmbeddings

What it does:
Turns text into vectors (numbers) using OpenAI's embedding models.
Simple terms:
This is like converting sentences into a "language of numbers" that the AI can compare and understand.

5. from langchain.chains import RetrievalQA

What it does:
Creates a question-answering system that retrieves relevant information from documents before answering.
Simple terms:
It lets you ask questions, and it will look through your loaded documents to give the best answer.

6. from langchain.llms import OpenAI

What it does:
Connects to OpenAI’s language models (like GPT-4) to generate answers or responses.
Simple terms:
This is the brain of the system — the actual AI model that gives intelligent responses.

Summary:

You're building (or using) a system where:

You load a document.
Split it into chunks.
Convert the chunks into a searchable format.
Store them in a system that can quickly find relevant ones.
Let a powerful AI model (like GPT-4) read the relevant parts.
And answer questions about them.

In [None]:
pip install -U langchain-community

In [None]:
#Load multiple text files.
files = ["business-policy.txt","lorem-ipsum.txt","sports-policy.txt"]

In [None]:
files

In [None]:
#Adding all files we have
documents = []
for file in files:
  loader = TextLoader(file)
  documents.extend(loader.load())

A for loop is used when you want to repeat something a certain number of times — usually going through a list of items one by one.

You use a for loop when you want to do something for every item in a group.
For example:
Print every name in a list.
Go through all lines in a file.
Add up all numbers in a list.
Do something repeatedly a known number of times.

In [None]:
#splitting text into smaller chunks
splitter = CharacterTextSplitter(chunk_size=500, chunk_overlap=50)
docs = splitter.split_documents(documents)

In [None]:
#creating embedding + vector db
embedding = OpenAIEmbeddings()
db = FAISS.from_documents(docs, embedding)

In [None]:
import os
os.environ["OPENAI_API_KEY"] = "skd-..."

In [None]:
#creating retriveal-based QA chain
qa = RetrievalQA.from_chain_type(
    llm=OpenAI(),
    retriver=db.as_retriever()
)

In [None]:
#Ask question to our files
query = " "
answer = qa.run(query)
print("Answers : ",answer)