### Indexing
1. **Load** - Use Document Loaders to load the data
2. **Split** - Use Text Splitters to break large documents into smaller chunks
3. **Store** - Use Vector Store and Embeddings Models to store and index our chunks

### Retreival and generation
1. **Retrieve** - Based on user input, retreive relevant splits/chunks from the Vector store using Retriever. Use Cosine    
                  similarity to perform Similarity Search
2. **Generate** - Pass the retrieved data along with User query to LLM/ChatModel to generate the response


![alt text](images/image.png)

In [1]:
from langchain_openai import ChatOpenAI
from dotenv import load_dotenv
from langchain_community.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_chroma import Chroma
from langchain_openai import OpenAIEmbeddings

load_dotenv()

embeddings = OpenAIEmbeddings()
llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0.7)


In [2]:
loader = TextLoader("./technical.txt")
docs = loader.load()

In [6]:
docs

[Document(metadata={'source': './technical.txt'}, page_content=".Net & .Net Core Web API:\n.NET Framework vs .NET Core: .NET Core is cross-platform, open-source, and optimized for microservices and high-performance applications.\nDependency Injection (DI): Built-in DI in .NET Core simplifies testability and decouples components.\nMiddleware: Custom middleware in .NET Core is used to handle HTTP requests and responses.\nVersioning in Web API: Best practices for versioning APIs include URL path versioning, query string versioning, and header versioning.\nAsynchronous Programming: Use of async/await for non-blocking I/O operations to enhance performance.\n\n\nFastAPI:\nFastAPI Core Strengths: High performance, type hints for request validation, and auto-generated OpenAPI documentation.\nDependency Injection in FastAPI: Simplifies code testing and component decoupling.\nData Validation: Use Pydantic for automatic request body validation, type checking, and schema generation.\nConcurrency: 

In [7]:
len(docs)

1

In [5]:
text_splitter = RecursiveCharacterTextSplitter(chunk_size=200, chunk_overlap=40)
chunks = text_splitter.split_documents(docs)
vector_store = Chroma.from_documents(documents=chunks, embedding=embeddings)

In [8]:
from langchain_core.prompts import ChatPromptTemplate
template ="""You are the helpful assistant to answer the question based on the below question and context
    Question: {question}
    Context: {context}
    Answer:
"""
prompt = ChatPromptTemplate.from_template(template)

In [9]:
from langchain import hub

prompt = hub.pull("rlm/rag-prompt")

In [11]:
retriever = vector_store.as_retriever()

def format_docs(docs):
    return f"\n\n".join(doc.page_content for doc in docs)

In [12]:
from langchain_core.runnables import RunnablePassthrough, RunnableLambda, RunnableParallel
from langchain_core.output_parsers.string import StrOutputParser

rag_chain =(
    RunnableParallel({"context":retriever | format_docs,"question":RunnablePassthrough()})
    | prompt
    | llm
    | StrOutputParser()
)

In [13]:
rag_chain.invoke("What options available for Authentication in .Net or Python FastAPI?")

'In FastAPI, the options available for authentication include OAuth2 and JWT token-based authentication. These features help in verifying the identity of end-users based on authentication performed by an authorization server. FastAPI also supports data validation using Pydantic for automatic request body validation and schema generation.'