### Vector Store
A database that stores text inside vectors(embeddings), helps find text by meaning. e.g.- FAISS,Chroma


### Retriver
Takes your Question converts to embeddings, searches vector store,returns top k relevant doc, doesn't 
generate ans used inside RAG

### Documents
Langchain implements a Documents abstraction, it has two attributes:
- page_content: a string representing the content, actual text.
- metadata: source ,author,creation_date,etc.

In [3]:
from langchain_core.documents import Document
documents=[
    Document(
        page_content="machine learning is branch of computer science",
        metadata={"source":"AI-doc"}
    ),
    Document(
        page_content="AI is branch of computer science",
        metadata={"source":"Artificial-doc"}
    ),
    Document(
        page_content="The goal of Arificial intelligence is to mimic human intelligence",
        metadata={"source":"intelligence-doc"}
    )
]

In [4]:
documents

[Document(metadata={'source': 'AI-doc'}, page_content='machine learning is branch of computer science'),
 Document(metadata={'source': 'Artificial-doc'}, page_content='AI is branch of computer science'),
 Document(metadata={'source': 'intelligence-doc'}, page_content='The goal of Arificial intelligence is to mimic human intelligence')]

In [9]:
import os 
from dotenv import load_dotenv
load_dotenv()
groq_api_key=os.getenv("GROQ_API_KEY")
os.environ["HF_TOKEN"]=os.getenv("HUGGINGFACE_TOKEN")

from langchain_groq import ChatGroq
llm=ChatGroq(model="llama-3.1-8b-instant",groq_api_key=groq_api_key)
llm

ChatGroq(client=<groq.resources.chat.completions.Completions object at 0x000001280EADFFD0>, async_client=<groq.resources.chat.completions.AsyncCompletions object at 0x000001280EADFD30>, model_name='llama-3.1-8b-instant', model_kwargs={}, groq_api_key=SecretStr('**********'))

In [None]:
from langchain_huggingface import HuggingFaceEmbeddings
embeddings=HuggingFaceEmbeddings(model="all-MiniLM-L6-v2")#This is a sentence-transformers model: It maps sentences & paragraphs to a 384 dimensional dense vector space


In [12]:
##Vectorstores
from langchain_chroma import Chroma
vectorstore=Chroma.from_documents(documents,embedding=embeddings)
vectorstore

<langchain_chroma.vectorstores.Chroma at 0x1281c435ab0>

In [14]:
vectorstore.similarity_search("machine learning")

[Document(id='d438af11-3351-44c5-9716-80731d195231', metadata={'source': 'AI-doc'}, page_content='machine learning is branch of computer science'),
 Document(id='0520c20c-e389-4994-a6ed-86dbe23ddc9f', metadata={'source': 'Artificial-doc'}, page_content='AI is branch of computer science'),
 Document(id='187f176b-9cc4-420f-a18a-0e48bad26aaa', metadata={'source': 'intelligence-doc'}, page_content='The goal of Arificial intelligence is to mimic human intelligence')]

In [15]:
vectorstore.similarity_search_with_score("machine learning")

[(Document(id='d438af11-3351-44c5-9716-80731d195231', metadata={'source': 'AI-doc'}, page_content='machine learning is branch of computer science'),
  0.7148762941360474),
 (Document(id='0520c20c-e389-4994-a6ed-86dbe23ddc9f', metadata={'source': 'Artificial-doc'}, page_content='AI is branch of computer science'),
  1.2551571130752563),
 (Document(id='187f176b-9cc4-420f-a18a-0e48bad26aaa', metadata={'source': 'intelligence-doc'}, page_content='The goal of Arificial intelligence is to mimic human intelligence'),
  1.6337151527404785)]

In [16]:
#retriver
retriver=vectorstore.as_retriever()
#retrivers job= take question and returns top k relevant chunks


In [19]:
from langchain_core.prompts import ChatPromptTemplate
prompt=ChatPromptTemplate.from_template("""
use the context below to answer the question.
Context:{context}
Question:{question}
""")
#it expect two things= {context}-doc from retriver and {question}-user question

In [20]:
from langchain_core.runnables import RunnableLambda
get_question=RunnableLambda(lambda x:x['question'])

In [21]:
get_context=RunnableLambda(lambda x: retriver.invoke(x['question'])) #takes question,runs retriever,returns doc

In [22]:
from langchain_core.runnables import RunnableParallel
rag_input=RunnableParallel({
    "context":get_context,
    "question":get_question
    
})

In [23]:
rag_chain=rag_input|prompt|llm

In [26]:
response=rag_chain.invoke({"question":"what is artificial intelligence"})
print(response.content)

Based on the given context, there are two pieces of information related to the definition of Artificial Intelligence:

1. From Document(id='0520c20c-e389-4994-a6ed-86dbe23ddc9f', metadata={'source': 'Artificial-doc'}, page_content='AI is branch of computer science'): 
   This document states that Artificial Intelligence is a branch of computer science.

2. From Document(id='187f176b-9cc4-420f-a18a-0e48bad26aaa', metadata={'source': 'intelligence-doc'}, page_content='The goal of Artificial intelligence is to mimic human intelligence'): 
   This document states that the goal of Artificial Intelligence is to mimic human intelligence.

These two pieces of information provide a comprehensive understanding of Artificial Intelligence. It's a branch of computer science that aims to mimic human intelligence.
