#### Pamata resursi
- LangChain: https://www.langchain.com/
- LangChain Google Gemini RAG: https://python.langchain.com/docs/tutorials/rag/
- LangChain PDF RAG: https://python.langchain.com/docs/tutorials/pdf_qa/
- Google embedding modelis: https://python.langchain.com/docs/integrations/text_embedding/google_generative_ai/
- Chroma vector DB: https://python.langchain.com/docs/integrations/vectorstores/chroma/
- RAG prompt: https://smith.langchain.com/hub/rlm/rag-prompt

#### Papildus resursi
- Kā izmantot Google colab: https://colab.research.google.com/drive/16pBJQePbqkz3QFV54L4NIkOn1kwpuRrj
- Pievienot atmiņu: https://python.langchain.com/docs/how_to/message_history/
- Dažādi veidi datu izgūšanai no dokumentiem: https://python.langchain.com/docs/concepts/retrieval/
- Pievienot norādes uz info avotu: https://python.langchain.com/docs/how_to/qa_sources/

### Ievadām savu Google API key

In [None]:
import getpass
import os

if "GOOGLE_API_KEY" not in os.environ:
    os.environ["GOOGLE_API_KEY"] = getpass.getpass("Enter your Google AI API key: ")

### Instalē nepieciešamās komponentes

In [None]:
%pip install --quiet --upgrade langchain langchain-community langchain-chroma langchain-google-genai pypdf

### Konfigurē modeli

In [None]:
from langchain_google_genai import ChatGoogleGenerativeAI

llm = ChatGoogleGenerativeAI(
    model="gemini-1.5-flash-001",
    temperature=0,
    max_tokens=None,
    timeout=None,
    max_retries=2,
)

In [None]:
import bs4
from langchain import hub
from langchain_chroma import Chroma

from langchain_community.document_loaders import WebBaseLoader
from langchain_community.document_loaders import PyPDFLoader

from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from langchain_google_genai import GoogleGenerativeAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter

### Lejupielādē dokumentus

In [None]:
!pip install gdown

In [None]:
!mkdir -p /example

In [None]:
import gdown

url = 'https://drive.google.com/uc?id=1AaXVGHdUjU8mK96DV9g_myQiAs62HzIm'
gdown.download(url, '/example/some_text.pdf', quiet=False)

### Apstrādā dokumentus

##### No WEB

In [None]:
# Load from web
loader_web = WebBaseLoader(
    web_paths=("https://lilianweng.github.io/posts/2023-06-23-agent/",),
    bs_kwargs=dict(
        parse_only=bs4.SoupStrainer(
            class_=("post-content", "post-title", "post-header")
        )
    ),
)
docs_web = loader_web.load()

##### No PDF

In [None]:
# Load from PDF
file_path = "/example/some_text.pdf"
loader_pdf = PyPDFLoader(file_path)

docs_pdf = loader_pdf.load()

In [None]:
docs = docs_web + docs_pdf

### Pārvērš dokumenta saturu par vektoriem

In [None]:

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
splits = text_splitter.split_documents(docs)
vectorstore = Chroma.from_documents(documents=splits, embedding=GoogleGenerativeAIEmbeddings(model="models/embedding-001"))

# Retrieve and generate using the relevant snippets of the blog.
retriever = vectorstore.as_retriever()
prompt = hub.pull("rlm/rag-prompt")


### Izveido dokumentu čatbotu

In [None]:
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)


rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

### Uzdod jautājumu

In [None]:
def query_llm(question):
    ai_msg = rag_chain.invoke(question)
    print(ai_msg)

In [None]:
question = input("Please enter your question: ")
print(f"Question: {question}")

query_llm(question)