In [None]:
!pip install --user --upgrade langchain python-dotenv google-generativeai langchain-google-genai langchain-community youtube-transcript-api chromadb

In [1]:
!rm -r chroma_db/chroma_db_pdf_file
!mkdir chroma_db/chroma_db_pdf_file

In [2]:
GOOGLE_API_KEY='INSERT THE GOOGLE API KEY'

# Set Environemnts

## Import Packages

In [3]:
import os
from operator import itemgetter
from langchain import PromptTemplate
from langchain.chains import LLMChain
from langchain.load import dumps, loads
from langchain.vectorstores import Chroma
from langchain.schema import StrOutputParser
from langchain.schema.runnable import RunnablePassthrough
from langchain_community.document_loaders import PyPDFLoader
from langchain.prompts import PromptTemplate, ChatPromptTemplate
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.docstore.document import Document as LangchainDocument
from langchain_google_genai import ChatGoogleGenerativeAI, GoogleGenerativeAIEmbeddings

##  Load & Process Dataset, Then Create Retriever Object

In [4]:
def create_retriever_from_pdf_file(pdf_filename, k):
    loader = PyPDFLoader(pdf_filename)
    pages = loader.load_and_split()
    
    # splits the texts into several smaller documents, each containing chunks of the texts
    text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
    docs = text_splitter.split_documents(pages)
    for d in docs:
        d.page_content = d.page_content.replace('\n', ' ')
    
    # embed the documents and store it in a vector database
    gemini_embeddings = GoogleGenerativeAIEmbeddings(model="models/embedding-001", google_api_key=GOOGLE_API_KEY)

    vectorstore = Chroma.from_documents(
                     documents=docs,                 # Data
                     embedding=gemini_embeddings,    # Embedding model
                     persist_directory="chroma_db/chroma_db_pdf_file" # Directory to save data
                     )
    vectorstore_disk = Chroma(
                        persist_directory="chroma_db/chroma_db_pdf_file",       # Directory of db
                        embedding_function=gemini_embeddings   # Embedding model
                   )
    
    # a vector store retriever to retrieve the embedded documents
    retriever = vectorstore_disk.as_retriever(search_kwargs={"k": k})
    
    return retriever

In [5]:
pdf_filename = "History of LLM.pdf"
retriever = create_retriever_from_pdf_file(pdf_filename, 3)

I0000 00:00:1726454722.283900       1 config.cc:230] gRPC experiments enabled: call_status_override_on_cancellation, event_engine_dns, event_engine_listener, http2_stats_fix, monitoring_experiment, pick_first_new, trace_record_callops, work_serializer_clears_time_cache
  vectorstore_disk = Chroma(


In [6]:
simple_query = 'What is a large language model?'
complex_query = 'Explain to me the history of the large language model'

In [14]:
[print(d.page_content, '\n') for d in retriever.invoke(simple_query)]

Large language models are artificial neural networks (algorithms) that have gone from a recent development to widespread use within a few years. They have been instrumental in the development of ChatGPT, the next evolutionary step in artificial intelligence. Generative AI was combined with large language models to produce a smarter version of artificial intelligence. Large language models (LLMs) are based on artificial neural networks, and recent improvements in deep learning have supported 

improvements in deep learning have supported their development. A large language model also uses semantic technology (semantics, the semantic web, and natural language processes). The history of large language models starts with the concept of semantics, developed by the French philologist, Michel Bréal, in 1883. Bréal studied the ways languages are organized, how they change as time passes, and how words connect within a language. Currently, semantics is used for languages developed for 

even mo

[None, None, None]

In [8]:
[print(d.page_content, '\n') for d in retriever.invoke(complex_query)]

improvements in deep learning have supported their development. A large language model also uses semantic technology (semantics, the semantic web, and natural language processes). The history of large language models starts with the concept of semantics, developed by the French philologist, Michel Bréal, in 1883. Bréal studied the ways languages are organized, how they change as time passes, and how words connect within a language. Currently, semantics is used for languages developed for 

Large language models are artificial neural networks (algorithms) that have gone from a recent development to widespread use within a few years. They have been instrumental in the development of ChatGPT, the next evolutionary step in artificial intelligence. Generative AI was combined with large language models to produce a smarter version of artificial intelligence. Large language models (LLMs) are based on artificial neural networks, and recent improvements in deep learning have supported 

The cre

[None, None, None]

## RAG Using Gemini + Langchain

In [9]:
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

def get_response_from_query(retriever, query):
    # retrieve documents that has high similiarity with the given query
    docs = retriever.get_relevant_documents(query)
    
    # create an instance of the gemini model
    llm = ChatGoogleGenerativeAI(model="gemini-pro", google_api_key=GOOGLE_API_KEY)

    # prompt text for the model 
    prompt = PromptTemplate(
        input_variables=["query", "docs"],
        template="""
        You are a helpful assistant that can answer questions about large language model.
        
        Answer the following question: {query}
        By searching the following informations: {docs}
        
        Only use the factual information from the given information to answer the question.
        
        If you feel like you don't have enough information to answer the question, say "I don't know".
        
        Your answers should be verbose and detailed.
        """
    )
    
    # chain of steps to make rag prompt
    rag_chain = (
        {"docs": retriever | format_docs, # documents used inside the prompt are acquired from the retriever and formatted with the funcrion "format_docs"
         "query": RunnablePassthrough()} # the query is provided in the input through applying the method of "invoke"
        | prompt
        | llm
        | StrOutputParser()
    ) 
    
    # make prompt based on the chain created
    resp = rag_chain.invoke(query)

    return resp, docs

In [10]:
resp, docs = get_response_from_query(retriever, simple_query)
print(resp)

  docs = retriever.get_relevant_documents(query)


A large language model (LLM) is a very large deep learning model that is pre-trained on massive amounts of data. Deep learning is a form of machine learning, which is also a neural network, but with additional layers. LLMs are based on artificial neural networks, and recent improvements in deep learning have supported their development. LLMs also use semantic technology (semantics, the semantic web, and natural language processes).


In [11]:
resp, docs = get_response_from_query(retriever, complex_query)
print(resp)

**The history of large language models:**

In 1883, French philologist Michel Bréal developed the concept of semantics, which is the study of the ways languages are organized, how they change over time, and how words connect within a language. Today, semantics is used in the development of languages for large language models.

Large language models are based on artificial neural networks, and recent improvements in deep learning have supported their development. 

The creation of the World Wide Web made the internet searchable and provided large language models with access to massive amounts of information.


# Retrieval Augmented Generation with Query Translation

## RAG with Multi-Query Using Gemini + Langchain

In [15]:
def create_chain_for_generating_more_queries():
    # prompt text for generating more queries from a query
    template = """
    You are an AI language model assistant. 
    Your task is to generate three shorter versions of the given user question with each version have unique perspectives.
    These shorter versions of the given user question will be used to retrieve relevant documents from a vector database. 
    By generating multiple shorter versions of the given user question, your goal is to help the user overcome some of the limitations of the distance-based similarity search. 
    Provide these shorter version questions separated by newlines.
    The given questions is: {query}
    """
    prompt_perspectives = ChatPromptTemplate.from_template(template)

    # create an instance of the gemini model
    llm = ChatGoogleGenerativeAI(model="gemini-pro", google_api_key=GOOGLE_API_KEY)
    
    # chain of steps to generate more queries 
    generate_queries = (
        prompt_perspectives 
        | llm
        | StrOutputParser() 
        | (lambda x: x.split("\n"))
    )
    
    return generate_queries

def _get_unique_union(documents):
    """ Unique union of retrieved docs """
    # Flatten list of lists, and convert each Document to string
    flattened_docs = [dumps(doc) for sublist in documents for doc in sublist]
    # Get unique documents
    unique_docs = list(set(flattened_docs))
    
    return '\n\n'.join([loads(doc).page_content for doc in unique_docs])

def create_retrieval_chain(generate_queries, retriever):
    retrieval_chain = generate_queries | retriever.map() | _get_unique_union
    
    return retrieval_chain

def get_response_from_multi_query(retrieval_chain, query):
    # retrieve documents that has high similarity with the given query
    docs = retrieval_chain.invoke({"query":query})
    
    # create an instance of the gemini model
    llm = ChatGoogleGenerativeAI(model="gemini-pro", google_api_key=GOOGLE_API_KEY)
    
    # prompt text for the model 
    template = """
    You are a helpful assistant to answer questions about large language model.
    
    Below is the question to answer.
    {query}
    
    Utilize the following information to generate the answer.
    {docs}
    
    Only use the factual information from the given information to answer the question.
    If you feel like you don't have enough information to answer the question, say "I don't know"
    """
    prompt = ChatPromptTemplate.from_template(template)

    # chain of steps to make rag prompt
    rag_chain = (
        {"docs": retrieval_chain, 
         "query": itemgetter("query")} 
        | prompt
        | llm
        | StrOutputParser()
    )
    
    # make prompt based on the chain created
    resp = rag_chain.invoke({"query":query})
        
    return resp, docs

In [16]:
generate_queries = create_chain_for_generating_more_queries()
retrieval_chain = create_retrieval_chain(generate_queries, retriever)

In [17]:
[print(v) for v in generate_queries.invoke(simple_query)]
print()

resp, docs = get_response_from_multi_query(retrieval_chain, simple_query)

print(resp, '\n')

print(docs)

- What are the key characteristics of a large language model?
- How do large language models compare to traditional natural language processing models?
- What are the potential applications of large language models?



  return '\n\n'.join([loads(doc).page_content for doc in unique_docs])


Large language models are artificial neural networks (algorithms) that have gone from a recent development to widespread use within a few years. 

The creation of the World Wide Web made the internet searchable and provided large language  models with access to massive amounts of information. The World Wide Web offers a platform  to create, store, locate, and share information on a variety of topics. During the mid-1990s, the  WWW initiated new levels of use on the internet, promoting interest in online shopping and what  was called “surfing” the internet. GPUs and Large Language Models Large language models require complex training, which

Large language models are artificial neural networks (algorithms) that have gone from a recent development to widespread use within a few years. They have been instrumental in the development of ChatGPT, the next evolutionary step in artificial intelligence. Generative AI was combined with large language models to produce a smarter version of artifi

In [18]:
[print(v) for v in generate_queries.invoke(complex_query)]
print()

resp, docs = get_response_from_multi_query(retrieval_chain, complex_query)

print(resp, '\n')

print(docs)

- The development and evolution of large language models
- A historical overview of the advancements in large language models
- The timeline of key milestones in the history of large language models

The history of large language models begins with the concept of semantics, developed by the French philologist, Michel Bréal, in 1883. Bréal studied the ways languages are organized, how they change as time passes, and how words connect within a language. Currently, semantics is used for languages developed for machine learning. 

to work with neural networks. ML was used to answer phones and perform a variety of automated tasks. Small Language Models Early development of the first (small) language models was started in the 1980s by IBM, and  they were/are designed to predict the next word in a sentence. Part of their design includes a  “dictionary,” which determines how often certain words occur within the text the model was  trained on. After each word, the algorithm recalculates statist

## RAG-Fusion Using Gemini + Langchain

In [19]:
def create_chain_for_generating_more_queries():
    # prompt text for generating more queries from a query
    template = """
    You are an AI language model assistant. 
    Your task is to generate three shorter versions of the given user question with each version have unique perspectives.
    These shorter versions of the given user question will be used to retrieve relevant documents from a vector database. 
    By generating multiple shorter versions of the given user question, your goal is to help the user overcome some of the limitations of the distance-based similarity search. 
    Provide these shorter version questions separated by newlines.
    The given questions is: {query}
    """
    prompt_perspectives = ChatPromptTemplate.from_template(template)

    # create an instance of the gemini model
    llm = ChatGoogleGenerativeAI(model="gemini-pro", google_api_key=GOOGLE_API_KEY)
    
    # chain of steps to generate more queries 
    generate_queries = (
        prompt_perspectives 
        | llm
        | StrOutputParser() 
        | (lambda x: x.split("\n"))
    )
    
    return generate_queries

def _reciprocal_rank_fusion(results, k=5):    
    # Initialize a dictionary to hold fused scores for each unique document
    fused_scores = {}

    # Iterate through each list of ranked documents
    for docs in results:
        # Iterate through each document in the list, with its rank (position in the list)
        for rank, doc in enumerate(docs):
            # Convert the document to a string format to use as a key (assumes documents can be serialized to JSON)
            doc_str = dumps(doc)
            # If the document is not yet in the fused_scores dictionary, add it with an initial score of 0
            if doc_str not in fused_scores:
                fused_scores[doc_str] = 0
            # Retrieve the current score of the document, if any
            previous_score = fused_scores[doc_str]
            # Update the score of the document using the RRF formula: 1 / (rank + k)
            fused_scores[doc_str] += 1 / (rank + k)

    # Sort the documents based on their fused scores in descending order to get the final reranked results
    reranked_results = [
        (loads(doc), score)
        for doc, score in sorted(fused_scores.items(), key=lambda x: x[1], reverse=True)
    ][:k]

    # Return the reranked results as a list of tuples, each containing the document and its fused score
    return '\n\n'.join([doc.page_content for doc, score in reranked_results])

def create_retrieval_chain(generate_queries, retriever):
    retrieval_chain = generate_queries | retriever.map() | _reciprocal_rank_fusion
    
    return retrieval_chain

def get_response_from_rag_fusion(retrieval_chain, query):
    # retrieve documents that has high similarity with the given query
    docs = retrieval_chain.invoke({"query":query})
    
    # create an instance of the gemini model
    llm = ChatGoogleGenerativeAI(model="gemini-pro", google_api_key=GOOGLE_API_KEY)
    
    # prompt text for the model 
    template = """
    You are a helpful assistant to answer questions about large language models.
    
    Below is the question to answer.
    {query}
    
    Utilize the following information to generate the answer.
    {docs}
    
    Only use the factual information from the given information to answer the question.
    If you feel like you don't have enough information to answer the question, say "I don't know"
    """
    prompt = ChatPromptTemplate.from_template(template)

    rag_chain = (   
        {"docs": retrieval_chain, 
         "query": itemgetter("query")} 
        | prompt
        | llm
        | StrOutputParser()  
    )

    # make prompt based on the chain created
    resp = rag_chain.invoke({"query":query})
        
    return resp, docs

In [20]:
generate_queries = create_chain_for_generating_more_queries()
retrieval_chain = create_retrieval_chain(generate_queries, retriever)

In [21]:
[print(v) for v in generate_queries.invoke(simple_query)]
print()

resp, docs = get_response_from_multi_query(retrieval_chain, simple_query)

print(resp, '\n')

print(docs)

- Characteristics of large language models
- Applications of large language models
- Comparison of large language models to other language models

A large language model is an artificial neural network that uses semantic technology to understand and generate human language. 

Large language models are artificial neural networks (algorithms) that have gone from a recent development to widespread use within a few years. They have been instrumental in the development of ChatGPT, the next evolutionary step in artificial intelligence. Generative AI was combined with large language models to produce a smarter version of artificial intelligence. Large language models (LLMs) are based on artificial neural networks, and recent improvements in deep learning have supported

improvements in deep learning have supported their development. A large language model also uses semantic technology (semantics, the semantic web, and natural language processes). The history of large language models starts wi

In [22]:
[print(v) for v in generate_queries.invoke(complex_query)]
print()

resp, docs = get_response_from_multi_query(retrieval_chain, complex_query)

print(resp, '\n')

print(docs)

- History of large language models
- Evolution of large language models over time
- Timeline of significant events in the development of large language models

The history of large language models starts with the concept of semantics, developed by the French philologist, Michel Bréal, in 1883. From 1906 to 1912, Ferdinand de Saussure taught Indo-European linguistics, general linguistics, and Sanskrit at the University of Geneva. During this time he developed the foundation for a highly functional model of languages as systems. Early development of the first (small) language models was started in the 1980s by IBM. 

improvements in deep learning have supported their development. A large language model also uses semantic technology (semantics, the semantic web, and natural language processes). The history of large language models starts with the concept of semantics, developed by the French philologist, Michel Bréal, in 1883. Bréal studied the ways languages are organized, how they chang

## RAG with Query Decomposition Using Gemini + Langchain

In [29]:
def decompose_query(query):
    template = """
    You are a helpful assistant that generates multiple differently phrased query related to the given query
    The goal is to break down the query into several queries that can be answered in isolation
    Generate multiple differently phrased query related to: {query}
    Output (3 queries):
    """
    
    prompt_decomposition = ChatPromptTemplate.from_template(template)
    
    # LLM
    llm = ChatGoogleGenerativeAI(model="gemini-pro", google_api_key=GOOGLE_API_KEY)

    # Chain
    generate_queries_decomposition = (prompt_decomposition | llm | StrOutputParser() | (lambda x: x.split("\n")))
    
    # Run
    decomposed_query = generate_queries_decomposition.invoke({"query":query})
    
    return decomposed_query

def format_question_answer(query, answer):
    formatted_string = f"""Question:{query}
                           Solution:{answer}"""
    
    return formatted_string

def get_response_from_rag_with_decomposition(retriever, query, decomposed_query):
    # create an instance of the gemini model
    llm = ChatGoogleGenerativeAI(model="gemini-pro", google_api_key=GOOGLE_API_KEY)
    
    # prompt text for the model 
    template = """
    You are a helpful assistant to answer questions about large language model.

    Below is the question to answer.

    \n --- \n {query} \n --- \n

    Utilize these available background question + answer pairs below to answer the question.

    \n --- \n {q_a_pairs} \n --- \n

    Utilize the following information to generate the answer.

    \n --- \n {context} \n --- \n

    Use the above informations and any background question + answer pairs to answer the question
    """
    prompt = ChatPromptTemplate.from_template(template)

    q_a_pairs = ""
    for q in decomposed_query:
        rag_chain = (
        {"context": itemgetter("query") | retriever, 
         "query": itemgetter("query"),
         "q_a_pairs": itemgetter("q_a_pairs")} 
        | prompt
        | llm
        | StrOutputParser())

        answer = rag_chain.invoke({"query":q,"q_a_pairs":q_a_pairs})
        q_a_pair = format_question_answer(q, answer)
        q_a_pairs = q_a_pairs + "\n---\n"+  q_a_pair
                
    return answer

In [32]:
decomposed_query = decompose_query(simple_query)

resp = get_response_from_rag_with_decomposition(retriever, simple_query, decomposed_query)

print(resp)

Large language models (LLMs) are a type of artificial neural network that has been trained on a massive dataset of text and code. This training allows LLMs to understand the structure and meaning of human language, and to generate new text that is both coherent and informative.

LLMs have a wide range of capabilities, including:

* **Natural language processing:** LLMs can be used to perform a variety of natural language processing tasks, such as text classification, named entity recognition, and machine translation.
* **Text generation:** LLMs can be used to generate new text, such as articles, stories, and code.
* **Dialogue generation:** LLMs can be used to generate dialogue, such as customer service conversations and chatbot responses.
* **Question answering:** LLMs can be used to answer questions about the world, based on the information they have been trained on.

LLMs are still under development, but they have already shown great promise for a variety of applications. As LLMs co

In [39]:
decomposed_query = decompose_query(complex_query)

resp = get_response_from_rag_with_decomposition(retriever, complex_query, decomposed_query)

print(resp)

Key milestones in the development of large language models (LLMs) include:

* **1883:** French philologist Michel Bréal develops the concept of semantics, which studies the meaning of words and how they are used in language.
* **1980s:** Small language models (SLMs) are developed to predict the next word in a sentence.
* **Recent years:**
    * LLMs are developed using artificial neural networks and deep learning.
    * LLMs are used to develop generative AI, which can generate new text, images, and other content.
    * The World Wide Web provides access to massive amounts of information, which LLMs use to train their models.
