# Upstage Full Stack LLM with Langchain
## Code to Understand!

In [1]:
! pip3 install -q openai langchain_community tiktoken langchain-upstage langchainhub faiss-cpu langchain python-dotenv tavily-python

In [2]:
import warnings
warnings.filterwarnings('ignore')

In [3]:
# Set .env and define these:
# UPSTAGE_API_KEY from https://console.upstage.ai/
# TAVILY_API_KEY https://app.tavily.com
# NEWS_API_KEY from https://newsapi.org/

%load_ext dotenv
%dotenv


## Interacting with the Solar-1-mini-chat Model

This Python code demonstrates how to use the OpenAI API to interact with the Solar-1-mini-chat model provided by Upstage AI.

### Steps

1. Import necessary libraries: `os`, `openai`, and `pprint`.
2. Set up the OpenAI client with the API key and base URL.
3. Create a chat completion request using `client.chat.completions.create()`.
   - Specify the model: "solar-1-mini-chat".
   - Provide a list of messages, including the system message and user message.
4. Handle the model's response:
   - Print the entire response using `pprint()`.
   - Print the content of the assistant's message using `response.choices[0].message.content`.

In [4]:
import os
from openai import OpenAI
from pprint import pprint

client = OpenAI(
    api_key=os.environ["UPSTAGE_API_KEY"], base_url="https://api.upstage.ai/v1/solar"
)
gc_result = client.chat.completions.create(
    model="solar-1-mini-chat",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What about Korea?"},
    ],
)
pprint(gc_result)
print("Message only:")
pprint(gc_result.choices[0].message.content)

ChatCompletion(id='f9695c08-83f3-4995-97a0-04a745b4edcb', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content="Ah, Korea! It's a fascinating country with a rich history and culture. I visited Seoul a few years ago, and it was an unforgettable experience. The city is a perfect blend of modernity and tradition, with skyscrapers standing next to ancient palaces. The food is also amazing, from traditional Korean BBQ to street food like tteokbokki. And let's not forget about the K-pop scene - it's a phenomenon that has taken the world by storm!", role='assistant', function_call=None, tool_calls=None))], created=1714736521, model='solar-1-mini-chat-240502', object='chat.completion', system_fingerprint=None, usage=CompletionUsage(completion_tokens=112, prompt_tokens=26, total_tokens=138))
Message only:
("Ah, Korea! It's a fascinating country with a rich history and culture. I "
 'visited Seoul a few years ago, and it was an unforgettable experie

## Using Few-Shot Examples in Chat Completions

This Python code demonstrates how to use few-shot examples in the OpenAI Chat Completions API to provide context and guide the model's responses.

### Steps

1. Set up the OpenAI client with the API key and base URL.
2. Create a chat completion request using `client.chat.completions.create()`.
   - Specify the model: "solar-1-mini-chat".
   - Provide a list of messages, including:
     - System message: Defines the assistant's role.
     - Few-shot examples: Provide context and desired behavior.
     - User input: The actual user query.
3. Handle the model's response:
   - Print the entire response using `pprint()`.
   - Print the content of the assistant's message using `response.choices[0].message.content`.

In [5]:
# few shots: examples or history
gc_result = client.chat.completions.create(
    model="solar-1-mini-chat",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        # examples
        {"role": "user", "content": "What is the capital of France?"},
        {
            "role": "assistant",
            "content": "I know of it. It's Paris!!",
        },
        # user input
        {"role": "user", "content": "What about Korea?"},
    ],
)
pprint(gc_result)
print("Message only:")
pprint(gc_result.choices[0].message.content)

ChatCompletion(id='e478d2fb-d381-4bff-aa5f-011883c57e3d', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='Ah, yes! The capital of Korea is Seoul.', role='assistant', function_call=None, tool_calls=None))], created=1714736523, model='solar-1-mini-chat-240502', object='chat.completion', system_fingerprint=None, usage=CompletionUsage(completion_tokens=16, prompt_tokens=55, total_tokens=71))
Message only:
'Ah, yes! The capital of Korea is Seoul.'


## Building LLM Applications with LangChain

This Python code demonstrates how to use the LangChain library to build applications with Large Language Models (LLMs). It covers the basic steps of defining an LLM, creating a chat prompt, defining a chain, and invoking the chain.

### Steps

1. Define your favorite LLM:
   - Import the `ChatUpstage` class from `langchain_upstage`.
   - Create an instance of `ChatUpstage` and assign it to the variable `llm`.

2. Define a chat prompt:
   - Import the `ChatPromptTemplate` class from `langchain_core.prompts`.
   - Create a `ChatPromptTemplate` instance using the `from_messages()` method.
   - Provide a list of messages, including system messages, example conversations, and user input.

3. Define a chain:
   - Import the `StrOutputParser` class from `langchain_core.output_parsers`.
   - Create a chain by combining the `rag_with_history_prompt`, `llm`, and `StrOutputParser()` using the pipe (`|`) operator.

4. Invoke the chain:
   - Call the `invoke()` method on the `chain` object, passing an empty dictionary (`{}`) as the input.
   - Print the response obtained from the chain.

In [6]:
# langchain, 1. llm defule, 2. prompt define, 3. chain, 4. chain.invoke

# 1. define your favorate llm, solar
from langchain_upstage import ChatUpstage
llm= ChatUpstage()

# 2. define chat prompt
from langchain_core.prompts import ChatPromptTemplate
rag_with_history_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", "You are a helpful assistant."),
        ("human", "What is the capital of France?"),
        ("ai", "I know of it. It's Paris!!"),
        ("human", "What about Korea?"),
    ]
)

# 3. define chain 
from langchain_core.output_parsers import StrOutputParser
chain = rag_with_history_prompt | llm | StrOutputParser()

# 4. invoke the chain
gc_result = chain.invoke({})
print(gc_result)

I know of it too. It's Seoul!!


## Parameterized Prompt Templates in LangChain

### Overview

- Prompt templates allow for reusable and modular prompts
- They improve maintainability compared to using raw prompt strings
- {country} value can be set from outside

In [7]:
# parameterized prompt template
rag_with_history_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", "You are a helpful assistant."),
        ("human", "What is the capital of France?"),
        ("ai", "I know of it. It's Paris!!"),
        ("human", "What about {country}?"),
    ]
)

chain = rag_with_history_prompt | llm | StrOutputParser()

# 4. invoke chain with param
print(chain.invoke({"country": "Korea"}))
print("---")
print(chain.invoke({"country": "Japan"}))

Oh, I know that one too! The capital of Korea is Seoul!!
---
I know of it. It's Tokyo!!


## Leveraging Message History in LangChain Prompts

- LangChain provides powerful tools for managing conversation history
- `MessagesPlaceholder` allows for dynamic inclusion of message history
- `HumanMessage` and `AIMessage` classes represent individual messages
- Combining message history with user input enables context-aware responses

In [8]:
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder

# More general chat
rag_with_history_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", "You are a helpful assistant."),
        MessagesPlaceholder(variable_name="history"),
        ("human", "{input}"),
    ]
)

from langchain_core.messages import AIMessage, HumanMessage

history = [
    HumanMessage("What is the capital of France?"),
    AIMessage("It's Paris!!"),
]

chain = rag_with_history_prompt | llm | StrOutputParser()
gc_result = chain.invoke({"history": history, "input": "What about Korea?"})
print(gc_result)

For South Korea, the capital is Seoul! And for North Korea, it's Pyongyang.


# Leveraging LangChain for Efficient Text Splitting and Vectorization

- LangChain provides powerful tools for text splitting and vectorization

In [9]:
from langchain_upstage import (
    UpstageLayoutAnalysisLoader,
    UpstageGroundednessCheck,
    ChatUpstage,
    UpstageEmbeddings,
)
from langchain_community.vectorstores import FAISS
from langchain.text_splitter import RecursiveCharacterTextSplitter


layzer = UpstageLayoutAnalysisLoader("./solar_paper.pdf", output_type="html")
# For improved memory efficiency, consider using the lazy_load method to load documents page by page.
docs = layzer.load()  # or layzer.lazy_load()

In [10]:
for doc in docs:
    pprint(doc.page_content[:100])

("<table id='0' "
 "style='font-size:14px'><tr><td>Model</td><td>Size</td><td>Type</td><td>H6 "
 '(Avg.)</td><')


## Retrieval Augmented Generation (RAG) for Question Answering

- RAG combines retrieval and generation to enhance LLM performance on specific tasks
- Relevant context is retrieved from external data sources and added to the prompt
- The augmented prompt is then passed to the LLM for generating a response
- RAG is particularly useful for question answering on custom datasets

In [11]:
# More general chat
rag_with_history_prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            """
You are an assistant for question-answering tasks. 
Use the following pieces of retrieved context to answer the question considering the history of the conversation. 
If you don't know the answer, just say that you don't know. 
---
CONTEXT:
{context}
         """,
        ),
        MessagesPlaceholder(variable_name="history"),
        ("human", "{input}"),
    ]
)

from langchain_core.messages import AIMessage, HumanMessage

history = [
]

chain = rag_with_history_prompt | llm | StrOutputParser()
query1 = "Performance comparison amongst the merge candidate"
response1 = chain.invoke({"history": history, "context": docs, "input": query1})
print("RESPONSE1\n", response1)


RESPONSE1
 To compare the performance of the merge candidates, you can refer to Table 6 in the provided context. This table presents the scores of two models, 'Cand. 1' and 'Cand. 2', on six different tasks: H6 (Average), ARC, HellaSwag, MMLU, TruthfulQA, and GSM8K.

In the table, 'Cand. 1' has a higher score for GSM8K but lower scores for the other tasks. On the other hand, 'Cand. 2' has a lower score for GSM8K but higher scores for the other tasks.

By comparing these scores, you can evaluate the performance of each merge candidate and determine which one is better suited for your specific needs based on the task requirements.


In [12]:
history = [
    HumanMessage(query1),
    AIMessage(response1)
]
query2 = "How about Ablation studies?"
response2 = chain.invoke({"history": history, "context": docs, "input": query2})
print("RESPONSE2\n", response2)

RESPONSE2
 The provided context includes two sets of ablation studies: one for the instruction tuning stage and another for the alignment tuning stage.

1. Instruction Tuning Ablation:

Table 3 in the context presents the ablation studies for the instruction tuning stage. It compares the performance of different models trained with various combinations of datasets: Alpaca-GPT4, OpenOrca, and Synth. Math-Instruct. The ablated models are prefixed with 'SFT' for supervised fine-tuning.

The main insights from this table are:

- Using only the Alpaca-GPT4 dataset for training (SFT v1) resulted in a H6 score of 69.15.
- Adding the OpenOrca dataset to the training (SFT v2) did not significantly change the H6 score (69.21), but it affected the scores for individual tasks.
- Including the Synth. Math-Instruct dataset improved the GSM8K score and maintained comparable scores for other tasks (SFT v3).
- When adding the Synth. Math-Instruct dataset to SFT v1 (SFT v4), the overall H6 score increas

In [13]:
# Let's load something big
layzer = UpstageLayoutAnalysisLoader("./kim-tse-2008.pdf", output_type="html")
# For improved memory efficiency, consider using the lazy_load method to load documents page by page.
docs = layzer.load()  # or layzer.lazy_load()

###  RAG Limitations
- LLM does not have long enough context length
- Sending long, irrelevant info is inefficient

In [14]:
chain = rag_with_history_prompt | llm | StrOutputParser()
query1 = "What is bug classification?"

try:    
    response1 = chain.invoke({"history": history, "context": docs, "input": query1})
    print(response1)
except Exception as e:
    print(e)

Error code: 400 - {'error': {'message': "This model's maximum context length is 32768 tokens. However, your messages resulted in 35692 tokens. Please reduce the length of the messages.", 'type': 'invalid_request_error', 'param': 'messages', 'code': 'context_length_exceeded'}}


## Efficient Text Splitting and Indexing with LangChain

### 1. Load Documents

The first step is to load the source documents that will be used to augment the language model's knowledge
This could be done by reading files from disk, pulling from a database, scraping web pages, etc.
The goal is to get the raw text content into a format that can be further processed

### 2. Chunking/Splitting

* Long documents need to be broken down into smaller chunks that are a manageable size for embedding and retrieval
Common approaches include:
  * Fixed-size chunking - split text into equal sized chunks based on character or token count 
  * Semantic chunking - split based on semantic boundaries like sentences, paragraphs, or sections
  * Hierarchical chunking - create chunks at multiple levels of granularity
The ideal chunk size depends on the embedding model, retrieval use case, and downstream task

3. Embedding & Indexing

* The text chunks are converted to vector embeddings using a model like Upstage embeddings
* The embeddings are indexed and stored in a vector database to enable efficient similarity search 
* Metadata about the source chunks can also be stored alongside the embeddings

4. Retrieval

* At query time, the user's question is itself embedded as a query vector
* The query embedding is used to find the most similar document chunks in the vector index 
* Top-k most relevant chunks are retrieved and can be used to augment the prompt sent to the language model to generate an answer

In [15]:
# RAG 1. load doc (done), 2. chunking, splits, 3. embeding - indexing, 4. retrieve 

# 2. Split
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=100)
splits = text_splitter.split_documents(docs)
print("Splits:", len(splits))

# 3. Embed & indexing
vectorstore = FAISS.from_documents(documents=splits, embedding=UpstageEmbeddings())
retriever = vectorstore.as_retriever(k=10)

# 4. retrive
result_docs = retriever.invoke("What is DUS?")
print(result_docs[1])

Splits: 266
page_content="a new functionality. When using this algorithm,<br>care needs to be taken to understand the meaning of<br>changes identified as bugs and, wherever possible,<br>to ensure that only truly buggy changes are flagged<br>as being buggy.</p><p id='60' style='font-size:20px'>9 CONCLUSION AND OPEN ISSUES</p><br><p id='61' style='font-size:18px'>If a developer knows that a change that she just made<br>contains a bug, she can use this information to take steps to<br>identify and fix the potential bug in" metadata={'total_pages': 16, 'type': 'html', 'split': 'none'}


In [16]:
# Finally query using RAG
query = "What is bug classification? How it works?"
result_docs = retriever.invoke(query)

gc_result = chain.invoke({"history": history, "context": result_docs, "input": query})
print(gc_result)

Bug classification is the process of categorizing software bugs based on their characteristics, severity, or other relevant factors. This process helps developers prioritize and manage the bugs more effectively. The classification can be done manually or using automated tools, such as machine learning algorithms.

Here's how bug classification works:

1. **Data Collection**: The first step is to collect data about the bugs. This data can include bug reports, error messages, stack traces, and any other relevant information.

2. **Feature Extraction**: From the collected data, relevant features are extracted. These features can be textual (e.g., keywords in bug reports), numerical (e.g., lines of code modified), or categorical (e.g., component affected).

3. **Model Training**: Machine learning algorithms are then trained using this feature data. The goal is to train a classifier that can accurately predict the bug category based on the extracted features.

4. **Model Evaluation**: The t

In [17]:
history = [
    HumanMessage(query),
    AIMessage(gc_result)
]

query = "Why it is good?"
result_docs = retriever.invoke(query)

gc_result = chain.invoke({"history": history, "context": result_docs, "input": query})
print(gc_result)

Bug classification is beneficial for several reasons:

1. **Efficient Bug Management**: By categorizing bugs based on their characteristics, severity, or other factors, developers can prioritize which bugs to fix first. This helps ensure that critical issues are addressed promptly, improving overall software quality and user satisfaction.

2. **Improved Productivity**: Automated bug classification using machine learning algorithms can significantly reduce the time developers spend on manually categorizing bugs. This frees up their time to focus on fixing bugs and developing new features.

3. **Better Understanding of Software Issues**: Bug classification can provide valuable insights into the types of issues that occur most frequently in a software system. This can help developers identify patterns, focus on areas that need improvement, and prevent similar issues in the future.

4. **Enhanced Collaboration**: Classifying bugs can facilitate better communication and collaboration among 

## Explanation of the Code: Query Expander

The provided code demonstrates a query expansion technique used in Retrieval Augmented Generation (RAG) systems. The main goal is to generate multiple variations of a given user query to retrieve relevant documents from a vector database more effectively. By generating different perspectives on the user query, the system aims to overcome some limitations of distance-based similarity search.

The code defines a function called `query_expander` that takes a user query as input and returns a list of expanded queries. It uses three different query expansion templates:

1. Multi Query: Generates five different versions of the user query to retrieve relevant documents from different perspectives.
2. RAG-Fusion: Generates four related search queries based on the input query.
3. Decomposition: Breaks down the input query into three sub-questions that can be answered in isolation.

The expanded queries are generated using the LangChain library, specifically the `ChatUpstage` model, and the results are parsed using the `StrOutputParser`.

In [18]:
from langchain_upstage import ChatUpstage
from langchain_core.output_parsers import StrOutputParser
from langchain.prompts import ChatPromptTemplate


def query_expander(query):
    # Multi Query: Different Perspectives
    multi_query_template = """You are an AI language model assistant. Your task is to generate five 
    different versions of the given user question to retrieve relevant documents from a vector 
    database. By generating multiple perspectives on the user question, your goal is to help
    the user overcome some of the limitations of the distance-based similarity search. 
    Provide these alternative questions separated by newlines. Original question: {query}"""

    # RAG-Fusion: Related
    rag_fusion_template = """You are a helpful assistant that generates multiple search queries based on a single input query. \n
    Generate multiple search queries related to: {query} \n
    Output (4 queries):"""

    # Decomposition
    decomposition_template = """You are a helpful assistant that generates multiple sub-questions related to an input question. \n
    The goal is to break down the input into a set of sub-problems / sub-questions that can be answers in isolation. \n
    Generate multiple search queries related to: {query} \n
    Output (3 queries):"""

    query_expander_templates = [
        multi_query_template,
        rag_fusion_template,
        decomposition_template,
    ]

    expanded_queries = []
    for template in query_expander_templates:
        prompt_perspectives = ChatPromptTemplate.from_template(template)

        generate_queries = (
            prompt_perspectives
            | ChatUpstage(temperature=0)
            | StrOutputParser()
            | (lambda x: x.split("\n"))
        )
        expanded_queries += generate_queries.invoke({"query": query})

    return expanded_queries


expanded_queries = query_expander("What is the DUS approach developed by Upstage?")
pprint(expanded_queries)

['1. Can you explain the DUS methodology created by Upstage?',
 '2. What is the DUS approach developed by Upstage and how does it work?',
 '3. Can you provide an overview of the DUS technique developed by Upstage?',
 '4. How does the DUS approach developed by Upstage differ from other similar '
 'methods?',
 '5. What are the key features of the DUS methodology developed by Upstage?',
 '1. "DUS approach by Upstage: definition and explanation"',
 '2. "How does the DUS approach by Upstage work?"',
 '3. "Applications of the DUS approach developed by Upstage"',
 '4. "Comparing the DUS approach by Upstage with other similar methods"',
 '1. What is the DUS approach in the context of Upstage?',
 '2. How does the DUS approach differ from other methods in its field?',
 '3. What are the key components and steps involved in the DUS approach '
 'developed by Upstage?']


In [19]:
# Finally query using RAG
oroginal_query = "What is bug classification? Why it is good?"
expanded_queries = query_expander(query)
expanded_queries.append(oroginal_query)

expended_result_docs = []
for query in expanded_queries:
    print("Search for: ", query)
    result_docs = retriever.invoke(query)
    expended_result_docs.append(result_docs)

# remove duplicates 
unique_docs = (list(set(expanded_queries)))
print("expended_result_docs", len(expended_result_docs))
print("Unique docs:", len(unique_docs))

gc_result = chain.invoke({"history": history, "context": expanded_queries, "input": query})
print(gc_result)

Search for:  1. What are the benefits of it?
Search for:  2. How does it contribute positively?
Search for:  3. What advantages does it offer?
Search for:  4. In what ways is it advantageous?
Search for:  5. What are the positive aspects of it?
Search for:  1. What are the benefits of it?
Search for:  2. How does it contribute positively?
Search for:  3. What advantages does it offer?
Search for:  4. Why is it considered beneficial?
Search for:  1. What are the benefits of it?
Search for:  2. How does it contribute positively to society?
Search for:  3. What are the advantages of using it over alternatives?
Search for:  What is bug classification? Why it is good?
expended_result_docs 13
Unique docs: 9
Bug classification is the process of categorizing software bugs based on their characteristics, severity, or other relevant factors. This process helps developers prioritize and manage the bugs more effectively. The classification can be done manually or using automated tools, such as mac

## Explanation of the Code: Smart Retrieval Augmented Generation (RAG)

### High-Level Overview

The code demonstrates a smart Retrieval Augmented Generation (RAG) system that combines local retrieval with external search capabilities. The main goal is to provide relevant context for answering user questions by first searching a local vector database and then falling back to an external search service if the local context is insufficient.


The code defines two main functions:


  1. is_in: Determines whether the answer to a given question can be found within the provided context.
smart_rag: Retrieves relevant context for a given question, either from the local vector database or an external search service, and generates an answer using the retrieved context.

  1. The code uses the LangChain library for generating prompts and invoking language models, as well as the Tavily API for external search capabilities.


### Detailed Explanation 

1. The code starts by defining the is_in function, which takes a question and context as input and determines whether the answer to the question can be found within the context.

  * It defines a prompt template called is_in_conetxt that asks the language model to check if the answer is in the context and return "yes" or "no".
  * The prompt template is used to create a ChatPromptTemplate object.
  * A chain of operations is constructed using the | operator:
    * The ChatPromptTemplate is passed to the ChatUpstage model.
    * The model's output is parsed using the StrOutputParser.
  * The chain is invoked with the question and context, and the response is stored in the response variable.
  * The function returns True if the response starts with "yes" (case-insensitive), indicating that the answer is in the context.

1. The code then demonstrates the usage of the is_in function with two example questions and their corresponding contexts retrieved from a retriever.

1. Next, the code defines the smart_rag function, which takes a question as input and generates an answer using the retrieved context.

  * It first retrieves the context for the question using the retriever.invoke method.
  * If the is_in function determines that the answer is not in the retrieved context, it falls back to searching for additional context using the Tavily API.
  * The retrieved context (either from the local retriever or Tavily) is stored in the context variable.
  * A chain of operations is constructed using the | operator:
    * The rag_with_history_prompt (not shown in the code snippet) is used as the prompt template.
    * The prompt is passed to the llm language model.
    * The model's output is parsed using the StrOutputParser.
  * The chain is invoked with the conversation history, retrieved context, and the question, and the generated answer is returned.

1. Finally, the code demonstrates the usage of the smart_rag function with two example questions:

  * "What is DUS?": The answer is expected to be found in the local context.
  * "What's the population of San Francisco?": The answer is not expected to be found in the local context, so it falls back to searching with Tavily.

This code showcases how LangChain can be used to build a smart RAG system that combines local retrieval with external search capabilities. By first searching a local vector database and falling back to an external search service if needed, the system aims to provide relevant context for generating accurate answers to user questions.

In [20]:
# RAG or Search?
def is_in(question, context):
    is_in_conetxt = """You are a helpful assistant and have a good understanding on 
context and questions. Using your best judgement, please check if the answer of the question is in the given context.
If answer is on the context, please returen yes. Otherwise resturn yes. 
Only resturn yes or no. Please do not include any additional information.
Please do youe best. Here are question and the context:
---
QUESTION: {question}

CONTEXT: {context}

Output (yes or no):"""

    is_in_prompt = ChatPromptTemplate.from_template(is_in_conetxt)
    chain = is_in_prompt | ChatUpstage() | StrOutputParser()

    response = chain.invoke({"history": [], "context": context, "question": question})
    print(response)
    return response.lower().startswith("yes")

In [21]:
question = "Can you tell me about Sung Kim's life?"
context = retriever.invoke(question)
print(is_in(question, context))

no
False


In [22]:
question = "What is bug classification?"
context = retriever.invoke(question)
print(is_in(question, context))

yes
True


In [23]:
# Smart RAG, Self-Improving RAG
from tavily import TavilyClient

def smart_rag(question):
    context = retriever.invoke(question)
    if not is_in(question, context):
        print("Searching in tavily")
        tavily = TavilyClient(api_key=os.environ["TAVILY_API_KEY"])
        context  = tavily.search(query=question)

    chain = rag_with_history_prompt | llm | StrOutputParser()
    return chain.invoke({"history": history, "context": context, "input": question})


In [33]:
question = "What is bug classification?"
smart_rag(question)


yes


'Bug classification is the process of categorizing software bugs based on their characteristics, severity, or other relevant factors. This process helps developers prioritize and manage the bugs more effectively. The classification can be done manually or using automated tools, such as machine learning algorithms.'

In [25]:
question = "What's the population of San Francisco?"
smart_rag(question)

no
Searching in tavily


'The population of San Francisco is approximately 873,965 as of 2020.'

## Explanation of the Code: Groundedness Check with LangChain and Upstage

### High-Level Overview

The provided code demonstrates how to perform a groundedness check using the LangChain library and the Upstage model. The groundedness check is a process of verifying whether the generated response is grounded in the given context. This is an important step in ensuring the quality and relevance of the generated output.

The code uses the `UpstageGroundednessCheck` class from the `langchain_upstage` module to perform the groundedness check. It takes the context (a string of unique documents) and the generated response as input, and returns a verdict indicating whether the response is grounded or not.

### Detailed Explanation

1. The code starts by importing the necessary module:
   - `UpstageGroundednessCheck` from `langchain_upstage`: This class is used to perform the groundedness check.

2. An instance of the `UpstageGroundednessCheck` class is created and assigned to the variable `groundedness_check`.

3. The input for the groundedness check is prepared by creating a dictionary called `request_input`:
   - The `"context"` key is assigned the value of `str(unique_docs)`, which represents the unique documents as a string.
   - The `"answer"` key is assigned the value of `response`, which represents the generated response.

4. The `invoke` method of the `groundedness_check` instance is called with the `request_input` as an argument. This method performs the groundedness check and returns the verdict.

5. The verdict is stored in the `response` variable and printed to the console using `print(response)`.

6. The code then checks if the `response` starts with the word "grounded" (case-insensitive):
   - If the response starts with "grounded", it means the groundedness check has passed, and the message "✅ Groundedness check passed" is printed.
   - If the response does not start with "grounded", it means the groundedness check has failed, and the message "❌ Groundedness check failed" is printed.


The provided code demonstrates a simple yet effective way to perform a groundedness check using LangChain and Upstage. By verifying whether the generated response is grounded in the given context, it helps ensure the quality and relevance of the output.

Groundedness checks are an important step in building reliable and trustworthy language models and conversational agents. They help prevent the generation of irrelevant, inconsistent, or factually incorrect responses.

By using the `UpstageGroundednessCheck` class from LangChain, developers can easily integrate groundedness checks into their language model pipelines and improve the overall performance of their systems.

In [26]:
# GC
from langchain_upstage import UpstageGroundednessCheck

groundedness_check = UpstageGroundednessCheck()

context = "DUS is a new approach developed by Upstage to improve the search quality."
answer = "DUS is developed by Upstage."

request_input = {
    "context": context,
    "answer": answer,
}
gc_result = groundedness_check.invoke(request_input)

print(gc_result)
if gc_result.lower().startswith("grounded"):
    print("✅ Groundedness check passed")
else:
    print("❌ Groundedness check failed")

grounded
✅ Groundedness check passed


In [27]:
context = "DUS is a new approach developed by Upstage to improve the search quality."
answer = "DUS is developed by Google."

request_input = {
    "context": context,
    "answer": answer,
}
gc_result = groundedness_check.invoke(request_input)

if gc_result.lower().startswith("grounded"):
    print("✅ Groundedness check passed")
else:
    print("❌ Groundedness check failed")

❌ Groundedness check failed


## Custom Tools in LangChain

### High-Level Overview

The provided code demonstrates how to create custom tools in LangChain, a framework for developing applications powered by language models. Tools are essential components in LangChain that allow language models to perform specific tasks or access external resources.

The code defines three custom tools:

1. `add`: A tool that adds two integers.
2. `multiply`: A tool that multiplies two integers.
3. `get_news`: A tool that retrieves news articles on a given topic using an external API.

These tools are then bound to a language model using the `bind_tools` method, enabling the model to utilize these tools when generating responses.

### Detailed Explanation

Let's break down the code and explain each part in detail:

1. Importing necessary modules:
   - `tool` from `langchain_core.tools`: This module provides the `@tool` decorator for defining custom tools.
   - `requests`: A library for making HTTP requests to external APIs.

2. Defining the `add` tool:
   - The `@tool` decorator is used to define the `add` function as a custom tool.
   - The function takes two integer parameters, `a` and `b`, and returns their sum.
   - The docstring provides a brief description of the tool's functionality.

3. Defining the `multiply` tool:
   - Similar to the `add` tool, the `multiply` function is defined as a custom tool using the `@tool` decorator.
   - It takes two integer parameters, `a` and `b`, and returns their product.
   - The docstring describes the tool's purpose.

4. Defining the `get_news` tool:
   - The `get_news` function is defined as a custom tool using the `@tool` decorator.
   - It takes a `topic` parameter of type `str` and returns news articles related to that topic.
   - The function constructs a URL for the news API using the provided topic and an API key stored in an environment variable.
   - It sends a GET request to the API using the `requests` library and returns the JSON response.

5. Creating a list of tools:
   - The `tools` list is created, containing the `add`, `multiply`, and `get_news` tools.
   - This list will be used to bind the tools to the language model.

6. Binding the tools to the language model:
   - The `bind_tools` method of the `llm` object is called, passing the `tools` list as an argument.
   - This step binds the custom tools to the language model, allowing it to utilize these tools when generating responses.
   - The resulting object is assigned to the variable `llm_with_tools`.

Conclusion

The code demonstrates how to create custom tools in LangChain, which can be used to extend the capabilities of language models. By defining tools for specific tasks, such as mathematical operations or retrieving news articles, developers can enhance the functionality of their LangChain applications.

The `@tool` decorator simplifies the process of defining custom tools, while the `bind_tools` method allows seamless integration of these tools with the language model.

By leveraging custom tools, LangChain enables developers to build powerful and versatile applications that can perform a wide range of tasks beyond simple text generation.


In [28]:
# Tools
from langchain_core.tools import tool
import requests

@tool
def add(a: int, b: int) -> int:
    """Adds a and b."""
    return a + b


@tool
def multiply(a: int, b: int) -> int:
    """Multiplies a and b."""
    return a * b

@tool
def get_news(topic: str) -> str:
    """Get news on a given topic."""
    # https://newsapi.org/v2/everything?q=tesla&from=2024-04-01&sortBy=publishedAt&apiKey=API_KEY
    # change this to request news from a real API
    news_url = f"https://newsapi.org/v2/everything?q={topic}&apiKey={os.environ['NEWS_API_KEY']}"
    respnse = requests.get(news_url)
    return respnse.json()

tools = [add, multiply, get_news]

llm_with_tools = llm.bind_tools(tools)

In [29]:
def call_tool(tool_call):
    tool_name = tool_call["name"].lower()
    if tool_name not in globals():
        print("Tool not found", tool_name)
        return None
    selected_tool = globals()[tool_name]
    return selected_tool.invoke(tool_call["args"])

In [30]:
query = "What is 3 * 12? Also, what is 11 + 49?"

tool_calls = llm_with_tools.invoke(query).tool_calls
print(tool_calls)

[{'name': 'multiply', 'args': {'a': 3, 'b': 12}, 'id': '2545c5d8-f1e6-4e51-9660-26424145eff7'}, {'name': 'add', 'args': {'a': 11, 'b': 49}, 'id': '6afc7a88-dacd-4414-a7bc-c82e615f0be6'}]


In [31]:
for tool_call in tool_calls:
    print(call_tool(tool_call))


36
60


In [32]:
query = "What's news on NewJeans?"

tool_calls = llm_with_tools.invoke(query).tool_calls
print(tool_calls)

for tool_call in tool_calls:
    print(str(call_tool(tool_call))[:200])

[{'name': 'get_news', 'args': {'topic': 'NewJeans'}, 'id': 'a148f319-2d98-4d05-8129-5119db4e402e'}]
{'status': 'ok', 'totalResults': 279, 'articles': [{'source': {'id': 'the-verge', 'name': 'The Verge'}, 'author': 'Amrita Khalid', 'title': 'You don’t know your K-pop persona, do you?', 'description':


## 🚀 Exciting Excercise: Building Your Own AI-Powered Chatbot! 🤖

### Introduction

Congratulations on completing the course on building chatbots using Language Models (LLMs), Layout Analysis (LA), custom tools, and Groundedness Checks (GC)! It's time to put your skills to the test by creating your own AI-powered chatbot. 🎉

### Objective

Your task is to develop a chatbot that can perform various tasks based on user queries, such as:

- 🎨 Drawing images based on user descriptions
- 📰 Searching for the latest news on various topics
- 📅 Checking and managing schedules
- 📄 Extracting structured information from PDFs and images using Layout Analysis
- 🌟 And more!

### Requirements

To create your chatbot, you'll need to leverage the following components:

1. 🧠 Language Model (LLM): Use a powerful LLM to understand user queries and generate responses.

2. 📊 Layout Analysis (LA): Utilize Layout Analysis techniques to extract structured information from PDFs and images.

3. 🛠️ Custom Tools: Develop custom tools for specific actions like image generation, news search, and schedule management.

4. ✅ Groundedness Check (GC): Implement a groundedness check to ensure relevant and accurate responses.

## Conclusion

This homework assignment is your opportunity to showcase your skills in building an AI-powered chatbot that can understand and process visual content using Layout Analysis. Have fun and be creative! 🚀

Happy coding, and may your chatbot impress everyone! 😄