# Natural Language Processing

# Agents 
- The core idea of agents is to use a language model to choose a sequence of actions to take. 
- In chains, a sequence of actions is hardcoded (in code). 
- In agents, a language model is used as a reasoning engine to determine which actions to take and in which order.

In [1]:
# #langchain library
# !pip install langchain==0.1.2
# !pip install langchain-openai
# !pip install langchain-community
# #Tool API
# !pip install wikipedia
# #LLM
# !pip install accelerate==0.25.0
# !pip install transformers==4.36.2
# !pip install bitsandbytes==0.41.2
# #Text Embedding
# !pip install sentence-transformers==2.2.2
# !pip install InstructorEmbedding==1.0.1
# #vectorstore
# !pip install pymupdf==1.23.8
# !pip install faiss-gpu==1.7.2
# !pip install faiss-cpu==1.7.4

In [3]:
import langchain
langchain.__version__

'0.1.2'

In [2]:
import os
import torch
import langchain
langchain.debug = True
# Set GPU device
os.environ["CUDA_VISIBLE_DEVICES"] = "0"

os.environ['http_proxy']  = 'http://192.41.170.23:3128'
os.environ['https_proxy'] = 'http://192.41.170.23:3128'

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
device

device(type='cpu')

## LLM

In [3]:
# # Initiate our LLM 
# import torch

# from transformers import (
#     AutoTokenizer, 
#     AutoTokenizer, 
#     pipeline, 
#     AutoModelForSeq2SeqLM
# )
# from transformers import BitsAndBytesConfig
# from langchain import HuggingFacePipeline

# model_id = "lmsys/fastchat-t5-3b-v1.0"

# tokenizer = AutoTokenizer.from_pretrained(
#     model_id,
# )
# tokenizer.pad_token_id = tokenizer.eos_token_id

# bitsandbyte_config = BitsAndBytesConfig(
#         load_in_4bit = True,
#         bnb_4bit_quant_type = "nf4",
#         bnb_4bit_compute_dtype = torch.float16,
#         bnb_4bit_use_double_quant = True
#     )

# model = AutoModelForSeq2SeqLM.from_pretrained(
#     model_id,
#     quantization_config = bitsandbyte_config,
#     device_map = 'auto',
#     load_in_8bit = True,
# )

# pipe = pipeline(
#     task="text2text-generation", 
#     model=model, 
#     tokenizer=tokenizer, 
#     max_new_tokens=256, 
#     model_kwargs = { 
#         "temperature":0, 
#         "repetition_penalty": 1.5
#     }, 
# )

# llm = HuggingFacePipeline(pipeline = pipe)

## OpenAI LLM

In [4]:
from langchain.chat_models import ChatOpenAI
from langchain import OpenAI
import openai
import os

api_key = os.getenv("OPENAI_API_KEY")
organization = os.getenv("OPEN_AI_ORG")

# Initiate our LLM - default is 'gpt-3.5-turbo'
llm = ChatOpenAI(
    temperature=0,
    api_key=api_key,
    organization=organization
)

  warn_deprecated(


## 1. Agent
This is the chain responsible for deciding what step to take next. 
This is powered by a language model and a prompt. The inputs to this chain are:

- `Tools`: Descriptions of available tools
- `User input`: The high level objective
- `Intermediate steps`: Any (action, tool output) pairs previously executed in order to achieve the user input
The output is the next action(s) to take or the final response to send to the user (AgentActions or AgentFinish). An action specifies a tool and the input to that tool.

Different agents have different prompting styles for reasoning, different ways of encoding inputs, and different ways of parsing the output. For a full list of built-in agents see agent types. You can also easily build custom agents, which we show how to do in the Get started section below.

## 2. Tools
Tools are functions that an agent can invoke. There are two important design considerations around tools:
1. Giving the agent access to the right tools
2. Describing the tools in a way that is most helpful to the agent
   
Without thinking through both, you won't be able to build a working agent. If you don't give the agent access to a correct set of tools, it will never be able to accomplish the objectives you give it. If you don't describe the tools well, the agent won't know how to use them properly.

LangChain provides a wide set of built-in tools, but also makes it easy to define your own (including custom descriptions). For a full list of built-in tools, see the [tools integrations section](https://python.langchain.com/docs/integrations/tools/)

## Built-in Tools

In [1]:
# Initiate a Search tool 
from langchain.tools import WikipediaQueryRun
from langchain_community.utilities import WikipediaAPIWrapper
search = WikipediaAPIWrapper()

## Tool 

In [1]:
from langchain.document_loaders import PyMuPDFLoader

nlp_document = '../docs/pdf/SpeechandLanguageProcessing_3rd_07jan2023.pdf'
loader = PyMuPDFLoader(nlp_document)
documents = loader.load()

from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size = 700,
    chunk_overlap = 100
)
docs = text_splitter.split_documents(documents) 

import torch
from langchain_community.embeddings import HuggingFaceInstructEmbeddings

model_name = 'hkunlp/instructor-base'

embedding_model = HuggingFaceInstructEmbeddings(
        model_name = model_name,              
        model_kwargs = {
            'device': torch.device('cuda' if torch.cuda.is_available() else 'cpu')
        },
    )

from langchain.vectorstores import FAISS

vectordb = FAISS.from_documents(docs, embedding_model)
retriever = vectordb.as_retriever()

from langchain.chains import RetrievalQA

nlp_retriever = RetrievalQA.from_chain_type(
    llm=llm, 
    chain_type="stuff", 
    retriever=retriever)

In [None]:
from langchain.agents import Tool

# Define a list of tools
tools = [
    Tool(
        name = "Search",
        func=search.run,
        description="useful for when you need to answer questions about current events"
    )
]

expanded_tools = [
    Tool(
        name = "Search",
        func=search.run,
        description="useful for when you need to answer questions about current events"
    ),
    Tool(
        name = 'Knowledge Base',
        func=nlp_retriever.run,
        description="Useful for general questions about how to do things and for details on interesting topics. Input should be a fully formed question."
    )
]

## Custom Prompt Template

In [None]:
# Set up a prompt template which can interpolate the history
template_with_history = """
I'm your friendly NLP chatbot named ChakyBot, here to assist Chaky and Gun with any questions they have about Natural Language Processing (NLP). 
If you're curious about how probability works in the context of NLP, feel free to ask any questions you may have. 
Whether it's about probabilistic models, language models, or any other related topic, 
I'm here to help break down complex concepts into easy-to-understand explanations.
Just let me know what you're wondering about, and I'll do my best to guide you through it!
Answer the following questions as best you can. You have access to the following tools:

{tools}

Use the following format:

Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [{tool_names}]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question

Begin! Remember to give detailed, informative answers

Previous conversation history:
{history}

New question: {input}
{agent_scratchpad}""".strip()

In [None]:
from langchain.prompts import BaseChatPromptTemplate, ChatPromptTemplate
from langchain.schema import AgentAction, AgentFinish, HumanMessage, SystemMessage
import datetime
import json
import re
from typing import List, Union
import zipfile

# Set up a prompt template
class CustomPromptTemplate(BaseChatPromptTemplate):
    # The template to use
    template: str
    # The list of tools available
    tools: List[Tool]
    
    def format_messages(self, **kwargs) -> str:
        # Get the intermediate steps (AgentAction, Observation tuples)
        
        # Format them in a particular way
        intermediate_steps = kwargs.pop("intermediate_steps")
        thoughts = ""
        for action, observation in intermediate_steps:
            thoughts += action.log
            thoughts += f"\nObservation: {observation}\nThought: "
            
        # Set the agent_scratchpad variable to that value
        kwargs["agent_scratchpad"] = thoughts
        
        # Create a tools variable from the list of tools provided
        kwargs["tools"] = "\n".join([f"{tool.name}: {tool.description}" for tool in self.tools])
        
        # Create a list of tool names for the tools provided
        kwargs["tool_names"] = ", ".join([tool.name for tool in self.tools])
        formatted = self.template.format(**kwargs)
        return [HumanMessage(content=formatted)]

In [2]:
prompt_with_history = CustomPromptTemplate(
    template=template_with_history,
    tools=expanded_tools,
    # The history template includes "history" as an input variable so we can interpolate it into the prompt
    input_variables=["input", "intermediate_steps", "history"]
)

In [None]:
prompt_with_history.template

"I'm your friendly NLP chatbot named ChakyBot, here to assist Chaky and Gun with any questions they have about Natural Language Processing (NLP). \nIf you're curious about how probability works in the context of NLP, feel free to ask any questions you may have. \nWhether it's about probabilistic models, language models, or any other related topic, \nI'm here to help break down complex concepts into easy-to-understand explanations.\nJust let me know what you're wondering about, and I'll do my best to guide you through it!\nAnswer the following questions as best you can. You have access to the following tools:\n\n{tools}\n\nUse the following format:\n\nQuestion: the input question you must answer\nThought: you should always think about what to do\nAction: the action to take, should be one of [{tool_names}]\nAction Input: the input to the action\nObservation: the result of the action\n... (this Thought/Action/Action Input/Observation can repeat N times)\nThought: I now know the final answ

## Outputparser

In [None]:
from langchain.agents import ( 
    LLMSingleActionAgent, 
    AgentOutputParser
)

class CustomOutputParser(AgentOutputParser):
    def parse(self, llm_output: str) -> Union[AgentAction, AgentFinish]:
        
        # Check if agent should finish
        if "Final Answer:" in llm_output:
            return AgentFinish(
                # Return values is generally always a dictionary with a single `output` key
                # It is not recommended to try anything else at the moment :)
                return_values={"output": llm_output.split("Final Answer:")[-1].strip()},
                log=llm_output,
            )

        # Parse out the action and action input
        regex = r"Action: (.*?)[\n]*Action Input:[\s]*(.*)"
        match = re.search(regex, llm_output, re.DOTALL)
        # If it can't parse the output it raises an error
        # You can add your own logic here to handle errors in a different way i.e. pass to a human, give a canned response
        if not match:
            raise ValueError(f"Could not parse LLM output: `{llm_output}`")
        action = match.group(1).strip()
        action_input = match.group(2)
        
        # Return the action and action input
        return AgentAction(tool=action, tool_input=action_input.strip(" ").strip('"'), log=llm_output)
    
output_parser = CustomOutputParser()

## LLMSingleActionAgent

In [None]:
from langchain.agents import LLMSingleActionAgent
from langchain import LLMChain

# Re-initialize the agent with our new list of tools
llm_chain = LLMChain(
    llm=llm, 
    prompt=prompt_with_history)

tool_names = [tool.name for tool in expanded_tools]

multi_tool_agent = LLMSingleActionAgent(
    llm_chain=llm_chain, 
    output_parser=output_parser,
    stop=["\nObservation:"], 
    allowed_tools=tool_names
)

## AgentExecutor

In [None]:
from langchain.agents import AgentExecutor
from langchain.memory import ConversationBufferWindowMemory

# Initiate the agent that will respond to our queries
# Set verbose=True to share the CoT reasoning the LLM goes through
multi_tool_memory = ConversationBufferWindowMemory(k=2)
multi_tool_executor = AgentExecutor.from_agent_and_tools(
    agent=multi_tool_agent, 
    tools=expanded_tools, 
    verbose=True, 
    memory=multi_tool_memory)

In [None]:
multi_tool_executor.run("Who are you by the way?")



[1m> Entering new AgentExecutor chain...[0m
<re.Match object; span=(81, 153), match='Action: Knowledge Base\nAction Input: "What is th>
[32;1m[1;3mQuestion: Who are you by the way?
Thought: The user is asking about my identity.
Action: Knowledge Base
Action Input: "What is the identity of ChakyBot?"[0m

Observation:[33;1m[1;3mBased on the given context, there is no information about a chatbot named "ChakyBot." Therefore, I don't have any information about the identity of ChakyBot.[0m
[32;1m[1;3mI don't have any information about the identity of ChakyBot.
Final Answer: I am an NLP chatbot named ChakyBot, here to assist Chaky and Gun with any questions they have about Natural Language Processing (NLP).[0m

[1m> Finished chain.[0m


'I am an NLP chatbot named ChakyBot, here to assist Chaky and Gun with any questions they have about Natural Language Processing (NLP).'

In [None]:
multi_tool_executor.run("What is the Transformers?")



[1m> Entering new AgentExecutor chain...[0m
<re.Match object; span=(232, 295), match='Action: Knowledge Base\nAction Input: What is the>
[32;1m[1;3mThought: The Transformers is a powerful deep learning model architecture that has revolutionized the field of Natural Language Processing (NLP). It was introduced in the paper "Attention is All You Need" by Vaswani et al. in 2017.

Action: Knowledge Base
Action Input: What is the Transformers?
[0m

Observation:[33;1m[1;3mThe Transformers are a type of neural network architecture that are used for sequence-to-sequence tasks, such as machine translation and natural language processing. They are made up of stacks of transformer blocks, which consist of simple linear layers, feedforward networks, and self-attention layers. The key innovation of transformers is the use of self-attention, which allows the network to extract and use information from arbitrarily large contexts without the need for recurrent connections. This makes transfor

'The Transformers are a type of neural network architecture used for sequence-to-sequence tasks in NLP. They utilize self-attention to efficiently extract information from large contexts without the need for recurrent connections.'