# Experiment parameters

* To use Agent 

**Data Source**: NETR, RMK 12 & NIMP 30 & WIKIPEDIA

**Splitter**: RecursiveCharacterTextSplitter

**Embedding**: [Cohere Embedding](https://cohere.com/)

**Retrieval**: similarity_search

**LLM**: ChatGoogleGenerativeAI


In [87]:
# Loading environment variable
from dotenv import load_dotenv

load_dotenv()
import os

# Setup tools for wiki & MOF

In [88]:
from langchain_community.tools import WikipediaQueryRun
from langchain_community.utilities import WikipediaAPIWrapper

api_wrapper=WikipediaAPIWrapper(top_k_results=1,doc_content_chars_max=1000)
wiki=WikipediaQueryRun(api_wrapper=api_wrapper)

In [89]:
wiki.name

'wikipedia'

In [90]:
from langchain_community.utilities import GoogleSearchAPIWrapper
from langchain_core.tools import Tool

search = GoogleSearchAPIWrapper()

Google = Tool(
    name="google_search",
    description="Search Google for recent results.",
    func=search.run,
)

In [91]:
Google.name

'google_search'

# Load data from documents

In [92]:
import os
from langchain_community.document_loaders import PyPDFLoader

filepaths = ["../a-pdfDocuments\\ksp_rmke_12.pdf", "../a-pdfDocuments\\NETR_Roadmap_0.pdf", "../a-pdfDocuments\\NIMP_2030.pdf"]
loaders = [PyPDFLoader(os.path.join(path)) for path in filepaths]
documents = []
for loader in loaders:
    doc = loader.load()
    documents.extend(doc)
documents
# pages = loaders.load()
# pages


[Document(page_content='', metadata={'source': '../a-pdfDocuments\\ksp_rmke_12.pdf', 'page': 0}),
 Document(page_content='1 1\nMid-Term Review of the Twelfth Malaysia Plan\nExecutive SummaryMid-Term Review of the Twelfth Malaysia Plan\nExecutive Summary', metadata={'source': '../a-pdfDocuments\\ksp_rmke_12.pdf', 'page': 1}),
 Document(page_content='3\n  The Mid-Term Review \nof the Twelfth Malaysia \nPlan (MTR of the Twelfth \nPlan) is a commitment by \nthe Government to realise \nthe reforms demanded \nby the rakyat, in line with \nthe aspiration of Malaysia \nMADANI.   \n', metadata={'source': '../a-pdfDocuments\\ksp_rmke_12.pdf', 'page': 2}),
 Document(page_content='3\nMid-Term Review of the Twelfth Malaysia Plan\nExecutive Summary\nForeword\nThe Mid-Term Review of the Twelfth Malaysia Plan (Twelfth Plan) is a firm assurance of the Government to realise the needed reforms \ndesired by the rakyat , with full efforts in translating the aspiration of Malaysia MADANI  based on Ekonomi M

In [93]:
len(documents)

292

# Documents Splitter

Recursively split by character

In [94]:
from langchain_text_splitters import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000,
    chunk_overlap=200,
    length_function=len,
    is_separator_regex=False,
)

chucks = text_splitter.split_documents(documents)
chucks[:10]

[Document(page_content='1 1\nMid-Term Review of the Twelfth Malaysia Plan\nExecutive SummaryMid-Term Review of the Twelfth Malaysia Plan\nExecutive Summary', metadata={'source': '../a-pdfDocuments\\ksp_rmke_12.pdf', 'page': 1}),
 Document(page_content='3\n  The Mid-Term Review \nof the Twelfth Malaysia \nPlan (MTR of the Twelfth \nPlan) is a commitment by \nthe Government to realise \nthe reforms demanded \nby the rakyat, in line with \nthe aspiration of Malaysia \nMADANI.', metadata={'source': '../a-pdfDocuments\\ksp_rmke_12.pdf', 'page': 2}),
 Document(page_content='3\nMid-Term Review of the Twelfth Malaysia Plan\nExecutive Summary\nForeword\nThe Mid-Term Review of the Twelfth Malaysia Plan (Twelfth Plan) is a firm assurance of the Government to realise the needed reforms \ndesired by the rakyat , with full efforts in translating the aspiration of Malaysia MADANI  based on Ekonomi MADANI . In this context, the last \ntwo years of the implementation of the Twelfth Plan will be an impo

In [95]:
len(chucks)

1011

# Embedding + Vectorstore

In [96]:
from langchain_cohere import CohereEmbeddings
from langchain_community.vectorstores import Chroma

embeddings_model = CohereEmbeddings(cohere_api_key=os.getenv("COHERE_API_KEY"))

db = Chroma.from_documents(chucks,embeddings_model)

In [97]:
db

<langchain_community.vectorstores.chroma.Chroma at 0x1b7c24b9c50>

# Retrieval

In [98]:
query = "What are the technological and infrastructure challenges?"
retireved_results=db.similarity_search(query)
print(retireved_results[0].page_content)

63
National Energy Transition Roadmap (NETR)Technology and Infrastructure  
Overview
Technology is a key determinant of success in unlocking new economic opportunities across the nation’s 
energy transition journey. It is crucial to facilitate conditions to foster innovation and new technology 
applications to create technological advantages across the energy sector. In addition, the scaling up of 
major energy infrastructure investments will be required to safeguard energy security, improve energy 
access and enhance environmental sustainability. Support will also be needed to encourage innovation 
especially for technologies at early stages of the maturity curve, but with high potential benefits and 
scalability. 
Challenges
The energy transition encounters significant technological and infrastructure challenges. The slow 
gradual uptake of sustainable practices within domestic industries impedes the swift transition to cleaner


In [99]:
"""
Retrievers: A retriever is an interface that returns documents given
 an unstructured query. It is more general than a vector store.
 A retriever does not need to be able to store documents, only to 
 return (or retrieve) them. Vector stores can be used as the backbone
 of a retriever, but there are other types of retrievers as well. 
 https://python.langchain.com/docs/modules/data_connection/retrievers/   
"""

docs_retriever = db.as_retriever()
docs_retriever

VectorStoreRetriever(tags=['Chroma', 'CohereEmbeddings'], vectorstore=<langchain_community.vectorstores.chroma.Chroma object at 0x000001B7C24B9C50>)

In [100]:
from langchain.tools.retriever import create_retriever_tool
docs_tool=create_retriever_tool(docs_retriever,"docs_tool",
                      "Search for information about Malaysian Economy based on the official blueprint. For any questions about plan and target, you must use this tool!")

In [101]:
tools = [wiki, Google, docs_tool]
tools

[WikipediaQueryRun(api_wrapper=WikipediaAPIWrapper(wiki_client=<module 'wikipedia' from 'c:\\Users\\User\\Desktop\\ML Projects\\Malaysia_Economy_QnA_Guide_LMM_Apps\\venv_econ\\Lib\\site-packages\\wikipedia\\__init__.py'>, top_k_results=1, lang='en', load_all_available_meta=False, doc_content_chars_max=1000)),
 Tool(name='google_search', description='Search Google for recent results.', func=<bound method GoogleSearchAPIWrapper.run of GoogleSearchAPIWrapper(search_engine=<googleapiclient.discovery.Resource object at 0x000001B7C0705F90>, google_api_key='AIzaSyD8D47cXYAOZx3ttdeI8sokemraHqv-MC4', google_cse_id='b2c186aab20b748fa', k=10, siterestrict=False)>),
 Tool(name='docs_tool', description='Search for information about Malaysian Economy based on the official blueprint. For any questions about plan and target, you must use this tool!', args_schema=<class 'langchain.tools.retriever.RetrieverInput'>, func=functools.partial(<function _get_relevant_documents at 0x000001B7BC356660>, retrieve

# LLM

In [102]:
from langchain_google_genai import ChatGoogleGenerativeAI
import google.generativeai as genai
genai.configure(api_key=os.getenv("GOOGLE_API_KEY"))
llm = ChatGoogleGenerativeAI(model="gemini-pro")
llm

ChatGoogleGenerativeAI(model='gemini-pro', client=genai.GenerativeModel(
    model_name='models/gemini-pro',
    generation_config={},
    safety_settings={},
    tools=None,
    system_instruction=None,
))

In [103]:
from langchain import hub
# Get the prompt to use - you can modify this!
prompt = hub.pull("hwchase17/openai-functions-agent")
prompt.messages

[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=[], template='You are a helpful assistant')),
 MessagesPlaceholder(variable_name='chat_history', optional=True),
 HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['input'], template='{input}')),
 MessagesPlaceholder(variable_name='agent_scratchpad')]

In [104]:
# Get the prompt to use - you can modify this!
prompt = hub.pull("hwchase17/react")

# Agent

In [105]:
### Agents
from langchain.agents import create_react_agent
agent=create_react_agent(llm,tools,prompt)

In [106]:
## Agent Executer
from langchain.agents import AgentExecutor
agent_executor=AgentExecutor(agent=agent,tools=tools,verbose=True)
agent_executor

AgentExecutor(verbose=True, agent=RunnableAgent(runnable=RunnableAssign(mapper={
  agent_scratchpad: RunnableLambda(lambda x: format_log_to_str(x['intermediate_steps']))
})
| PromptTemplate(input_variables=['agent_scratchpad', 'input'], partial_variables={'tools': 'wikipedia: A wrapper around Wikipedia. Useful for when you need to answer general questions about people, places, companies, facts, historical events, or other subjects. Input should be a search query.\ngoogle_search: Search Google for recent results.\ndocs_tool: Search for information about Malaysian Economy based on the official blueprint. For any questions about plan and target, you must use this tool!', 'tool_names': 'wikipedia, google_search, docs_tool'}, metadata={'lc_hub_owner': 'hwchase17', 'lc_hub_repo': 'react', 'lc_hub_commit_hash': 'd15fe3c426f1c4b3f37c9198853e4a86e20c425ca7f4752ec0c9b0e97ca7ea4d'}, template='Answer the following questions as best you can. You have access to the following tools:\n\n{tools}\n\nUse

In [None]:
agent_executor.invoke({"input":"Who is the name of the Malaysia prime minister?"})


In [None]:
agent_executor.invoke({"input":"Can you summarize point by point what is NEW INDUSTRIAL MASTER PLAN 2030"})
