# Experiment parameters

**Data Source**: NETR document

**Splitter**: RecursiveCharacterTextSplitter

**Embedding**: [Cohere Embedding](https://cohere.com/)

**Retrieval**: similarity_search

**LLM**: ChatGoogleGenerativeAI


In [1]:
# Loading environment variable
from dotenv import load_dotenv

load_dotenv()
import os
#os.environ['COHERE_API_KEY']=os.getenv("COHERE_API_KEY")



# Data Loader

In [2]:
from langchain_community.document_loaders import PyPDFLoader

loader_NETR = PyPDFLoader("../a-pdfDocuments\\NETR_Roadmap_0.pdf")
pages_NETR = loader_NETR.load()

In [3]:
len(pages_NETR)

70

# Documents Splitter

### Exp1: Recursively split by character

In [4]:
from langchain_text_splitters import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000,
    chunk_overlap=200,
    length_function=len,
    is_separator_regex=False,
)

chucks_NETR = text_splitter.split_documents(pages_NETR)
chucks_NETR[:10]

[Document(page_content='1\nNational Energy Transition Roadmap (NETR)', metadata={'source': '../a-pdfDocuments\\NETR_Roadmap_0.pdf', 'page': 0}),
 Document(page_content='2\nNational Energy Transition Roadmap (NETR)Published by:\nMinistry of Economy\nMenara Prisma\nNo. 26, Persiaran Perdana, Precint 3\nFederal Government Administrative Centre\n62675 Putrajaya, Malaysia\n       603 8090 2090   |          ukk@ekonomi.gov.my   |          ekonomi.gov.my\n©Publisher’s Copyright\nAll rights reserved. No part of this publication may be reproduced, copied, stored in any retrieval system or transmitted in \nany form or by any means-electronic, mechanical, photocopying, recording or otherwise, without prior permission of the \nMinistry of Economy, Malaysia.', metadata={'source': '../a-pdfDocuments\\NETR_Roadmap_0.pdf', 'page': 1}),
 Document(page_content='3\nNational Energy Transition Roadmap (NETR)Foreword  ..........................................................................................

In [5]:
len(chucks_NETR)

417

# Embedding + Vectorstore

In [6]:
from langchain_cohere import CohereEmbeddings
from langchain_community.vectorstores import Chroma

embeddings_model = CohereEmbeddings(cohere_api_key=os.getenv("COHERE_API_KEY"))

db = Chroma.from_documents(chucks_NETR,embeddings_model)

In [7]:
db

<langchain_community.vectorstores.chroma.Chroma at 0x24117594490>

# Retrieval

In [8]:
query = "What are the technological and infrastructure challenges?"
retireved_results=db.similarity_search(query)
print(retireved_results[0].page_content)

63
National Energy Transition Roadmap (NETR)Technology and Infrastructure  
Overview
Technology is a key determinant of success in unlocking new economic opportunities across the nation’s 
energy transition journey. It is crucial to facilitate conditions to foster innovation and new technology 
applications to create technological advantages across the energy sector. In addition, the scaling up of 
major energy infrastructure investments will be required to safeguard energy security, improve energy 
access and enhance environmental sustainability. Support will also be needed to encourage innovation 
especially for technologies at early stages of the maturity curve, but with high potential benefits and 
scalability. 
Challenges
The energy transition encounters significant technological and infrastructure challenges. The slow 
gradual uptake of sustainable practices within domestic industries impedes the swift transition to cleaner


In [9]:
"""
Retrievers: A retriever is an interface that returns documents given
 an unstructured query. It is more general than a vector store.
 A retriever does not need to be able to store documents, only to 
 return (or retrieve) them. Vector stores can be used as the backbone
 of a retriever, but there are other types of retrievers as well. 
 https://python.langchain.com/docs/modules/data_connection/retrievers/   
"""

retriever=db.as_retriever()
retriever

VectorStoreRetriever(tags=['Chroma', 'CohereEmbeddings'], vectorstore=<langchain_community.vectorstores.chroma.Chroma object at 0x0000024117594490>)

# LLM

In [10]:
# import google.generativeai as genai
# genai.configure(api_key=os.getenv("GOOGLE_API_KEY"))

# llm= genai.GenerativeModel('gemini-pro')
# #response= model.generate_content([input,image[0],prompt])
# #response.text
# llm

In [11]:
from langchain_google_genai import ChatGoogleGenerativeAI
import google.generativeai as genai
genai.configure(api_key=os.getenv("GOOGLE_API_KEY"))
llm = ChatGoogleGenerativeAI(model="gemini-pro")
llm

  from .autonotebook import tqdm as notebook_tqdm


ChatGoogleGenerativeAI(model='gemini-pro', client=genai.GenerativeModel(
    model_name='models/gemini-pro',
    generation_config={},
    safety_settings={},
    tools=None,
    system_instruction=None,
))

In [12]:
## Design ChatPrompt Template
from langchain_core.prompts import ChatPromptTemplate
prompt = ChatPromptTemplate.from_template("""
Answer the following question based only on the provided context. 
Think step by step before providing a detailed answer. 
I will tip you $1000 if the user finds the answer helpful. 
<context>
{context}
</context>
Question: {input}""")

# Chains

In [13]:
## Chain Introduction
## Create Stuff Document Chain

from langchain.chains.combine_documents import create_stuff_documents_chain
    
document_chain=create_stuff_documents_chain(llm,prompt)

In [14]:
"""
Retrieval chain:This chain takes in a user inquiry, which is then
passed to the retriever to fetch relevant documents. Those documents 
(and original inputs) are then passed to an LLM to generate a response
https://python.langchain.com/docs/modules/chains/
"""
from langchain.chains import create_retrieval_chain
retrieval_chain=create_retrieval_chain(retriever,document_chain)

In [19]:
response=retrieval_chain.invoke({"input":"What are the technological and infrastructure challenges?"})
response

{'input': 'What are the technological and infrastructure challenges?',
 'context': [Document(page_content='63\nNational Energy Transition Roadmap (NETR)Technology and Infrastructure  \nOverview\nTechnology is a key determinant of success in unlocking new economic opportunities across the nation’s \nenergy transition journey. It is crucial to facilitate conditions to foster innovation and new technology \napplications to create technological advantages across the energy sector. In addition, the scaling up of \nmajor energy infrastructure investments will be required to safeguard energy security, improve energy \naccess and enhance environmental sustainability. Support will also be needed to encourage innovation \nespecially for technologies at early stages of the maturity curve, but with high potential benefits and \nscalability. \nChallenges\nThe energy transition encounters significant technological and infrastructure challenges. The slow \ngradual uptake of sustainable practices with

In [17]:
response=retrieval_chain.invoke({"input":"Who is the name of the ministry incharged?"})
response['answer']

'The provided text does not specify the name of the ministry in charge.'

In [18]:
response=retrieval_chain.invoke({"input":"Can you explain more about the Energy System Pathway?"})
response['answer']

"The Energy System Pathway of the National Energy Transition Roadmap (NETR) is a strategic plan designed to guide Malaysia's transition to a greener, low-carbon energy system by 2050. It aims to balance the energy trilemma, which involves ensuring energy security, affordability, and environmental sustainability. The pathway includes the following key elements:\n\n1. Increased use of renewable energy (RE) in the power generation mix: NETR aims to significantly increase the proportion of RE in the power generation mix to reduce reliance on fossil fuels and promote clean energy sources.\n\n2. Phasing out coal from the power generation mix: NETR plans to gradually phase out coal-fired power plants to reduce carbon emissions and improve air quality.\n\n3. Broad-based energy efficiency initiatives: NETR emphasizes the importance of energy efficiency measures to reduce energy consumption across key sectors, including residential, commercial, industrial, and transportation. This includes optim