# Document Chains Demo

In [8]:
import os
import getpass
import textwrap

from langchain import PromptTemplate, LLMChain
from langchain_google_genai import GoogleGenerativeAI
from langchain.prompts import PromptTemplate
from langchain.chains.summarize import load_summarize_chain

from langchain.docstore.document import Document
from langchain.text_splitter import CharacterTextSplitter

from dotenv import load_dotenv

In [9]:
load_dotenv()

True

In [10]:
model = GoogleGenerativeAI(model="gemini-1.5-pro-latest",temperature=0.5)

### Stuff Chain

In [11]:
# This involves putting all relevant data into the prompt for LangChain's StuffDocumentChain to process. The advantage of this method
# is that it requires only one call to llm, and the model has access to all information at once.

In [12]:
from langchain_community.document_loaders import PyPDFLoader

loader = PyPDFLoader("HSBC_Dynamos_GenAI.pdf")
docs=loader.load()

In [14]:
cnt=0
for doc in docs:
    cnt = cnt + 1
    print("---- Document #", cnt)
    print(doc.page_content.strip())

---- Document # 1
Team Name : HSBC Dynamos 
Problem Statement : Revolutionizing Credit Scoring with AI -Powered 
Personalization
---- Document # 2
Brief about the Idea :
This project is replacing out the old way of calculating credit scores and building a brand new system. Like a futuristic detective 
solving a mystery, they use advanced computer programs i.e. Generate AI, similar to those that create realistic pictures, to 
analyse your financial situation in a whole new way .
Here's the game -changer:
•Forget just bank statements! They consider your social media activity and how you pay your bills, getting a more complete 
picture.
•They become your personal financial coach, offering customized advice to improve your score.
•Ever wondered "what if" you did something different? This system lets you see how your score would change based on your 
actions.
•Confused about your score? No problem! They have a friendly AI assistant you can chat with to ask questions, like a virtual 
sidekic

In [15]:
prompt_template="""
You are given a document related to our GEN AI project as the below text : 
-------
{text}
-------
Question : Please respond with summary of the project in 80 words and also tools used.
Summary:
Tools Used:
"""

In [16]:
prompt = PromptTemplate(template=prompt_template, input_variables=["text"])

stuff_chain = load_summarize_chain(model,
                                   chain_type="stuff",
                                   prompt=prompt
                                   )
output_summary = stuff_chain.run(docs)

In [17]:
print(output_summary)

Summary:

HSBC Dynamos aim to revolutionize credit scoring with AI-powered personalization. By leveraging alternative data like social media and utility payments, the project provides a comprehensive credit analysis, real-time simulations, and personalized advice through an NLP chatbot. Explainable AI ensures transparency, while seamless credit card statement integration enhances accuracy.

Tools Used:

* **Generative AI:** Google's Vertex AI
* **NLP Chatbot:** Google Dialogflow
* **Data Storage & Management:** Google BigQuery
* **Serverless Functionalities:** Google Cloud Functions 



### Refine Chain

The Refine Documents Chain uses an iterative process to generate a response by analyzing each input document and updating its answer accordingly.

It passes all non-document inputs, the current document, and the latest intermediate answer to an LLM chain to obtain a new answer for each document.

This chain is ideal for tasks that involve analyzing more documents than can fit in the model's context, as it only passes a single document to the LLM at a time

In [19]:
refine_chain = load_summarize_chain(model, chain_type="refine")
print(refine_chain.refine_llm_chain.prompt.template)

Your job is to produce a final summary.
We have provided an existing summary up to a certain point: {existing_answer}
We have the opportunity to refine the existing summary (only if needed) with some more context below.
------------
{text}
------------
Given the new context, refine the original summary.
If the context isn't useful, return the original summary.


In [20]:
output_summary = refine_chain.run(docs)
output_summary

Retrying langchain_google_genai.llms._completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised ResourceExhausted: 429 Resource has been exhausted (e.g. check quota)..
Retrying langchain_google_genai.llms._completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised ResourceExhausted: 429 Resource has been exhausted (e.g. check quota)..
Retrying langchain_google_genai.llms._completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised ResourceExhausted: 429 Resource has been exhausted (e.g. check quota)..
Retrying langchain_google_genai.llms._completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised ResourceExhausted: 429 Resource has been exhausted (e.g. check quota)..
Retrying langchain_google_genai.llms._completion_with_retry.<locals>._completion_with_retry in 8.0 seconds as it raised ResourceExhausted: 429 Resource has been exhausted (e.g. check quota)..


KeyboardInterrupt: 

### Map Reduce Chain

To process large amounts of data efficiently, the MapReduce DocumentsChain method is used. This involves applying an LLM chain to each document individually (in the Map step), producing a new document. Then, all the new documents are passed to a separate combine documents chain to get a single output (in the Reduce step). If necessary, the mapped documents can be compressed before passing them to the combine documents chain. This compression step is performed recursively.

In [21]:
map_reduce_chain = load_summarize_chain(model,
                                        chain_type = "map_reduce",
                                        verbose= True)

In [22]:
print(map_reduce_chain.llm_chain.prompt.template)

Write a concise summary of the following:


"{text}"


CONCISE SUMMARY:


In [23]:
output_summary = map_reduce_chain.run(docs)
print(output_summary)



[1m> Entering new MapReduceDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mWrite a concise summary of the following:


"Team Name : HSBC Dynamos 
Problem Statement : Revolutionizing Credit Scoring with AI -Powered 
Personalization"


CONCISE SUMMARY:[0m
Prompt after formatting:
[32;1m[1;3mWrite a concise summary of the following:


"Brief about the Idea :
This project is replacing out the old way of calculating credit scores and building a brand new system. Like a futuristic detective 
solving a mystery, they use advanced computer programs i.e. Generate AI, similar to those that create realistic pictures, to 
analyse your financial situation in a whole new way .
Here's the game -changer:
•Forget just bank statements! They consider your social media activity and how you pay your bills, getting a more complete 
picture.
•They become your personal financial coach, offering customized advice to improve your score.
•Ever won

Retrying langchain_google_genai.llms._completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised ResourceExhausted: 429 Resource has been exhausted (e.g. check quota)..
Retrying langchain_google_genai.llms._completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised ResourceExhausted: 429 Resource has been exhausted (e.g. check quota)..
Retrying langchain_google_genai.llms._completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised ResourceExhausted: 429 Resource has been exhausted (e.g. check quota)..
Retrying langchain_google_genai.llms._completion_with_retry.<locals>._completion_with_retry in 8.0 seconds as it raised ResourceExhausted: 429 Resource has been exhausted (e.g. check quota)..
Retrying langchain_google_genai.llms._completion_with_retry.<locals>._completion_with_retry in 10.0 seconds as it raised ResourceExhausted: 429 Resource has been exhausted (e.g. check quota)..


KeyboardInterrupt: 