#### Document summarization application with Llama Open Source LLM with Langchain using Sagemaker Jumpstart

* Author : Dipjyoti Das Last Edited : Jan 25, 2024
* This notebook provides an example for how to use Sagemaker Jumpstart -for text summarization use case. It used Llama-7b-chat fine tuned open source model from Jumsptart model hub with Langchain.

#### Prerequisites
* AWS Innovation Sandbox should be installed and Domain created in Sagemaker

In [42]:
# Import the Boto3 and JSON modules
import json
import boto3
print(boto3.__version__)

import warnings
warnings.filterwarnings('ignore')

1.34.27


In [49]:
llama2_7b_chat_endpoint_name = 'jumpstart-dft-meta-textgeneration-l-20240125-212704'
llama2_7b_chat_InferenceComponentName = 'meta-textgeneration-llama-2-7b-f-20240125-212705'

llama2_13b_chat_endpoint_name = 'jumpstart-dft-meta-textgeneration-l-20240126-165505'
llama2_13b_chat_InferenceComponentName = 'meta-textgeneration-llama-2-13b-f-20240126-165505'

region_name = "us-east-1"

In [3]:
!pip install -q transformers pypdfium2 accelerate langchain

In [44]:
# Import the relevant modules and break down the long document into chunks:
import langchain
from langchain import SagemakerEndpoint, PromptTemplate
from langchain.llms.sagemaker_endpoint import LLMContentHandler
from langchain.chains.summarize import load_summarize_chain
from langchain.text_splitter import RecursiveCharacterTextSplitter, CharacterTextSplitter
from langchain import LLMChain

from langchain.document_loaders import TextLoader
from langchain.docstore.document import Document
from langchain.document_loaders import PyPDFium2Loader
import transformers
import torch

In [45]:
pdf_filepath = '/home/sagemaker-user/6_extracted_FM-esg-report-2022.pdf'

In [46]:
def pdf_output(pdf_filepath):
    loader = PyPDFium2Loader(pdf_filepath)
    data = loader.load()
    text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
    texts_FM1 = text_splitter.split_documents(data)
    return texts_FM1

In [47]:
pdf_output(pdf_filepath)

[Document(page_content='About Fannie Mae\r\nWho we are\r\nThe Federal National Mortgage Association, better known as \r\nFannie Mae, is a purpose-driven company by charter and by \r\nchoice. Our business supports mortgage lenders by providing \r\nmortgage financing to help people buy or rent a home. We help \r\nmake the popular 30-year fixed-rate mortgage possible, enabling \r\npredictable mortgage payments over the life of the loan and \r\ngiving homeowners stability and peace of mind. \r\nOur charter, an act of Congress, establishes our purposes: \r\nto provide liquidity and stability to the residential mortgage \r\nmarket and to promote access to mortgage credit. This mandate \r\nincludes facilitating mortgages on housing for low- and \r\nmoderate-income families involving a reasonable economic \r\nreturn that may be less than the return earned on other \r\nactivities. Congress declared that our operations should be \r\nfinanced by private capital to the maximum extent feasible. Wit

In [14]:
# To make LangChain work effectively with Llama models, we need to define the default content handler classes for valid input and output:

class ContentHandlerTextSummarization(LLMContentHandler):
    content_type = "application/json"
    accepts = "application/json"

    def transform_input(self, prompt: str, model_kwargs={}) -> bytes:
        input_str = json.dumps({"inputs": prompt, **model_kwargs})
        return input_str.encode("utf-8")

    def transform_output(self, output: bytes) -> json:
        response_json = json.loads(output.read().decode("utf-8"))
        generated_text = response_json[0]['generated_text']
        return generated_text.split("summary:")[-1]
    
content_handler = ContentHandlerTextSummarization()

In [57]:
summary_model_llm = SagemakerEndpoint( endpoint_name=llama2_13b_chat_endpoint_name, 
                                      region_name= region_name,
                                      model_kwargs={"max_new_tokens": 2000, "top_p": 0.9, "temperature": 0.6, "top_k":10, "do_sample" :True, "max_length": 1000},
                                      endpoint_kwargs={ "CustomAttributes": 'accept_eula=true', "InferenceComponentName" : llama2_13b_chat_InferenceComponentName}, 
                                      content_handler=content_handler )

In [51]:
template = """
              Write a summary of the following text delimited by triple backquotes.
              Return your response in bullet points which covers the key points of the text.
              ```{text}```
              BULLET POINT SUMMARY:
           """

prompt = PromptTemplate(template=template, input_variables=["text"])
print(prompt)

input_variables=['text'] template='\n              Write a summary of the following text delimited by triple backquotes.\n              Return your response in bullet points which covers the key points of the text.\n              ```{text}```\n              BULLET POINT SUMMARY:\n           '


In [52]:
llm_chain = LLMChain(prompt=prompt, llm=summary_model_llm)

In [22]:
# Result of Llama-2-7b-chat model
print(llm_chain.run(summarize_pdf(pdf_filepath)))

 • Fannie Mae is a purpose-driven company by charter and by choice,


In [53]:
# Result of Llama-2-13b-chat model
print(llm_chain.run(summarize_pdf(pdf_filepath)))



              • Fannie Mae is a purpose-driven company that provides liquidity


In [54]:
def summarize_pdf_full(pdf_filepath):
    loader = PyPDFium2Loader(pdf_filepath)
    data = loader.load()

    text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
    text_FM = text_splitter.split_documents(data)

    template = """
              Write a concise summary of the following text delimited by triple backquotes.
              Return your response in bullet points which covers the key points of the text.
              ```{text}```
              BULLET POINT SUMMARY:
           """

    prompt = PromptTemplate(template=template, input_variables=["text"])

    llm_chain = LLMChain(prompt=prompt, llm=summary_model_llm)

    return llm_chain.run(text_FM)

In [56]:
pdf_filepath

'/home/sagemaker-user/6_extracted_FM-esg-report-2022.pdf'

In [38]:
# result of Llama2-7b-chat model
summarize_pdf_full(pdf_filepath = pdf_filepath)

' • Fannie Mae is a purpose-driven company by charter and by choice.'

In [58]:
# result of Llama2-13b-chat model
summarize_pdf_full(pdf_filepath = pdf_filepath)

'\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n'

In [25]:
!pip install gradio



In [26]:
import gradio as gr
print(gr.__version__)

4.15.0


In [None]:
# C:\Users\Dipjyoti\PycharmProjects\GenerativeAI-projects\Summarization-use-case\6_extracted_FM-esg-report-2022.pdf
# the above path is not going to work in Gradio - use the Sagemaker path

In [39]:
pdf_filepath

'/home/sagemaker-user/6_extracted_FM-esg-report-2022.pdf'

In [40]:
input_pdf_path = gr.components.Textbox(label="Provide the PDF file path")
output_summary = gr.components.Textbox(label="Summary")



interface = gr.Interface(
    fn=summarize_pdf_full,
    inputs=input_pdf_path,
    outputs=output_summary,
    title="PDF Summarizer",
    description="Provide PDF file path to get the summary."
).queue().launch(share=True, debug = False)

Running on local URL:  http://127.0.0.1:7866
Running on public URL: https://2bd0a720ff5f172289.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)


Traceback (most recent call last):
  File "/opt/conda/lib/python3.10/site-packages/gradio/queueing.py", line 495, in call_prediction
    output = await route_utils.call_process_api(
  File "/opt/conda/lib/python3.10/site-packages/gradio/route_utils.py", line 232, in call_process_api
    output = await app.get_blocks().process_api(
  File "/opt/conda/lib/python3.10/site-packages/gradio/blocks.py", line 1561, in process_api
    result = await self.call_function(
  File "/opt/conda/lib/python3.10/site-packages/gradio/blocks.py", line 1179, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "/opt/conda/lib/python3.10/site-packages/anyio/to_thread.py", line 33, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/opt/conda/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 877, in run_sync_in_worker_thread
    return await future
  File "/opt/conda/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 807, in run
    re

In [31]:
# Provide the PDF file path = /home/sagemaker-user/6_extracted_FM-esg-report-2022.pdf

'/home/sagemaker-user'

In [41]:
# delete the inference endpoint to avoid incurring unnecessary costs 
client = boto3.client('runtime.sagemaker')
client.delete_endpoint(EndpointName=llama2_7b_chat_endpoint_name)

AttributeError: 'SageMakerRuntime' object has no attribute 'delete_endpoint'