## Utility Chains Overview
### Summarizing Documents

#### load summarize chain

In [1]:
import os
os.environ["OPENAI_API_KEY"] = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
os.environ["HUGGINGFACEHUB_API_TOKEN"] = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"

In [2]:
from langchain.llms import OpenAI
from langchain.prompts import PromptTemplate
from langchain.chains.summarize import load_summarize_chain
from langchain.text_splitter import CharacterTextSplitter
from langchain.docstore.document import Document

In [3]:
llm = OpenAI(temperature=0.9)

In [4]:
# Reading the document
with open("sample.txt") as f:
    data = f.read()

In [5]:
# Split text
text_splitter = CharacterTextSplitter()
texts = text_splitter.split_text(data)

In [7]:
# Create multiple documents
docs = [Document(page_content=t) for t in texts]


In [8]:
docs

[Document(page_content="A large language model (LLM) is a computational model notable for its ability to achieve \ngeneral-purpose language generation and other natural language processing tasks such as\n classification. Based on language models, LLMs acquire these abilities by learning\n  statistical relationships from vast amounts of text during a computationally intensive\n   self-supervised and semi-supervised training process.[1] LLMs can be used for text \n   generation, a form of generative AI, by taking an input text and \nrepeatedly predicting the next token or word.[2]\n\nLLMs are artificial neural networks that utilize the transformer architecture, invented\n in 2017. The largest and most capable LLMs, as of June 2024, are built with a decoder-only \n transformer-based architecture, which enables efficient processing and generation of \n large-scale text data.\n\nHistorically, up to 2020, fine-tuning was the primary method used to adapt a model for\n specific tasks. However,

In [9]:
chain = load_summarize_chain(llm, chain_type="map_reduce", verbose=True)
chain.run(docs)



[1m> Entering new MapReduceDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mWrite a concise summary of the following:


"A large language model (LLM) is a computational model notable for its ability to achieve 
general-purpose language generation and other natural language processing tasks such as
 classification. Based on language models, LLMs acquire these abilities by learning
  statistical relationships from vast amounts of text during a computationally intensive
   self-supervised and semi-supervised training process.[1] LLMs can be used for text 
   generation, a form of generative AI, by taking an input text and 
repeatedly predicting the next token or word.[2]

LLMs are artificial neural networks that utilize the transformer architecture, invented
 in 2017. The largest and most capable LLMs, as of June 2024, are built with a decoder-only 
 transformer-based architecture, which enables efficient processing and genera

'\n\nLLMs are advanced language models that use large amounts of text to perform tasks such as language generation and classification. They are based on artificial neural networks and are able to handle large-scale text data. However, they may also have biases from the data they are trained on. Nepal is a diverse and culturally rich country with a federal democratic system. After adopting a new constitution, the Nepalese people remain resilient in building a stable and prosperous nation. Maithili culture, rooted in the Mithila region, has a strong emphasis on traditions, art forms, and community bonds. Efforts are being made to preserve and promote this culture in modern times.'

### HTTP Requests

#### LLMRequestsChain

In [10]:
from langchain.chains import LLMRequestsChain, LLMChain

In [11]:
template = """
Extract the answer to the question '{query}' or say "not found" if the information is not available.
{requests_result}
"""

PROMPT = PromptTemplate(
    input_variables=["query", "requests_result"],
    template=template,
)

In [12]:
llm = OpenAI()

In [13]:
chain = LLMRequestsChain(llm_chain=LLMChain(llm=llm, prompt=PROMPT))

In [14]:
question = "What is the capital of India?"
input = {
    "query":question,
    "url":"https://www.google.com/search?q=" + question.replace(" ", "+"),
}

In [15]:
chain(input)

{'query': 'What is the capital of India?',
 'url': 'https://www.google.com/search?q=What+is+the+capital+of+India?',
 'output': '\nNew Delhi is the capital of India.'}

In [16]:
import inspect
print(inspect.getsource(chain._call
                        ))

    def _call(
        self,
        inputs: Dict[str, Any],
        run_manager: Optional[CallbackManagerForChainRun] = None,
    ) -> Dict[str, Any]:
        from bs4 import BeautifulSoup

        _run_manager = run_manager or CallbackManagerForChainRun.get_noop_manager()
        # Other keys are assumed to be needed for LLM prediction
        other_keys = {k: v for k, v in inputs.items() if k != self.input_key}
        url = inputs[self.input_key]
        res = self.requests_wrapper.get(url)
        # extract the text from the html
        soup = BeautifulSoup(res, "html.parser")
        other_keys[self.requests_key] = soup.get_text()[: self.text_length]
        result = self.llm_chain.predict(  # type: ignore[attr-defined]
            callbacks=_run_manager.get_child(), **other_keys
        )
        return {self.output_key: result}

