<a href="https://colab.research.google.com/github/MariaMuu/langchain-tutorials/blob/main/Chain_Types.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
from langchain.document_loaders import UnstructuredFileLoader
from langchain.chains.summarize import load_summarize_chain
from langchain.chains.question_answering import load_qa_chain

### Load Documents

In [None]:
sm_loader = UnstructuredFileLoader("../data/muir_lake_tahoe_in_winter.txt")
sm_doc = sm_loader.load()

lg_loader = UnstructuredFileLoader("../data/PaulGrahamEssays/worked.txt")
lg_doc = lg_loader.load()

In [None]:
def doc_summary(docs):
    print (f'You have {len(docs)} document(s)')

    num_words = sum([len(doc.page_content.split(' ')) for doc in docs])

    print (f'You have roughly {num_words} words in your docs')
    print ()
    print (f'Preview: \n{docs[0].page_content.split(". ")[0]}')

In [None]:
doc_summary(sm_doc)

You have 1 document(s)
You have roughly 2298 words in your docs

Preview: 
The winter glory of the Sierra ! How little is known of it! Californians admire descriptions of the Swiss Alps, reading with breathless interest how ice and snow load their sublime heights, and booming avalanches sweep in glorious array through their crowded forests, while our own icy, snow-laden mountains, with their unrivaled forests, loom unnoticed along our eastern horizon


In [None]:
doc_summary(lg_doc)

You have 1 document(s)
You have roughly 12551 words in your docs

Preview: 
February 2021Before college the two main things I worked on, outside of school,

were writing and programming


### Load Your LLM

In [None]:
from langchain import OpenAI

In [None]:
OPENAI_API_KEY = '...'

In [None]:
llm = OpenAI(openai_api_key=OPENAI_API_KEY)

### Summarize: Stuff

In [None]:
chain = load_summarize_chain(llm, chain_type="stuff", verbose=True)

In [None]:
chain.run(sm_doc)



[1m> Entering new StuffDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mWrite a concise summary of the following:


"The winter glory of the Sierra ! How little is known of it! Californians admire descriptions of the Swiss Alps, reading with breathless interest how ice and snow load their sublime heights, and booming avalanches sweep in glorious array through their crowded forests, while our own icy, snow-laden mountains, with their unrivaled forests, loom unnoticed along our eastern horizon. True, only mountaineers may penetrate their snow-blocked fastnesses to behold them in all their white wild grandeur, but to every healthy man and woman, and even to children, many of the subalpine valleys and lake-basins, six or seven thousand feet above the sea, remain invitingly open and approachable all winter. With a friend and his two little sons I have just returned from a week of bracing weathering around Lake Tahoe, in which we


[1m> Finished chain.[0m

[1m> Finished chain.[0m


" This article is a description of a winter trip to Lake Tahoe, California. It highlights the mild winter weather and snow-covered mountains, as well as the abundance of glacier lakes in the Sierra region. The author also mentions the local lumber companies and their destructive effects on the lake's forests, as well as the activities of the few remaining winter dwellers. The article concludes with a description of a snowshoeing excursion down the Truckee canon to the railroad."

In [None]:
chain.run(lg_doc)

### Summarize: Map Reduce

In [None]:
chain = load_summarize_chain(llm, chain_type="map_reduce", verbose=True)

In [None]:
chain.run(sm_doc)

In [None]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

In [None]:
text_splitter = RecursiveCharacterTextSplitter(
    # Set a really small chunk size, just to show.
    chunk_size = 400,
    chunk_overlap = 0
)

In [None]:
lg_docs = text_splitter.split_documents(lg_doc)

In [None]:
doc_summary(lg_docs)

You have 201 document(s)
You have roughly 12751 words in your docs

Preview: 
February 2021Before college the two main things I worked on, outside of school,

were writing and programming


In [None]:
chain.run(lg_docs[:5])

### Summarize: Refine

In [None]:
chain = load_summarize_chain(llm, chain_type="refine", verbose=True)

In [None]:
chain.run(lg_docs[:5])

### Q&A: Map Re-Rank

In [None]:
chain = load_qa_chain(llm, chain_type="map_rerank", verbose=True, return_intermediate_steps=True)

In [None]:
query = "Who was the authors friend who he got permission from to use the IBM 1401?"

result = chain({"input_documents": lg_docs[:5], "question": query}, return_only_outputs=True)

In [None]:
result['output_text']

' Rich Draves'

In [None]:
result['intermediate_steps']

[{'answer': ' This document does not answer the question', 'score': '0'},
 {'answer': ' Rich Draves', 'score': '100'},
 {'answer': ' This document does not answer the question.', 'score': '0'},
 {'answer': ' This document does not answer the question.', 'score': '0'},
 {'answer': ' This document does not answer the question', 'score': '0'}]