# Project: Summarization App Using LangChain and OpenAI

In [1]:
from langchain.chat_models import ChatOpenAI
from langchain.schema import AIMessage, HumanMessage, SystemMessage

from langchain import PromptTemplate
from langchain.chains import LLMChain, summarize
from langchain.docstore.document import Document

from dotenv import load_dotenv, find_dotenv

load_dotenv(find_dotenv(), override=True)

True

### Basic Prompt

In [4]:
text= """
Mojo combines the usability of Python with the performance of C, unlocking unparalleled programmability \
of AI hardware and extensibility of AI models.
Mojo is a new programming language that bridges the gap between research and production \ 
by combining the best of Python syntax with systems programming and metaprogramming.
With Mojo, you can write portable code that’s faster than C and seamlessly inter-op with the Python ecosystem.
When we started Modular, we had no intention of building a new programming language. \
But as we were building our platform with the intent to unify the world’s ML/AI infrastructure, \
we realized that programming across the entire stack was too complicated. Plus, we were writing a \
lot of MLIR by hand and not having a good time.
And although accelerators are important, one of the most prevalent and sometimes overlooked "accelerators" \
is the host CPU. Nowadays, CPUs have lots of tensor-core-like accelerator blocks and other AI acceleration \
units, but they also serve as the “fallback” for operations that specialized accelerators don’t handle, \
such as data loading, pre- and post-processing, and integrations with foreign systems. \
"""

messages = [
    SystemMessage(content='You are an expert copywriter with expertize in summarizing documents'),
    HumanMessage(content=f'Please provide a short and concise summary of the following text:\n TEXT: {text}')
]

llm = ChatOpenAI(temperature=0, model_name='gpt-3.5-turbo')

llm.get_num_tokens(text)

231

In [6]:
summary_output = llm(messages)
print(summary_output.content)

Mojo is a new programming language that combines the usability of Python with the performance of C. It aims to bridge the gap between research and production in the field of AI by offering portable code that is faster than C and seamlessly integrates with the Python ecosystem. Mojo was developed to simplify programming across the entire ML/AI infrastructure and to address the importance of host CPUs in AI acceleration.


### Summarizing Using Prompt Templates

In [9]:
template = """
Write a concise and short summary of the following text:
TEXT: `{text}`
Translate the summary to {language}.
"""
prompt = PromptTemplate(
    input_variables=['text', 'language'],
    template=template
)

llm.get_num_tokens(prompt.format(text=text, language='English'))

251

In [10]:
chain = LLMChain(llm=llm, prompt=prompt)
summary = chain.run({'text': text, 'language': 'English'})
print(summary)

Mojo is a new programming language that combines the usability of Python with the performance of C. It aims to bridge the gap between research and production in the field of AI by offering portable code that is faster than C and can seamlessly interact with the Python ecosystem. The creators of Mojo realized the need for a simpler programming language while building their ML/AI infrastructure platform. They also emphasize the importance of CPUs as accelerators in addition to specialized AI hardware.


### Summarizing using StuffDocumentsChain

In [24]:
with open('../2 QA on Private Documents/files/sj.txt', encoding='utf-8') as f:
    text = f.read()

docs = [Document(page_content=text)]
llm = ChatOpenAI(temperature=0, model_name='gpt-3.5-turbo')

In [25]:
template = """
Write a concise and short summary of the following text.
TEXT: `{text}`
"""

prompt = PromptTemplate(
    input_variables=['text'],
    template=template
)

chain = summarize.load_summarize_chain(
    llm,
    chain_type='stuff',
    prompt=prompt,
    verbose=False
)

output_summary = chain.run(docs)

Retrying langchain.chat_models.openai.ChatOpenAI.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised APIConnectionError: Error communicating with OpenAI: ('Connection aborted.', ConnectionResetError(10054, 'An existing connection was forcibly closed by the remote host', None, 10054, None)).


In [26]:
print(output_summary)

The speaker, Steve Jobs, shares three stories from his life during a commencement speech. The first story is about dropping out of college and how it led him to take a calligraphy class, which later influenced the design of the Macintosh computer. The second story is about getting fired from Apple and how it allowed him to start over and eventually create successful companies like NeXT and Pixar. The third story is about facing death when he was diagnosed with cancer and how it made him realize the importance of living life to the fullest. He encourages the graduates to follow their hearts, stay hungry, and stay foolish.
