# Building LLM-Powered Applications with LangChain

In [4]:
#!pip install langchain deeplake openai langchain_community

In [1]:
from dotenv import load_dotenv
load_dotenv()

from langchain.chat_models import ChatOpenAI
from langchain.prompts.chat import (
  ChatPromptTemplate,
  SystemMessagePromptTemplate,
  HumanMessagePromptTemplate
)

In [2]:
chat = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0)


  chat = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0)


## Prompt Templates

In [3]:
template = "You are an assistant that helps users find information about movies."
system_message_prompt = SystemMessagePromptTemplate.from_template(template)
human_template = "Find information about the movie {movie_title}."
human_message_prompt = HumanMessagePromptTemplate.from_template(human_template)

chat_prompt = ChatPromptTemplate.from_messages([system_message_prompt, human_message_prompt])
response = chat(chat_prompt.format_prompt(movie_title="Inception").to_messages())

  response = chat(chat_prompt.format_prompt(movie_title="Inception").to_messages())


In [4]:
print(response.content)

"Inception" is a 2010 science fiction action film written and directed by Christopher Nolan. The film stars Leonardo DiCaprio as a professional thief who steals information by entering the subconscious minds of his targets through their dreams. The ensemble cast also includes Joseph Gordon-Levitt, Ellen Page, Tom Hardy, Ken Watanabe, and Marion Cotillard.

The movie explores the concept of dream sharing and features stunning visual effects and mind-bending storytelling. "Inception" received critical acclaim for its originality, direction, performances, and visual effects. It was also a commercial success, grossing over $800 million worldwide.

If you haven't seen it yet, "Inception" is definitely worth a watch for its unique premise and engaging storytelling.


## Summarization Chain Example



In [5]:
from langchain.document_loaders import PyPDFLoader
document_loader =  PyPDFLoader(file_path='building-llms-production-reliability.pdf')


In [6]:
docs = []
for doc in document_loader.lazy_load():
    docs.append(doc.page_content)

In [7]:
print("".join(docs[:10]))

What Experts Think About Building LLMs for
Production
"This is the most comprehensive textbook to date on building LLM
applications, and helps learners understand everything from
fundamentals to the simple-to-advanced building blocks of
constructing LLM applications. The application topics include
prompting, RAG, agents, ﬁne-tuning, and deployment - all essential
topics in an AI Engineer's toolkit."
— Jerry Liu, Co-founder and CEO of LlamaIndex
“An indispensable guide for anyone venturing into the world of large
language models. This book masterfully demystiﬁes complex
concepts, making them accessible and actionable […] It’s a must-
have in the library of every aspiring and seasoned AI professional.”
— Shashank Kalanithi, Data Engineer at Meta
“Building LLMs in Production" is for you. It contains thorough
explanations and code for you to start using and deploying LLMs, as
well as optimizing their performance. Very highly recommended!”
— Luis Serrano, PhD, Founder of Serrano.Academy & a

In [9]:
from langchain_huggingface import HuggingFaceEndpoint, ChatHuggingFace
from langchain.chains.summarize import load_summarize_chain

llm = HuggingFaceEndpoint(
    repo_id="Qwen/Qwen2.5-1.5B-Instruct",
    task="summary_text",
    max_new_tokens=512,
    do_sample=False,
    repetition_penalty=1.03,
)

chat = ChatHuggingFace(llm=llm, verbose=True)
summarize_chain = load_summarize_chain(chat)

  from .autonotebook import tqdm as notebook_tqdm


In [14]:
# document = document_loader.load()
# summary = summarize_chain(document)

## Building a News Articles Summarizer

```bash
!pip install -q newspaper3k python-dotenv lxml[html_clean]
```

In [15]:
import json 
from dotenv import load_dotenv
load_dotenv()

True

In [20]:
import requests
from newspaper import Article

headers = { 'User-Agent': '''Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.82 Safari/537.36'''}
article_url = """https://vnexpress.net/de-xuat-lap-don-vi-hanh-chinh-moi-noi-do-tai-ha-noi-tp-hcm-4857547.html"""

session = requests.Session()

try:
  response = session.get(article_url, headers=headers, timeout=10)
  if response.status_code == 200:
    print("Request was successful")
    article = Article(article_url)
    article.download()
    article.parse()
    print("title = ", article.title)
    print("text = ", article.text)
  else:
    print(f"Request failed with status code {response.status_code}")
    
except requests.exceptions.RequestException as e:
  print(f"An error occurred: {e}")
  

Request was successful
title =  Đề xuất lập đơn vị hành chính mới: 'Nội đô' tại Hà Nội, TP HCM
text =  PGS Tô Văn Hòa, Phó hiệu trưởng Trường Đại học Luật Hà Nội, đề xuất thành lập đơn vị hành chính mới mang tên "Nội đô" tại các thành phố trực thuộc Trung ương như Hà Nội, TP HCM và Hải Phòng.

Đề xuất này được đưa ra tại Hội thảo khoa học cấp quốc gia về đổi mới công tác xây dựng và thi hành pháp luật đáp ứng yêu cầu phát triển đất nước trong kỷ nguyên mới do Bộ Tư pháp và Học viện Chính trị quốc gia Hồ Chí Minh phối hợp tổ chức, sáng 6/3.

Theo PGS Hòa, việc nghiên cứu sửa đổi, bổ sung Hiến pháp là cấp thiết, tạo cơ sở hiến định vững chắc để tiến hành "cách mạng về tinh gọn bộ máy" và bảo đảm tính tối cao của Hiến pháp. Qua rà soát, ông thấy hướng nghiên cứu sửa đổi Hiến pháp năm 2013 cần tập trung vào các quy định về chính quyền địa phương nhằm hiến định việc không tổ chức chính quyền trung gian (cấp huyện).

Ông Hòa cho rằng, việc không tổ chức cấp huyện không chỉ đơn thuần là bãi b

In [22]:
article.title

"Đề xuất lập đơn vị hành chính mới: 'Nội đô' tại Hà Nội, TP HCM"

In [29]:
from langchain.schema import HumanMessage

article_text = article.text
article_title = article.title

# prepare template for prompt
template ="""You are a very good assistant that summarizes online articles.
Here's the article you want to summarize.
==================
Title: {article_title}

{article_text}

==================

Write a summary of the previous article.
"""

prompt = template.format(article_title=article_title, article_text=article_text)

message = [HumanMessage(content=prompt)]

In [30]:
from langchain.chat_models import ChatOpenAI

chat = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0)
summary = chat.invoke(message)

In [32]:
print(summary.content)

The article discusses a proposal to establish a new administrative unit called "Nội đô" in major cities like Hanoi, Ho Chi Minh City, and Hai Phong. The proposal was put forward by PGS Tô Văn Hòa at a national scientific conference on legal reform to meet the country's development needs in the new era. The suggestion aims to restructure the local government system in Vietnam by focusing on the elimination of intermediary government levels (district level) and restructuring administrative units into two levels: provincial level and grassroots level. The "Nội đô" unit would consist of inner city districts in major cities. Additionally, the article discusses the need to amend the Constitution to streamline the political system and ensure the highest level of constitutional integrity.


## Llamaindex introduction

```bash
pip install -q llama-index llama-index-vector-stores-chroma openai cohere tiktoken chromadb llama-index-readers-wikipedia wikipedia
```

In [33]:
from dotenv import load_dotenv
load_dotenv()
import logging
import sys

logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

In [35]:
from llama_index.readers.wikipedia import WikipediaReader
loader = WikipediaReader()

documents = loader.load_data(pages=['Natural Language Processing', 'Artificial Intelligence'])
print(len(documents))

DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): en.wikipedia.org:80
Starting new HTTP connection (1): en.wikipedia.org:80
DEBUG:urllib3.connectionpool:http://en.wikipedia.org:80 "GET /w/api.php?list=search&srprop=&srlimit=1&limit=1&srsearch=Natural+Language+Processing&srinfo=suggestion&format=json&action=query HTTP/1.1" 301 0
http://en.wikipedia.org:80 "GET /w/api.php?list=search&srprop=&srlimit=1&limit=1&srsearch=Natural+Language+Processing&srinfo=suggestion&format=json&action=query HTTP/1.1" 301 0
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): en.wikipedia.org:443
Starting new HTTPS connection (1): en.wikipedia.org:443
DEBUG:urllib3.connectionpool:https://en.wikipedia.org:443 "GET /w/api.php?list=search&srprop=&srlimit=1&limit=1&srsearch=Natural+Language+Processing&srinfo=suggestion&format=json&action=query HTTP/1.1" 200 174
https://en.wikipedia.org:443 "GET /w/api.php?list=search&srprop=&srlimit=1&limit=1&srsearch=Natural+Language+Processing&srinfo=su

In [36]:
documents

[Document(id_='21652', embedding=None, metadata={}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={}, metadata_template='{key}: {value}', metadata_separator='\n', text_resource=MediaResource(embeddings=None, data=None, text='Natural language processing (NLP) is a subfield of computer science and especially artificial intelligence. It is primarily concerned with providing computers with the ability to process data encoded in natural language and is thus closely related to information retrieval, knowledge representation and computational linguistics, a subfield of linguistics.\nMajor tasks in natural language processing are speech recognition, text classification, natural-language understanding, and natural-language generation.\n\n\n== History ==\n\nNatural language processing has its roots in the 1950s. Already in 1950, Alan Turing published an article titled "Computing Machinery and Intelligence" which proposed what is now called the Turing test as a cri

In [38]:
documents[0].get_content()

'Natural language processing (NLP) is a subfield of computer science and especially artificial intelligence. It is primarily concerned with providing computers with the ability to process data encoded in natural language and is thus closely related to information retrieval, knowledge representation and computational linguistics, a subfield of linguistics.\nMajor tasks in natural language processing are speech recognition, text classification, natural-language understanding, and natural-language generation.\n\n\n== History ==\n\nNatural language processing has its roots in the 1950s. Already in 1950, Alan Turing published an article titled "Computing Machinery and Intelligence" which proposed what is now called the Turing test as a criterion of intelligence, though at the time that was not articulated as a problem separate from artificial intelligence. The proposed test includes a task that involves the automated interpretation and generation of natural language.\n\n\n=== Symbolic NLP (

In [39]:
from llama_index.core.node_parser import (
    SentenceSplitter,
    SemanticSplitterNodeParser,
)
from llama_index.embeddings.openai import OpenAIEmbedding

import os

In [40]:
embed_model = OpenAIEmbedding()
splitter = SemanticSplitterNodeParser(
    buffer_size=1, breakpoint_percentile_threshold=95, embed_model=embed_model
)

# also baseline splitter
base_splitter = SentenceSplitter(chunk_size=512)

nodes = splitter.get_nodes_from_documents(documents)

DEBUG:openai._base_client:Request options: {'method': 'post', 'url': '/embeddings', 'files': None, 'post_parser': <function Embeddings.create.<locals>.parser at 0x7083cd6980d0>, 'json_data': {'input': ['Natural language processing (NLP) is a subfield of computer science and especially artificial intelligence. It is primarily concerned with providing computers with the ability to process data encoded in natural language and is thus closely related to information retrieval, knowledge representation and computational linguistics, a subfield of linguistics. ', 'Natural language processing (NLP) is a subfield of computer science and especially artificial intelligence. It is primarily concerned with providing computers with the ability to process data encoded in natural language and is thus closely related to information retrieval, knowledge representation and computational linguistics, a subfield of linguistics. Major tasks in natural language processing are speech recognition, text classif

In [41]:
len(nodes)

42

In [44]:
nodes[1].get_content()

'The authors claimed that within three or five years, machine translation would be a solved problem.  However, real progress was much slower, and after the ALPAC report in 1966, which found that ten years of research had failed to fulfill the expectations, funding for machine translation was dramatically reduced. Little further research in machine translation was conducted in America (though some research continued elsewhere, such as Japan and Europe) until the late 1980s when the first statistical machine translation systems were developed.\n1960s: Some notably successful natural language processing systems developed in the 1960s were SHRDLU, a natural language system working in restricted "blocks worlds" with restricted vocabularies, and ELIZA, a simulation of a Rogerian psychotherapist, written by Joseph Weizenbaum between 1964 and 1966. Using almost no information about human thought or emotion, ELIZA sometimes provided a startlingly human-like interaction. When the "patient" excee