In [1]:
from dotenv import load_dotenv
load_dotenv() # get API key stored in local file, not committed

True

In [52]:
from langchain_openai import OpenAI
llm = OpenAI(model_name="gpt-3.5-turbo-instruct")
llm.invoke("You are creative. What does a github repo called oh-yay do?")

'\n\nOh-yay is a repository that contains a collection of uplifting and positive resources for personal and professional development. This may include inspirational quotes, self-care tips, career advice, and motivational articles. The goal of this repo is to spread joy and encourage personal growth among its users. It may also serve as a platform for individuals to share their own experiences and strategies for finding happiness and success.'

In [61]:
from langchain.document_loaders import WebBaseLoader  # wrapper over bs4
from langchain.prompts import PromptTemplate
from langchain.chains import (
    StuffDocumentsChain,
    LLMChain,
)


prompt_template = """Write a concise summary of the following content 
provide the output in bullent points:

{content}

Summary:
"""  # uses jinja2 formatting

prompt = PromptTemplate.from_template(prompt_template)

map_chain = LLMChain(prompt=prompt, llm=llm)

stuff_chain = StuffDocumentsChain(
    llm_chain=map_chain, document_variable_name="content"
)

url = "https://python.langchain.com/docs/get_started/introduction"
loader = WebBaseLoader(url)
data = loader.load()
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0)
docs = text_splitter.split_documents(data)

# this runs without hitting the document limit because the concatenated full document is less than the token limit
summary = stuff_chain.run(docs)
print(summary)

- LangChain is a framework for developing applications powered by large language models (LLMs).
- It simplifies every stage of the LLM application lifecycle, including development, productionization, and deployment.
- The framework consists of open-source libraries such as langchain-core, langchain-community, and partner packages for third-party integrations.
- The broader ecosystem includes LangSmith for debugging and monitoring, LangGraph for building multi-actor applications, and LangServe for deploying chains as APIs.
- The LangChain Expression Language (LCEL) is used to compose chains and is designed for production use.
- The framework provides standard and extendable interfaces for components and integrates with other tools.
- Security best practices are also provided.
- Guides and tutorials are available for use cases such as question answering and chatbots.
- The framework is available in Python and JavaScript versions.


In [62]:
len(docs)

11

In [67]:
url = "https://en.wikipedia.org/wiki/Transformer_(deep_learning_architecture)" # larger document
loader = WebBaseLoader(url)
data = loader.load()

# Split
from langchain_text_splitters import RecursiveCharacterTextSplitter
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0)
docs = text_splitter.split_documents(data)

map_template = """Write a concise summary of the following content:

{content}

Summary:
"""
map_prompt = PromptTemplate.from_template(map_template)
map_chain = LLMChain(prompt=map_prompt, llm=llm)

reduce_template = """The following is set of summaries:

{doc_summaries}

Summarize the above summaries with all the key details
Summary:"""
reduce_prompt = PromptTemplate.from_template(reduce_template)
reduce_chain = LLMChain(prompt=reduce_prompt, llm=llm)
stuff_chain = StuffDocumentsChain(
    llm_chain=reduce_chain, document_variable_name="doc_summaries"
)

reduce_chain = ReduceDocumentsChain(
    combine_documents_chain=stuff_chain,
)
summary = map_reduce_chain.run(docs)
print(summary)

- Transformer is a powerful deep learning architecture for natural language processing, proposed in 2017.
- It utilizes self-attention mechanism and has been used in tasks such as machine translation and text summarization.
- Transformer has outperformed previous models in benchmarks and has been adapted for other domains.
- Different approaches and techniques have been explored to improve its performance.
- The article covers topics related to machine learning, such as structured prediction and artificial neural networks.
- It discusses the timeline, training, applications, implementations, and architecture of Transformer.
- Alternative activation functions, positional encodings, and efficient implementation techniques have been explored.
- Different types of attention mechanisms have been used, such as Multi-Query Attention and FlashAttention.
- The article also covers the use of Transformer in other domains, such as speech recognition and image recognition.
- It discusses the use of

In [68]:
len(docs)

176