<a href="https://colab.research.google.com/github/athapa785/LLM_4_Biz_Stanford/blob/main/aditya_thapa_llm4biz_homework_2_summarize_w_langchain_ipynb.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# [Aditya Thapa](https://www.linkedin.com/in/aditya-thapa-56442b11b/)
SLAC National Accelerator Laboratory | Stanford University

# Homework 2
## LLM for Biz with Python
Stanford Continuing Education
TECH 16

This project utilizes Large Language Models (LLMs), specifically OpenAI's GPT models, in the Google Colab environment. It employs the LangChain framework to enhance LLM capabilities for various Natural Language Processing (NLP) tasks, demonstrating its effectiveness for diverse applications.

Key explorations include:

- **Document summarization:** Summarizing the research paper "Attention is All You Need" by comparing LangChain's "stuff" and "map_reduce" methods with tailored prompts for refinement.
- **Language translation:** Creating a translation tool to convert English text into multiple languages, providing both native script and roman transliteration, using LangChain's prompt engineering.
- **Creative text generation:** Developing a pet name generator with LangChain to create unique names based on pet characteristics.

In [563]:
# Initialize key and client

from openai import OpenAI
from google.colab import userdata

open_ai_key = userdata.get('open_ai_key')

client = OpenAI(api_key=open_ai_key)

In [564]:
# Grab "Attention is all you need"
!wget https://arxiv.org/pdf/1706.03762

--2025-02-08 06:17:31--  https://arxiv.org/pdf/1706.03762
Resolving arxiv.org (arxiv.org)... 151.101.67.42, 151.101.131.42, 151.101.3.42, ...
Connecting to arxiv.org (arxiv.org)|151.101.67.42|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2215244 (2.1M) [application/pdf]
Saving to: ‘1706.03762.3’


2025-02-08 06:17:32 (172 MB/s) - ‘1706.03762.3’ saved [2215244/2215244]



In [565]:
%%capture
!pip install -U langchain-community pypdf langchain-openai

In [566]:
from langchain.document_loaders import PyPDFLoader

In [567]:
loader = PyPDFLoader("1706.03762")
pages = loader.load_and_split()

In [568]:
from langchain.chains.summarize import load_summarize_chain
from langchain_community.document_loaders import WebBaseLoader
from langchain_openai import ChatOpenAI

In [569]:
from IPython.display import Markdown

In [570]:
llm = ChatOpenAI(temperature=0.1, model_name="gpt-4-turbo-preview", api_key=open_ai_key)

## Chain type: "stuff"

In [572]:
%%time
chain_stuff = load_summarize_chain(llm, chain_type="stuff")
resp_stuff = chain_stuff.invoke(pages)

CPU times: user 54.8 ms, sys: 3.32 ms, total: 58.1 ms
Wall time: 5.86 s


In [573]:
Markdown(resp_stuff["output_text"])

The paper "Attention Is All You Need" by Ashish Vaswani and colleagues at Google introduces the Transformer, a novel neural network architecture that relies entirely on an attention mechanism, eliminating the need for recurrent or convolutional layers in sequence transduction models. This architecture allows for significantly more parallelization, reducing training time and improving performance on machine translation tasks. The Transformer achieves state-of-the-art results on the WMT 2014 English-to-German and English-to-French translation tasks, outperforming existing models and ensembles with less training time. The model's design includes multi-head self-attention, positional encodings, and a novel scaling method for the dot-product attention mechanism. The paper demonstrates the Transformer's effectiveness not only in translation but also in English constituency parsing, suggesting its potential applicability to a wide range of sequence modeling problems. The research contributes to the understanding of attention mechanisms and their central role in neural network architectures for natural language processing.

## Chain type: "map_reduce"

In [574]:
%%time
chain_map_reduce = load_summarize_chain(llm, chain_type="map_reduce")
resp_map_reduce = chain_map_reduce.invoke(pages)

CPU times: user 680 ms, sys: 75.5 ms, total: 755 ms
Wall time: 1min 27s


Noticed that map_reduce takes much longer.

In [575]:
Markdown(resp_map_reduce["output_text"])

The paper "Attention Is All You Need" by Ashish Vaswani et al. introduces the Transformer, a novel neural network architecture that exclusively uses attention mechanisms, eliminating recurrent and convolutional layers. This model achieves state-of-the-art performance in machine translation, notably on the WMT 2014 English-to-German and English-to-French tasks, and shows potential for generalization to other tasks like English constituency parsing. The Transformer's efficiency and performance are attributed to its unique structure, which allows for greater parallelization and includes innovations such as multi-head attention and positional encoding. This work, presented at NIPS 2017, marks a significant advancement in sequence modeling and has implications for future research in deep learning and natural language processing (NLP). Additionally, the document discusses broader advancements in NLP and deep learning from 2013 to 2017, covering developments in RNNs, CNNs, machine translation, and neural network optimization, reflecting the field's progression towards more sophisticated models for understanding human language.

## Let's try prompting the model to format the summary text per our preferences.

In [576]:
from langchain.chains.combine_documents.stuff import StuffDocumentsChain
from langchain.chains.llm import LLMChain
from langchain.prompts import PromptTemplate

In [592]:
# Define prompt template
prompt_template = """Write a concise summary in a maximum of 3 bullets of the following text enclosed within three backticks:
```{text}```
CONCISE SUMMARY:"""
prompt = PromptTemplate.from_template(prompt_template)

In [593]:
# Define LLM chain
llm_chain = LLMChain(llm=llm, prompt=prompt)

In [594]:
%%time
# Define StuffDocumentsChain
stuff_chain = StuffDocumentsChain(llm_chain=llm_chain, document_variable_name="text")

resp = stuff_chain.invoke(pages)

CPU times: user 70.2 ms, sys: 3.05 ms, total: 73.2 ms
Wall time: 6.52 s


In [595]:
Markdown(resp["output_text"])

- Google introduces the Transformer, a novel neural network architecture based solely on attention mechanisms, eliminating the need for recurrence and convolutions. This architecture achieves state-of-the-art results on the WMT 2014 English-to-German and English-to-French translation tasks, demonstrating superior quality, parallelizability, and training efficiency.
- The Transformer model utilizes self-attention layers to compute representations of its input and output without relying on sequence-aligned RNNs or convolution, facilitating significantly more parallelization. It also incorporates multi-head attention to allow the model to focus on different positions of the input sequence simultaneously.
- Experiments show that the Transformer model not only outperforms existing models in translation tasks but also generalizes well to other tasks like English constituency parsing, achieving competitive results with much less training time and computational resources.

## What if we prompt the LLM to compare "stuff" and "map_reduce" outputs?

In [601]:
compare_prompt_template = """You are an analytical thinker.
                            I will provide you two texts and you will compare the two texts.
                            The first text is the output of the stuff chain and the second text is the output of the map_reduce chain.
                            I want you to note down any differences between the style.
                            Also, compare the amount of information in the two texts.
                            Here are the texts:
                            ```{text1}```
                            ```{text2}```
                            You shoudld refer to the texts as "Stuff" and "Map Reduce".
                            Give a verdict of which chain is better.
                          """
compare_prompt = PromptTemplate.from_template(compare_prompt_template)

In [602]:
llm_chain_compare = LLMChain(llm=llm, prompt=compare_prompt)

In [603]:
text1 = resp_stuff["output_text"]
text2 = resp_map_reduce["output_text"]

In [604]:
%%time
response_compare = llm_chain_compare.invoke({"text1": text1, "text2": text2})

CPU times: user 131 ms, sys: 22.4 ms, total: 154 ms
Wall time: 20 s


In [605]:
Markdown(response_compare["text"])

Upon comparing the "Stuff" and "Map Reduce" texts, several differences in style and the amount of information presented become apparent.

### Style Differences

1. **Conciseness and Clarity**:
   - The "Stuff" text is more detailed and explanatory, providing a comprehensive overview of the Transformer model, its components, and its achievements. It uses technical language appropriate for an audience familiar with neural network architectures.
   - The "Map Reduce" text, while still informative, is more concise and slightly broader in its discussion. It mentions the Transformer's achievements and unique features but spends less time on each aspect. It also briefly touches on broader advancements in NLP and deep learning, providing context to the Transformer's development.

2. **Technical Detail**:
   - The "Stuff" text delves deeper into the specific components of the Transformer model, such as "multi-head self-attention," "positional encodings," and a "novel scaling method." This detail is valuable for readers seeking a deeper understanding of the model's inner workings.
   - The "Map Reduce" text mentions "multi-head attention" and "positional encoding" but does not elaborate on these components to the same extent. It also broadens the discussion to include the historical context of NLP and deep learning advancements.

3. **Author Attribution**:
   - The "Stuff" text attributes the paper to "Ashish Vaswani and colleagues at Google," providing a clear picture of the authors' affiliations.
   - The "Map Reduce" text uses "Ashish Vaswani et al.," a more traditional academic citation style, which is concise but less informative about the authors' affiliations.

### Amount of Information

- **Depth of Coverage**:
  - The "Stuff" text is more focused on the Transformer model itself, providing a detailed account of its features, performance, and potential applications. It is rich in specific details related to the model's architecture and achievements.
  - The "Map Reduce" text, while covering the Transformer, also allocates space to discuss the broader context of NLP and deep learning advancements. This broader scope introduces additional information but at the expense of some depth regarding the Transformer model.

- **Broader Context**:
  - The "Map Reduce" text provides a broader historical context, mentioning the progression of NLP and deep learning from 2013 to 2017. This context is valuable for readers interested in the evolution of the field but may dilute the focus on the Transformer model itself.
  - The "Stuff" text remains tightly focused on the Transformer, making it more suitable for readers primarily interested in the model and its direct implications.

### Verdict

The choice between the "Stuff" and "Map Reduce" chains depends on the reader's needs:

- If the goal is to gain a detailed understanding of the Transformer model, its architecture, and its specific contributions to NLP, the "Stuff" text is superior due to its depth and specificity.
- If the goal is to understand the Transformer within the broader context of NLP and deep learning advancements, the "Map Reduce" text is better as it provides a concise overview of the model while also discussing related developments in the field.

Neither chain is inherently better; their value depends on the reader's objectives.

# Extra credit

Let's explore some use cases of LLMs beyond summarization and comparison of texts.

We'll use the LangChain framework to prompt OpenAI's GPT models to do some interesting tasks for us.

## Translation tool

Inspiration: https://python.langchain.com/docs/tutorials/llm_chain/

In [608]:
from langchain_core.prompts import ChatPromptTemplate

In [609]:
llm_4o_conservative = ChatOpenAI(temperature=0, model_name="gpt-4o", api_key=open_ai_key)

In [610]:
system_template = """Translate the following from English to {language}.
                  Do not be verbose. Try to do an almost word for word translation while preserving the meaning.
                  If the script is not roman, it is absolutely essential for you to provide the roman transliteration in parenthesis after the translation in the native script.
                  Output the name of the language in boldface.
                  """

prompt_template = ChatPromptTemplate.from_messages(
    [("system", system_template), ("user", "{text}")]
)

In [611]:
text = "What are you doing?"
language = "Nepali, Hindi, Mandarin, Korean, Japanese, Spanish, Italian, Greek, Norwegian, Swahili"

In [612]:
prompt = prompt_template.invoke({"language": language, "text": text})

In [613]:
%%time
response = llm_4o_conservative.invoke(prompt)

CPU times: user 34.9 ms, sys: 2.31 ms, total: 37.2 ms
Wall time: 2.93 s


In [614]:
Markdown(response.content)

**Nepali:** के गर्दैछौ? (ke gardai chhau?)

**Hindi:** तुम क्या कर रहे हो? (tum kya kar rahe ho?)

**Mandarin:** 你在做什么？ (nǐ zài zuò shénme?)

**Korean:** 뭐 하고 있어요? (mwo hago isseoyo?)

**Japanese:** 何をしていますか？ (nani o shiteimasu ka?)

**Spanish:** ¿Qué estás haciendo?

**Italian:** Cosa stai facendo?

**Greek:** Τι κάνεις; (Ti káneis?)

**Norwegian:** Hva gjør du?

**Swahili:** Unafanya nini?

## Pet name generator?

Inspiration: https://www.youtube.com/watch?v=lG7Uxts9SXs

In [615]:
creative_llm = ChatOpenAI(temperature=1, model = "gpt-4-turbo-preview", openai_api_key=open_ai_key)

In [617]:
sys_temp_animal = """You are someone who comes up with creative names.
          I will give you the animal type, personality, and color of my pet.
          I want you to suggest me five cool names for my pet.
          Use one word names only.
          """

prompt_temp_animal = ChatPromptTemplate.from_messages(
    [("system", sys_temp_animal), ("user", "{animal_info}")]
)

In [618]:
animal_type = "cat"
pet_color = "brown"
personality = "happy"
animal_info = f"I have a {animal_type} with a {personality} personality, and it is {pet_color} in color."

In [619]:
prompt_animal = prompt_temp_animal.invoke({"animal_info": animal_info})

In [620]:
%%time
response_animal = creative_llm.invoke(prompt_animal)

CPU times: user 21.1 ms, sys: 0 ns, total: 21.1 ms
Wall time: 850 ms


In [621]:
Markdown(response_animal.content)

1. Mocha
2. Ember
3. Hazel
4. Sunny
5. Copper