<a href="https://colab.research.google.com/github/athapa785/LLM_4_Biz_Stanford/blob/main/aditya_thapa_llm4biz_homework_2_summarize_w_langchain_ipynb.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
# Initialize key and client

from openai import OpenAI
from google.colab import userdata

open_ai_key = userdata.get('open_ai_key')

client = OpenAI(api_key=open_ai_key)

In [None]:
# Grab "Attention is all you need"
!wget https://arxiv.org/pdf/1706.03762

--2025-02-08 05:22:39--  https://arxiv.org/pdf/1706.03762
Resolving arxiv.org (arxiv.org)... 151.101.67.42, 151.101.195.42, 151.101.3.42, ...
Connecting to arxiv.org (arxiv.org)|151.101.67.42|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2215244 (2.1M) [application/pdf]
Saving to: ‘1706.03762.2’


2025-02-08 05:22:39 (163 MB/s) - ‘1706.03762.2’ saved [2215244/2215244]



In [None]:
!pip install -U langchain-community pypdf langchain-openai



In [None]:
from langchain.document_loaders import PyPDFLoader

In [None]:
loader = PyPDFLoader("1706.03762")
pages = loader.load_and_split()

In [None]:
from langchain.chains.summarize import load_summarize_chain
from langchain_community.document_loaders import WebBaseLoader
from langchain_openai import ChatOpenAI

In [None]:
from IPython.display import Markdown

In [None]:
llm = ChatOpenAI(temperature=0.1, model_name="gpt-4-turbo-preview", api_key=open_ai_key)

In [None]:
chain_stuff = load_summarize_chain(llm, chain_type="stuff")
resp_stuff = chain_stuff.invoke(pages)

In [None]:
Markdown(resp_stuff["output_text"])

The paper "Attention Is All You Need" by Ashish Vaswani and colleagues at Google introduces the Transformer, a novel neural network architecture that relies entirely on an attention mechanism, eliminating the need for recurrent or convolutional layers in sequence transduction models. This architecture allows for significantly increased parallelization, reducing training times and achieving state-of-the-art results on machine translation tasks. The Transformer model achieved a BLEU score of 28.4 on the WMT 2014 English-to-German translation task and 41.8 on the English-to-French task, surpassing previous models and demonstrating superior quality and efficiency. The paper also explores the Transformer's application to English constituency parsing, showing its potential beyond translation tasks. The research highlights the benefits of self-attention mechanisms, including computational efficiency and the ability to capture long-range dependencies within sequences.

In [None]:
chain_map_reduce = load_summarize_chain(llm, chain_type="map_reduce")

resp_map_reduce = chain_map_reduce.invoke(pages)

Noticed that map_reduce takes much longer.

In [None]:
Markdown(resp_map_reduce["output_text"])

The paper "Attention Is All You Need" by Ashish Vaswani et al. introduces the Transformer, a groundbreaking network architecture that exclusively uses attention mechanisms, eliminating the need for recurrent or convolutional neural networks. This model sets new benchmarks in machine translation, with impressive BLEU scores for English-to-German and English-to-French translations, and showcases superior training efficiency and parallelization capabilities. The Transformer, composed of an encoder and decoder featuring multi-head self-attention and feed-forward networks, excels in sequence transduction tasks, surpassing previous models in both speed and performance. It also shows promise in applications beyond machine translation, such as English constituency parsing. Additionally, the paper explores the Transformer's encoder self-attention mechanism's role in anaphora resolution, highlighting its technical and potential philosophical implications in understanding justice and language processing.

Let's try prompt templates.

In [None]:
from langchain.chains.combine_documents.stuff import StuffDocumentsChain
from langchain.chains.llm import LLMChain
from langchain.prompts import PromptTemplate

In [None]:
# Define prompt template
prompt_template = """Write a concise summary in a maximum of 3 bullets of the following text enclosed within three backticks:
```{text}```
Include heading as summary of the title of the text.
CONCISE SUMMARY:"""
prompt = PromptTemplate.from_template(prompt_template)

In [None]:
# Define LLM chain
llm_chain = LLMChain(llm=llm, prompt=prompt)

In [None]:
# Define StuffDocumentsChain
stuff_chain = StuffDocumentsChain(llm_chain=llm_chain, document_variable_name="text")

resp = stuff_chain.invoke(pages)

In [None]:
Markdown(resp["output_text"])

### "Attention Is All You Need" by Ashish Vaswani et al., Google Brain and Google Research

- The paper introduces the Transformer, a novel neural network architecture based solely on attention mechanisms, eliminating the need for recurrence and convolutions in sequence transduction models. This architecture achieves state-of-the-art performance on machine translation tasks, outperforming existing models in both quality and training efficiency.
- Experiments demonstrate the Transformer's effectiveness, achieving new state-of-the-art results on the WMT 2014 English-to-German and English-to-French translation tasks, with significant improvements over previous models. The architecture allows for more parallelization, reducing training time.
- The Transformer also shows promising results in English constituency parsing, indicating its potential applicability beyond machine translation. The paper highlights the model's ability to learn long-range dependencies and its interpretability, with attention visualizations providing insights into the model's decision-making process.

## Translation tool

Inspiration: https://python.langchain.com/docs/tutorials/llm_chain/

In [530]:
from langchain_core.prompts import ChatPromptTemplate

In [531]:
llm_4o_conservative = ChatOpenAI(temperature=0, model_name="gpt-4o", api_key=open_ai_key)

In [532]:
system_template = """Translate the following from English to {language}.
                  Do not be verbose. Try to do an almost word for word translation while preserving the meaning.
                  If the script is not roman, it is absolutely essential for you to provide the roman transliteration in parenthesis after the translation in the native script.
                  Output the name of the language in boldface.
                  """

prompt_template = ChatPromptTemplate.from_messages(
    [("system", system_template), ("user", "{text}")]
)

In [533]:
text = "What are you doing?"
language = "Nepali, Hindi, Mandarin, Korean, Japanese, Spanish, Italian, Greek, Norwegian, Swahili"

In [534]:
prompt = prompt_template.invoke({"language": language, "text": text})

In [535]:
response = llm_4o_conservative.invoke(prompt)

In [536]:
Markdown(response.content)

**Nepali:** के गर्दैछौ? (ke gardai chhau?)

**Hindi:** तुम क्या कर रहे हो? (tum kya kar rahe ho?)

**Mandarin:** 你在做什么？ (nǐ zài zuò shénme?)

**Korean:** 뭐 하고 있어요? (mwo hago isseoyo?)

**Japanese:** 何をしていますか？ (nani o shiteimasu ka?)

**Spanish:** ¿Qué estás haciendo?

**Italian:** Cosa stai facendo?

**Greek:** Τι κάνεις; (Ti káneis?)

**Norwegian:** Hva gjør du?

**Swahili:** Unafanya nini?

## Pet name generator?

Inspiration: https://www.youtube.com/watch?v=lG7Uxts9SXs

In [537]:
creative_llm = ChatOpenAI(temperature=1, model = "gpt-4-turbo-preview", openai_api_key=open_ai_key)

In [538]:
sys_temp_animal = """You are someone who comes up with creative names.
          I will give you the animal type, personality, and color of my pet.
          I want you to suggest me five cool names for my pet.
          Use one word names only.
          """

prompt_temp_animal = ChatPromptTemplate.from_messages(
    [("system", sys_temp_animal), ("user", "{animal_info}")]
)

In [539]:
animal_type = "cat"
pet_color = "brown"
personality = "happy"
animal_info = f"I have a {animal_type} with a {personality} personality, and it is {pet_color} in color."

In [540]:
prompt_animal = prompt_temp_animal.invoke({"animal_info": animal_info})

In [541]:
response_animal = creative_llm.invoke(prompt_animal)

In [542]:
Markdown(response_animal.content)

1. Mocha
2. Chipper
3. Cider
4. Maple
5. Hazel