!pip install openai
!pip install langchain
!pip install langchain_experimental
!pip install "langchain[docarray]"
!pip install tiktoken
!pip install chromadb
!pip install numexpr
!pip install wikipedia

In [1]:
import os, sys
sys.path.insert(0, './')
import openai
from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv())

openai.api_key = os.environ['OPENAI_API_KEY']

# OpenAI API

In [2]:
def get_completion(prompt, model='gpt-3.5-turbo'):
    messages = [{'role': 'user', 'content': prompt}]
    response = openai.chat.completions.create(
        model=model,
        messages=messages,
        temperature=0
    )
    return response.choices[0].message.content

In [3]:
text = "Duke University is a private institution that was founded \
    in 1838. It has a total undergraduate enrollment of 6,640 (fall 2022), \
    and the campus size is 8,693 acres. It utilizes a semester-based academic \
    calendar. Duke University's ranking in the 2024 edition of Best Colleges \
    is National Universities, #7."
prompt = f"Please summarize the text below: {text}"
response = get_completion(prompt)
print(response)

Duke University is a private institution founded in 1838. It has around 6,640 undergraduate students and a campus size of 8,693 acres. The university follows a semester-based academic calendar. In the 2024 edition of Best Colleges, Duke University is ranked #7 among National Universities.


# Langchain

## 1. Prompt Template

In [None]:
from langchain.chat_models import ChatOpenAI
from langchain.prompts import ChatPromptTemplate

import os
API_KEY = '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'
os.environ['OPENAI_API_KEY'] = API_KEY

In [None]:
template_string = """
Translate the following text into {language}: \
```{text}```"
"""
chat = ChatOpenAI(temperature=0.0)
prompt_template = ChatPromptTemplate.from_template(template_string)
prompt_template.messages[0].prompt.input_variables

['language', 'text']

In [None]:
language = "Spanish"
text = "Duke University is a private institution that was founded \
    in 1838. It has a total undergraduate enrollment of 6,640 (fall 2022), \
    and the campus size is 8,693 acres. It utilizes a semester-based academic \
    calendar. Duke University's ranking in the 2024 edition of Best Colleges \
    is National Universities, #7."

prompt_message = prompt_template.format_messages(
    language=language, text=text
)
response = chat(prompt_message)
print("Spanish: \n" + response.content)

language = "Mandarin"
prompt_message = prompt_template.format_messages(
    language=language, text=text
)
response = chat(prompt_message)
print("Mandarin: \n" + response.content)

Spanish: 
La Universidad de Duke es una institución privada que fue fundada en 1838. Tiene un total de 6,640 estudiantes de pregrado (otoño 2022) y el tamaño del campus es de 8,693 acres. Utiliza un calendario académico basado en semestres. La clasificación de la Universidad de Duke en la edición 2024 de los Mejores Colegios es Universidades Nacionales, #7.
Mandarin: 
杜克大学是一所私立学府，成立于1838年。它的本科生总人数为6,640人（2022年秋季），校园占地8,693英亩。学校采用学期制的学术日历。根据2024年《最佳大学》排名，杜克大学在全国大学中排名第7位。


## 2. Output Parser

In [None]:
from langchain.output_parsers import ResponseSchema, StructuredOutputParser

import os
API_KEY = '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'
os.environ['OPENAI_API_KEY'] = API_KEY

In [None]:
name_schema = ResponseSchema(
    name="name",
    description="the name of the institution"
)
year_schema = ResponseSchema(
    name="year",
    description="the year when the institution was established"
)
num_students_schema = ResponseSchema(
    name="num_students",
    description="number of undergraduate students enrolled"
)
area_schema = ResponseSchema(
    name="area",
    description="the total area of the campus in the unit of acres"
)

response_schemas = [name_schema, year_schema, num_students_schema, area_schema]
output_parser = StructuredOutputParser.from_response_schemas(response_schemas)
format_instructions = output_parser.get_format_instructions()
print(format_instructions)

The output should be a markdown code snippet formatted in the following schema, including the leading and trailing "```json" and "```":

```json
{
	"name": string  // the name of the institution
	"year": string  // the year when the institution was established
	"num_students": string  // number of undergraduate students enrolled
	"area": string  // the total area of the campus in the unit of acres
}
```


In [None]:
template_string = """
For the following text, extract the following information:

name: the name of the institution
year: the year when the institution was established
num_students: number of undergraduate students enrolled
area: the total area of the campus in the unit of acres

text: {text}

format instructions: {format_instructions}
"""
prompt_template = ChatPromptTemplate.from_template(template_string)

text = "Duke University is a private institution that was founded \
    in 1838. It has a total undergraduate enrollment of 6,640 (fall 2022), \
    and the campus size is 8,693 acres. It utilizes a semester-based academic \
    calendar. Duke University's ranking in the 2024 edition of Best Colleges \
    is National Universities, #7."

prompt_message = prompt_template.format_messages(
    text=text,
    format_instructions=format_instructions
)
response = chat(prompt_message)
output_dict = output_parser.parse(response.content)
print(type(output_dict))
print(output_dict)

<class 'dict'>
{'name': 'Duke University', 'year': '1838', 'num_students': '6,640', 'area': '8,693 acres'}


## 3. Conversation Chain and Memory

In [None]:
from langchain.chat_models import ChatOpenAI
from langchain.chains import ConversationChain
from langchain.memory import ConversationBufferMemory
from langchain.memory import ConversationBufferWindowMemory
from langchain.memory import ConversationTokenBufferMemory   # Similar to window buffer memory
from langchain.memory import ConversationSummaryBufferMemory

import os
API_KEY = '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'
os.environ['OPENAI_API_KEY'] = API_KEY

In [None]:
llm = ChatOpenAI(temperature=0.0)
memory = ConversationBufferMemory()
conversation = ConversationChain(
    llm=llm, memory=memory, verbose=False
)

In [None]:
conversation.predict(input="Hello, this is Gary.")

'Hello Gary! How can I assist you today?'

In [None]:
conversation.predict(input="How can I print out Hello World in Python?")

'To print out "Hello World" in Python, you can use the print() function. Simply type print("Hello World") and run the code. This will display "Hello World" in the console.'

In [None]:
conversation.predict(input="What is my name?")

'Your name is Gary.'

In [None]:
print(memory.buffer)

Human: Hello, this is Gary.
AI: Hello Gary! How can I assist you today?
Human: How can I print out Hello World in Python?
AI: To print out "Hello World" in Python, you can use the print() function. Simply type print("Hello World") and run the code. This will display "Hello World" in the console.
Human: What is my name?
AI: Your name is Gary.


In [None]:
memory.save_context({"input": "Hi"}, {"output": "What's up?"})

In [None]:
print(memory.buffer)

Human: Hello, this is Gary.
AI: Hello Gary! How can I assist you today?
Human: How can I print out Hello World in Python?
AI: To print out "Hello World" in Python, you can use the print() function. Simply type print("Hello World") and run the code. This will display "Hello World" in the console.
Human: What is my name?
AI: Your name is Gary.
Human: Hi
AI: What's up?


### Window Buffer Memory

In [None]:
# Only 1 round of conversation will be memorized
memory = ConversationBufferWindowMemory(k=1)
conversation = ConversationChain(
    llm=llm, memory=memory, verbose=False
)
conversation.predict(input="Hello, this is Gary.")

'Hello Gary! How can I assist you today?'

In [None]:
conversation.predict(input="What is 1 + 1?")

'1 + 1 equals 2.'

In [None]:
conversation.predict(input="What is my name?")

"I'm sorry, but I don't have access to personal information about individuals unless it has been shared with me in the course of our conversation."

### Summary Buffer Memory

In [None]:
schedule = "Hey! As a virtual college student, my schedule isn't as dynamic as yours might be, \
            but let's pretend it's a typical day. I might have a few classes, study sessions, \
            and maybe some extracurricular activities. Right now, I'm free—what's on your mind? \
            Anything exciting happening on your end?"

memory = ConversationSummaryBufferMemory(llm=llm, max_token_limit=70)
memory.save_context({"input": "Hi"}, {"output": "What's up?"})
memory.save_context({"input": "What is your schedule today?"}, {"output": f"{schedule}"})

In [None]:
memory.load_memory_variables({})

{'history': "System: The human greets the AI and asks about its schedule. The AI explains that as a virtual college student, its schedule is not as dynamic as the human's, but it might have classes, study sessions, and extracurricular activities. The AI is currently free and asks the human if anything exciting is happening on their end."}

In [None]:
conversation = ConversationChain(
    llm=llm, memory=memory, verbose=False
)
conversation.predict(input="When is your next class?")

"As a virtual college student, my schedule is not fixed like a traditional student's. However, I do have classes throughout the week. My next class is tomorrow at 10:00 AM. It's a computer science lecture on artificial intelligence algorithms. I'm quite excited about it! Is there anything exciting happening on your end?"

In [None]:
conversation.predict(input="I'm excited about the upcoming Christmas holiday.")

"That's great to hear! Christmas is a wonderful time of the year. Do you have any special plans or traditions for the holiday?"

In [None]:
memory.load_memory_variables({})   # Note that it stores a "summary" of the human-AI conversation

{'history': "System: The human greets the AI and asks about its schedule. The AI explains that as a virtual college student, its schedule is not as dynamic as the human's, but it might have classes, study sessions, and extracurricular activities. The AI is currently free and asks the human if anything exciting is happening on their end. The AI mentions that its next class is a computer science lecture on artificial intelligence algorithms tomorrow at 10:00 AM and expresses excitement about it. The AI then asks the human if there is anything exciting happening on their end.\nHuman: I'm excited about the upcoming Christmas holiday.\nAI: That's great to hear! Christmas is a wonderful time of the year. Do you have any special plans or traditions for the holiday?"}

## 4. Chains

In [None]:
import os
import pandas as pd
from langchain.chat_models import ChatOpenAI
from langchain.prompts import ChatPromptTemplate
from langchain.chains import LLMChain

import os
API_KEY = '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'
os.environ['OPENAI_API_KEY'] = API_KEY
df = pd.read_csv('./IMDB_data_sub.csv')
df.head(5)

Unnamed: 0,review,sentiment
0,One of the other reviewers has mentioned that ...,positive
1,A wonderful little production. <br /><br />The...,positive
2,I thought this was a wonderful way to spend ti...,positive
3,Basically there's a family where a little boy ...,negative
4,"Petter Mattei's ""Love in the Time of Money"" is...",positive


### LLM Chain

In [None]:
template_string = "What is the sentiment of the {review}?"
llm = ChatOpenAI(temperature=0.9)   # temperature = 0.9 means more randomness
prompt = ChatPromptTemplate.from_template(template_string)
chain = LLMChain(llm=llm, prompt=prompt)

review = df['review'][0]
chain.run(review)

'The sentiment of this passage is positive. The reviewer expresses their enthusiasm for the show "Oz" and describes how they were hooked on it after watching just one episode. They praise the show for its brutal and unflinching portrayal of violence and its willingness to tackle taboo subjects such as drugs, sex, and violence. The reviewer also appreciates the show\'s lack of prettiness and charm, stating that it goes where other shows wouldn\'t dare. Despite the unsettling nature of the content, the reviewer admits to developing a taste for the show and becoming comfortable with its graphic violence and portrayal of injustice.'

### Simple Sequential Chain

In [None]:
from langchain.chains import SimpleSequentialChain

llm = ChatOpenAI(temperature=0.9)
prompt_1 = ChatPromptTemplate.from_template(
    """What is the product contained in this review? \
    {review}"""
)
chain_1 = LLMChain(llm=llm, prompt=prompt_1)
prompt_2 = ChatPromptTemplate.from_template(
    """Write a 20-word summary for the following product: \
    {product}"""
)
chain_2 = LLMChain(llm=llm, prompt=prompt_2)
chain_seq = SimpleSequentialChain(
    chains=[chain_1, chain_2], verbose=True
)

review = df['review'][0]
chain_seq.run(review)



[1m> Entering new SimpleSequentialChain chain...[0m
[36;1m[1;3mThe product contained in this review is the TV show "Oz".[0m
[33;1m[1;3m"Oz" is a TV show that delves into the dark and gritty world of a maximum-security prison.[0m

[1m> Finished chain.[0m


'"Oz" is a TV show that delves into the dark and gritty world of a maximum-security prison.'

### Regular Sequential Chain

Regular sequential chains can be used for multiple inputs and multiple outputs.

In [None]:
from langchain.chains import SequentialChain

prompt_1 = ChatPromptTemplate.from_template(
    """Can you translate the following review into {language}? \
    {review}"""
)
chain_1 = LLMChain(llm=llm, prompt=prompt_1, output_key='translated_review')
prompt_2 = ChatPromptTemplate.from_template(
    """Can you write a one-sentence summary for the following review in {language}? \
    {translated_review}"""
)
chain_2 = LLMChain(llm=llm, prompt=prompt_2, output_key='summary')
prompt_3 = ChatPromptTemplate.from_template(
    """Write a follow up response to the following summary in the specified language: \
    \n\nSummary: {summary}\n\nLanguage: {language}"""
)
chain_3 = LLMChain(llm=llm, prompt=prompt_3, output_key='followup_message')
chain_seq = SequentialChain(
    chains=[chain_1, chain_2, chain_3],
    input_variables=['review', 'language'],
    output_variables=['translated_review', 'summary', 'followup_message'],
    verbose=True
)

review, language = df['review'][0], 'Mandarin'
chain_seq({
    'review': review, 'language': language
})



[1m> Entering new SequentialChain chain...[0m

[1m> Finished chain.[0m


{'review': "One of the other reviewers has mentioned that after watching just 1 Oz episode you'll be hooked. They are right, as this is exactly what happened with me.<br /><br />The first thing that struck me about Oz was its brutality and unflinching scenes of violence, which set in right from the word GO. Trust me, this is not a show for the faint hearted or timid. This show pulls no punches with regards to drugs, sex or violence. Its is hardcore, in the classic use of the word.<br /><br />It is called OZ as that is the nickname given to the Oswald Maximum Security State Penitentary. It focuses mainly on Emerald City, an experimental section of the prison where all the cells have glass fronts and face inwards, so privacy is not high on the agenda. Em City is home to many..Aryans, Muslims, gangstas, Latinos, Christians, Italians, Irish and more....so scuffles, death stares, dodgy dealings and shady agreements are never far away.<br /><br />I would say the main appeal of the show is du

### Router Chain

Decide which subchain (path) we should go with based on the input type

In [None]:
from langchain.chat_models import ChatOpenAI
from langchain.chains.router import MultiPromptChain
from langchain.chains.router.llm_router import LLMRouterChain, RouterOutputParser
from langchain.prompts import PromptTemplate
from langchain.chains.router.multi_prompt_prompt import MULTI_PROMPT_ROUTER_TEMPLATE

positive_template = """Please write a brief thank-you note to the following review:\n
Review:\n\n{input}"""
negative_template = """Please make an apology in response to the following review:\n
Review:\n\n{input}"""
prompt_infos = [
    {
        'name': 'positive_review',
        'description': 'Good for writing feedback in response to a positive review',
        'template': positive_template
    },
    {
        'name': 'negative_review',
        'description': 'Good for writing feedback in response to a negative review',
        'template': negative_template
    }
]

In [None]:
# Define specific destination chains
llm = ChatOpenAI(temperature=0.9)
destination_chains = {}
for p_info in prompt_infos:
    name = p_info['name']
    prompt_template = p_info['template']
    prompt = ChatPromptTemplate.from_template(template=prompt_template)
    chain = LLMChain(llm=llm, prompt=prompt)
    destination_chains[name] = chain

# Description for each destination chain
destinations = [f"{p['name']}: {p['description']}" for p in prompt_infos]
destinations_str = "\n".join(destinations)
print(destinations_str)

positive_review: Good for writing feedback in response to a positive review
negative_review: Good for writing feedback in response to a negative review


In [None]:
default_prompt = ChatPromptTemplate.from_template(
    """Please write a one-sentence summary for the following review:\n
    Review:\n\n{input}"""
)
default_chain = LLMChain(llm=llm, prompt=default_prompt)

In [None]:
router_template = MULTI_PROMPT_ROUTER_TEMPLATE.format(destinations=destinations_str)
router_prompt = PromptTemplate(
    template=router_template,
    input_variables=["input"],
    output_parser=RouterOutputParser()
)
router_chain = LLMRouterChain.from_llm(llm, router_prompt)
chain = MultiPromptChain(
    router_chain=router_chain,
    destination_chains=destination_chains,
    default_chain=default_chain,
    verbose=True
)

In [None]:
review_1 = df['review'][0]   # Positive review
chain.run(review_1)



[1m> Entering new MultiPromptChain chain...[0m




positive_review: {'input': "One of the other reviewers has mentioned that after watching just 1 Oz episode you'll be hooked. They are right, as this is exactly what happened with me.\n\nThe first thing that struck me about Oz was its brutality and unflinching scenes of violence, which set in right from the word GO. Trust me, this is not a show for the faint hearted or timid. This show pulls no punches with regards to drugs, sex or violence. Its is hardcore, in the classic use of the word.\n\nIt is called OZ as that is the nickname given to the Oswald Maximum Security State Penitentary. It focuses mainly on Emerald City, an experimental section of the prison where all the cells have glass fronts and face inwards, so privacy is not high on the agenda. Em City is home to many..Aryans, Muslims, gangstas, Latinos, Christians, Italians, Irish and more....so scuffles, death stares, dodgy dealings and shady agreements are never far away.\n\nI would say the main appeal of the show is due to the

"Dear Reviewer,\n\nI wanted to take a moment to express my sincere gratitude for your thoughtful and insightful review of the show Oz. Your review captured the essence of the show perfectly and really resonated with me.\n\nI completely agree with your observation about the show's brutality and unapologetic portrayal of violence. It is refreshing to see a series that does not shy away from the darker aspects of life. The way the show explores the dynamics within the prison, highlighting the various gangs and clashes between different groups, is truly gripping.\n\nYour words about how Oz goes where other shows wouldn't dare really struck a chord with me. The raw and unfiltered depiction of life in prison is both unsettling and captivating at the same time. While it may not be for everyone, I appreciate how the show challenges its viewers to confront their own discomfort and explore their darker side.\n\nThank you once again for sharing your perspective on Oz. Your review has further soli

In [None]:
review_2 = df['review'][3]   # Negative review
chain.run(review_2)



[1m> Entering new MultiPromptChain chain...[0m




negative_review: {'input': "Basically there's a family where a little boy (Jake) thinks there's a zombie in his closet & his parents are fighting all the time.\n\nThis movie is slower than a soap opera... and suddenly, Jake decides to become Rambo and kill the zombie.\n\nOK, first of all when you're going to make a film you must Decide if its a thriller or a drama! As a drama the movie is watchable. Parents are divorcing & arguing like in real life. And then we have Jake with his closet which totally ruins all the film! I expected to see a BOOGEYMAN similar movie, and instead i watched a drama with some meaningless thriller spots.\n\n3 out of 10 just for the well playing parents & descent dialogs. As for the shots with Jake: just ignore them."}
[1m> Finished chain.[0m


"Dear [Reviewer],\n\nThank you for taking the time to share your thoughts on our film. We sincerely apologize for not meeting your expectations and for the disappointment you experienced while watching it.\n\nWe understand your frustration regarding the mixed genre elements in the movie. You rightly point out that the combination of drama and thriller did not work effectively, and the sudden transition from a family drama to a thrilling plot involving Jake's closet was jarring. We apologize for not properly establishing the tone and genre of the film, as it seems to have resulted in a disconnect between your expectations and the actual viewing experience.\n\nYour feedback regarding the well-playing parents and descent dialogs is greatly appreciated. We are glad that those aspects resonated with you and provided some positive elements to the overall movie. However, we acknowledge that the shots involving Jake did not add value to the film and could have been handled better.\n\nWe take y

## 5. Question Answering

In [2]:
from langchain.chat_models import ChatOpenAI
from langchain.chains import RetrievalQA
from langchain.document_loaders import CSVLoader
from langchain.vectorstores import DocArrayInMemorySearch
from langchain.indexes import VectorstoreIndexCreator
from langchain.embeddings import OpenAIEmbeddings
from IPython.display import display, Markdown

import os
API_KEY = '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'
os.environ['OPENAI_API_KEY'] = API_KEY
# os.environ['KMP_DUPLICATE_LIB_OK'] = 'True'   # Might cause some issues

In [3]:
file_path = './IMDB_data_sub.csv'
loader = CSVLoader(file_path)
index = VectorstoreIndexCreator(
    vectorstore_cls=DocArrayInMemorySearch,
).from_loaders([loader])

In [4]:
query = """Please list all the reviews involving \'Shakespeare\' in a table
        in Markdown and summarize them in one sentence."""
response = index.query(query)
display(Markdown(response))



| Review | Sentiment |
| --- | --- |
| The cast played Shakespeare. | Negative |
| I appreciate that this is trying to bring Shakespeare to the masses, but why ruin something so good. | Negative |
| Is it because 'The Scottish Play' is my favorite Shakespeare? I do not know. | Negative |
| What I do know is that a certain Rev Bowdler (hence bowdlerization) tried to do something similar in the Victorian era. | Negative |
| In other words, you cannot improve perfection. | Negative |
| If you wish to see Shakespeare's masterpiece in its entirety, I suggest you find this BBC version. | Positive |
| Every film version of "Hamlet" has tinkered with its structure. | Positive |
| Jacoby is able to pull all of these aspects of Hamlet's character together with the aid of Shakespeare's full script. | Positive |
| This movie illustrates like no other the state of the Australian film industry and everything that's holding it back. | Negative |
| An "adaptation" of sorts, it brought nothing new to Macbeth. | Negative |
| If there's one body of work that has been done (and done and done and done), it's Shakespeare's.

In [5]:
# Create a vector database for embeddings of the tokens in the document
docs = loader.load()
embeddings = OpenAIEmbeddings()
db = DocArrayInMemorySearch.from_documents(docs, embeddings)

# Similarity search
query = "Please list all the reviews involving \'Shakespeare\'."
results = db.similarity_search(query)
print(len(results))

4


In [9]:
# Question answering with Retriever
retriever = db.as_retriever()
llm = ChatOpenAI(temperature=0.9)
qdocs = "".join([results[i].page_content for i in range(len(results))])
response = llm.call_as_llm(f"{qdocs} Question: Please list all the reviews involving \'Shakespeare\' in \
a table in Markdown and summarize them in one sentence.")
display(Markdown(response))

| Review | Sentiment |
|---|---|
| The cast played Shakespeare. Shakespeare lost. I appreciate that this is trying to bring Shakespeare to the masses, but why ruin something so good. Is it because 'The Scottish Play' is my favorite Shakespeare? I do not know. What I do know is that a certain Rev Bowdler (hence bowdlerization) tried to do something similar in the Victorian era. In other words, you cannot improve perfection. I have no more to write but as I have to write at least ten lines of text (and English composition was never my forte I will just have to keep going and say that this movie, as the saying goes, just does not cut it. | Negative |
| If you wish to see Shakespeare's masterpiece in its entirety, I suggest you find this BBC version. Indeed it is overlong at four and a half hours but Jacoby's performance as Hamlet and Patrick Stewart's as Claudius are well worth the effort. It never ceases to amaze me how clear "Hamlet" is when you see it in its length and order as set down by the Bard. Every film version of "Hamlet" has tinkered with its structure. Olivier concentrated on Hamlet's indecision, Gibson on his passions. Jacoby is able to pull all of these aspects of Hamlet's character together with the aid of Shakespeare's full script. Why does Hamlet not kill Claudius immediately? Hamlet says "I am very proud, revengeful, ambitious..." Hamlet is extremely upset, not only for his father's death (and suspected murder), or his mother's marriage to his uncle, but also, and mostly, because Claudius has usurped the throne belonging to Hamlet. He is furious at his mother for marrying Claudius (marriages between royal kin is not unknown; done for political reasons) but that her marriage solidified Claudius' claim to the throne before he could return from Wittenburg to claim it for himself. He is, therefore, impotent to do anything about it. And this is true even after he hears his father's ghost cry vengeance. He cannot simply kill the King or he will lose the throne in doing so. He must "out" the King's secret and here is the tragedy! At the moment Hamlet is successful in displaying Claudius' guilt in public, he has opportunity to kill him and does not. WHY? He wants it ALL! He wants revenge, the throne AND the damnation of Claudius' soul in hell. Hamlet OVERREACHES himself in classic tragic form. His own HUBRIS is his undoing. He kills Polonius thinking it is Claudius and the rest of the play spirals down to the final deaths of Rosencrantz, Guildenstern, Ophelia, Laertes, Gertrude, Claudius and Hamlet himself. | Positive |
| This movie illustrates like no other the state of the Australian film industry and everything that's holding it back. Awesome talent, outstanding performances (particularly by Victoria Hill), but a let down in practically every other way. An "adaptation" of sorts, it brought nothing new to Macbeth (no, setting it in present-day Australia is not enough), and essentially, completely failed to justify its existence, apart from (let's face it, completely unnecessarily) paying homage to the original work. If there's one body of work that has been done (and done and done and done), it's Shakespeare's. So any adaptation, if it's not to be a self-indulgent and pointless exercise, needs to at least bring some new interpretation to the work. And that's what this Macbeth fails to do. As it was done, this film has no contemporary relevance whatsoever. It's the same piece that we have seen countless (too many!) times before. Except with guns and in different outfits. Apart from the fundamental blunder (no other way to put it) of keeping the original Shakespearian dialogue, one of the more cringeful moments of the movie is the prolonged and incredibly boring slow motion shoot out towards the end, during which I completely tuned out, even though I was looking at the screen. I never thought I had a short attention span, but there you go. I suppose the movie succeeds on its own, very limited terms. But as Australia continues to produce world-class acting talent, its movie-makers need to stop being proud of succeeding on limited terms, and actually set high enough standards to show that they respect for the kind of acting talent they work with. A shame. An absolute shame. | Negative |
| The performance of every actor and actress (in the film) are excellently NATURAL which is what movie acting should be; and the directing skill is so brilliantly handled on every details that I am never tired of seeing it over and over again. However, I am rather surprised to see that this film is not included in some of the actors' and director, Attenborough's credits that puzzles me: aren't they proud of making a claim that they have made such excellent, long lasting film for the audience? I am hoping I would get some answers to my puzzles from some one (possibly one of the "knowledgeable" personnel (insider) of the film. | Positive |

Summary: One review criticizes the attempt to bring Shakespeare to the masses, while another praises the full-length adaptation of "Hamlet" and the performances of the actors. Another review criticizes an Australian adaptation of Macbeth for not bringing anything new to the story, while the last review praises the natural performances and directing skill in a film.

In [10]:
# stuff: Simply stuff all data into the prompt
# map_reduce: Get response on all chunks of data and use another LLM
#     to summarize all individual responses, useful for long documents
# refine: Use over many documents by building upon the answer from previous documents
# map_rank: Assign each document a score, and return the response with the highest score
qa_stuff = RetrievalQA.from_chain_type(
    llm=llm, chain_type='stuff', retriever=retriever, verbose=True
)
query = """Please list all the reviews involving \'Shakespeare\' in a table
        in Markdown and summarize them in one sentence."""
response = qa_stuff.run(query)
display(Markdown(response))



[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m


| Review | Sentiment |
|---|---|
| The cast played Shakespeare. Shakespeare lost. I appreciate that this is trying to bring Shakespeare to the masses, but why ruin something so good. Is it because 'The Scottish Play' is my favorite Shakespeare? I do not know. What I do know is that a certain Rev Bowdler (hence bowdlerization) tried to do something similar in the Victorian era. In other words, you cannot improve perfection. I have no more to write but as I have to write at least ten lines of text (and English composition was never my forte I will just have to keep going and say that this movie, as the saying goes, just does not cut it. | negative |
| If you wish to see Shakespeare's masterpiece in its entirety, I suggest you find this BBC version. Indeed it is overlong at four and a half hours but Jacoby's performance as Hamlet and Patrick Stewart's as Claudius are well worth the effort. It never ceases to amaze me how clear "Hamlet" is when you see it in its length and order as set down by the Bard. Every film version of "Hamlet" has tinkered with its structure. Olivier concentrated on Hamlet's indecision, Gibson on his passions. Jacoby is able to pull all of these aspects of Hamlet's character together with the aid of Shakespeare's full script. Why does Hamlet not kill Claudius immediately? Hamlet says "I am very proud, revengeful, ambitious..." Hamlet is extremely upset, not only for his father's death (and suspected murder), or his mother's marriage to his uncle, but also, and mostly, because Claudius has usurped the throne belonging to Hamlet. He is furious at his mother for marrying Claudius (marriages between royal kin is not unknown; done for political reasons) but that her marriage solidified Claudius' claim to the throne before he could return from Wittenburg to claim it for himself. He is, therefore, impotent to do anything about it. And this is true even after he hears his father's ghost cry vengeance. He cannot simply kill the King or he will lose the throne in doing so. He must "out" the King's secret and here is the tragedy! At the moment Hamlet is successful in displaying Claudius' guilt in public, he has opportunity to kill him and does not. WHY? He wants it ALL! He wants revenge, the throne AND the damnation of Claudius' soul in hell. Hamlet OVERREACHES himself in classic tragic form. His own HUBRIS is his undoing. He kills Polonius thinking it is Claudius and the rest of the play spirals down to the final deaths of Rosencrantz, Guildenstern, Ophelia, Laertes, Gertrude, Claudius and Hamlet himself. | positive |
| The performance of every actor and actress (in the film) are excellently NATURAL which is what movie acting should be; and the directing skill is so brilliantly handled on every details that I am never tired of seeing it over and over again. However, I am rather surprised to see that this film is not included in some of the actors' and director, Attenborough's credits that puzzles me: aren't they proud of making a claim that they have made such excellent, long lasting film for the audience? I am hoping I would get some answers to my puzzles from some one (possibly one of the "knowledgeable" personnel (insider) of the film. | positive |
| This movie illustrates like no other the state of the Australian film industry and everything that's holding it back. Awesome talent, outstanding performances (particularly by Victoria Hill), but a let down in practically every other way. An "adaptation" of sorts, it brought nothing new to Macbeth (no, setting it in present-day Australia is not enough), and essentially, completely failed to justify its existence, apart from (let's face it, completely unnecessarily) paying homage to the original work. If there's one body of work that has been done (and done and done and done), it's Shakespeare's. So any adaptation, if it's not to be a self-indulgent and pointless exercise, needs to at least bring some new interpretation to the work. And that's what this Macbeth fails to do. As it was done, this film has no contemporary relevance whatsoever. It's the same piece that we have seen countless (too many!) times before. Except with guns and in different outfits. Apart from the fundamental blunder (no other way to put it) of keeping the original Shakespearian dialogue, one of the more cringeful moments of the movie is the prolonged and incredibly boring slow motion shoot out towards the end, during which I completely tuned out, even though I was looking at the screen. I never thought I had a short attention span, but there you go. I suppose the movie succeeds on its own, very limited terms. But as Australia continues to produce world-class acting talent, its movie-makers need to stop being proud of succeeding on limited terms, and actually set high enough standards to show that they respect for the kind of acting talent they work with. A shame. An absolute shame. | negative |

Summarized in one sentence: The first review criticizes a Shakespeare play, the second review praises a BBC version of Hamlet, the third review applauds the natural performances in a film, and the fourth review criticizes an adaptation of Macbeth for not bringing anything new to the table.

## 6. Evaluation

In [27]:
import langchain
from langchain.chat_models import ChatOpenAI
from langchain.chains import RetrievalQA
from langchain.document_loaders import CSVLoader
from langchain.vectorstores import DocArrayInMemorySearch
from langchain.indexes import VectorstoreIndexCreator
from langchain.evaluation.qa import QAGenerateChain, QAEvalChain
from IPython.display import display, Markdown

import os
API_KEY = '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'
os.environ['OPENAI_API_KEY'] = API_KEY

In [12]:
file_path = './IMDB_data_sub.csv'
loader = CSVLoader(file_path)
data = loader.load()
index = VectorstoreIndexCreator(
    vectorstore_cls=DocArrayInMemorySearch,
).from_loaders([loader])

In [13]:
llm = ChatOpenAI(temperature=0.0)
qa = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type='stuff',
    retriever=index.vectorstore.as_retriever(),
    verbose=True,
    chain_type_kwargs={
        "document_separator": "<<<<>>>>>"
    }
)

In [18]:
# Generate some question-answer pairs for evaluation
gen_chain = QAGenerateChain.from_llm(ChatOpenAI())
qa_pairs = gen_chain.apply_and_parse(
    [{"doc": t} for t in data[:5]]
)
qa_pairs[0]



{'qa_pairs': {'query': 'What is the main appeal of the show "Oz" according to the reviewer?',
  'answer': 'The main appeal of the show "Oz" according to the reviewer is its willingness to go where other shows wouldn\'t dare, with its brutal and unflinching scenes of violence, graphic portrayal of drugs and sex, and exploration of themes of injustice and shady dealings within a maximum security prison.'}}

In [22]:
# Set langchain.debug = True for debugging
langchain.debug = True
qa.run(qa_pairs[0]['qa_pairs']['query'])

[32;1m[1;3m[chain/start][0m [1m[1:chain:RetrievalQA] Entering Chain run with input:
[0m{
  "query": "What is the main appeal of the show \"Oz\" according to the reviewer?"
}
[32;1m[1;3m[chain/start][0m [1m[1:chain:RetrievalQA > 3:chain:StuffDocumentsChain] Entering Chain run with input:
[0m[inputs]
[32;1m[1;3m[chain/start][0m [1m[1:chain:RetrievalQA > 3:chain:StuffDocumentsChain > 4:chain:LLMChain] Entering Chain run with input:
[0m{
  "question": "What is the main appeal of the show \"Oz\" according to the reviewer?",
  "context": "the fact that it goes where other shows wouldn't dare. Forget pretty pictures painted for mainstream audiences, forget charm, forget romance...OZ doesn't mess around. The first episode I ever saw struck me as so nasty it was surreal, I couldn't say I was ready for it, but as I watched more, I developed a taste for Oz, and got accustomed to the high levels of graphic violence. Not just violence, but injustice (crooked guards who'll be sold out

'According to the reviewer, the main appeal of the show "Oz" is its brutality, unflinching scenes of violence, and its hardcore nature. The reviewer also mentions the diverse range of characters and the constant tension and conflict within the prison setting.'

In [25]:
langchain.debug = False
qa_pairs = [pair['qa_pairs'] for pair in qa_pairs]
predictions = qa.apply(qa_pairs)



[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m


[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m


[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m


[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m


[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m


In [30]:
# Use another LLM to evaluate the previously generated question-answer pairs
llm = ChatOpenAI(temperature=0.0)
eval_chain = QAEvalChain.from_llm(llm)
graded_outputs = eval_chain.evaluate(qa_pairs, predictions)
for i, pair in enumerate(qa_pairs):
    print(f"Example {i}:")
    print("Question: " + predictions[i]['query'])
    print("Real Answer: " + predictions[i]['answer'])
    print("Predicted Answer: " + predictions[i]['result'])
    print("Predicted Grade: " + graded_outputs[i]['results'])
    print()

Example 0:
Question: What is the main appeal of the show "Oz" according to the reviewer?
Real Answer: The main appeal of the show "Oz" according to the reviewer is its willingness to go where other shows wouldn't dare, with its brutal and unflinching scenes of violence, graphic portrayal of drugs and sex, and exploration of themes of injustice and shady dealings within a maximum security prison.
Predicted Answer: According to the reviewer, the main appeal of the show "Oz" is its brutality, unflinching scenes of violence, and its hardcore nature. The reviewer also mentions the diverse range of characters and the constant tension and conflict within the prison setting.
Predicted Grade: CORRECT

Example 1:
Question: What is the filming technique used in the production that gives a sense of realism?
Real Answer: The filming technique used in the production is very unassuming- very old-time-BBC fashion, which gives a comforting and sometimes discomforting sense of realism to the entire piec

## 7. Agent

In [10]:
import langchain
from langchain_experimental.agents.agent_toolkits import create_python_agent
from langchain.agents import tool, load_tools, initialize_agent
from langchain.agents import AgentType
from langchain_experimental.tools.python.tool import PythonREPLTool
from langchain.python import PythonREPL
from langchain.chat_models import ChatOpenAI

import os
from datetime import date
API_KEY = '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'
os.environ['OPENAI_API_KEY'] = API_KEY

In [2]:
# To use LLM as an reasoning engine, we need to set temperature = 0 to avoid randomness
# langchain.debug = True
llm = ChatOpenAI(temperature=0.0)
tools = load_tools(['llm-math','wikipedia'], llm=llm)
agent = initialize_agent(
    tools,
    llm, 
    agent=AgentType.CHAT_ZERO_SHOT_REACT_DESCRIPTION, 
    handle_parsing_errors=True, 
    verbose=True
)
agent("What is the 25% of 300?")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mI can use the calculator tool to find the answer to this question.

Action:
```json
{
  "action": "Calculator",
  "action_input": "25% of 300"
}
```[0m
Observation: [36;1m[1;3mAnswer: 75.0[0m
Thought:[32;1m[1;3mThe answer is 75.0.
Final Answer: 75.0[0m

[1m> Finished chain.[0m


{'input': 'What is the 25% of 300?', 'output': '75.0'}

In [3]:
# The agent will use Wikipedia this time
question = """Andrew Y. Ng is a British-American computer scientist and technology entrepreneur
focusing on maching learning and artificial intelligence. What is his biggest contribution to the field
of maching learning?"""
agent(question)



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mThought: I can use Wikipedia to find information about Andrew Y. Ng's contributions to the field of machine learning.

Action:
```json
{
  "action": "Wikipedia",
  "action_input": "Andrew Y. Ng"
}
```[0m
Observation: [33;1m[1;3mPage: Andrew Ng
Summary: Andrew Yan-Tak Ng (Chinese: 吳恩達; born 1976) is a British-American computer scientist and technology entrepreneur focusing on machine learning and artificial intelligence (AI). Ng was a cofounder and head of Google Brain and was the former Chief Scientist at Baidu, building the company's Artificial Intelligence Group into a team of several thousand people.Ng is an adjunct professor at Stanford University (formerly associate professor and Director of its Stanford AI Lab or SAIL). Ng has also worked in the field of online education, cofounding Coursera and DeepLearning.AI. He has spearheaded many efforts to "democratize deep learning" teaching over 2.5 million students through 

{'input': 'Andrew Y. Ng is a British-American computer scientist and technology entrepreneur\nfocusing on maching learning and artificial intelligence. What is his biggest contribution to the field\nof maching learning?',
 'output': 'Andrew Y. Ng\'s biggest contribution to the field of machine learning is his work on "democratizing deep learning" through his online courses.'}

In [8]:
# Use LLM to write Python code
agent = create_python_agent(
    llm, 
    tool=PythonREPLTool(), 
    handle_parsing_errors=True, 
    verbose=True
)
nums, target = [2,7,11,15], 9
agent.run(f"""Given an array of integers: {nums} and an integer target: {target}, return indices of 
the two numbers such that they add up to target.""")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mWe can use a nested loop to iterate through the array and check if any two numbers add up to the target.
Action: Python_REPL
Action Input: arr = [2, 7, 11, 15], target = 9[0m
Observation: [36;1m[1;3mSyntaxError('cannot assign to literal', ('<string>', 1, 8, 'arr = [2, 7, 11, 15], target = 9\n'))[0m
Thought:[32;1m[1;3mThere is a syntax error in the input. I need to fix it by removing the assignment operator.
Action: Python_REPL
Action Input: arr = [2, 7, 11, 15]; target = 9[0m
Observation: [36;1m[1;3m[0m
Thought:[32;1m[1;3mThe syntax error is fixed. Now I can proceed with the nested loop to find the indices of the two numbers that add up to the target.
Action: Python_REPL
Action Input: 
```
for i in range(len(arr)):
    for j in range(i+1, len(arr)):
        if arr[i] + arr[j] == target:
            print(i, j)
```[0m
Observation: [36;1m[1;3m0 1
[0m
Thought:[32;1m[1;3mThe indices of the two numbers that add u

'[0, 1]'

In [12]:
# Customized tools
@tool
def time(text: str) -> str:
    """Returns todays date, use this for any \
    questions related to knowing todays date. \
    The input should always be an empty string, \
    and this function will always return todays \
    date - any date mathematics should occur \
    outside this function."""
    return str(date.today())

agent = initialize_agent(
    tools + [time],   # Add the time function to the existing tools
    llm, 
    agent=AgentType.CHAT_ZERO_SHOT_REACT_DESCRIPTION, 
    handle_parsing_errors=True, 
    verbose=True
)
agent.run("Whats the date today?")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mQuestion: What's the date today?
Thought: I can use the `time` tool to get the current date.
Action:
```
{
  "action": "time",
  "action_input": ""
}
```
[0m
Observation: [38;5;200m[1;3m2023-12-23[0m
Thought:[32;1m[1;3mI now know the final answer.
Final Answer: The date today is 2023-12-23.[0m

[1m> Finished chain.[0m


'The date today is 2023-12-23.'