In [None]:
%pip install langchain
%pip install openai
%pip install tiktoken
%pip install unstructured
%pip install chromadb
%pip install pdfminer.six


In [None]:
import pandas as pd
import os
import json
from dotenv import load_dotenv
load_dotenv()
# You would need to 
os.environ["OPENAI_API_KEY"] = os.getenv('OPENAI_KEY')
os.chdir('/Users/tcoan/git_repos/ncrm-spring-school/')
import numpy as np
from sklearn.metrics import precision_score, recall_score, f1_score, accuracy_score

## Large Language Models with LangChain

### OpenAI's API (GPT-3)

Sending requests to the OpenAI API is quite easy using LangChain. Start by importing the relevant packages:

In [None]:
# LangChain imports
from langchain.llms import OpenAI
from langchain import PromptTemplate

And then instantiate the `OpenAI()` class as follows:

In [None]:
# We set the "temperature" to zero to remove randomness in the response
llm = OpenAI(temperature=0)

We now need to send a (correctly formatted) prompt to the API and collect the response. Note that the `langchain` library offers a number of different "templates" to help structure your prompts. Let's see how this works:

In [None]:
sentiment_template = """
Here is an example of a movie review:

{review}

Is this a positive, negative, or neutral review?" If you don't know, say 'unclear'. \
Return the result as {response_format} with the key 'sentiment'.
"""

prompt = PromptTemplate(
    input_variables=["review", "response_format"],
    template=sentiment_template,
)

Let's see what a "formatted" prompt would look like with input data:

In [None]:
movie_review = 'I know that most people love the movie Titanic. I thought it was pretty stupid. Sappy!'
response_format = 'JSON'
print(prompt.format(review=movie_review, response_format=response_format))

Now we just pass the formatted prompt to our `llm` object and collect the results:

In [None]:
response_json = llm(prompt.format(review=movie_review, response_format=response_format))

In [None]:
print(response_json)
print(type(response_json))

In [None]:
response = print(json.loads(response_json))
print(type(response_json))

One of the coolest features of using a GPT-style model for doing classification is that we can ask the model to justify its decision (in natural language!). Let's change our prompt to ask for a justification:

In [None]:
sentiment_template = """
Here is an example of a movie review:

{review}

Is this a positive, negative, or neutral review?" If you don't know, say 'unclear'. \
Return the result as JSON with the key 'sentiment' and return a short \
description of why you gave this answer using the key "reason". 
"""

prompt = PromptTemplate(
    input_variables=["review"],
    template=sentiment_template,
)

In [None]:
response_json = llm(prompt.format(review=movie_review))
print(response_json)

Interacting with the OpenAI's API will often provide the most accurate models (especially `davinci`), but it is expensive to use. Let's look at the "Chat" version (GPT3.5 turbo) which is 10 times cheaper!

### ChatGPT implementation (GPT3.5 turbo)

To interact with the "Chat" versions of OpenAI models using `langchain`, we need to load some additional classes/functions:

In [None]:
from langchain.chat_models import ChatOpenAI
from langchain.schema import (
    HumanMessage,
    SystemMessage
)

And then instantiate the `ChatOpenAI()` class as follows:

In [None]:
# chat mode instance
chat = ChatOpenAI(temperature=0)

How you pass prompts to `ChatOpenAI` differs from the standard API as well:

In [None]:
movie_review = 'I know that most people love the movie Titanic. I thought it was pretty stupid. Sappy!'
messages = [
    SystemMessage(content="You are a helpful assistant that can classify the sentiment of movie review texts. The labels you can use are positive, negative and neutral."),
    HumanMessage(content=f"Provide label for the following review: {movie_review}\n\nMake sure that your label is one of the following: 'positive', 'negative' or 'neutral'. Also provide a short justification for your label. Return the response as JSON."),
]

response = chat(messages)

In [None]:
print(response.content)
#print(json.loads(response.content))

### Zero-shot classification of movie reviews data

It's easy to do "zero-shot" classification with GPT-style models, but the obvious question is how to the perform? To get a sense of performance, let's load our movie reviews data:

In [None]:
reviews = pd.read_csv('data/movie_reviews.csv').to_dict('records')

# Subset the last 500 reviews which we used as a "test set"
reviews_unlabeled = reviews[1500:]

Now we simply need to loop over our text, send it to OpenAI, and collect the predictions. To make this easier, let's define a function to send each request:

In [None]:
def send_chat_request(movie_review):
    messages = [
        SystemMessage(content="You are a helpful assistant that can classify the sentiment of movie review texts. The labels you can use are positive, negative and neutral."),
        HumanMessage(content=f"Provide label for the following review: {movie_review}\n\nMake sure that your label is one of the following: 'positive', 'negative' or 'neutral'. Also provide a short justification for your label. You MUST return the response as JSON."),
    ]
    return chat(messages)

And run the loop:

In [None]:
# I've already ran this loop and saved the JSON responses here:
preds = pd.read_json('language-models/zero_shot_predictions.json').to_dict('records')

# I don't want to run this again, so I'm commenting this out!
""""
results_chat = []
for i,row in enumerate(reviews_unlabeled):
    response = send_chat_request(row['text'])
    results_chat.append(response.content)
    print(f'Finished iteration {i}')
"""

print(preds[0])

Finally, let's calculate the performance:

In [None]:
# Pull out the "positive" predictions and save as a numpy array
y_pred = np.array([row['positive'] for row in preds])

# Get the "truth" and save as a numpy array:
y = np.array([row['positive'] for row in reviews_unlabeled])

print(accuracy_score(y, y_pred))
print(precision_score(y, y_pred))
print(recall_score(y, y_pred))
print(f1_score(y, y_pred))


## Few-shot learning with LLMs and LangChain

While we provided no "training" data to the model in the zero-shot example, it is often helpful to nudge the model in the right direction. You can do this via "few shot" learning: i.e., you provide a "few" examples from the data to the prompt prior to classification. `langchain` makes doing few shot learning with OpenAI (and may other models!) relatively painless. Start by importing the few shot learning prompt template:

In [None]:
from langchain import FewShotPromptTemplate

Next, let's set up a few examples of movie reviews with different labels:

In [None]:
examples = [
    {'text': 'The movie is about two teen couples who drink and drive, resulting in an accident where one of the guys dies. His girlfriend continues to see him in her life and has nightmares. The film attempts to present a cool idea but fails to execute it properly, resulting in a confusing and jumbled plot. The actors are good, but the film lacks entertainment value and feels redundant. It is not a horror or teen slasher flick, but it is packaged to look that way. The reviewer suggests skipping it.',
     'label': 'negative'},
    {'text':'"From Hell" is a successful film adaptation of a graphic novel by Alan Moore and Eddie Campbell about the Jack the Ripper murders in 1888 London\'s East End. The film is directed by the Hughes brothers and stars Johnny Depp as Inspector Frederick Abberline, who investigates the gruesome murders with the help of an unfortunate named Mary Kelly (Heather Graham). The film has a unique and interesting theory about the identity of the killer and the reasons he chooses to slay. The film\'s appearance is dark and bleak, capturing the dreariness of Victorian-era London, and the acting is solid, with Depp and Graham turning in strong performances. The film is rated R for strong violence/gore, sexuality, language, and drug content.',
     'label':'positive'},
    {'text':'The "The Bourne Identity" is an okay movie. It has all of the typical action that you would expect and the acting is decent. It is entertaining enough to watch.',
     'label':'neutral'}
]

Now we are ready to set up our prompt. This typically includes defining a prefix for out prompt, and example (or data) template, and a prompt suffix. Let's do this for our movie review examples:

In [None]:
prefix = """You are a helpful assistant that can classify the sentiment of movie review texts. 
The labels you can use are positive, negative and neutral. Here are some examples"""

example_template = """
Text: {text}
Label: {label}
"""

example_prompt = PromptTemplate(
    input_variables=["text", "label"],
    template=example_template
)

suffix = """
Text: {text}
Label: """

# Now create the few shot prompt template
few_shot_prompt_template = FewShotPromptTemplate(
    examples=examples,
    example_prompt=example_prompt,
    prefix=prefix,
    suffix=suffix,
    input_variables=["text"],
    example_separator="\n\n"
)



We can then view the formatted prompt in the usual way by calling the `.format()` method:

In [None]:
movie_review = "The Big Lebowski is the funniest movie that I've ever seen. I know this sort of comedy isn't for everyone, but wow."
print(few_shot_prompt_template.format(text=movie_review))

And we send the prompt to OpenAI in the usual way:

In [None]:
# We set the "temperature" to zero to remove randomness in the response
llm = OpenAI(temperature=0)

response = llm(few_shot_prompt_template.format(text=movie_review))

In [None]:
print(response)

## Text summarization with LangChain

Note that when entering "examples" for the few shot learning prompt above, I didn't enter full movie reviews but instead only summaries. How did I get these summaries? I used LangChain! To do text summarization, we start by loading the relevant libraries:

In [None]:
from langchain.text_splitter import CharacterTextSplitter
from langchain.chains.summarize import load_summarize_chain
from langchain.docstore.document import Document

And initialize the model that we want to use.

In [None]:
chat = ChatOpenAI(temperature=0)

Note that that the number of tokens that you can pass to OpenAI is limited, so when summarizing long documents, it's necessary to split them into smaller chunks:

In [None]:
text_splitter = CharacterTextSplitter()
texts = text_splitter.split_text(reviews[1]['text'])
docs = [Document(page_content=t) for t in texts]

In [None]:
print(docs)

In [None]:
chain = load_summarize_chain(chat, chain_type="refine")
chain.run(docs)

## Question answering

One of the most powerful features of LangChain + GPT-style models is that they provide the ability to do question answering over a set of documents using natural language. As an example, let's load the `trump_tweets_2017.csv` and use `langchain` to make queries on the tweet content. First, load the CSV "agent" needed to interact with CSVs:

In [None]:
from langchain.agents import create_csv_agent

Instantiate the model and load the data into the agent:

In [None]:
agent = create_csv_agent(OpenAI(temperature=0), 'data/trump_tweets_2017.csv', verbose=True)

Now we run queries against the Trump data like so:

In [None]:
agent.run("Can you provide 5 tweets that attacks Democracts?")

In [None]:
agent.run("Show me the tweets that related the environment. Don't include tweets on the business environment.")

#### Question answering with a PDF file

We are not limited to making queries againt as CSV -- `langchain` provides functionality to query almost anything! For example, let's see how to load and run queries on a PDF document. Start by loading the necessary functions:

In [None]:
from langchain.document_loaders import UnstructuredPDFLoader
from langchain.indexes import VectorstoreIndexCreator
loader = UnstructuredPDFLoader("/Users/tcoan/Downloads/s41598-021-01714-4.pdf")

Next, we use the `VectorstoreIndexCreator()` load and prepare our data. This is where all of the magic happens. `VectorstoreIndexCreator()` is carries at the following tasks:

1. Loading the PDF and splitting it into smaller chunks.
2. Creating embeddings for each document (i.e., chunk)
3. Storing the documents and embeddings in a vectorstore (i.e., a database)

In [None]:
index = VectorstoreIndexCreator().from_loaders([loader])

Now, we can use the `index` object (and the underlying vectorstore) to make queries like so:

In [None]:
index.query('does the paper use RoBERTa for classification?')

In [None]:
index.query('what is the F1 score for the best performing model in the paper?')

In [None]:
text = 'Al Gore is an alarmist and we should not take anything he says seriously.'
prompt = f'Is the follow sentence an example of climate skepticism: {text}. If so, why?'
index.query(prompt)