# Creating Conservational AI with Large Language Models for Business

# Algoritma

Algoritma Data Science School is an educational institution based in Indonesia that specializes in providing training and courses in the field of data science and analytics. It aims to equip individuals with the knowledge and skills needed to excel in the rapidly growing field of data science.

This Jupyter Notebook has been specifically designed for Algoritma Data Science School, providing a platform for students to explore, analyze, and visualize data using Python and its associated data science libraries. Jupyter Notebook serves as an interactive workspace where students can demonstrate their understanding of data science concepts, showcase their skills, and present their findings in a structured and organized manner. Please note that this Jupyter Notebook is intended for personal or educational use only. Kindly refrain from reproducing or distributing this notebook without prior permission. 

# Outline

- Large Language Model (LLM)
    - Overview of Large Language Model & Transformer
    - Introduction to populer LLM like GPT-3, GPT-2, and BERT
    - Understanding Capabilities and limitation of LLM

- Large Language Model Implementation
    - Introduction to LangChain
    - Setting API key and .env
    - LangChain QuickStart

- Build Question-Answering System
    - Introduction to Question-Answering System
    - Connecting datasource (database and text data) with LLM
    - Basics of building Question-Answering System using LLM with database and text data
    - Demonstration of using OpenAI and LangChain to build a Question-Answering System

- Hugging Face
    - Introduction to Text Generation and Hugging Face
    - Setting API key and .env
    - Applying HuggingFace's inference API to use LLM without OpenAI credits
    - Integrating HuggingFace's Inference API into the previously built Question-Answering System
    - Demonstration of using HuggingFace's Inference API to build a Question-Answering System

# Large Language Model (LLM)

# Large Language Model Implementation

## Intro to LangChain

LangChain is a framework for developing applications powered by language models. The LangChain framework is designed around these principle : 

1. Data-aware: connect a language model to other sources of data

2. Agentic: allow a language model to interact with its environment

## Setting API key and .env

### dotenv

The dotenv library is a popular Python library that simplifies the process of loading environment variables from a .env file into your Python application. It allows you to store configuration variables separately from your code, making it easier to manage sensitive information such as API keys, database credentials, or other environment-specific settings.

#### `.env` file

the .env file is a text file commonly used in software development projects to store environment-specific configuration variables. It follows a simple key-value format, where each line represents a single configuration variable.

Here are a few important points about the .env file:

1. Purpose: The primary purpose of the .env file is to separate sensitive or environment-specific information from your codebase. It allows you to store configuration variables such as API keys, database credentials, or other settings that may change based on the environment (e.g., development, staging, production).

2. File Format: The .env file is typically a plain text file without any special formatting. Each line in the file consists of a key-value pair, where the key and value are separated by an equal sign (=). For example :

```{python}
API_KEY=abc123
DATABASE_URL=mysql://user:password@localhost/db

```

3. Environment Variables: Each line in the .env file represents an environment variable. The key is the name of the environment variable, and the value is the corresponding value for that variable. These variables can be accessed within your code to retrieve the associated values.

4. Loading Variables: To make use of the variables defined in the .env file, you need to load them into your application. This is typically done using a library like dotenv in Python. The library reads the .env file and sets the defined variables as environment variables that can be accessed within your code.

5. Security: It's essential to ensure the security of your .env file. The file may contain sensitive information, such as passwords or access tokens. Make sure to exclude the .env file from version control systems like Git and only share it with authorized individuals who require access to the environment-specific configuration.

Overall, the .env file provides a convenient and flexible way to manage configuration variables in your project, allowing you to keep sensitive information separate from your code and easily configure different environments.


### Verify

In [1]:
from dotenv import load_dotenv

load_dotenv()

True

## LangChain QuickStart

**Topic**

- Prompt
- Chain
- Agent

In [8]:
from langchain import OpenAI

llm = OpenAI(temperature=0.9) # parameter temperature

### Prompt

#### Basic Prompt

The basic building of LangChain is the LLM, which takes in text and generates more text (answer!).

Example, we're building an application that generates a brand name based on company description.

In [8]:
prompt = "What is a good name for a brand that makes local burger?"

print(llm.predict(prompt))



Uptown Burgers.


Notice every re-run it generate new answer.

We can also did it in other languages, let's try with Bahasa

In [9]:
print(llm.predict("Nama yang bagus untuk brand yang membuat pisang goreng mentai?"))



PangoMentai.


#### Prompt Templates

Most LLM applications do not pass input directly into an LLM. Usually they will add the usre input to a larger piece of text, called a prompt template.

A prompt template refers to a reproducible way to generate a prompt. It contains a text string (“the template”), that can take in a set of parameters from the end user and generate a prompt.

The prompt template may contain:

- instructions to the language model,

- a set of few shot examples to help the language model generate a better response,

- a question to the language model.

In the previous example, the text we passed to the model contained instruction to generate a brand name based on description. For our application, it'd be great if the user only had to provide the description of a company/product, without having to worry about giving the model instruction.

In [10]:
from langchain.prompts import PromptTemplate

template_prompt = PromptTemplate.from_template("What is a good name for a brand that makes {product}?")

prompt = template_prompt.format(product="local burger")

print(prompt)

What is a good name for a brand that makes local burger?


Notice the instruction changes automatically based on user input, this instruction will be input to `llm` to generate the response.

In [11]:
print(llm.predict(prompt))



BurgTowne.


Because this is a template, it can handle more than one input, for example.

In [12]:
template = "Write a {adjective} poem about {subject}"

poem_template = PromptTemplate(
    input_variables=["adjective", "subject"],
    template=template,
)

In [13]:
poem_template.format(adjective='sad', subject='ducks')

'Write a sad poem about ducks'

In [14]:
print(llm.predict(poem_template.format(adjective='sad', subject='ducks')))



The water's so still, not a soul around
The ducks paddle in circles, the sun setting down
The lake is so empty it fills them with dread
No one to talk to, no one to share bread

The sky is now vacant and void of all sound
No one to share warmth, no one around
The ducks look in vain for their partners in flight
But the sky has remained empty, no birds in sight

The lonesome ducks feel their hearts break in two
No one is near, what else can they do?
No quacks or honks, no chatter or calls
Just quiet and sadness and these empty walls

The ducks will swim on, though their hearts may be sore
Until one day, the sky will roar
And their friends will return, with the sound of a bell
Until then, the ducks will stay in their lonely and sad little hell.


The prompt template may contain:

- instructions to the language model,

- a set of few shot examples to help the language model generate a better response,

- a question to the language model.

In [15]:
template = """
I want you to act as a naming consultant for new companies.

Here are some examples of good company names:

- search engine, Google
- social media, Facebook
- video sharing, YouTube

The name should be short, catchy and easy to remember.

What is a good name for a brand that makes {product}?
"""

brand_template = PromptTemplate(
    input_variables=["product"],
    template=template,
)

batik_prompt = brand_template.format(product='batik')

print(llm.predict(batik_prompt))


BatikBliss


### Chain

We've got a model and prompt template, we'll want to combine the two by "Chain"-ing them up. Chains give us a way to link (or chain) together multiple primitives, like models, prompts, and other chains.

The simplest and most common type of chain is an LLMChain, which passes an input first to a PromptTemplate and then to an LLM. We can construct an LLM chain from our existing model and prompt template.

For example, if we want to generate response from our template our workflow would be:

1. Create the prompt based on input with `template_prompt`

In [16]:
prompt = template_prompt.format(product="rendang mozarella")

print(prompt)

What is a good name for a brand that makes rendang mozarella?


2. Generate response from prompt with `llm`

In [17]:
print(llm.predict(prompt))



Mozarella Rendang Co.


We can simplify the workflow by chaining (link) them up with `Chains`

In [18]:
from langchain.chains import LLMChain

chain = LLMChain(llm=llm, prompt=template_prompt)

In [19]:
print(chain.run('rendang mozarella'))



Mozarella Rendang Delights


Notice that every new input we just need one line code to generate the response using `Chains`. Understanding how this simple chain works will set you up well for working with more complex chains.

### Agents

We had a plan for our initial chain to follow specific steps. However, for more complicated workflows, it's important to be able to pick actions based on the situation.

Agents help us do exactly that. They use a language model to figure out which actions to take and in what order. These agents have tools at their disposal, and they keep selecting a tool, running it, and examining the results until they find the ultimate solution.

To load an agent, you need to choose a(n):

- **LLM/Chat model:** The language model powering the agent.

- **Tool(s):** A function that performs a specific duty. This can be things like: Google Search, Database lookup, Python REPL, other chains. For a list of predefined tools and their specifications, see the Tools documentation.

- **Agent name:** A string that references a supported agent class. An agent class is largely parameterized by the prompt the language model uses to determine which action to take. Because this notebook focuses on the simplest, highest level API, this only covers using the standard supported agents. If you want to implement a custom agent, see here. For a list of supported agents and their specifications, see here.

For this example, we'll be using `wikipedia` tools to query a response from wikipedia information.

In [20]:
from langchain.agents import AgentType, initialize_agent, load_tools

# The language model we're going to use to control the agent.
llm_agent = OpenAI(temperature=0)

# The tools we'll give the Agent access to. Note that the 'llm-math' tool uses an LLM, so we need to pass that in.
tools = load_tools(["wikipedia", "llm-math"], llm=llm_agent)

# Finally, let's initialize an agent with the tools, the language model, and the type of agent we want to use.
agent = initialize_agent(tools, llm_agent, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True)

Let's test it out

In [21]:
agent.run("What year did Lionel Messi Joined Barcelona? What is his current age raised to the 0.43 power?")



[1m> Entering new  chain...[0m
[32;1m[1;3m I need to find out when Messi joined Barcelona and then calculate his current age raised to the 0.43 power.
Action: Wikipedia
Action Input: Lionel Messi[0m
Observation: [36;1m[1;3mPage: Lionel Messi
Summary: Lionel Andrés Messi (Spanish pronunciation: [ljoˈnel anˈdɾes ˈmesi] (listen); born 24 June 1987), also known as Leo Messi, is an Argentine professional footballer who plays as a forward and captains the Argentina national team. Widely regarded as one of the greatest players of all time, Messi has won a record seven Ballon d'Or awards and a record six European Golden Shoes, and in 2020 he was named to the Ballon d'Or Dream Team. Until leaving the club in 2021, he had spent his entire professional career with Barcelona, where he won a club-record 34 trophies, including ten La Liga titles, seven Copa del Rey titles and the UEFA Champions League four times. With his country, he won the 2021 Copa América and the 2022 FIFA World Cup. A 

'Lionel Messi joined Barcelona in 2004 and his current age raised to the 0.43 power is 3.9218486893172186.'

# Build Question Answering System

Build Question-Answering System
- Introduction to Question-Answering System
- Connecting datasource (database and text data) with LLM
- Basics of building Question-Answering System using LLM with database and text data
- Demonstration of using OpenAI and LangChain to build a Question-Answering System

## Intro to Question-Answer System

As we know, LangChain is an open-source library that offers developers a comprehensive set of resources to develop applications that run on Large Language Models (LLMs). In the previous example, we have built model that feed question and llm will generate response based on its internal knowledge.

But, what if we need to feed a question that more contextual about our domain business problem? for example what if we feed LLM with question about our company such as "What is the product that generate most revenue?".
However, there are some limited knowledge of LLMs models, especially when documents are specific of some context.

One way to address this limitation is to give more information about documents to large language model (LLM) to answer questions about information it was not trained on. The basic idea is to first retrieve any relevant documents from a corpus called context, then pass those documents along with the original question to the LLM. The LLM will then generate a response that is informed by the information in the retrieved documents.

This documents can be any file that store any information. It can be database, pdf, text and even information from a website.

In this module, we will explore how to connect and feed a database and text information to LLM to build Question-Answer System.

## Database

### Connecting to Database

LangChain provide function that connect database to LLM, it called SQL Database. It also provide a function to chaining between the database, model llm and an agent that will execute SQL query based on natural language prompt

In [22]:
from langchain import SQLDatabase, SQLDatabaseChain

At this part we need to load the data, we will use the chinook data from our academy class as example. You need to explicitly explain what kind of database you load, for example `sqlite:///`.

Then you can just load the database using `SQLDatabase` from `langchain`.

In [23]:
dburi = "sqlite:///data_input/chinook.db"
db = SQLDatabase.from_uri(dburi)

After that, we chaining the `db` to model it will create agent that will looking for answer in chinook database based on prompt input.

Let's try prompt to know how many rows in the tracks table of this db.

In [9]:
llm = OpenAI(temperature=0) # parameter temperature

In [24]:
db_chain = SQLDatabaseChain(llm=llm, database=db, verbose=True)

db_chain.run("How many rows is in the tracks table of this db?")





[1m> Entering new  chain...[0m
How many rows is in the tracks table of this db?
SQLQuery:[32;1m[1;3mSELECT COUNT(*) AS 'Number of Tracks' FROM tracks;[0m
SQLResult: [33;1m[1;3m[(3503,)][0m
Answer:[32;1m[1;3m3503[0m
[1m> Finished chain.[0m


'3503'

Notice there are some component in the output, such as `SQLQuery` that provide information what process the model did to seeking the answer using SQL; `SQLResult`, the result of `SQLQuery` from our database; And lastly, it convert the `SQLResult` to natural language and display it on `Answer`.

### Basics of Building Question-Answer System using LLM

The `SqlDatabaseChain` allows you to answer questions over a SQL database. 

Another example we use the question from our dive deeper. 

> all sales in rock genre in 2012

In [25]:
db_chain.run("all sales in rock genre in 2012 based on invoice")



[1m> Entering new  chain...[0m
all sales in rock genre in 2012 based on invoice
SQLQuery:[32;1m[1;3mSELECT "Total" 
FROM invoices 
INNER JOIN invoice_items 
ON invoices."InvoiceId" = invoice_items."InvoiceId" 
INNER JOIN tracks 
ON invoice_items."TrackId" = tracks."TrackId" 
INNER JOIN genres 
ON tracks."GenreId" = genres."GenreId" 
WHERE genres."Name" = "Rock" 
AND invoices."InvoiceDate" >= date('2012-01-01') 
AND invoices."InvoiceDate" <= date('2012-12-31') 
LIMIT 5;[0m
SQLResult: [33;1m[1;3m[(8.91,), (8.91,), (8.91,), (8.91,), (13.86,)][0m
Answer:[32;1m[1;3mThe total sales of rock genre in 2012 based on invoice is 8.91, 8.91, 8.91, 8.91, 13.86.[0m
[1m> Finished chain.[0m


'The total sales of rock genre in 2012 based on invoice is 8.91, 8.91, 8.91, 8.91, 13.86.'

> We want the returned DataFrame to contain only the Pop genre and only when the UnitPrice of the track is 0.99

In [26]:
db_chain.run("We want the returned DataFrame to contain only the Pop genre and only when the UnitPrice of the track is 0.99")



[1m> Entering new  chain...[0m
We want the returned DataFrame to contain only the Pop genre and only when the UnitPrice of the track is 0.99
SQLQuery:[32;1m[1;3mSELECT TrackId, Name, GenreId, UnitPrice FROM tracks WHERE GenreId=3 AND UnitPrice=0.99 LIMIT 5;[0m
SQLResult: [33;1m[1;3m[(77, 'Enter Sandman', 3, 0.99), (78, 'Master Of Puppets', 3, 0.99), (79, 'Harvester Of Sorrow', 3, 0.99), (80, 'The Unforgiven', 3, 0.99), (81, 'Sad But True', 3, 0.99)][0m
Answer:[32;1m[1;3mThe returned DataFrame contains only the Pop genre and only when the UnitPrice of the track is 0.99.[0m
[1m> Finished chain.[0m


'The returned DataFrame contains only the Pop genre and only when the UnitPrice of the track is 0.99.'

In [28]:
db_chain.run("Tampilkan lagu dengan Genre Pop")



[1m> Entering new  chain...[0m
Tampilkan lagu dengan Genre Pop
SQLQuery:

[32;1m[1;3mSELECT "Name" FROM tracks WHERE "GenreId"=3 LIMIT 5;[0m
SQLResult: [33;1m[1;3m[('Enter Sandman',), ('Master Of Puppets',), ('Harvester Of Sorrow',), ('The Unforgiven',), ('Sad But True',)][0m
Answer:[32;1m[1;3mEnter Sandman, Master Of Puppets, Harvester Of Sorrow, The Unforgiven, Sad But True[0m
[1m> Finished chain.[0m


'Enter Sandman, Master Of Puppets, Harvester Of Sorrow, The Unforgiven, Sad But True'

> Notes: In each query there is always `LIMIT 5` query command, this is the limitation from the model (at least we got the query tho and run it by ourselves to get the full rows)

## Text Dataset (CSV)

### Connecting to CSV

Structured data not only stored in database file, some other example are .xlsx and .csv that stored data as table (columns and rows). Likewise, LangChain not only provides agent to generate answer from database using SQL based on natural language prompt, it also provides agent to generate answer based on tabular text data source, such as csv, that we will demonstrate how to utilize the agent.

First, let's define the location path of our dataset `rice.csv` that contains rice category transaction.

In [29]:
filepath = "data_input/rice.csv"

Then, we create agent. This time is a CSV agent. We use same llm model as our sql part that's why we don't need to re-define llm.

In [30]:
from langchain.agents import create_csv_agent
agent = create_csv_agent(llm, filepath, verbose=True)

Then we just run ask the question about our data.

In [31]:
agent.run("berikan detail banyaknya transaksi yang terjadi di setiap format")



[1m> Entering new  chain...[0m
[32;1m[1;3mThought: saya harus mencari total pembelian dalam setiap format
Action: python_repl_ast
Action Input: df.groupby('format')['quantity'].sum()[0m
Observation: [36;1m[1;3mformat
hypermarket    1464
minimarket     9578
supermarket    4953
Name: quantity, dtype: int64[0m
Thought:[32;1m[1;3m Saya sekarang tahu jawabannya
Final Answer: Hypermarket: 1464; Minimarket: 9578; Supermarket: 4953[0m

[1m> Finished chain.[0m


'Hypermarket: 1464; Minimarket: 9578; Supermarket: 4953'

Notice there more component in the output.
- `Thought`, is what agent thought how to solve the problem from the prompt and what is thought based on its action result.
- `Action`, is what agent did to solve the problem, it use `python_repl_ast` which is just python shell, it tells the agent use python to solved the problem. It also tell what `pandas` command the agent did to get the result from csv data.
- `Final Answer`: is a natural language form of answer from the result of `Action Input`.

## Demonstration of using OPENAI and Langchain to build a Question-Answering System For Unstrutured Data

The document in the company not only stored in structured form of data, it also stored in unstructured form of data. For example like summary of the meeting, task reports, product description, etc. If we need to gather information or ask a question regarding those documents, a person would looking related document and search for the answer manually.

So what if we want to let llm model to seeking the answer? What if we have a document or regulation that only special at our company. We can add our own information to the LLM model. This part we will use OpenAI Embeddings. So we can build question answer system based on our company document.

In [10]:
from langchain.document_loaders import DirectoryLoader, TextLoader
from langchain.text_splitter import CharacterTextSplitter
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import Chroma
from langchain.chains import RetrievalQA

embeddings = OpenAIEmbeddings()

Let's use `summary.txt`` that contains summary of coal news for Australia, Indonesia, and China.

In [11]:
loader = TextLoader('data_input/summary.txt')

documents = loader.load()
text_splitter = CharacterTextSplitter(chunk_size=2500, chunk_overlap=0)
texts = text_splitter.split_documents(documents)
#texts

In [12]:
docsearch = Chroma.from_documents(texts, embeddings)

qa_chain = RetrievalQA.from_chain_type(
    llm=OpenAI(), 
    chain_type="stuff",
    retriever=docsearch.as_retriever()
)

In [13]:
qa_chain.run("What are the effects of legislations surrounding emissions on the Australia coal market?")

" Australia's coal and gas exports may reduce by half within the next five years due to the efforts of Asian countries to decrease greenhouse gas emissions. The earnings of minerals and energy exports are predicted to reach $464bn in 2022-23 from $128bn in thermal coal exports and $91bn in liquidified natural gas (LNG) exports. Coal producers are in talks with the government of New South Wales, following the government's announcement that coal miners should reserve up to 10% of production for domestic supply to control rising energy costs in Australia. Exports of coal are essential to the Australian economy, with 80% of the country's coal exported, yet the move comes as coal prices rise nearly 50% YoY."

In [14]:
qa_chain.run("Is there an export ban on Coal in Indonesia? Why?")

" Yes, Indonesia has imposed a ban on coal exports. This is to ensure adequate supply for the country's state-owned electricity companies."

In [15]:
qa_chain.run("Who are the main exporters of Coal to China? What is the role of Indonesia in this?")

' The main exporters of coal to China are Indonesia, Russia, and Mongolia. Indonesia is the largest exporter of coal to China, accounting for 58.3% of total imports, followed by Russia at 23.3% and Mongolia at 10%.'

# Hugging Face

## Using other LLM Model

### Hugging Face

Hugging Face is a company that specializes in natural language processing (NLP) and develops a variety of tools, libraries, and models to facilitate NLP tasks. The company is known for creating and maintaining an open-source library called "transformers," which has gained significant popularity in the NLP community.

The Transformers library, developed by Hugging Face, provides a wide range of pre-trained models and utilities for tasks such as text classification, named entity recognition, sentiment analysis, machine translation, question answering, and more. These pre-trained models are based on state-of-the-art architectures, such as the GPT (Generative Pre-trained Transformer) models developed by OpenAI.

In addition to the Transformers library, Hugging Face offers a platform called the "Hugging Face Hub." This platform serves as a central repository where users can access, share, and collaborate on models and datasets. It provides a seamless integration with the Transformers library, allowing users to easily download and use pre-trained models.

Hugging Face has made significant contributions to the NLP community by democratizing access to pre-trained models and fostering collaboration among researchers and developers. Their tools and resources have greatly simplified the process of building NLP applications and have contributed to the rapid advancement of the field.

Langchain also provide us to connect with hugging face API so we can use another Transformer.

In [17]:
from langchain import HuggingFaceHub, LLMChain
from langchain.prompts import PromptTemplate

Similar with the OpenAI part the only diffrent we need to define which model we want to use, then we set the model parameter such as temperature.

In [18]:
hub_llm = HuggingFaceHub(
    repo_id='gpt2',
    model_kwargs={'temperature': 0}
)

  from .autonotebook import tqdm as notebook_tqdm


As the name, we need to create a chain, so we need to provide some kind on prompt template to the model. Then we create the chain.

In [19]:
prompt = PromptTemplate(
    input_variables=["question"],
    template="""Question: {question}"""
)

hub_chain = LLMChain(prompt=prompt, llm=hub_llm, verbose=True)

Let's ask some intereting question.

In [20]:
hub_chain.run("who won FIFA World Cup in the year 1994?")



[1m> Entering new  chain...[0m
Prompt after formatting:
[32;1m[1;3mQuestion: who won FIFA World Cup in the year 1994?[0m

[1m> Finished chain.[0m


'\n\nA: The winner of the'

## Integrating HuggingFace's Inference API into the previously built Question-Answering System

In [21]:
from langchain import SQLDatabase, SQLDatabaseChain

dburi = "sqlite:///data_input/chinook.db"
db = SQLDatabase.from_uri(dburi)

We just need to replace the llm model with the hugging face model when chaining it to database.

In [22]:
hf_llm_t5 = HuggingFaceHub(
    repo_id='mrm8488/t5-base-finetuned-wikiSQL',
    model_kwargs={'temperature': 0}
)

In [44]:
prompt_db = PromptTemplate(
    input_variables=['question'],
    template="Translate English to SQL: {question}"
)

In [64]:
hub_chain = LLMChain(prompt=prompt_db, llm=hf_llm_t5, verbose=True)

hub_chain.run("How many rows is in the tracks's table?")



[1m> Entering new  chain...[0m
Prompt after formatting:
[32;1m[1;3mTranslate English to SQL: How many rows is in the tracks's table?[0m

[1m> Finished chain.[0m


'SELECT COUNT Rows FROM table WHERE Tracks = Tracks'

It doesn't provide correct answer because it should be FROM tracks instead of table. Notice this is the difference between t5 model and openai model.

However, HuggingFace provide many models that can use for certain task.

While GPT-2 is a powerful language model with impressive capabilities, it does have certain weaknesses:

- Lack of Factual Accuracy: GPT-2 generates text based on patterns and examples it has learned from training data, but it does not have a built-in mechanism to verify the accuracy of the information it generates. As a result, it may sometimes produce plausible-sounding but factually incorrect or misleading responses.

- Limited Context Understanding: GPT-2's understanding of context is limited to a fixed window of text. It does not possess true contextual understanding or long-term memory. This can lead to issues where it fails to maintain coherence or consistency when generating longer passages of text.

- Sensitivity to Input Phrasing: GPT-2 can be sensitive to slight changes in input phrasing, which can sometimes result in different or inconsistent responses. Small modifications in the wording of a prompt can lead to significantly different outputs, making it challenging to control the model's behavior consistently.

- Lack of Explainability: GPT-2 operates as a black box model, meaning it does not provide explanations for the reasoning behind its generated responses. This lack of transparency can make it difficult to understand how and why the model arrives at certain conclusions or generates specific outputs, making it less suitable for applications where explainability is critical.

- Propensity for Biased Outputs: GPT-2 is trained on vast amounts of text data from the internet, which can contain biases and reflect societal prejudices. As a result, the model may inadvertently generate biased or prejudiced responses, perpetuating or amplifying existing biases present in the training data.

- Vulnerability to Adversarial Inputs: GPT-2 can be susceptible to generating misleading or nonsensical outputs when presented with adversarial inputs or deliberate attempts to manipulate its behavior. Adversarial examples can exploit weaknesses in the model's training data or architecture, leading to unreliable or undesirable responses.

It's important to note that some of these weaknesses have been addressed or mitigated in subsequent models like GPT-3 and GPT-4, which have shown improvements in contextual understanding, bias handling, and fine-grained control.