**Coursebook: Creating Conservational AI with Large Language Models for Business**

- Part 2 of Large Language Models Specialization
- Course Length: 9 hours
- Last Updated: July 2023

---

Developed by Algoritma's Research and Development division

## Background

The coursebook is part of the **Large Language Models Specialization** developed by [Algoritma](https://algorit.ma/). The coursebook is intended for a restricted audience only, i.e. the individuals and organizations having received this coursebook directly from the training organization. It may not be reproduced, distributed, translated or adapted in any form outside these individuals and organizations without permission.Algoritma is a data science education center based in Jakarta. We organize workshops and training programs to help working professionals and students gain mastery in various data science sub-fields: data visualization, machine learning, data modeling, statistical inference etc.

# Creating Conservational AI with Large Language Models for Business

## Training Objective

In this module, we will embark on a journey to explore the fascinating world of Conversational AI and its applications in various business domains. We will delve into the principles, techniques, and best practices for harnessing the potential of LLMs to build robust and effective conversational systems. Whether you are a business professional, data scientist, or developer, this book will equip you with the knowledge and skills needed to leverage LLMs for creating advanced conversational AI solutions tailored to the specific needs of your organization.

- **Large Language Models: Architecture, Transformer, and Key Concepts**
   - Overview of Large Language Models and their Architecture
   - Understanding what is Transformer
   - Explanation of pre-training and fine-tuning of language models
   - Introduction to popular Large Language Models like GPT-3, GPT-2, and BERT
   - Understanding the capabilities and limitations of Large Language Models
   - Explanation of the LangChain concept
   - Setting the API key and .env


- **Building Question-Answering Systems with Large Language Models**
   - Introduction to Question-Answering System
   - Steps involved in connecting databases with LLM
   - Basics of building a Question-Answering System using LLM with a database
   - Demonstration of using OpenAI and LangChain to build a Question-Answering System
   - Using LangChain and OpenAI to build a Question-Answering System with text data
   - Steps involved in connecting CSV data with LLM
   - Demonstration of using LangChain and OpenAI to build a Question-Answering System with text data


- **Text Generation with HuggingFace**
   - Introduction to the Text Generation model in HuggingFace
   - Setting the .env token key
   - Applying HuggingFace's Inference API to use LLM without OpenAI credits
   - Integrating HuggingFace's Inference API into the previously built Question-Answering System
   - Demonstration of using HuggingFace's Inference API to build a Question-Answering System

# Large Language Models


**What is Large Language Models?**

Large Language Models (LLMs) is an advanced type of language model that represent a breakthrough in the field of natural language processing (NLP). These models are designed to understand and generate human-like text by leveraging the power of deep learning algorithms and massive amounts of data.

If you've ever chatted with a virtual assistant or interacted with an AI customer service agent, you might have interacted with a large language model without even realizing it. These models have a wide range of applications, from chatbots to language translation to content creation.

Some of the most impressive large language models are developed by OpenAI. Their GPT-3 model, for example, has over [175 billion parameters](https://www.techtarget.com/searchenterpriseai/definition/GPT-3#:~:text=GPT%2D3%20has%20more%20than,(BERT)%20and%20Turing%20NLG.) and is able to perform tasks like [summarization](https://wandb.ai/mostafaibrahim17/ml-articles/reports/Compressing-the-Story-The-Magic-of-Text-Summarization--VmlldzozNTYxMjc2), [question-answering](https://wandb.ai/mostafaibrahim17/ml-articles/reports/The-Answer-Key-Unlocking-the-Potential-of-Question-Answering-With-NLP--VmlldzozNTcxMDE3), and even creative writing.



**How a Large Language Model was Built?**

The architecture of LLMs is based on the Transformer model, which has revolutionized NLP tasks. The Transformer model utilizes a self-attention mechanism that allows the model to focus on different parts of the input sequence, capturing dependencies and relationships between words more effectively. This architecture enables LLMs to generate coherent and contextually relevant responses, making them valuable tools for a wide range of applications.

A large-scale transformer model known as a “large language model” is typically too massive to run on a single computer and is, therefore, provided as a service over an API or web interface. These models are trained on vast amounts of text data from sources such as books, articles, websites, and numerous other forms of written content. By analyzing the statistical relationships between words, phrases, and sentences through this training process, the models can generate coherent and contextually relevant responses to prompts or queries.

*ChatGPT’s GPT-3* model, for instance, was trained on massive amounts of internet text data, giving it the ability to understand various languages and possess knowledge of diverse topics. As a result, it can produce text in multiple styles. While its capabilities may seem impressive, including translation, text summarization, and question-answering, they are not surprising, given that these functions operate using special “grammars” that match up with prompts.

### Understanding what is Transformer

The Transformer is a type of deep learning architecture that has revolutionized the field of natural language processing. It was introduced in the paper ["Attention Is All You Need" by Vaswani et al. (2017)](https://arxiv.org/abs/1706.03762). The Transformer model employs self-attention mechanisms to capture dependencies between words in a sentence, enabling it to learn contextual relationships and generate coherent and contextually relevant text.

The Transformer architecture excels at handling text data which is inherently sequential. They take a text sequence as input and produce another text sequence as output. eg. to translate an input French sentence to English.

<img src="https://jalammar.github.io/images/t/the_transformer_3.png">

The Transformer architecture consists of two main components: the encoder and the decoder. The encoder processes the input sequence and generates a representation, which is then passed to the decoder. The decoder generates the output sequence based on the encoder's representation and previous outputs.

<img src="https://jalammar.github.io/images/t/The_transformer_encoders_decoders.png">

Here are the key components and concepts of the Transformer architecture:

1. **Positional Encoding**: Transformers incorporate positional encoding to provide the model with information about the order of words in the input sequence. Positional encoding is usually added to the input embeddings and allows the model to differentiate between the positions of words.

![positional_encoding](assets/positional_encoding.png)

2. **Self-Attention Mechanism**: Self-attention allows each word in the input sequence to attend to all other words. It computes the attention weight between each pair of words and uses them to generate a weighted sum of the word embeddings. This mechanism enables the model to capture long-range dependencies and learn contextual relationships effectively.

![self_attention](assets/self_attention.png)

### Pre-training and Fine-tuning of Language Models

Pre-training and fine-tuning are two key steps in the training process of language models, including Large Language Models (LLMs). 

**Pre-training**: In the pre-training phase, a language model is trained on a large corpus of unlabeled text data. During this phase, the model learns to predict missing words in sentences based on the surrounding context. It develops an understanding of language patterns, grammar, and contextual relationships. The pre-training process typically involves techniques like masked language modeling, where certain words are randomly masked and the model learns to predict them based on the remaining context.

**Fine-tuning**: After pre-training, the language model is fine-tuned on specific labeled datasets for specific downstream tasks. Fine-tuning involves training the pre-trained model on labeled data related to a particular task, such as question answering, sentiment analysis, or text classification. This process allows the model to adapt to the specific task by learning task-specific patterns and improving its performance. Fine-tuning is performed on a smaller dataset, which is typically task-specific and labeled by human experts.

In the context of Large Language Models (LLMs), the terms "pre-trained" and "fine-tuned" refer to two stages in the model development process. This two-step process offers several advantages:

- Pre-training on large-scale data helps LLMs learn from a diverse range of linguistic patterns and structures, enhancing their language understanding capabilities.
- Fine-tuning allows LLMs to adapt to specific tasks or domains, improving their performance and making them more efficient in generating desired outputs.
- Fine-tuning requires comparatively less labeled data than training from scratch, making it a practical approach when labeled data is limited.

LLMs, such as GPT-3, GPT-2, and BERT, are examples of large-scale language models that have undergone extensive pre-training and fine-tuning processes. They have been trained on vast amounts of text data and have a large number of parameters. This pre-training and fine-tuning approach allows LLMs to capture complex language patterns, generate coherent text, and perform well on a wide range of natural language processing tasks.

### Popular Large Language Models 

Popular Large Language Models (LLMs) are advanced models that have gained significant attention in the field of natural language processing. They have been trained on massive amounts of text data and have a large number of parameters, allowing them to capture complex language patterns and generate coherent text.

Here are some examples of popular LLMs:

1. **[GPT-3](https://openai.com/blog/gpt-3-apps) (Generative Pre-trained Transformer 3)**: GPT-3 is a state-of-the-art language model developed by OpenAI. It is renowned for its impressive size, consisting of 175 billion parameters. GPT-3 has been trained on a vast amount of internet text data, enabling it to understand and generate human-like text. It can perform a wide range of natural language processing tasks, including language translation, text completion, sentiment analysis, and more. GPT-3 has shown remarkable capabilities in generating coherent and contextually relevant responses, making it a powerful tool for various applications.

2. **[GPT-2](https://huggingface.co/gpt2) (Generative Pre-trained Transformer 2)**: GPT-2 is the predecessor to GPT-3, also developed by OpenAI. Although smaller in size with 1.5 billion parameters, GPT-2 still delivers impressive language generation capabilities. It has been trained on diverse internet text sources, allowing it to produce high-quality text in a variety of styles and topics. GPT-2 is widely used for tasks such as text completion, text generation, and language understanding.

3. **[BERT](https://machinelearningmastery.com/a-brief-introduction-to-bert/) (Bidirectional Encoder Representations from Transformers)**: BERT is a groundbreaking language model developed by Google. It introduced the concept of bidirectional training, which significantly improved the understanding of context in natural language processing. BERT has been trained on large-scale text data and employs a transformer architecture. It excels in various language understanding tasks, including question-answering, sentiment analysis, named entity recognition, and more. BERT has set new benchmarks in several natural language processing tasks and has been widely adopted in both research and industry.


### Capabilities and limitations of Large Language Models

Large language models like GPT-3, GPT-2, and BERT exhibit impressive capabilities in tasks such as :

- text generation, 
- language translation, 
- text understanding
- sentiment analysis,
- text summarization
- question answering,
- etc.

They can understand complex language structures, generate coherent text, and perform well on a range of natural language processing tasks. 

However, it's essential to acknowledge the limitations of LLMs:

- Bias and Ethical Concerns: LLMs can inherit biases present in the training data, leading to biased or controversial outputs. Ensuring fairness, diversity, and ethical use of LLMs is a critical challenge.
- Contextual Understanding: While LLMs can generate coherent text, they may sometimes struggle with understanding the broader context or resolving ambiguous statements.
- Lack of Real-World Knowledge: LLMs are trained on vast amounts of text data, but they lack true real-world experience and common-sense reasoning abilities. They may provide accurate information but lack true understanding.
- Computational Requirements: LLMs are computationally intensive, requiring significant computational resources for training and inference. This can limit their accessibility and scalability for some applications.
- Data Dependency: LLMs heavily rely on the quality and diversity of the training data. Inadequate or biased data can impact their performance and generalization capabilities.

## Introduction to LangChain

[LangChain](https://python.langchain.com/docs/get_started/introduction.html) is a framework for developing applications powered by language models that refers to the integration of multiple language models and APIs to create a powerful and flexible language processing pipeline. It involves connecting different language models, such as OpenAI's GPT-3 or GPT-2, with other tools and APIs to enhance their functionality and address specific business needs. 

The LangChain concept aims to leverage the strengths of each language model and API to create a comprehensive language processing system. It allows developers to combine different models for tasks like question answering, text generation, translation, summarization, sentiment analysis, and more.

The core idea of the library is that we can **“chain”**“ together different components to create more advanced use cases around LLMs. Chains may consist of multiple components from several modules:

1. **Prompt templates**: Prompt templates are templates for different types of prompts. Like “chatbot” style templates, ELI5 question-answering, etc

2. **LLMs**: Large language models like GPT-3, BLOOM, etc

3. **Agents**: Agents use LLMs to decide what actions should be taken. Tools like web search or calculators can be used, and all are packaged into a logical loop of operations.

4. **Memory**: Short-term memory, long-term memory.





### Environment Set-up

Using LangChain will usually require integrations with one or more model providers, data stores, APIs, etc. For this example, we'll use OpenAI's model APIs.

#### Setting API key and `.env`

Accessing the API requires an API key, which you can get by creating an account and heading here. When setting up an API key and using a .env file in your Python project, you follow these general steps:

1. **Obtain an API key**: If you're working with an external API or service that requires an API key, you need to obtain one from the provider. This usually involves signing up for an account and generating an API key specific to your project.

2. **Create a .env file**: In your project directory, create a new file and name it ".env". This file will store your API key and other sensitive information securely.

3. **Store API key in .env**: Open the .env file in a text editor and add a line to store your API key. The format should be `API_KEY=your_api_key`, where "API_KEY" is the name of the variable and "your_api_key" is the actual value of your API key. Make sure not to include any quotes or spaces around the value.

4. **Load environment variables**: In your Python code, you need to load the environment variables from the .env file before accessing them. Import the dotenv module and add the following code at the beginning of your script:

```python
from dotenv import load_dotenv

# Load environment variables from .env file
load_dotenv()
```

> `dotenv` library is a popular Python library that simplifies the process of loading environment variables from a .env file into your Python application. It allows you to store configuration variables separately from your code, making it easier to manage sensitive information such as API keys, database credentials, or other environment-specific settings.


In [None]:
from dotenv import load_dotenv

load_dotenv()

## LangChain Quickstart

In `LangChain`, a QuickStart involves working with three key components: Prompt, Chain, and Agent. 

With the **Prompt, Chain, and Agent** components working together, we can engage in interactive conversations with the language model. The Prompt sets the context or initiates the conversation, the Chain maintains the conversation history, and the Agent manages the communication between the user and the language model.

Using these components, we can build dynamic and **interactive applications** that involve back-and-forth interactions with the language model, allowing we to create **conversational agents**, **chatbots**, **question-answering systems**, and more.

To interact `LangChain` library with an OpenAI language model, we should:

1. **Importing the Required Module**: The code imports the LangChain library by using the statement `from langchain import OpenAI`.

2. **Creating an OpenAI Instance**: The code creates an instance of the `OpenAI` class and assigns it to the variable `llm`. This instance represents the connection to the OpenAI language model.

3. **Setting the Temperature Parameter**: The `temperature` parameter is passed to the `OpenAI` instance during its initialization. Temperature is a parameter that controls the randomness of the language model's output. 
> A higher temperature value (e.g., 0.9) makes the generated text more **diverse and creative**, while a lower value (e.g., 0.2) makes it more **focused and deterministic**.


In [None]:
from langchain import OpenAI

llm = OpenAI(temperature=0.9)

By creating an instance of `OpenAI` and setting the desired temperature, we can now use the `llm` object to interact with the OpenAI language model. We can pass prompts or messages to the `llm` object, receive the generated responses, and customize the behavior of the language model using additional parameters and methods provided by the LangChain library.

### Prompt

#### Basic Prompt

**Prompt** refers to the initial input or instruction given to the language model to generate a response. It sets the context and provides guidance for the language model to produce relevant and coherent text. 

In this example, the prompt asks for a suggestion of a good name for a brand specializing in local burgers. 

In [None]:
prompt = "What is a good name for a brand that makes local burger?"
print(llm.predict(prompt))

The language model then uses its knowledge and training to generate a response that fits the given prompt. Notice every re-run it generate new answer.

> The `llm.predict()` function is called with the prompt as the input. This function sends the prompt to the language model and **generates** a response based on the given input. The generated text represents the **language model's prediction** or **completion** of the prompt.

# UNOFFICIAL MATERIAL BORDER

We can also did it in other languages, let's try with Bahasa

In [None]:
print(llm.predict("Nama yang bagus untuk brand yang membuat pisang goreng mentai?"))

#### Prompt Templates

Most LLM applications do not pass input directly into an LLM. Usually they will add the user input to a larger piece of text, called a prompt template.

A prompt template refers to a reproducible way to generate a prompt. It contains a text string (“the template”), that can take in a set of parameters from the end user and generate a prompt.

The prompt template may contain:

- instructions to the language model,
- a set of few shot examples to help the language model generate a better response,
- a question to the language model.

In the previous example, the text we passed to the model contained instruction to generate a brand name based on description. For our application, it'd be great if the user only had to provide the description of a company/product, without having to worry about giving the model instruction.

In [None]:
from langchain.prompts import PromptTemplate

template_prompt = PromptTemplate.from_template("What is a good name for a brand that makes {product}?")

prompt = template_prompt.format(product="local burger")

print(prompt)

Notice the instruction changes automatically based on user input, this instruction will be input to `llm` to generate the response.

In [None]:
print(llm.predict(prompt))

Because this is a template, it can handle more than one input, for example.

In [None]:
template = "Write a {adjective} poem about {subject}"

poem_template = PromptTemplate(
    input_variables=["adjective", "subject"],
    template=template,
)

In [None]:
poem_template.format(adjective='sad', subject='ducks')

In [None]:
print(llm.predict(poem_template.format(adjective='sad', subject='ducks')))

The prompt template may contain:

- instructions to the language model,
- a set of few shot examples to help the language model generate a better response,
- a question to the language model.

In [None]:
template = """
I want you to act as a naming consultant for new companies.

Here are some examples of good company names:

- search engine, Google
- social media, Facebook
- video sharing, YouTube

The name should be short, catchy and easy to remember.

What is a good name for a brand that makes {product}?
"""

brand_template = PromptTemplate(
    input_variables=["product"],
    template=template,
)

batik_prompt = brand_template.format(product='batik')

print(llm.predict(batik_prompt))

### Chain

We've got a model and prompt template, we'll want to combine the two by "Chain"-ing them up. Chains give us a way to link (or chain) together multiple primitives, like models, prompts, and other chains.

The simplest and most common type of chain is an LLMChain, which passes an input first to a PromptTemplate and then to an LLM. We can construct an LLM chain from our existing model and prompt template.

For example, if we want to generate response from our template our workflow would be:

1. Create the prompt based on input with `template_prompt`

In [None]:
prompt = template_prompt.format(product="rendang mozarella")

print(prompt)

2. Generate response from prompt with `llm`

In [None]:
print(llm.predict(prompt))

We can simplify the workflow by chaining (link) them up with `Chains`

In [None]:
from langchain.chains import LLMChain

chain = LLMChain(llm=llm, prompt=template_prompt)

In [None]:
print(chain.run('rendang mozarella'))

Notice that every new input we just need one line code to generate the response using `Chains`. Understanding how this simple chain works will set you up well for working with more complex chains.

### Agents

We had a plan for our initial chain to follow specific steps. However, for more complicated workflows, it's important to be able to pick actions based on the situation.

Agents help us do exactly that. They use a language model to figure out which actions to take and in what order. These agents have tools at their disposal, and they keep selecting a tool, running it, and examining the results until they find the ultimate solution.

To load an agent, you need to choose a(n):

- **LLM/Chat model:** The language model powering the agent.

- **Tool(s):** A function that performs a specific duty. This can be things like: Google Search, Database lookup, Python REPL, other chains. For a list of predefined tools and their specifications, see the Tools documentation.

- **Agent name:** A string that references a supported agent class. An agent class is largely parameterized by the prompt the language model uses to determine which action to take. Because this notebook focuses on the simplest, highest level API, this only covers using the standard supported agents. If you want to implement a custom agent, see here. For a list of supported agents and their specifications, see here.

For this example, we'll be using `wikipedia` tools to query a response from wikipedia information.

In [None]:
from langchain.agents import AgentType, initialize_agent, load_tools

# The language model we're going to use to control the agent.
llm_agent = OpenAI(temperature=0)

# The tools we'll give the Agent access to. Note that the 'llm-math' tool uses an LLM, so we need to pass that in.
tools = load_tools(["wikipedia", "llm-math"], llm=llm_agent)

# Finally, let's initialize an agent with the tools, the language model, and the type of agent we want to use.
agent = initialize_agent(tools, llm_agent, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True)

Let's test it out

In [None]:
agent.run("What year did Lionel Messi Joined Barcelona? What is his current age raised to the 0.43 power?")

# Build Question Answering System

Build Question-Answering System
- Introduction to Question-Answering System
- Connecting datasource (database and text data) with LLM
- Basics of building Question-Answering System using LLM with database and text data
- Demonstration of using OpenAI and LangChain to build a Question-Answering System

## Introduction to Question-Answer System

As we know, LangChain is an open-source library that offers developers a comprehensive set of resources to develop applications that run on Large Language Models (LLMs). In the previous example, we have built model that feed question and llm will generate response based on its internal knowledge.

But, what if we need to feed a question that more contextual about our domain business problem? for example what if we feed LLM with question about our company such as "What is the product that generate most revenue?".
However, there are some limited knowledge of LLMs models, especially when documents are specific of some context.

One way to address this limitation is to give more information about documents to large language model (LLM) to answer questions about information it was not trained on. The basic idea is to first retrieve any relevant documents from a corpus called context, then pass those documents along with the original question to the LLM. The LLM will then generate a response that is informed by the information in the retrieved documents.

This documents can be any file that store any information. It can be database, pdf, text and even information from a website.

In this module, we will explore how to connect and feed a database and text information to LLM to build Question-Answer System.

## Database

### Connecting Database

LangChain provide function that connect database to LLM, it called SQL Database. It also provide a function to chaining between the database, model llm and an agent that will execute SQL query based on natural language prompt

In [None]:
from langchain import SQLDatabase, SQLDatabaseChain

At this part we need to load the data, we will use the chinook data from our academy class as example. You need to explicitly explain what kind of database you load, for example `sqlite:///`.

Then you can just load the database using `SQLDatabase` from `langchain`.

In [None]:
dburi = "sqlite:///data_input/chinook.db"
db = SQLDatabase.from_uri(dburi)

After that, we chaining the `db` to model it will create agent that will looking for answer in chinook database based on prompt input.

Let's try prompt to know how many rows in the tracks table of this db.

In [None]:
llm = OpenAI(temperature=0) # parameter temperature

In [None]:
db_chain = SQLDatabaseChain(llm=llm, database=db, verbose=True)

db_chain.run("How many rows is in the tracks table of this db?")

Notice there are some component in the output, such as `SQLQuery` that provide information what process the model did to seeking the answer using SQL; `SQLResult`, the result of `SQLQuery` from our database; And lastly, it convert the `SQLResult` to natural language and display it on `Answer`.

### Basics of Building Question-Answer System using LLM

The `SqlDatabaseChain` allows you to answer questions over a SQL database. 

Another example we use the question from our dive deeper. 

> all sales in rock genre in 2012

In [None]:
db_chain.run("all sales in rock genre in 2012 based on invoice")

In [None]:
db_chain.run("We want the returned DataFrame to contain only the Pop genre and only when the UnitPrice of the track is 0.99")

In [None]:
db_chain.run("Tampilkan lagu dengan Genre Pop")

## Text Dataset (`.csv`)

### Connecting to CSV

Structured data not only stored in database file, some other example are .xlsx and .csv that stored data as table (columns and rows). Likewise, LangChain not only provides agent to generate answer from database using SQL based on natural language prompt, it also provides agent to generate answer based on tabular text data source, such as csv, that we will demonstrate how to utilize the agent.

First, let's define the location path of our dataset `rice.csv` that contains rice category transaction.

In [None]:
filepath = "data_input/rice.csv"

Then, we create agent. This time is a CSV agent. We use same llm model as our sql part that's why we don't need to re-define llm.

In [None]:
from langchain.agents import create_csv_agent
agent = create_csv_agent(llm, filepath, verbose=True)

Then we just run ask the question about our data.

In [None]:
agent.run("berikan detail banyaknya transaksi yang terjadi di setiap format")

Notice there more component in the output.
- `Thought`, is what agent thought how to solve the problem from the prompt and what is thought based on its action result.
- `Action`, is what agent did to solve the problem, it use `python_repl_ast` which is just python shell, it tells the agent use python to solved the problem. It also tell what `pandas` command the agent did to get the result from csv data.
- `Final Answer`: is a natural language form of answer from the result of `Action Input`.

## Unstructured Data

The document in the company not only stored in structured form of data, it also stored in unstructured form of data. For example like summary of the meeting, task reports, product description, etc. If we need to gather information or ask a question regarding those documents, a person would looking related document and search for the answer manually.

So what if we want to let llm model to seeking the answer? What if we have a document or regulation that only special at our company. We can add our own information to the LLM model. This part we will use OpenAI Embeddings. So we can build question answer system based on our company document.

In [None]:
from langchain.document_loaders import DirectoryLoader, TextLoader
from langchain.text_splitter import CharacterTextSplitter
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import Chroma
from langchain.chains import RetrievalQA

embeddings = OpenAIEmbeddings()

Let's use `summary.txt` that contains summary of coal news for Australia, Indonesia, and China.

In [None]:
loader = TextLoader('data_input/summary.txt')

documents = loader.load()
text_splitter = CharacterTextSplitter(chunk_size=2500, chunk_overlap=0)
texts = text_splitter.split_documents(documents)
#texts

In [None]:
docsearch = Chroma.from_documents(texts, embeddings)

qa_chain = RetrievalQA.from_chain_type(
    llm=OpenAI(), 
    chain_type="stuff",
    retriever=docsearch.as_retriever()
)

In [None]:
qa_chain.run("What are the effects of legislations surrounding emissions on the Australia coal market?")

In [None]:
qa_chain.run("Is there an export ban on Coal in Indonesia? Why?")

In [None]:
qa_chain.run("Who are the main exporters of Coal to China? What is the role of Indonesia in this?")

# HuggingFace

Hugging Face is a company that specializes in natural language processing (NLP) and develops a variety of tools, libraries, and models to facilitate NLP tasks. The company is known for creating and maintaining an open-source library called "transformers," which has gained significant popularity in the NLP community.

The Transformers library, developed by Hugging Face, provides a wide range of pre-trained models and utilities for tasks such as text classification, named entity recognition, sentiment analysis, machine translation, question answering, and more. These pre-trained models are based on state-of-the-art architectures, such as the GPT (Generative Pre-trained Transformer) models developed by OpenAI.

In addition to the Transformers library, Hugging Face offers a platform called the "Hugging Face Hub." This platform serves as a central repository where users can access, share, and collaborate on models and datasets. It provides a seamless integration with the Transformers library, allowing users to easily download and use pre-trained models.

Hugging Face has made significant contributions to the NLP community by democratizing access to pre-trained models and fostering collaboration among researchers and developers. Their tools and resources have greatly simplified the process of building NLP applications and have contributed to the rapid advancement of the field.

Langchain also provide us to connect with hugging face API so we can use another Transformer.

## Setting Up API Key

#### `.env` file

Likewise the OPENAI_API_KEY, we need to create HuggingFaceAPI Key that you can create [here](https://huggingface.co/settings/tokens). 

Then copy the token to .env and stored it as `HUGGINGFACEHUB_API_TOKEN` variable

your .env file would be like this
```{python}
OPENAI_API_KEY={token_forPopen_AI}
HUGGINGFACEHUB_API_TOKEN={token_forPopen_HuggingFace}
```

Then load the env variable using `load_dotenv()` so python can recognize and use the TOKEN API KEY

In [None]:
load_dotenv()

## Implementation

In [None]:
from langchain import HuggingFaceHub, LLMChain
from langchain.prompts import PromptTemplate

Similar with the OpenAI part the only diffrent we need to define which model we want to use, then we set the model parameter such as temperature.

In [None]:
hub_llm = HuggingFaceHub(
    repo_id='gpt2',
    model_kwargs={'temperature': 0, 'max_length': 50}
)

As the name, we need to create a chain, so we need to provide some kind on prompt template to the model. Then we create the chain.

In [None]:
prompt = PromptTemplate(
    input_variables=["question"],
    template="""Question: {question}"""
)

hub_chain = LLMChain(prompt=prompt, llm=hub_llm, verbose=True)

Let's ask some intereting question.

In [None]:
hub_chain.run("who won FIFA World Cup in the year 1994?")

## Integrating HuggingFace's Inference API

What if we want to use one of the free model from Hugging Face to create question answer system based on our database? We can import pretrained model that can translate natural language model into SQL query, one of the model is `t5-base-finetuned-wikiSQL`. Let's try it

In [None]:
# Import the llm model from huggingface
hf_llm_t5 = HuggingFaceHub(
    repo_id='mrm8488/t5-base-finetuned-wikiSQL',
    model_kwargs={'temperature': 0}
)

In [None]:
# Create prompt template
prompt_db = PromptTemplate(
    input_variables=['question'],
    template="Translate English to SQL: {question}"
)

In [None]:
# Chaining
hub_chain = LLMChain(prompt=prompt_db, llm=hf_llm_t5, verbose=True)

# Run the template
hub_chain.run("How many rows is in the tracks's table?")

It doesn't provide correct answer because it should be FROM tracks instead of table. Notice this is the difference between t5 model and openai model.

However, HuggingFace provide many models that can use for certain task.

## Summary



## Reference

- [The Illustrated Transformer](https://jalammar.github.io/illustrated-transformer/)