In [1]:
from datasets import load_dataset
from transformers import AutoModelForSeq2SeqLM
from transformers import AutoTokenizer
from transformers import GenerationConfig
import transformers
import torch
from dotenv import load_dotenv
import os
import openai
import banking  # noqa: E402
from private_prompting import Prompter
load_dotenv(".env")

open_ai_key = os.environ.get("openai-key")


Download data and initialize a DuckDB instance. 

In [2]:
_ = banking.BankingData("https://tinyurl.com/jb-bank", "bank")
_.extract_to_csv()

# Loading in SQL extension
%reload_ext sql
# Initiating a DuckDB database named 'bank.duck.db' to run our SQL queries on
%sql duckdb:///bank.duck.db

Create table.

In [3]:
%sql CREATE OR REPLACE TABLE bank AS FROM read_csv_auto('bank_cleaned.csv', header=True, sep=',')
# Extract column names
columns = %sql PRAGMA table_info('bank');
column_names = [row[1] for row in columns]

<h1 align='center'>Prompts & Agents</h1>

<h2 align='center'>How to incorporate prompting into your Python scripts <br>and expand their functionality through agents</h2>

<h2 align='center'>Laura Funderburk</h2>

<h2 align='center'>PyData Vancouver</h2>

<h1 align='center'>About me</h1>

* Developer Advocate @ Ploomber (talking and sharing knowledge about tools to improve the data science workflow)

* Previously a data scientist (for-profit, not-for-profit sector)

* Deeply curious about generative AI, Large Language Models, with a focus on engineering and automation

* I use LLMs, prompting and agents to automate work tasks.

* BJJ enthusiast. Working towards a blue belt end of the year. Competing at CBJJF Internationals September

<h1 align='center'>Talk at a glance</h1>

<h2 align='center'>Part I: Prompting (40 minutes)</h2>


1. LLMs use cases and tasks
2. The Generative AI project lifecycle 
3. Choosing the right LLM for the desired task
4. Key elements of prompting & prompting techniques
5. Prompting private LLMs (OpenAI API): `ChatCompletion`
6. Prompting open source LLMs through HuggingFace

<h1 align='center'>Talk at a glance</h1>

<h2 align='center'>Part II: Agents and open source frameworks (20 minutes)</h2>

1. NLP pipelines and chaining
2. Introduction to Haystack
3. Introduction to LangChain
4. Techniques to combine prompting and agents for deployment of applications
5. Pros and cons of each

<h1 align='center'>Part I: Prompting</h1>


<h1 align='center'>LLMs use cases and tasks</h1>


<h1 align='center'>LLMs use cases and tasks</h1>


* Text summarization

* Conversation 

* Translation

* Text generation

* Text, token and sentiment classification

* Table Q&A and Q&A from unstructured data

* Sentence similarity

* Masking

<h1 align='center'>LLMs use cases and tasks</h1>

<h3 align='center'>Your goal is to understand the business case you are solving -<br> then select the appropriate methods to solve it</h3>

<h4 align='center'>Who will benefit from your product?</h4>

<h4 align='center'>What are business constraints (time, data, resources)?</h4>

<h4 align='center'>What is the end result?</h4>

<h4 align='center'>How will it be served?</h4>



<h1 align='center'>The generative AI project lifecycle</h1>


<h1 align='center'>The generative AI project lifecycle</h1>

<p></p>

<center>
  <img src="diagrams/genai_project_lifecycle.jpg" width="1400px"/>

</center>

Source: Coursera, Generative AI with LLMs

<h1 align='center'>Focus of this talk</h1>

<p></p>
<center>
  <img src="diagrams/genai_project_lifecycle_focus.jpg" width="1400px"/>

</center>

<h1 align='center'>Choosing the right LLM (architecture) for the desired task</h1>

<p></p>
<center>
  <img src="diagrams/opt.jpeg" width="200px"/>

</center>


* Decoder-only transformers: Good for **generative tasks** (auto-regressive)
* Encoder-only transformers: Good for tasks that require **understanding of the input** (auto-encoding)
* Encoder-decoder transformers or sequence-to-sequence models: Good for **generative tasks that require input** 



<h1 align='center'>Choosing the right LLM (architecture) for the desired task</h1>

| Tranformer type | Architecture|Model-like | Focus | Example| 
|-|-|-|-|-|
| Auto-regressive | Decoder-only |GPT-like | Generative tasks | Chat bot | 
| Auto-encoding | Encoder-only |BERT-like | Understanding of the input | Question-answering|
| Sequence-to-Sequence |Encoder-decoder |BART/T5-like | Generative tasks that require an input | Language translation|

Read more

https://github.com/christianversloot/machine-learning-articles/blob/main/differences-between-autoregressive-autoencoding-and-sequence-to-sequence-models-in-machine-learning.md

<h1 align='center'>Do I need to train a new model to solve my problem?</h1>

<h3 align='center'>No. Training an LLM is costly (GPU usage, time, compute, data). This is why sharing LLMs and their fine-tuned components has become highly popular.</h3>


<p></p>
<center>
  <img src="diagrams/hftasks.png" width="1300px"/>

</center>

<h3 align='center'>You can start with prompting a LLM, then fine-tuning* if you aren't getting the results you want. <br>You'll need to curate a dataset for this. </h3>

*(instruction-tuning or, PEFT + LoRA for example)

Source: https://huggingface.co/tasks

<h1 align='center'>Prompting</h1>


<h1 align='center'>Key elements of prompting</h1>

<h3 align='center'>Basic</h3>

* A LLM to interact with
* Temperature
* Max tokens
* A natural language request

<h3 align='center'>Advanced</h3>

* Data (text files, web files)
* A database storage system (vector DB, SQL, PostgreSQL, etc)
* User interfaces


<h1 align='center'>Prompting techniques</h1>

* Zero-shot inference

* One-shot inference

* Few-shot inference


<h1 align='center'>Prompting techniques: Zero-shot inference</h1>

**Formula: instruction, no examples.**

Suppose we want to translate natural language queries into SQL. 

We have a `natural_question`, a database table name `db_name` and a `schema`

```python
prompt= f"Answer the question {natural_question} for table {db_name} with schema {schema}"        
```

Suppose we want to classify the sentiment in a sentence `sentence`

```python
prompt = f"How does the author feel about this based on the statement {sentence}"

```

<h1 align='center'>Prompting techniques: One-shot inference</h1>

**Formula: instruction, one example.**

```python
prompt= f"Answer the question {natural_question} for table {db_name} with schema {schema}\
                        For example, if you receive the question: 'How many records are there?'\
                        An appropriate answer is\
                        'SELECT COUNT(*) FROM bank'"
```

```python
prompt = f"How does the author feel about this based on the statement {sentence}\
            'I find the user experience is confusing and convoluted'\
            Answer: Negative"

```

<h1 align='center'>Prompting techniques: Few-shot inference</h1>

**Formula: instruction, more than one example.**

```python
prompt= f"Answer the question {natural_question} for table {db_name} with schema {schema}\
                        Example:\
                        'How many records are there?'\
                        'SELECT COUNT(*) FROM bank'\
                        Example:\
                        Find all employees that are unemployed\
                        SELECT * FROM bank WHERE job = 'unemployed'"
```

```python
prompt = f"How does the author feel about this based on the statement {sentence}\
            'I find the user experience is confusing and convoluted'\
            Answer: Negative\
            'The decoration of the room made me feel welcome!'\
            Answer: Positive"

```

<h1 align='center'>Prompting private LLMs (OpenAI API)</h1>


We're going to focus on the `ChatCompletion` end point. 

Key elements:

* OpenAI API Key
* Model chosen (GPT4, GPT 3.5 Turbo, Text-Davinci)
* Temperature
* Your prompt


<h1 align='center'>Prompting private LLMs (OpenAI API)</h1>

```python
import openai

class Prompter:
    def __init__(self, api_key, gpt_model, temperature=0.2):
        if not api_key:
            raise Exception("Please provide the OpenAI API key")

        self.api_key  = api_key
        self.gpt_model = gpt_model
        self.temperature = temperature
    
    
```

<h1 align='center'>Prompting private LLMs (OpenAI API)</h1>

```python
import openai

class Prompter:
    ...
    def prompt_model_return(self, messages: list):
        openai.api_key = self.api_key
        response = openai.ChatCompletion.create(model=self.gpt_model, 
                                                messages=messages,
                                                temperature=self.temperature)
        return response["choices"][0]["message"]["content"]
    

```

<h1 align='center'>Roles in prompting the ChatCompletion endpoint</h1>

`system content`: What context should the LLM have in mind? Expert in marketing, helpful assistant, enthusiastic marketing generator.

`user content`: What are typical requests that someone in that role would receive?


```python
import openai

class Prompter:
    ...
    def natural_language_to_sql(self, db_name:str, schema:str, natural_question:str):

            system_content = f"You are a data analyst, \
                               and you specialize in solving business questions with SQL.\
                               You are given a natural language question, \
                               and your role is to translate the question\
                               into a query that can be executed against a database. \
                               Ensure your queries are written in a single line, with no special characters"
            
            user_content = f"Please generate a SQL query for data with in a database named {db_name}\
                            along with a schema {schema} for the question {natural_question}"

            full_prompts = [
                                    {"role" : "system", "content" : system_content},
                                    {"role" : "user", "content" : user_content},
                                    ]

            result = self.prompt_model_return(full_prompts)

            return result
    
```

Let's suppose I have a database called `bank` that looks as follows.

In [42]:
%sqlcmd explore --table bank

Let's take the different prompting techniques for a ride.

We will ask a GPT model to translate a natural language question into SQL. 

In [43]:
pm  = Prompter(open_ai_key, "gpt-3.5-turbo")

Zero-shot results.

In [44]:
pm.natural_language_zero_shot("bank", column_names, "How many unique jobs are there?")

'To determine the number of unique jobs in the table "bank", we need to look at the distinct values in the "job" column. Unfortunately, without the actual data in the table, it is not possible to provide an exact answer. However, you can run a query on the table to find the distinct values in the "job" column and count them. Here is an example query in SQL:\n\nSELECT COUNT(DISTINCT job) AS unique_jobs\nFROM bank;\n\nThis query will return the number of unique jobs in the "bank" table.'

In [45]:
pm.natural_language_zero_shot("bank", column_names, "What is the total balance for employees by education?")

"I'm sorry, but I cannot provide the answer to that question as it requires access to the data in the table."

Single-shot results.

In [46]:
pm.natural_language_single_shot("bank", column_names, "How many unique jobs are there?")

'SELECT COUNT(DISTINCT job) FROM bank'

In [47]:
pm.natural_language_single_shot("bank", column_names, "What is the total balance for employees by education?")

'SELECT education, SUM(balance) AS total_balance\nFROM bank\nGROUP BY education'

Few-shot results.

In [48]:
pm.natural_language_few_shot("bank", column_names, "How many unique jobs are there?")

"To find the number of unique jobs in the table 'bank', you can use the following query:\n\nSELECT COUNT(DISTINCT job) FROM bank\n\nThis query will return the count of distinct job values in the 'bank' table."

In [49]:
pm.natural_language_few_shot("bank", column_names, "What is the total balance for employees by education?")

'To find the total balance for employees by education, you can use the following SQL query:\n\nSELECT education, SUM(balance) AS total_balance\nFROM bank\nGROUP BY education;'

Roles-based results.

In [50]:
pm.natural_language_with_roles("bank", column_names, "How many unique jobs are there?")

'SELECT COUNT(DISTINCT job) FROM bank'

In [52]:
pm.natural_language_with_roles("bank", column_names, "What is the total balance for employees by education?")

'SELECT education, SUM(balance) AS total_balance FROM bank GROUP BY education'

In [53]:
%%sql
SELECT COUNT(DISTINCT job) FROM bank

count(DISTINCT job)
12


<h1 align='center'>Prompting open source LLMs through HuggingFace</h1>


You need to ensure you install the right modules via `pip` along with the model card of the LLM.

<p></p>
<center>
  <img src="diagrams/hftasks.png" width="1300px"/>

</center>


Example: https://huggingface.co/microsoft/tapex-base

<h1 align='center'>The reality of prompting open source models</h1>

<p></p>
<center>
  <img src="diagrams/this-is-fine.jpeg" width="1200px"/>

</center>


<h1 align='center'>The reality of prompting open source models</h1>

* HuggingFace hosted models resemble GitHub repos (but not in a good way).
* Dependency hell
* Prompting results vary across different models.
* Model documentation ranges from non-existent to minimal.
* Higher likelihood that you'll need to find the base model and fine-tune your data for better results.


In [None]:
from transformers import AutoModelWithLMHead, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("mrm8488/t5-base-finetuned-wikiSQL")
model = AutoModelWithLMHead.from_pretrained("mrm8488/t5-base-finetuned-wikiSQL")

In [54]:
def get_sql(query, db_name, schema):
    input_text = "translate English to SQL: %s </s>" % query
    features = tokenizer([input_text], return_tensors='pt')

    output = model.generate(input_ids=features['input_ids'], 
               attention_mask=features['attention_mask'])

    return tokenizer.decode(output[0])

# Translate
natural_question =  "How many entries are there?" 
db_name = "banks"
schema = column_names

get_sql(natural_question, db_name, schema)

'<pad> SELECT COUNT Entry FROM table</s>'

<h1 align='center'>How to guide your choices</h1>

1. Remember your use case, the business constraints and who will use your application
2. Remember the three base models and their keywords
3. Be ready for the possibility of fine-tuning


| Tranformer type | Architecture|Model-like | Focus | Example| 
|-|-|-|-|-|
| Auto-regressive | Decoder-only |GPT-like | Generative tasks | Chat bot | 
| Auto-encoding | Encoder-only |BERT-like | Understanding of the input | Question-answering|
| Sequence-to-Sequence |Encoder-decoder |BART/T5-like | Generative tasks that require an input | Language translation|

<h1 align='center'>Part II: Agents and open source frameworks</h1>

The frameworks introduced here can both be installed via `pip` and imported as modules into your Python script. 

Both of them offer agents, although they approach the implementation differently. 

<h1 align='center'>Introducing</h1>
<center>
  <img src="diagrams/langchain.png" width="500px"/>

</center>

LangChain is a framework for developing applications powered by language models. It enables applications that are:

Data-aware: connect a language model to other sources of data

Agentic: allow a language model to interact with its environment


<h1 align='center'>How does LangChain approach Agents?</h1>
<center>
  <img src="diagrams/langchain.png" width="500px"/>

</center>

With LangChain we think in terms of **components** and **off-the-shelf chains**.

<h1 align='center'>How to incorporate it into your scripts</h1>
<center>
  <img src="diagrams/langchain.png" width="500px"/>

</center>

You can build your custom functions in Python and use their `@tool` decorator. Then after initializing LangChain along with the GPT model you want, you can then ask it to perform tasks with natural language commands. 

<h1 align='center'>Example: automate the creation of social media posts from web pages</h1>
<center>
  <img src="diagrams/langchain.png" width="500px"/>

</center>

Tools the agent was given:

1. A web scraper
2. A GPT based prompter with `system` and `user` content
3. Instructions to summarize the webpage and then write a social media post about the summary

<h1 align='center'>Introducing</h1>
<center>
  <img src="diagrams/haystack-ogimage.png" width="500px"/>

</center>

Haystack is an open-source framework for building search systems that work intelligently over large document collections.

<center>
  <img src="diagrams/haystack-ogimage.png" width="500px"/>

</center>

- Call open source models as well as private ones
- Build production-ready NLP pipelines with their custom-built tools
- Machinery for unstructured data processing (text)
- Leverage their prompt templates
- Incorporate Agents
- Deploy via REST API


<h1 align='center'>How does Haystack approach Agents?</h1>
<center>
  <img src="diagrams/haystack-ogimage.png" width="500px"/>

</center>

They rely heavily on the use of **prompt nodes**, the **pipelines** that connect them, expanding their functionality through **agents** and deploying via a REST API.

They first developed the framework with document stores, nodes to process information, and pipelines to connect the nodes, AND THEN built agent functionality on top of it.

**If you can get comfortable thinking in these terms, you will be able to leverage their application from prototyping to production for your documents.**

<h1 align='center'>Key concepts</h1>

<p></p>
<center>
  <img src="diagrams/haystack-ogimage.png" width="400px"/>

</center>
<p></p>
<center>
  <img src="diagrams/haystack.png" width="1200px"/>

</center>

<h1 align='center'>Sample ready-to-use Natural Language pipelines</h1>

* `ExtractiveQAPipeline`
* `DocumentSearchPipeline`
* `GenerativeQAPipeline`
* `WebQAPipeline`
* `SearchSummarizationPipeline`
* `TextIndexingPipeline`
* `TranslationWrapperPipeline`
* `FAQPipeline`
* `QuestionGenerationPipeline`
* `QuestionAnswerGenerationPipeline`
* `MostSimilarDocumentsPipeline`

You can create your custom pipelines too. Get familiar with the idea indexing, querying, document stores and their prompt node classes. 

<h1 align='center'>LangChain Pros & Cons</h1>

**Pros**
1. Easy to get started with
2. Maps easily to OpenAI API chat completion end point
3. Can easily connect to a variety of applications based on your function definition

**Cons**
1. Security issues (?) 
2. Evaluation of results 
3. Deployment 
4. Integration to open LLMs is not clear

<h1 align='center'>Haystack Pros & Cons</h1>

**Pros**
1. Established framework with a focus on production-ready NLP applications
2. Constantly adapting to new changes and building on top of their framework
3. Can be deployed to AWS
4. Offers solutions for your custom documents 
5. Offers prompt templates

**Cons**
1. Steeper learning curve
2. Current deployment option is REST API, but other options currently not available
3. Limitations on the types of files it can handle (PDF and markdown currently not supported) 
4. Narrower focus when it comes to the types of agents it supports (although you can create custom agents, through custom nodes)