In [1]:
from datasets import load_dataset
from transformers import AutoModelForSeq2SeqLM
from transformers import AutoTokenizer
from transformers import GenerationConfig
import transformers
import torch
from dotenv import load_dotenv
import os
import openai
import banking  # noqa: E402
from private_prompting import Prompter
load_dotenv(".env")

open_ai_key = os.environ.get("openai-key")


Download data and initialize a DuckDB instance. 

In [2]:
_ = banking.BankingData("https://tinyurl.com/jb-bank", "bank")
_.extract_to_csv()

# Loading in SQL extension
%reload_ext sql
# Initiating a DuckDB database named 'bank.duck.db' to run our SQL queries on
%sql duckdb:///bank.duck.db

Create table.

In [3]:
%sql CREATE OR REPLACE TABLE bank AS FROM read_csv_auto('bank_cleaned.csv', header=True, sep=',')
# Extract column names
columns = %sql PRAGMA table_info('bank');
column_names = [row[1] for row in columns]

<h1 align='center'>Prompts & Agents</h1>

<h2 align='center'>How to incorporate prompting into your Python scripts <br>and expand their functionality through agents</h2>

<h2 align='center'>Laura Funderburk</h2>

<h2 align='center'>PyData Vancouver</h2>

<h1 align='center'>About me</h1>

* Developer Advocate @ Ploomber (talking and sharing knowledge about tools to improve the data science workflow)

* Previously a data scientist (for-profit, not-for-profit sectors)

* Deeply curious about generative AI, Large Language Models, with a focus on engineering and automation

* Currently use LLMs, prompting and agents to automate work tasks 

* BJJ enthusiast. Working towards a blue belt end of year. Competing at CBJJF Internationals September

<h1 align='center'>Talk at a glance</h1>

<h2 align='center'>Part I: Prompting (25 minutes)</h2>


1. LLMs use cases and tasks
2. The Generative AI project lifecycle 
3. Choosing the right LLM for the desired task
4. Key elements of prompting & prompting techniques
5. Prompting private LLMs (OpenAI API): `ChatCompletion`
6. Prompting open source LLMs through HuggingFace


Q&A about prompting (5 minutes)

<h1 align='center'>Talk at a glance</h1>

<h2 align='center'>Part II: Agents and open source frameworks (25 minutes)</h2>

1. NLP pipelines and chaining
2. Introduction to Haystack
3. Introduction to LangChain
4. Which one to pick
5. Techniques to combine prompting and agents for deployment of applications

Q&A about agents (5 minutes)

<h1 align='center'>Part I: Prompting</h1>


<h1 align='center'>LLMs use cases and tasks</h1>


<h1 align='center'>LLMs use cases and tasks</h1>


* Text summarization

* Conversation 

* Translation

* Text generation

* Text, token and sentiment classification

* Table Q&A and Q&A from unstructured data

* Sentence similarity

* Masking

<h1 align='center'>LLMs use cases and tasks</h1>

<h3 align='center'>Your goal is to understand the business case you are solving -<br> then select the appropriate methods to solve it</h3>

<h4 align='center'>Who will benefit from your product?</h4>

<h4 align='center'>What are business constraints (time, data, resources)?</h4>

<h4 align='center'>What is the end result?</h4>

<h4 align='center'>How will it be served?</h4>



<h1 align='center'>The generative AI project lifecycle</h1>


<h1 align='center'>The generative AI project lifecycle</h1>

<p></p>

<center>
  <img src="diagrams/genai_project_lifecycle.jpg" width="1400px"/>

</center>

<h1 align='center'>Focus of this talk</h1>

<p></p>
<center>
  <img src="diagrams/genai_project_lifecycle_focus.jpg" width="1400px"/>

</center>

<h1 align='center'>Choosing the right LLM (architecture) for the desired task</h1>

<p></p>
<center>
  <img src="diagrams/opt.jpeg" width="200px"/>

</center>


* Decoder-only transformers: Good for **generative tasks** (auto-regressive)
* Encoder-only transformers: Good for tasks that require **understanding of the input** (auto-encoding)
* Encoder-decoder transformers or sequence-to-sequence models: Good for **generative tasks that require an input** 



<h1 align='center'>Choosing the right LLM (architecture) for the desired task</h1>

| Tranformer type | Architecture|Model-like | Focus | Example| 
|-|-|-|-|-|
| Auto-regressive | Decoder-only |GPT-like | Generative tasks | Chat bot | 
| Auto-encoding | Encoder-only |BERT-like | Understanding of the input | Question-answering|
| Sequence-to-Sequence |Encoder-decoder |BART/T5-like | Generative tasks that require an input | Language translation|

Read more

https://github.com/christianversloot/machine-learning-articles/blob/main/differences-between-autoregressive-autoencoding-and-sequence-to-sequence-models-in-machine-learning.md

<h1 align='center'>Do I need to train a new model to solve my problem?</h1>

<h3 align='center'>No. Training a LLM is costly (GPU usage, time, compute, data). This is why sharing LLMs and their fine-tuned componets has become highly popularized.</h3>


<p></p>
<center>
  <img src="diagrams/hftasks.png" width="1300px"/>

</center>

<h3 align='center'>You can start with prompting a LLM, then fine-tuning* if you aren't getting the results you want. <br>You'll need to curate a dataset for this. </h3>

*(instruction-tuning or, PEFT + LoRA for example)

Source: https://huggingface.co/tasks

<h1 align='center'>Prompting</h1>


<h1 align='center'>Key elements of prompting</h1>

<h3 align='center'>Basic</h3>

* A LLM to interact with
* Temperature
* Max tokens
* A natural language request

<h3 align='center'>Advanced</h3>

* Data (text files, web files)
* A database storage system (vector DB, SQL, PostgreSQL, etc)
* User interfaces


<h1 align='center'>Prompting techniques</h1>

* Zero-shot inference

* One-shot inference

* Few-shot inference


<h1 align='center'>Prompting techniques: Zero-shot inference</h1>

**Formula: instruction, no examples.**

Suppose we want to translate natural language queries into SQL. 

We have a `natural_question`, a database table name `db_name` and a schema `schema`

```python
prompt= f"Answer the question {natural_question} for table {db_name} with schema {schema}"        
```

Suppose we want to classify the sentiment in a sentence `sentence`

```python
prompt = f"How does the author feel about this based on the statement {sentence}"

```

<h1 align='center'>Prompting techniques: One-shot inference</h1>

**Formula: instruction, one example.**

```python
prompt= f"Answer the question {natural_question} for table {db_name} with schema {schema}\
                        For example, if you receive the question: 'How many records are there?'\
                        An appropriate answer is\
                        'SELECT COUNT(*) FROM bank'"
```

```python
prompt = f"How does the author feel about this based on the statement {sentence}\
            'I find the user experience is confusing and convoluted'\
            Answer: Negative"

```

<h1 align='center'>Prompting techniques: Few-shot inference</h1>

**Formula: instruction, more than one example.**

```python
prompt= f"Answer the question {natural_question} for table {db_name} with schema {schema}\
                        Example:\
                        'How many records are there?'\
                        'SELECT COUNT(*) FROM bank'\
                        Example:\
                        Find all employees that are unemployed\
                        SELECT * FROM bank WHERE job = 'unemployed'"
```

```python
prompt = f"How does the author feel about this based on the statement {sentence}\
            'I find the user experience is confusing and convoluted'\
            Answer: Negative\
            'The decoration of the room made me feel welcome!'\
            Answer: Positive"

```

<h1 align='center'>Prompting private LLMs (OpenAI API)</h1>


We're going to focus on the `ChatCompletion` end point. 

Key elements:

* OpenAI API Key
* Model chosen (GPT4, GPT 3.5 Turbo, Text-Davinci)
* Temperature
* Your prompt


<h1 align='center'>Prompting private LLMs (OpenAI API)</h1>

```python
import openai

class Prompter:
    def __init__(self, api_key, gpt_model, temperature=0.2):
        if not api_key:
            raise Exception("Please provide the OpenAI API key")

        self.api_key  = api_key
        self.gpt_model = gpt_model
        self.temperature = temperature
    
    
```

<h1 align='center'>Prompting private LLMs (OpenAI API)</h1>

```python
import openai

class Prompter:
    ...
    def prompt_model_return(self, messages: list):
        openai.api_key = self.api_key
        response = openai.ChatCompletion.create(model=self.gpt_model, 
                                                messages=messages,
                                                temperature=self.temperature)
        return response["choices"][0]["message"]["content"]
    

```

<h1 align='center'>Roles in prompting the ChatCompletion endpoint</h1>

`system content`: What context should the LLM have in mind? Expert in marketing, helpful assistant, enthusiastic marketing generator.

`user content`: What are typical requests that someone with that role would receive?


```python
import openai

class Prompter:
    ...
    def natural_language_to_sql(self, db_name:str, schema:str, natural_question:str):

            system_content = f"You are a data analyst, \
                               and you specialize in solving business questions with SQL.\
                               You are given a natural language question, \
                               and your role is to translate the question\
                               into a query that can be executed against a database. \
                               Ensure your queries are written in a single line, with no special characters"
            
            user_content = f"Please generate a SQL query for data with in a database named {db_name}\
                            along with a schema {schema} for the question {natural_question}"

            full_prompts = [
                                    {"role" : "system", "content" : system_content},
                                    {"role" : "user", "content" : user_content},
                                    ]

            result = self.prompt_model_return(full_prompts)

            return result
    
```

Let's suppose I have a database called `bank` that looks as follows.

In [None]:
%sqlcmd explore --table bank

Let's take the different prompting techniques for a ride.

We will ask a GPT model to translate a natural language question into SQL. 

In [None]:
pm  = Prompter(open_ai_key, "gpt-3.5-turbo")

Zero-shot results.

In [None]:
pm.natural_language_zero_shot("bank", column_names, "How many unique jobs are there?")

In [None]:
pm.natural_language_zero_shot("bank", column_names, "What is the total balance for employees by education?")

Single-shot results.

In [None]:
pm.natural_language_single_shot("bank", column_names, "How many unique jobs are there?")

In [None]:
pm.natural_language_single_shot("bank", column_names, "What is the total balance for employees by education?")

Few-shot results.

In [None]:
pm.natural_language_few_shot("bank", column_names, "How many unique jobs are there?")

In [None]:
pm.natural_language_few_shot("bank", column_names, "What is the total balance for employees by education?")

Roles-based results.

In [None]:
pm.natural_language_with_roles("bank", column_names, "How many unique jobs are there?")

In [None]:
pm.natural_language_with_roles("bank", column_names, "What is the total balance for employees by education?")

In [None]:
%%sql
SELECT COUNT(DISTINCT job) FROM bank

<h1 align='center'>Prompting open source LLMs through HuggingFace</h1>


You need to ensure you install the right modules via `pip` along with the model card of the LLM.

<p></p>
<center>
  <img src="diagrams/hftasks.png" width="1300px"/>

</center>


Example: https://huggingface.co/microsoft/tapex-base

<h1 align='center'>The reality of prompting open source models</h1>

<p></p>
<center>
  <img src="diagrams/this-is-fine.jpeg" width="1200px"/>

</center>


<h1 align='center'>The reality of prompting open source models</h1>

* HuggingFace hosted models resemble GitHub repos (but not in a good way)
* Dependency hell
* Prompting results vary across different models and need to be highly adjusted
* Adapt to whathever documentation the model card has
* Higher likelihood you'll need to find the base model and fine-tune with your data for better results



| Tranformer type | Architecture|Model-like | Focus | Example| 
|-|-|-|-|-|
| Auto-regressive | Decoder-only |GPT-like | Generative tasks | Chat bot | 
| Auto-encoding | Encoder-only |BERT-like | Understanding of the input | Question-answering|
| Sequence-to-Sequence |Encoder-decoder |BART/T5-like | Generative tasks that require an input | Language translation|

In [19]:
from transformers import AutoModelWithLMHead, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("mrm8488/t5-base-finetuned-wikiSQL")
model = AutoModelWithLMHead.from_pretrained("mrm8488/t5-base-finetuned-wikiSQL")

The `xla_device` argument has been deprecated in v4.4.0 of Transformers. It is ignored and you can safely remove it from your `config.json` file.
The `xla_device` argument has been deprecated in v4.4.0 of Transformers. It is ignored and you can safely remove it from your `config.json` file.
The `xla_device` argument has been deprecated in v4.4.0 of Transformers. It is ignored and you can safely remove it from your `config.json` file.
The `xla_device` argument has been deprecated in v4.4.0 of Transformers. It is ignored and you can safely remove it from your `config.json` file.


In [20]:
def get_sql(query):
    input_text = "translate English to SQL: %s </s>" % query
    features = tokenizer([input_text], return_tensors='pt')

    output = model.generate(input_ids=features['input_ids'], 
               attention_mask=features['attention_mask'])

    return tokenizer.decode(output[0])

# Translate
natural_question =  "What is the balance for employees by their education?"
prompt= f"Translate English to SQL: {natural_question}"  
get_sql(prompt)

'<pad> SELECT Balance FROM table WHERE Education = employee</s>'

<h1 align='center'>Intermission: Q & A about prompting</h1>


<h1 align='center'>Part II: Agents and open source frameworks</h1>


<h1 align='center'>NLP pipelines and chaining</h1>


<h1 align='center'>Core ideas</h1>



<h1 align='center'>Introducing Haystack</h1>

- Call open source models
- Build NLP pipelines with their custom built tools
- Leverage their prompt templates
- Incorporate Agents
- Deploy via REST API



<h1 align='center'>Q & A about agents</h1>
