![Banner](img/AI_Special_Program_Banner.jpg)

## Introduction to LLMs - Material 6: Prompt engineering revisited
---


Finally, here is an idea: what if it is good enough that the LLM can generate text and we can achieve our goal by prompt engineering alone? So, we will give our local approach another try.

---
<div style="color:red"><b>Attention:</b></div> 
Do <b>not</b> try to run this notebook on the university server! Or, if you try to do it, make sure no one else is using the same GPU as you are!

---

## Overview
- [Prompt engineering revisited](#Prompt-engineering-revisited)
  - [Text classification by prompt engineering](#Text-classification-by-prompt-engineering)
  - [Initial tries](#Initial-tries)
  - [Looking at the movie reviews](#Looking-at-the-movie-reviews)
- [Learning Outcomes](#Learning-Outcomes)

---

### Text classification by prompt engineering

Let's start by checking our available GPU resources again:

In [1]:
# this will only work on Linux ...
!nvidia-smi | grep MiB

|  0%   47C    P8              17W / 450W |     28MiB / 24564MiB |      0%      Default |
|    0   N/A  N/A      1228      G   /usr/lib/xorg/Xorg                           18MiB |


If there is enough space, we can proceed and import the necessary libraries ...

In [2]:
import torch
from transformers import pipeline
from langchain.llms import HuggingFacePipeline

import warnings
warnings.filterwarnings('ignore')

... and instantiate the model in the form of a [transformers pipeline](https://huggingface.co/docs/transformers/pipeline_tutorial):

In [3]:
cls_pipe = pipeline(
    task="text-generation", 
    model="HuggingFaceH4/zephyr-7b-beta", 
    torch_dtype=torch.bfloat16, 
    device=0,
    do_sample=True,
    max_new_tokens=100,  
    temperature=1.)

Loading checkpoint shards:   0%|          | 0/8 [00:00<?, ?it/s]

In [5]:
cls_llm = HuggingFacePipeline(pipeline=cls_pipe)

In [6]:
def get_prompt(system_input,user_input):
    prompt = '<|system|>\n'
    prompt += f'{system_input}</s>\n'
    prompt += '<|user|>\n'
    prompt += f'{user_input}</s>\n'
    prompt += '<|assistant|>\n'
    return prompt

Let's try the prompt that worked for the model accessed via the InferenceAPI:

In [7]:
cls_sys_prompt = "Answer with only one word, either 'positive' or 'negative', "
cls_sys_prompt += "depending on the sentiment of the opinion."

### Initial tries

The positive one first:

In [8]:
opinion = "Brock Purdy is an awesome QB!"
print(cls_llm.invoke(get_prompt(cls_sys_prompt, opinion)))

Positive.


Ok, this looks promising. Now for a negative one:

In [9]:
opinion = "I think the Niners QB stinks!"
print(cls_llm.invoke(get_prompt(cls_sys_prompt, opinion)))

negative


Great! Now on to the movies ...

### Looking at the movie reviews

In [10]:
import pandas as pd
df = pd.read_csv('data/movie_data.csv', encoding='utf-8')
df.head()

Unnamed: 0,review,sentiment
0,"In 1974, the teenager Martha Moxley (Maggie Gr...",1
1,OK... so... I really like Kris Kristofferson a...,0
2,"***SPOILER*** Do not read this, if you think a...",0
3,hi for all the people who have seen this wonde...,1
4,"I recently bought the DVD, forgetting just how...",0


In [12]:
print(df.review.iloc[0])

In 1974, the teenager Martha Moxley (Maggie Grace) moves to the high-class area of Belle Haven, Greenwich, Connecticut. On the Mischief Night, eve of Halloween, she was murdered in the backyard of her house and her murder remained unsolved. Twenty-two years later, the writer Mark Fuhrman (Christopher Meloni), who is a former LA detective that has fallen in disgrace for perjury in O.J. Simpson trial and moved to Idaho, decides to investigate the case with his partner Stephen Weeks (Andrew Mitchell) with the purpose of writing a book. The locals squirm and do not welcome them, but with the support of the retired detective Steve Carroll (Robert Forster) that was in charge of the investigation in the 70's, they discover the criminal and a net of power and money to cover the murder.<br /><br />"Murder in Greenwich" is a good TV movie, with the true story of a murder of a fifteen years old girl that was committed by a wealthy teenager whose mother was a Kennedy. The powerful and rich family 

In [13]:
opinion = df.review.iloc[0]
print(cls_llm.invoke(get_prompt(cls_sys_prompt, opinion)))

Positive: none

Negative: murder, covered-up, disgrace, lack of emotion

Based on the analysis of the sentiment of the opinion, the answer would be 'negative'.


In [14]:
df.sentiment.iloc[0]

1

So, not only do we have a misclassification (which, by the reasoning provided by the LLM seems understandable, however), but the output becomes more verbose again. Therefore, while it does not seem completely useless, we should probably do some more prompt engineering or play around with the hyper parameters (maybe even temperature?). This is left for the interested student to try out, but the presentation is now finished.

---

## Learning Outcomes

The contents presented in the series of notebooks on using LangChain to work with LLMs touched on some of the most relevant aspects of this task. Having worked through the notebooks, you should now
* realize that LLMs are made for generating text, but may also have other areas of application like, e.g., text classification
* have an idea about the consequences of *knowledge cutoff*
* know that there are proprietary as well as open source LLMs available, which may be used via an API or (in the case of open source LLMs) locally
* be aware of the opportunity to experiment with models via HuggingFace's InferenceAPI
* have a firm grip on the concepts of *prompt engineering*, *memory*, and *context*
* know how to employ *vector databases* for *Retrieval Augmented Generation* (RAG)
* be aware of the option of *fine-tuning* an LLM to a specific task

---