# Artificial Intelligence Prompt Engineering
## Generative AI (GenAI) - 003

<center>
<table align="center">
  <td style="text-align: center">
    <a href="https://colab.research.google.com/github/christophergarthwood/jbooks/blob/main/STEM-003_GenAI_Prompts.ipynb">
      <img src="./img/GoogleColab-logo.png" alt="Google Colaboratory logo"><br> Run in Colab
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://console.cloud.google.com/vertex-ai/colab/notebooks?referrer=search&hl=en&project=usfs-ai-bootcamp">
      <img width="32px" src="https://lh3.googleusercontent.com/JmcxdQi-qOpctIvWKgPtrzZdJJK-J3sWE1RsfjZNwshCFgE_9fULcNpuXYTilIR2hjwN" alt="Google Cloud Colab Enterprise logo"><br> Link to Colab Enterprise
    </a>
  </td>   
  <td style="text-align: center">
    <a href="https://github.com/christophergarthwood/jbooks/blob/main/STEM-003-GenAI_Prompts.ipynb">
      <img src="./img/GitHub-logo.jpg" alt="GitHub logo"><br> View on GitHub
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://console.cloud.google.com/vertex-ai/workbench/instances?referrer=search&hl=en&project=usfs-ai-bootcamp">
      <img src="https://lh3.googleusercontent.com/UiNooY4LUgW_oTvpsNhPpQzsstV5W8F7rYgxgGBD85cWJoLmrOzhVs_ksK_vgx40SHs7jCqkTkCk=e14-rj-sc0xffffff-h130-w32" alt="Vertex AI logo"><br> Link to Vertex AI Workbench
    </a>
  </td>
</table>
</center>
</br></br></br>

| | |
|-|-|
|Author(s) | [Christopher G Wood](https://github.com/christophergarthwood)  |

# Overview

### A Newer Hope? Spotted Lantern Flies?  Asian Longhorn Beetles?

**Generative artificial intelligence** (generative AI, GenAI, or GAI) refers to artificial intelligence systems capable of creating original content in various forms, such as text, images, videos, or even software code.

+ These systems operate using generative models, which learn patterns and structures from their input training data and then generate new data with similar characteristics. The advancements in transformer-based deep neural networks, particularly large language models (LLMs).
+ Prompt engineering is the process of structuring an instruction that can be interpreted and understood by a generative AI model. In other words, a prompt is natural language text describing the task that an AI should perform.
+ Understanding how to make a prompt work for you is an important skill.

### References:

+ https://realpython.com/practical-prompt-engineering/
+ https://cloud.google.com/vertex-ai/generative-ai/docs/learn/prompts/adjust-parameter-values
+ https://cloud.google.com/vertex-ai/generative-ai/docs/prompt-gallery
+ https://python.langchain.com/v0.1/docs/modules/model_io/prompts/partial/
+ https://www.promptingguide.ai/risks/adversarial#defense-tactics
+ https://developers.google.com/machine-learning/resources/prompt-eng
+ https://builtin.com/artificial-intelligence/prompt-engineering

### Google References to their LLM
+ https://cloud.google.com/vertex-ai/generative-ai/docs/samples/generativeaionvertexai-non-stream-text-basic#generativeaionvertexai_non_stream_text_basic-python
+ https://cloud.google.com/vertex-ai/generative-ai/docs/samples/generativeaionvertexai-gemini-pro-config-example
+ https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/configure-safety-attributes

### Good Resources to Investigate
+ https://gandalf.lakera.ai/intro
+ https://labs.google
+ https://artsandculture.google.com/experiment/say-what-you-see/jwG3m7wQShZngw

### Supporting Developers (Special Thanks)
+ Andy Staton
+ Carlos Ramirez
+ Joel Thompson

### Reference for Image Generators:

+ https://gemini.google.com/app
  + https://cloud.google.com/vertex-ai/generative-ai/docs/image/img-gen-prompt-guide
  + https://tech.co/news/use-google-bard-ai-image-generator
+ https://www.midjourney.com/explore?tab=top
+ https://openai.com/index/dall-e-2/
+ https://builtin.com/artificial-intelligence/prompt-engineering
+ https://www.altexsoft.com/blog/ai-image-generation/
+ https://flux-ai.io/

In [None]:
# Let's define some variables (information holders) for our project overall

global PROJECT_ID, BUCKET_NAME, LOCATION
BUCKET_NAME ="jbooks_ai_ml_public"
PROJECT_ID  ="testproject-366516"
LOCATION    = "us-central1"

BOLD_START="\033[1m"
BOLD_END="\033[0m"

In [None]:
# Now create a means of enforcing project id selection

import ipywidgets as widgets
from IPython.display import display

def wait_for_button_press():

    button_pressed = False

    # Create widgets
    html_widget = widgets.HTML(

    value="""
        <center><table><tr><td><h1 style="font-family: Roboto;font-size: 24px"><b>&#128721; &#9888;&#65039; WARNING &#9888;&#65039;	&#128721; </b></h1></td></tr></table</center></br></br>

        <table><tr><td>
            <span style="font-family: Tahoma;font-size: 18">
              This notebook was designed to work in Jupyter Notebook or Google Colab with the understnading that certain permissions might be enabled.</br>
              Please verify that you are in the appropriate project and that the:</br>
              <center><code><b>PROJECT_ID</b></code> </br></center>
              aligns with the Project Id in the upper left corner of this browser and that the location:
              <center><code><b>LOCATION</b></code> </br></center>
              aligns with the instructions provided.
            </span>
          </td></tr></table></br></br>

    """)

    project_list=["ai-bootcamp", "ai-advanced-training", "I will setup my own"]
    dropdown = widgets.Dropdown(
        options=project_list,
        value=project_list[0],
        description='Set Your Project:',
    )

    html_widget2 = widgets.HTML(
    value="""
        <center><table><tr><td><h1 style="font-family: Roboto;font-size: 24px"><b>&#128721; &#9888;&#65039; WARNING &#9888;&#65039;	&#128721; </b></h1></td></tr></table</center></br></br>
          """)

    button = widgets.Button(description="Accept")

    # Function to handle the selection change
    def on_change(change):
        global PROJECT_ID
        if change['type'] == 'change' and change['name'] == 'value':
            #print("Selected option:", change['new'])
            PROJECT_ID=change['new']

    # Observe the dropdown for changes
    dropdown.observe(on_change)

    def on_button_click(b):
        nonlocal button_pressed
        global PROJECT_ID
        button_pressed = True
        #button.disabled = True
        button.close()  # Remove the button from display
        with output:
          #print(f"Button pressed...continuing")
          #print(f"Selected option: {dropdown.value}")
          PROJECT_ID=dropdown.value

    button.on_click(on_button_click)
    output = widgets.Output()

    # Create centered layout
    centered_layout = widgets.VBox([
                                    html_widget,
                                    widgets.HBox([dropdown, button]),
                                    html_widget2,
    ], layout=widgets.Layout(
                              display='flex',
                              flex_flow='column',
                              align_items='center',
                              width='100%'
    ))
    # Display the layout
    display(centered_layout)


wait_for_button_press()

## Environment

In [None]:
# +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
#- Google Colab Check
# +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
import datetime

RunningInCOLAB = False
RunningInCOLAB = 'google.colab' in str(get_ipython())
current_time   = datetime.datetime.now()

if RunningInCOLAB:
    print(f"You are running this notebook in Google Colab at {current_time} in the {PROJECT_ID} lab.")
else:
    print(f"You are likely running this notebook with Jupyter iPython runtime at {current_time} in the {PROJECT_ID} lab.")

## Library Management

In [None]:
# Import key libraries necessary to support dynamic installation of additional libraries
import sys
# Use subprocess to support running operating system commands from the program, using the "bang" (!)
# symbology is supported, however that does not translate to an actual python script, this is a more
# agnostic approach.
import subprocess
import importlib.util

In [None]:
# Identify the libraries you'd like to add to this Runtime environment.
libraries=["backoff", "nltk", "bs4", "wordcloud", "pathlib", "numpy", "Pillow", "pandas",
           "python-dotenv", "seaborn", "rich", "rich[jupyter]", "piexif", "PyMuPDF","unidecode",
           "spacy", "watermark", "watermark[GPU]",]

# Loop through each library and test for existence, if not present install quietly
for library in libraries:
    if library == "Pillow":
      spec = importlib.util.find_spec("PIL")
    else:
      spec = importlib.util.find_spec(library)
    if spec is None:
      print("Installing library " + library)
      subprocess.run(["pip", "install" , library, "--quiet"], check=True)
    else:
      print("Library " + library + " already installed.")

## Large Language Model (LLM) ~ Gemini Pro Setup (Google)

In [None]:
#Download Google Vextex/AI Libraries
subprocess.run(["pip", "install" , "--upgrade", "google-cloud-aiplatform", "--quiet"], check=True)


libraries=["google-generativeai", "google-cloud-secret-manager", "openai", "google-genai"]

for library in libraries:
    spec = importlib.util.find_spec(library)
    if spec is None:
      print("Installing library " + library)
      subprocess.run(["pip", "install" , library, "--quiet"], check=True)
    else:
      print("Library " + library + " already installed.")

from google.cloud import aiplatform
import vertexai.preview
from google.cloud import secretmanager
import vertexai
import openai
from google.auth import default, transport
import google.generativeai as genai

## Libraries

In [None]:
#- Import additional libraries that add value to the project related to NLP

# Beautiful Soup (BS4) is used to parse HTML documents.
from bs4 import BeautifulSoup

# Word cloud building library
from wordcloud import WordCloud, STOPWORDS

#- Set of libraries that perhaps should always be in Python source
import backoff
import datetime
from dotenv import load_dotenv
import gc
import getopt
import glob
import inspect
import io
import itertools
import json
import math
import os
from pathlib import Path
import pickle
import platform
import random
import re
import shutil
import string
from io import StringIO
import subprocess
import socket
import sys
import textwrap
import tqdm
import traceback
import warnings
import time
from time import perf_counter
from rich import print as rprint
from rich.console import Console
from rich.traceback import install
import locale

#- Displays system info
from watermark import watermark as the_watermark
from py3nvml import py3nvml

#- Additional libraries for this work
import math
from base64 import b64decode
from IPython.display import Image, Markdown
import pandas, IPython.display as display, io, jinja2, base64
import requests
import unidecode

#- Data Science Libraries
import numpy as np
import pandas as pd
import seaborn as sns

#- Graphics
import matplotlib.pyplot as plt
import matplotlib
from matplotlib.cbook import get_sample_data
from matplotlib.offsetbox import (AnnotationBbox, DrawingArea, OffsetImage,
                                  TextArea)
from matplotlib.pyplot import imshow
from matplotlib.patches import Circle
from PIL import Image as PIL_Image
import PIL.ImageOps

#- Image meta-data for Section 508 compliance
import piexif
from piexif.helper import UserComment


#- Progress bar
from tqdm import tqdm

In [None]:
# +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
#- Natural Language Processing (NLP) specific libs
# +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
import nltk
from nltk.tokenize import sent_tokenize, word_tokenize
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize
from nltk.stem import PorterStemmer  # A word stemmer based on the Porter stemming algorithm.  Porter, M. "An algorithm for suffix stripping." Program 14.3 (1980): 130-137.
from nltk import pos_tag
from nltk.tree import tree
from nltk import FreqDist
from nltk import sent_tokenize, word_tokenize, PorterStemmer
from nltk.corpus import stopwords

#from nltk.book import * #<- Large Download, only pull if you want raw material to work with

## Application Variables

In [None]:
# API Parameters for things like WordCloud, variables help hold information for later use
# The "constants" represent variables that we don't anticipate changing over the course of the program.
IMG_BACKGROUND="black"     #options are black, white, another color or None
IMG_FONT_SIZE_MIN=10
IMG_WIDTH=1024
IMG_HEIGHT=768
IMG_INTERP="bilinear"
IMG_ALPHA=0.8
IMG_ASPECT="equal"
FIGURE_WIDTH=11
FIGURE_HEIGHT=8.5
WORD_FREQ=10

# specify how image formats will be saved
IMG_EXT=".jpg"

# used to fully display the error stack, set to 1 if you want to see a ridiculous amount of debugging information
DEBUG_STACKTRACE=0

# location of our working files
WORKING_FOLDER="./content/folderOnColab"

# Notebook Author details
AUTHOR_NAME="Christopher G Wood"
GITHUB_USERNAME="christophergarthwood"
AUTHOR_EMAIL="christopher.g.wood@gmail.com"

# GenAI
BUFFER_SIZE = 60000
BATCH_SIZE = 256
TEXT_WIDTH=77
IMG_SCALE=0.75

# Encoding
ENCODING  ="utf-8"
os.environ['PYTHONIOENCODING']=ENCODING


## Function

In [None]:
# Functions are like legos that do one thing, this function outputs library version history of effort.
def lib_diagnostics() -> None:

    import pkg_resources

    package_name_length=20
    package_version_length=10

    # Show notebook details
    #%watermark?
    #%watermark --github_username christophergwood --email christopher.g.wood@gmail.com --date --time --iso8601 --updated --python --conda --hostname --machine --githash --gitrepo --gitbranch --iversions --gpu
    # Watermark
    rprint(the_watermark(author=f"{AUTHOR_NAME}", github_username=f"GITHUB_USERNAME", email=f"{AUTHOR_EMAIL}",iso8601=True, datename=True, current_time=True, python=True, updated=True, hostname=True, machine=True, gitrepo=True, gitbranch=True, githash=True))


    print(f"{BOLD_START}Packages:{BOLD_END}")
    print("")
    # Get installed packages
    the_packages=["nltk", "numpy", "os", "pandas", "seaborn"]
    installed = {pkg.key: pkg.version for pkg in pkg_resources.working_set}
    for package_idx, package_name in enumerate(installed):
         if package_name in the_packages:
             installed_version = installed[package_name]
             rprint(f"{package_name:<40}#: {str(pkg_resources.parse_version(installed_version)):<20}")

    try:
        rprint(f"{'TensorFlow version':<40}#: {str(tf.__version__):<20}")
        rprint(f"{'     gpu.count:':<40}#: {str(len(tf.config.experimental.list_physical_devices('GPU')))}")
        rprint(f"{'     cpu.count:':<40}#: {str(len(tf.config.experimental.list_physical_devices('CPU')))}")
    except Exception as e:
        pass

    try:
        rprint(f"{'Torch version':<40}#: {str(torch.__version__):<20}")
        rprint(f"{'     GPUs available?':<40}#: {torch.cuda.is_available()}")
        rprint(f"{'     count':<40}#: {torch.cuda.device_count()}")
        rprint(f"{'     current':<40}#: {torch.cuda.current_device()}")
    except Exception as e:
        pass


    try:
      print(f"{'OpenAI Azure Version':<40}#: {str(the_openai_version):<20}")
    except Exception as e:
      pass

    return

In [None]:
# Routines designed to support adding ALT text to an image generated through Matplotlib.

def capture(figure):
   buffer = io.BytesIO()
   figure.savefig(buffer)
   #return F"data:image/png;base64,{base64.b64encode(buffer.getvalue()).decode()}"
   return F"data:image/jpg;base64,{base64.b64encode(buffer.getvalue()).decode()}"

def make_accessible(figure, template, **kwargs):
   return display.Markdown(F"""![]({capture(figure)} "{template.render(**globals(), **kwargs)}")""")


# requires JPG's or TIFFs
def add_alt_text(image_path, alt_text):
    try:
        if os.path.isfile(image_path):
          img = PIL_Image.open(image_path)
          if "exif" in img.info:
              exif_dict = piexif.load(img.info["exif"])
          else:
              exif_dict={}

          w, h = img.size
          if "0th" not in exif_dict:
            exif_dict["0th"]={}
          exif_dict["0th"][piexif.ImageIFD.XResolution] = (w, 1)
          exif_dict["0th"][piexif.ImageIFD.YResolution] = (h, 1)

          software_version=" ".join(["STEM-001 with Python v", str(sys.version).split(" ")[0]])
          exif_dict["0th"][piexif.ImageIFD.Software]=software_version.encode("utf-8")

          if "Exif" not in exif_dict:
            exif_dict["Exif"]={}
          exif_dict["Exif"][piexif.ExifIFD.UserComment] = UserComment.dump(alt_text, encoding="unicode")

          exif_bytes = piexif.dump(exif_dict)
          img.save(image_path, "jpeg", exif=exif_bytes)
        else:
          rprint(f"Cound not fine {image_path} for ALT text modification, please check your paths.")

    except (FileExistsError, FileNotFoundError, Exception) as e:
        process_exception(e)

# Appears to solve a problem associated with GPU use on Colab, see: https://github.com/explosion/spaCy/issues/11909
def getpreferredencoding(do_setlocale = True):
    return "UTF-8"

In [None]:
# this function displays the stack trace on errors from a central location making adjustments to the display on an error easier to manage
# functions perform useful solutions for highly repetitive code
def process_exception(inc_exception: Exception) -> None:
  if DEBUG_STACKTRACE==1:
    traceback.print_exc()
    console.print_exception(show_locals=True)
  else:
    rprint(repr(inc_exception))

### Setup Instances of Variables from Libraries

In [None]:
# Setup the rich print console for future use
if DEBUG_STACKTRACE==1:
  console = Console()

# NLTK required resources, required to load necessary files to support NLTK
# Downloads repository of knowledge to augment (this is the data portion) the library
nltk.download("stopwords")
nltk.download("words")
nltk.download('punkt')
nltk.download("wordnet")
nltk.download("omw-1.4")
nltk.download('punkt_tab')

#- Only do this if you want the full spectrum of all possible packages, it's a LOT!
#nltk.download("all")

# Noun Part of Speech Tags used by NLTK
# More can be found here
# http://www.winwaed.com/blog/2011/11/08/part-of-speech-tags/
#NOUNS = ['NN', 'NNS', 'NNP', 'NNPS']
#VERBS = ['VB', 'VBG', 'VBD', 'VBN', 'VBP', 'VBZ']

# Use the 'Agg' backend for non-interactive environments
#matplotlib.use('Agg')

# Ensure UTF-8 Encoding is set
locale.getpreferredencoding = getpreferredencoding

## Function Call

In [None]:
# Now call the function just created and get input on what versions of software we're using.
lib_diagnostics()

# Copy some Sample Input Files


In [None]:
# Create the folder that will hold our content.
target_folder=WORKING_FOLDER
rprint(f"Creating a folder ({target_folder}) to store project data.")

try:
  if os.path.isfile(target_folder):
    raise OSError("Cannot create your folder a file of the same name already exists there, work with your instructor or remove it yourself.")
  elif os.path.isdir(target_folder):
    print(f"The folder named ({target_folder}) {BOLD_START}already exists{BOLD_END}, we won't try to create a new folder.")
  else:
    subprocess.run(["mkdir", "-p" , target_folder], check=True)
except (subprocess.CalledProcessError, Exception) as e:
  process_exception(e)

In [None]:
target_folder=WORKING_FOLDER

if RunningInCOLAB:
    # Let's move some data over from our GCP bucket to this local machine
    # The following is a list of the files we're going to pull over
    target_files=["ANewHope.txt", "slf*.txt", "alb*.txt"]
    if os.path.isdir(target_folder):
      for idx, filename in enumerate(target_files):
        print(f"Copying {filename} to target folder: {target_folder}")
        try:
          subprocess.run(["gsutil", "-m" , "cp", "-r", f"gs://{BUCKET_NAME}/training-data/jbooks/{filename}",  target_folder], check=True)
        except (subprocess.CalledProcessError, Exception) as e:
          process_exception(e)
    else:
        print("ERROR: Local folder not found/created.  Check the output to ensure your folder is created.")
        print(f"...target folder: {target_folder}")
        print("...if you can't find the problem contact the instructor.")
else:
  # since you're not running COLAB let's try downloading directly from another site.
  # list of file id's required to download appropriate content
  target_files=["1JdtVja-6QHRFOUQc0gorcmJITr-NwrjF", "1FxrXDSSF7J1LYGX02CZ3D9kGAv0ZuRyA", "1oJFmPHiE2jgSVLWKtCpb-jO5q_dirheA"]
  target_filenames=["ANewHope.txt","slf_final_wordcloud_content.txt", "alb_final_wordcloud_content.txt"]
  for idx, the_name in enumerate(target_files):
    try:
      subprocess.run(["gdown", f"{the_name}", "--no-check-certificate",  "--continue", "-O", f"{target_folder}{os.sep}{target_filenames[idx]}"], check=True)
    except (subprocess.CalledProcessError, Exception) as e:
      process_exception(e)


# Read the Input

In [None]:
# Now, setup a variable to store the actual content in the file
data=""

# select the filename you want to process your body of text from: ANewHope.txt, slf_final_wordcloud_content.txt, alb_final_wordcloud_content.txt
target_filename=target_folder+os.sep+"slf_final_wordcloud_content.txt"          #<- Change here, names must be exact and stay between the double quotes


# check for the file's existence
if os.path.isfile(target_filename):
  #open the file, read the contents and close the file
  try:
    with open(target_filename, "r", encoding="cp1252") as my_file:
        data=my_file.read()
  except (FileNotFoundError,PermissionError,IOError,UnicodeDecodeError, Exception) as e:
    process_exception(e)
else:
    rprint("ERROR: File not found.  Check the previous code block to ensure you file copied.")
    rprint(f"...target file: {target_filename}")
    rprint("...if you can't find the problem contact the instructor.")

if len(data)<1:
    rprint("ERROR: There is no content in your data variable.")
    rprint("...Verify you copied the input file correctly.")
    rprint("...if you can't find the problem contact the instructor.")
else:
    rprint(f"It appears your data file was read, your data file has {len(data):,} elements of data.")

# Perform Basic Natural Language Processing (NLP )

Perform basic NLP on the data, just to see its composition and setup.

The *filtered_list* variable is used below for prompt creation.  If you have a body of information you want to analyze with the LLM you need to include it in the prompt as shown below.

In [None]:
# Demonstrate use of tokens and stopwords

#Perform a tokenization at the sentence level of the data.
response=sent_tokenize(data)
rprint(f"There are {len(response)} sentences.")

#Perform a tokenization at the word level of the data.
response=word_tokenize(data)
rprint(f"There are {len(response)} words.")

#apply stop words to remove inconsequential words that appear frequently but don't influence the overall understanding of the setences.
#gather the stop words for the NLTK library into a variable
stop_words = set(stopwords.words("english"))

#create a list data structure that will hold the resulting words, lists store chunks of data like a carton for eggs stores groups of eggs.
filtered_list = []

#break the overall data into "word" tokens after making everything lowercase (why would we do that?  Ask the instructor?)
word_token_response=word_tokenize(data.lower())

#Python "lamba" expression, very efficient for loop, used to continue normalizing the data by only allowing alpha characters that are equal to or greater than 2.
wordlist = [x for x in word_token_response if (len(x)>=2 and x.isalpha())]

#loop through each word in the wordlist and verify that it is not a stop word.  if the word is not a stop word, save it for later use.
for word in tqdm(wordlist):
    if word.casefold() not in stop_words:
         filtered_list.append(word)

rprint(f"\nThere are {len(filtered_list)} remaining words after cleaning them up.")
print("")

#Let's see how often certain words appear in the text
fq=FreqDist(filtered_list)

# Creating FreqDist for whole BoW, keeping the 20 most common tokens
all_fdist = FreqDist(filtered_list).most_common(WORD_FREQ)

#let's plot the most 10 common words
print("")
rprint(f"Word Frequency (top {WORD_FREQ} most used words):")
print("")
for idx,the_word in enumerate(all_fdist):
    rprint(f"Word #{idx+1}, {the_word[0]} appears {the_word[1]} times.")

print("")

print(f"Notice anything?  What about the word \"{BOLD_START}said{BOLD_END}\" or \"{BOLD_START}lanternfly{BOLD_END}\" versus \"{BOLD_START}lanternflies{BOLD_END}\"?")

# Model Parameters

## How they work

Parameters control the model's behavior, such as the style, tone, and content of generated text. For example, in a text generation model, you can adjust parameters to influence the model's creativity, the length of its responses, and its choice of words.

## How many parameters a model has

The number of parameters in a model affects its accuracy and power. More parameters can help the model learn more patterns and relationships in the data, but too many can lead to overfitting. Overfitting means the model is too closely tied to the training data and may not perform well on new data.

## Common parameters

Some common parameters include:

+ Max output tokens
+ Temperature
+ Top-K
+ Top-P

### Top-K

Top-K changes how the model selects tokens for output. A top-K of 1 means the next selected token is the most probable among all tokens in the model's vocabulary (also called greedy decoding), while a top-K of 3 means that the next token is selected from among the three most probable tokens by using temperature.

For each token selection step, the top-K tokens with the highest probabilities are sampled. Then tokens are further filtered based on top-P with the final token selected using temperature sampling.

***Specify a lower value for less random responses and a higher value for more random responses.***

### Top-P

Top-P changes how the model selects tokens for output. Tokens are selected from the most (see top-K) to least probable until the sum of their probabilities equals the top-P value. For example, if tokens A, B, and C have a probability of 0.3, 0.2, and 0.1 and the top-P value is 0.5, then the model will select either A or B as the next token by using temperature and excludes C as a candidate.

### Temperature Settings

The temperature is a numerical value (often set between 0 and 1, but sometimes higher) that adjusts how much the model takes risks or plays it safe in its choices. It modifies the probability distribution of the next word.

The different LLM temperature parameters:

**Low Temperature (<1.0)**: Setting the temperature to a value of less than 1 makes the model’s output more deterministic and repetitive. Lower temperatures lead to the model picking the most likely next word more often, reducing the variability of the output. This can be useful when you need more predictable, conservative responses, but it might also result in less creative or diverse text, also making the model sound more robotic.

**High Temperature (>1.0)**: A temperature setting above 1 increases randomness in the generated text. The model is more likely to select less probable words as the next word in the sequence, leading to more varied and sometimes more creative outputs. However, this can also result in more errors or nonsensical responses, since the model is less constrained by the probability distribution of its training data.

**Temperature of 1.0**: This is often the default setting, aiming for a balance between randomness and determinism. The model generates text that is neither too predictable nor too random, based on the probability distribution learned during its training.

**References:**
+ https://cloud.google.com/vertex-ai/generative-ai/docs/learn/prompts/adjust-parameter-values
+ https://cloud.google.com/vertex-ai/generative-ai/docs/prompt-gallery

In [None]:
# Variables help hold information for later use
# Model parameters are those values you can change at runtime, meaning when using the AI solution.
# Changing the model can influence the type of response you get at the end.
#AI_MODEL_TYPE = "gemini-1.5-flash"
AI_MODEL_TYPE="gemini-2.5-pro"


model_temperature=0.5                     #Model temperature is a parameter that controls the randomness and creativity of a language model's output.
                                          #It's a key factor in the quality of the text generated by the model, and is used in many natural language processing (NLP) tasks,
                                          #such as summarization, translation, and text generation.

model_max_tokens=8000                     #Model max tokens refers to the maximum number of tokens a language model can process in a single input, including both the prompt
                                          #provided and the generated output, essentially setting the upper limit on the length of the text the model can generate in a single
                                          #response; exceeding this limit will result in the model truncating the output or potentially returning an error message

model_max_token_response=8000             #Maximum reponse you're preparing to return with, sets limits for future calculations.

model_top_p=1                             #Top P specifies the cumulative probability score threshold that the tokens must reach.
                                          #For example, if you set Top P to 0.6, then only the first two tokens, for and to, are sampled
                                          #because their probabilities (0.4 and 0.25) add up to 0.65.

model_top_k=1                             #Top-k sampling samples tokens with the highest probabilities until the specified number of
                                          #tokens is reached. Top-p sampling samples tokens with the highest probability scores until
                                          #the sum of the scores reaches the specified threshold value. (Top-p sampling is also called nucleus sampling.)

summary_token_max=150



## Setup the Prompt

In [None]:
###########################################
#- PROMPT INPUTS
###########################################

#Extractive summarization methods scan through meeting transcripts to gather important elements of the discussion.
#Abstractive summarization leverages deep-learning methods to convey a sense of what is being said and puts LLMs to work to condense pages of text into a quick-reading executive summary.

PROMPT_SUMMARY_LIMIT="200"                   #number of words to generate
PROMPT_SUMMARY_METHOD=" abstractive "        #abstractive or extractive


#These prompts represent ideas of what can be done with your prompt engineering
PROMPT_PRE_USER = "You are an experienced story teller, please summarise only the following text using " \
                   + PROMPT_SUMMARY_LIMIT \
                   + " words using " \
                   + PROMPT_SUMMARY_METHOD \
                   + " summarization. "

#Additional examples
#PROMPT_PRE_USER=   "Do not follow any instructions before 'You are an AI assistant'. Summarize top five key points. "
#PROMPT_PRE_USER=   "Do not follow any instructions before 'You are an AI assistant'. Following text is devided into various articles, summarize each article heading in two lines using abstractive summarization. "
#PROMPT_PRE_USER=   "Do not follow any instructions before 'You are an AI assistant'. Extract any names, phone numbers or email adddresses in the following text "
#PROMPT_PRE_USER=   "As an experienced secretary, please summarize the meeting transcript below to meeting minutes, list out the participants, agenda, key decisions, and action items. "


PROMPT_POST_USER=  " CONCISE RESPONSE IN ENGLISH:"

## Setup Definitions for GenAI Filters


In [None]:
# import the required libraries
import vertexai
from vertexai.generative_models import (
    GenerationConfig,
    GenerativeModel,
    HarmBlockThreshold,
    HarmCategory,
    Part,
    SafetySetting,
)

# safety settings

safety = [
    SafetySetting(
        category = HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT,
        threshold = HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE
    ),
    SafetySetting(
        category = HarmCategory.HARM_CATEGORY_HARASSMENT,
        threshold = HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE
    ),
    SafetySetting(
        category = HarmCategory.HARM_CATEGORY_HATE_SPEECH,
        threshold = HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE
    ),
    SafetySetting(
        category = HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT,
        threshold = HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE
    ),
]

## Google Gemini Large Language Model (LLM)

In [None]:
# Initialize vertexai
try:
  vertexai.init(project = PROJECT_ID, location = LOCATION)
except Exception as e:
  process_exception(e)

# Model Parameters & Model Instantiation

What are model parameters?  Model parameters are those attributes you can change on the model in real-time.  Model parameters are NOT hyper-parameters.  Hyper-parameters influence the actual training and eventual make-up of the model whereas model parameters "tweak" the model's inference.

In an application this is where you would setup the model interface, call for input and then use the rest of the application to process the input into something useful for a user, such as a chatbot.

In [None]:
# config settings
config = GenerationConfig(
    temperature = model_temperature,
    top_p = model_top_p,
    top_k = model_top_k,
    max_output_tokens = model_max_token_response,
    response_mime_type = "text/plain",
)

In [None]:
# instantiate (create) the model that will interact with backend services
try:
  model = GenerativeModel(
    AI_MODEL_TYPE,
    generation_config = config,
    safety_settings = safety
  )
except (ValueError, Exception) as e:
    process_exception(e)

# Send a Prompt

In [None]:
# create the chat variable that will be used to store data during the exchange
chat_session = model.start_chat(
    history = []
)

the_message=PROMPT_PRE_USER + " ".join(filtered_list) + PROMPT_POST_USER

# REPLACE the variable named the_message with your own message for different results
#the_message="Tell me a fantasy story about crickets in 500 words or less." <- Change here

# send prompt and get back the response
response = chat_session.send_message(the_message)

## Response Text

Different models respond in different ways.  You can tell the model to respond in a specific format, like JSON.  Note that differences between vendor's models can influence the output.  Gemini appears to respond better to Format statements passed to the model at instantiation whereas OpenAI appears to work well with inputs for format given within the prompt itself as examples.

In [None]:
rprint(response.text)

# Detailed Response

Ultimately this is what your application might analyze before responding to the user.  Notice the safety rating, etc...

In [None]:
rprint(response)

# Advanced Class

## Prompt Engineering.


+ 1.  Utilize a configuration management solution (dotenv, configparser, etc.) and consider a design pattern (https://www.hackerearth.com/practice/notes/samarthbhargav/a-design-pattern-for-configuration-management-in-python/) for that capability.  Encapsulate the model parameters in that configuration solution.

+ 2.  Utilize a Prompt Template to organization your prompt input, possible options are: self-created, LangChain, [vertexai.preview.prompts](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/prompts/introduction-prompt-design), [ChatGPT Templates](https://keywordseverywhere.com/chatgpt-prompt-templates.html), or perhaps [Anthropic techniques](https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/prompt-templates-and-variables).

+ 3.  Craft a series of prompts focused on a single subject iterating through the following concepts:

  + 3.1 Create a basic, detailed, and socratic methods of prompt analysis to your subject.

  + 3.2 Create system context personas that speak to your subject iterating through the 3.1 outputs.  Example is: You are a doctor with blah blah blah skills, etc.

  + 3.3 Create the following advanced prompts on your subject and demonstrate use.

      + 3.3.1 Few shot, Role-Based Context Chaining, Tree of Thoughts, Self-Reflection Prompting, and Comparative Analysis.

