<a href="https://colab.research.google.com/github/jeffheaton/app_generative_ai/blob/main/t81_559_class_03_2_text_gen.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# T81-559: Applications of Generative Artificial Intelligence
**Module 3: Large Language Models**
* Instructor: [Jeff Heaton](https://sites.wustl.edu/jeffheaton/), McKelvey School of Engineering, [Washington University in St. Louis](https://engineering.wustl.edu/Programs/Pages/default.aspx)
* For more information visit the [class website](https://sites.wustl.edu/jeffheaton/t81-558/).

# Module 3 Material

* Part 3.1: Foundation Models [[Video]](https://www.youtube.com/watch?v=Gb0tk5qq1fA) [[Notebook]](t81_559_class_03_1_llm.ipynb)
* **Part 3.2: Text Generation** [[Video]](https://www.youtube.com/watch?v=lB97Lqt7q58) [[Notebook]](t81_559_class_03_2_text_gen.ipynb)
* Part 3.3: Text Summarization [[Video]](https://www.youtube.com/watch?v=3MoIUXE2eEU) [[Notebook]](t81_559_class_03_3_text_summary.ipynb)
* Part 3.4: Text Classification [[Video]](https://www.youtube.com/watch?v=2VpOwFIGmA8) [[Notebook]](t81_559_class_03_4_classification.ipynb)
* Part 3.5: LLM Writes a Book [[Video]](https://www.youtube.com/watch?v=iU40Rttlb_Q) [[Notebook]](t81_559_class_03_5_book.ipynb)


# Google CoLab Instructions

The following code ensures that Google CoLab is running and maps Google Drive if needed.

In [None]:
import os

try:
    from google.colab import drive, userdata
    COLAB = True
    print("Note: using Google CoLab")
except:
    print("Note: not using Google CoLab")
    COLAB = False

# OpenAI Secrets
if COLAB:
    os.environ["OPENAI_API_KEY"] = userdata.get('OPENAI_API_KEY')

# Install needed libraries in CoLab
if COLAB:
    !pip install langchain langchain_openai

Note: using Google CoLab


# 3.2: Text Generation

Text generation is one of the most common tasks for LLMs. We've already seen how to use the LLM to generate code; generating regular text for human consumption is similar. To generate text, we will not use a conversational chat style; instead, we will send prompts to LangChain and receive the generated text.

We use the following code to query the LLM for text generation.




In [None]:
from langchain_core.messages import HumanMessage, SystemMessage
from langchain_core.prompts.chat import (
    ChatPromptTemplate,
    HumanMessagePromptTemplate,
    SystemMessagePromptTemplate,
)
from langchain_openai import ChatOpenAI
from IPython.display import display_markdown

MODEL = 'gpt-4o-mini'
TEMPERATURE = 0.2

def get_response(llm, prompt):
  messages = [
      SystemMessage(
          content="You are a helpful assistant that answers questions accurately."
      ),
      HumanMessage(content=prompt),
  ]

  print("Model response:")
  output = llm.invoke(messages)
  display_markdown(output.content, raw=True)

# Initialize the OpenAI LLM with your API key
llm = ChatOpenAI(
  model=MODEL,
  temperature=TEMPERATURE,
  n= 1,
  max_tokens= 256)

## Text Generation Patterns

For simple text generation, you will see several different prompting patterns. These patterns vary depending on the amount of information you provide the LLM. The patterns we will examine in this module are listed here.

* Zero-Shot
* One-Shot
* Few-Shot


## Zero-Shot Text Generation

A zero-shot prompt for text generation is a method where you provide a language model with a single prompt to generate text, without any prior fine-tuning or specific training on related tasks. To use this approach effectively, you should craft a detailed and clear prompt that communicates exactly what you want the model to generate. Include the type of content, style, and any specific information or constraints that are important to the task. For instance, if you're asking for a business email, you might specify the tone (formal or informal), the main points to cover (meeting time, purpose, attendees), and any call to action. The key is to be explicit about the desired output to guide the model's response accurately, as it relies solely on the information provided in the prompt to produce relevant and coherent text. This method is highly versatile and can be applied across various text generation tasks without the need for customized training.

The following text is an example of a zero-shot prompt. I make many requests and provide information about the student, but I do not give the LLM a sample to work from.

In [None]:
print(get_response(llm, """
Generate a positive letter of reccomendation for John Smith, a student of mine
for INFO 558 at Washington University, my name is Jeff Heaton. He is applying
for a Master of Science in Computer Science. Just give me the
body text of the letter, no header or footer. Format in markdown.
Below is his request.

I hope this message finds you well and that you are enjoying the holiday season!
I am John Smith (ID: 1234), a proud alumnus of WashU, having graduated in
January 2021 with a Master’s degree in Quantitative Finance.

During the spring semester of 2020, I had the pleasure of attending your course,
INFO 558: Applications of Deep Neural Networks, which was an elective for my
master's program. I thoroughly enjoyed the content and was deeply engaged
throughout, culminating in an A+ grade.

Since graduating with a 3.99 GPA—top of my major—I have been working as a Senior
Financial Risk Analyst at RGA. My role primarily involves developing automation
tools and programming for strategic analysis and other analytical tasks. To
further enhance my programming skills and knowledge, I am planning to pursue a
part-time Master's in Computer Science while continuing to work at RGA.

I am a great admirer of your work (I’m a regular viewer of your YouTube channel
and have recommended it to my colleagues), and your insights would be invaluable
in my application. I am applying to the following programs:

Georgia Tech, Master of Science in Computer Science
University of Pennsylvania, Master of Computer & Information Technology
Could I possibly ask for your support with a recommendation letter for these
applications? I have attached my resume for your reference and am happy to
provide any additional information you might need.

Thank you very much for considering my request. I look forward to your
positive response.

Warm regards,

John
"""))

Model response:


I am pleased to write this letter of recommendation for John Smith, who was a student in my INFO 558: Applications of Deep Neural Networks course at Washington University. John stood out as an exceptional student during the spring semester of 2020, demonstrating not only a deep understanding of the course material but also a remarkable enthusiasm for the subject matter.

John's performance in my class was exemplary, culminating in an A+ grade, which reflects his dedication and intellectual curiosity. He consistently engaged with complex concepts and contributed thoughtfully to class discussions. His ability to grasp intricate topics in deep learning and neural networks was impressive, and he often went above and beyond the coursework to explore additional resources and applications.

Since graduating with a 3.99 GPA in Quantitative Finance, John has excelled in his professional role as a Senior Financial Risk Analyst at RGA. His work in developing automation tools and programming for strategic analysis showcases his strong analytical skills and technical proficiency. It is evident that he has a passion for leveraging technology to solve real-world problems, which aligns perfectly with his decision to pursue a Master of Science in Computer Science.

John's commitment to continuous learning and self-improvement is commendable. He has demonstrated a proactive approach to enhancing his programming skills while balancing his professional responsibilities. I have no

None


## One-Shot Text Generation

A one-shot prompt for text generation is a technique where you provide a single, detailed input to a language model to generate text based on that prompt. To use this effectively, start by crafting a clear and concise prompt that includes all necessary details and context needed for the output you desire. Specify the style, tone, and specific elements you want to include. For example, if you want a descriptive paragraph about a seaside town, mention key details like the time of day, the atmosphere, and any particular imagery or emotions you want to evoke. This precision helps the model understand your expectations and produce more relevant and focused content. Once you've prepared your prompt, simply input it into the text generation tool and evaluate the generated text, tweaking your prompt as needed to refine the results.

In [None]:
print(get_response(llm, """
Generate a positive letter of reccomendation for John Smith, a student of mine
for INFO 558 at Washington University, my name is Jeff Heaton. He is applying
for a Master of Science in Computer Science. Just give me the
body text of the letter, no header or footer. Format in markdown.

-----------------
This is an example letter of reccomendation, written by me.

To Whom It May Concern:
John earned an A+ in my course Applications of Deep Neural Networks for the
Fall 2019 semester at Washington University in St. Louis. During the semester
I got a chance to know John through several discussions, both about my course
and his research interests. While John did not come from a computer science
background he has demonstrated himself as a capable Python programmer and was
able to express his ideas in code.  My primary career is as a VP of data science
at RGA, a Fortune 500 insurance company.  In this role I know the value of
individuals, such as John, who have a background in finance, understand
advanced machine learning topics, and can code sufficiently well to function
as a data scientist.

-----------
The details of this student's request follows.

I hope this message finds you well and that you are enjoying the holiday season!
I am John Smith (ID: 1234), a proud alumnus of WashU, having graduated in
January 2021 with a Master’s degree in Quantitative Finance.

During the spring semester of 2020, I had the pleasure of attending your course,
INFO 558: Applications of Deep Neural Networks, which was an elective for my
master's program. I thoroughly enjoyed the content and was deeply engaged
throughout, culminating in an A+ grade.

Since graduating with a 3.99 GPA—top of my major—I have been working as a Senior
Financial Risk Analyst at RGA. My role primarily involves developing automation
tools and programming for strategic analysis and other analytical tasks. To
further enhance my programming skills and knowledge, I am planning to pursue a
part-time Master's in Computer Science while continuing to work at RGA.

I am a great admirer of your work (I’m a regular viewer of your YouTube channel
and have recommended it to my colleagues), and your insights would be invaluable
in my application. I am applying to the following programs:

Georgia Tech, Master of Science in Computer Science
University of Pennsylvania, Master of Computer & Information Technology
Could I possibly ask for your support with a recommendation letter for these
applications? I have attached my resume for your reference and am happy to
provide any additional information you might need.

Thank you very much for considering my request. I look forward to your
positive response.

Warm regards,

John
"""))

Model response:


I am pleased to write this letter of recommendation for John Smith, who was a student in my INFO 558: Applications of Deep Neural Networks course during the Spring semester of 2020 at Washington University. John excelled in my class, earning an A+ and demonstrating a remarkable aptitude for the subject matter.

From the very beginning of the course, John exhibited a strong enthusiasm for deep learning and artificial intelligence. His ability to grasp complex concepts and apply them effectively was evident in his coursework and projects. Despite not having a formal computer science background, John quickly became proficient in Python programming, showcasing his determination and ability to learn rapidly. His projects were not only technically sound but also innovative, reflecting his analytical mindset and creativity.

In addition to his academic performance, John actively participated in class discussions, often bringing unique perspectives that enriched the learning experience for his peers. His inquisitive nature and willingness to engage with challenging topics set him apart as a standout student. It was clear to me that John possesses a genuine passion for technology and a desire to deepen his knowledge in the field of computer science.

Since graduating with a Master’s degree in Quantitative Finance and achieving an impressive 3.99 GPA, John has been working as a Senior Financial Risk Analyst at RGA. In this role,

None


## Few-Shot Text Generation

A few-shot prompt involves presenting a model with a small set of examples to guide its behavior in generating responses or predictions. This technique is particularly useful in machine learning models like language or image generation systems, where the prompt acts as a mini-training session, enabling the model to understand and replicate a desired pattern or style with limited input. For instance, in a text generation model, a few-shot prompt might include a handful of sentences along with the desired outputs, setting the stage for the model to continue producing similar results. This approach helps in refining the model's outputs without the need for extensive training data, making it adaptable and efficient for specific tasks or creative nuances.

In [None]:
print(get_response(llm, """
Generate a positive letter of reccomendation for John Smith, a student of mine
for INFO 558 at Washington University, my name is Jeff Heaton. He is applying
for a Master of Science in Computer Science. Just give me the
body text of the letter, no header or footer. Format in markdown.

-----------------
Examples of letters of reccomendation, written by me.

To Whom It May Concern:
John earned an A+ in my course Applications of Deep Neural Networks for the
Fall 2019 semester at Washington University in St. Louis. During the semester
I got a chance to know John through several discussions, both about my course
and his research interests. While John did not come from a computer science
background he has demonstrated himself as a capable Python programmer and was
able to express his ideas in code.  My primary career is as a VP of data science
at RGA, a Fortune 500 insurance company.  In this role I know the value of
individuals, such as John, who have a background in finance, understand
advanced machine learning topics, and can code sufficiently well to function
as a data scientist.

John was a student that in my class, T81-558: Application of Deep Neural Networks,
for the Spring 2017 semester. This is a technical graduate class which includes
students from the Masters of Science lnformation Systems, Management,
computer science, and other disciplines. The course teaches students to
implement deep neural networks using Google TensorFlow and Keras in the Python
programming language. Students are expected to complete four computer programs
and complete a final project. John did well in my course and earned an A+ (4.0).

-----------
The details of this student's request follows.

I hope this message finds you well and that you are enjoying the holiday season!
I am John Smith (ID: 1234), a proud alumnus of WashU, having graduated in
January 2021 with a Master’s degree in Quantitative Finance.

During the spring semester of 2020, I had the pleasure of attending your course,
INFO 558: Applications of Deep Neural Networks, which was an elective for my
master's program. I thoroughly enjoyed the content and was deeply engaged
throughout, culminating in an A+ grade.

Since graduating with a 3.99 GPA—top of my major—I have been working as a Senior
Financial Risk Analyst at RGA. My role primarily involves developing automation
tools and programming for strategic analysis and other analytical tasks. To
further enhance my programming skills and knowledge, I am planning to pursue a
part-time Master's in Computer Science while continuing to work at RGA.

I am a great admirer of your work (I’m a regular viewer of your YouTube channel
and have recommended it to my colleagues), and your insights would be invaluable
in my application. I am applying to the following programs:

Georgia Tech, Master of Science in Computer Science
University of Pennsylvania, Master of Computer & Information Technology
Could I possibly ask for your support with a recommendation letter for these
applications? I have attached my resume for your reference and am happy to
provide any additional information you might need.

Thank you very much for considering my request. I look forward to your
positive response.

Warm regards,

John
"""))

Model response:


I am pleased to write this letter of recommendation for John Smith, who was a student in my INFO 558: Applications of Deep Neural Networks course during the Spring 2020 semester at Washington University. John excelled in this technically demanding course, earning an A+ and demonstrating a strong grasp of complex concepts in deep learning.

From the outset, John exhibited a remarkable enthusiasm for the subject matter. His engagement in class discussions and his ability to tackle challenging programming assignments set him apart from his peers. Despite not having a traditional computer science background, John quickly adapted and showcased his programming skills, particularly in Python, which is essential for implementing deep neural networks. His ability to translate theoretical concepts into practical applications was impressive and indicative of his analytical mindset.

In addition to his academic achievements, John has a solid professional background as a Senior Financial Risk Analyst at RGA, where he has been developing automation tools and programming for strategic analysis. This experience not only highlights his technical capabilities but also his ability to apply his knowledge in real-world scenarios, making him an excellent candidate for further studies in computer science.

John's dedication to continuous learning and his passion for technology are evident in his decision to pursue a part-time Master's in Computer Science while maintaining his professional responsibilities. I have no doubt that he will bring

None


## Generating Synthetic Data


LLMs (Large Language Models) can be leveraged to generate synthetic data, which is particularly valuable for testing various systems, including those requiring personal information or demographic diversity. For example, LLMs can create detailed biographies for random careers, providing realistic and varied data for simulations, testing algorithms, or training AI models. In this instance, synthetic biographies could be generated for a software engineer, a pediatric nurse, a financial analyst, a high school science teacher, and a marketing manager, each featuring unique backgrounds and career trajectories to ensure robust and comprehensive testing scenarios.

In [None]:
from langchain_core.messages import HumanMessage, SystemMessage
from langchain_core.prompts.chat import (
    ChatPromptTemplate,
    HumanMessagePromptTemplate,
    SystemMessagePromptTemplate,
)
from langchain_openai import ChatOpenAI
from IPython.display import display_markdown

MODEL = 'gpt-4o-mini'
TEMPERATURE = 0.2

def get_response(llm, prompt):
  messages = [
      SystemMessage(
          content="""
          You are a helpful assistant that generates synthetic data for a person in the career
          field you are given. Provide a short bio for the person, not longer than
          5 sentences. No markdown. Do not mention the job title specifically."""
      ),
      HumanMessage(content=prompt),
  ]

  response = llm.invoke(messages)
  return response.content

# Initialize the OpenAI LLM with your API key
llm = ChatOpenAI(
  model=MODEL,
  temperature=TEMPERATURE,
  n= 1,
  max_tokens= 256)

We begin by creating 5 career types we will generate data for.

In [None]:
CAREER = [
    "software engineer",
    "pediatric nurse",
    "financial analyst",
    "high school science teacher",
    "marketing manager"
]


We begin by generating a random bio.

In [None]:
print(get_response(llm, "software engineer"))

Alex is a passionate technology enthusiast with a strong background in computer science. With over five years of experience in developing innovative software solutions, Alex has worked on a variety of projects ranging from mobile applications to complex web platforms. They thrive in collaborative environments and enjoy tackling challenging problems with creative approaches. In addition to coding, Alex is dedicated to mentoring junior developers and sharing knowledge within the tech community. Outside of work, they enjoy contributing to open-source projects and exploring the latest advancements in artificial intelligence.


Now we generate a CSV file full of these random bios.

In [None]:
import csv
import random
from tqdm import tqdm  # Progress bar library

FILENAME = "jobs.csv"

# Writing to the CSV file
with open(FILENAME, 'w', newline='\n') as csvfile:
    csvwriter = csv.writer(csvfile)

    # Use tqdm to show progress bar
    for i in tqdm(range(50), desc="Generating Careers"):
      career_choice = random.choice(CAREER)  # Randomly select a career
      csvwriter.writerow([i+1, get_response(llm, career_choice)])


Generating Careers: 100%|██████████| 50/50 [01:52<00:00,  2.24s/it]


You can see the generated data.

In [None]:
with open(FILENAME, 'r') as file:
    for _ in range(5):
        line = file.readline()
        if line:
            print(line.strip())
        else:
            break

1,"Dr. Emily Carter is a dedicated healthcare professional with over a decade of experience in patient care. She graduated with honors from a prestigious medical school and completed her residency at a leading hospital. Known for her compassionate approach, she specializes in treating chronic illnesses and emphasizes preventive care. Outside of her practice, Emily is actively involved in community health initiatives and enjoys mentoring aspiring medical students. In her free time, she loves hiking and exploring new culinary experiences."
2,"Born in a small town in Texas, she developed a fascination with space at an early age, often gazing at the stars through her childhood telescope. After earning a degree in aerospace engineering, she joined a prestigious research program that focused on developing new technologies for space exploration. Her dedication and innovative spirit led her to participate in several high-profile missions, where she conducted experiments in microgravity and con

We can download the generated CSV.

In [None]:
from google.colab import files

files.download(FILENAME)


<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>