<a href="https://colab.research.google.com/github/ethvedbitdesjan/AI-ML_Training/blob/main/LLMIntro.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [2]:
!pip install -U transformers
!pip install -U bitsandbytes



In [45]:
pip install -U sentence-transformers

Collecting sentence-transformers
  Downloading sentence_transformers-3.0.1-py3-none-any.whl (227 kB)
[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/227.1 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[90m╺[0m[90m━━━[0m [32m204.8/227.1 kB[0m [31m6.1 MB/s[0m eta [36m0:00:01[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m227.1/227.1 kB[0m [31m5.1 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: sentence-transformers
Successfully installed sentence-transformers-3.0.1


In [1]:
import huggingface_hub
from google.colab import userdata
huggingface_hub.login(userdata.get('HF_TOKEN'))

The token has not been saved to the git credentials helper. Pass `add_to_git_credential=True` in this function directly or `--add-to-git-credential` if using via `huggingface-cli` if you want to set the git credential as well.
Token is valid (permission: read).
Your token has been saved to /root/.cache/huggingface/token
Login successful


In [2]:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
import matplotlib.pyplot as plt
from IPython.display import display, Markdown

torch.random.manual_seed(0)
model = AutoModelForCausalLM.from_pretrained(
    "microsoft/Phi-3-mini-4k-instruct",
    device_map="cuda",
    torch_dtype=torch.bfloat16,
    trust_remote_code=True,
)

tokenizer = AutoTokenizer.from_pretrained("microsoft/Phi-3-mini-4k-instruct")
pipe = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
)



Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


In [3]:
def get_model_response(prompt, max_new_tokens=100, temperature=1.0):
    assert temperature > 0.0, "Temperature must be greater than 0.0"
    assert max_new_tokens > 0, "Max new tokens must be greater than 0"
    messages = [
        {"role": "system", "content": "You are a helpful AI assistant."},
        {"role": "user", "content": f"{prompt}"},
    ]
    generation_args = {
        "max_new_tokens": max_new_tokens,
        "return_full_text": False,
        "temperature": temperature,
        "do_sample": True,
    }

    output = pipe(messages, **generation_args)
    return output[0]['generated_text']

In [22]:
display(Markdown(get_model_response("Write poem on machine learning")))

 **The Dance of Neurons**


In silicon groves where data doth wind,

Algorithms in whispers, to insights they inclined.

A field of stars – a neural net – does sprawl,

Each node a neuron, in memory they recall.


Backpropagation, a path it seeks,

To tune the weights where knowledge peaks.

Through layers deep where hidden

# LLMs and LLM Modulo Frameworks: A Comprehensive Introduction

# ## I. Introduction to Large Language Models (LLMs)

In [6]:
# ### A. Definition of LLMs
llm_definition_prompt = "Provide a concise definition of Large Language Models (LLMs)."
llm_definition = get_model_response(llm_definition_prompt, max_new_tokens=50)
display(Markdown(f"**LLM Definition:** {llm_definition}"))

**LLM Definition:**  Large Language Models (LLMs) are sophisticated artificial intelligence tools that can understand and generate human language from a vast dataset. They are capable of processing, contextualizing, and predicting text based on patterns they've

In [7]:
# ### B. Brief overview of how LLMs work

# %%
llm_overview_prompt = "Explain in simple terms how Large Language Models work."
llm_overview = get_model_response(llm_overview_prompt, max_new_tokens=100)
display(Markdown(f"**How LLMs Work:** {llm_overview}"))

**How LLMs Work:**  Large Language Models, or LLMs, are sophisticated computer programs that understand and generate human-like text. Think of them like advanced spellers and writers combined. They digest huge amounts of written words and texts to learn how language works — from basic grammar to more complex conversational cues. Once trained, these models can mimic text so well that they can compose essays, write stories, converse with users, and even translate languages. They're

In [8]:
# ### C. Real-world applications and examples

# %%
applications_prompt = "List 5 real-world applications of Large Language Models."
applications = get_model_response(applications_prompt, max_new_tokens=150)
display(Markdown(f"**Real-world Applications of LLMs:**\n{applications}"))

**Real-world Applications of LLMs:**
 Large language models (LLMs) have become increasingly prevalent in a variety of real-world applications. Here are five notable ones:

1. **Content Generation:** LLMs can generate a wide variety of content, including written pieces, such as articles, blog posts, social media content, reports, and scripts. For example, Microsoft's GPT series of models can generate human-like articles on a wide array of topics or assist in marketing content creation.

2. **Natural Language Understanding (NLU):** LLMs play a significant role in understanding human language and are used for various tasks such as sentiment analysis, machine translation, chatbots, and voice assistants

In [9]:
# ### A. Natural language processing tasks

# %%
nlp_tasks = [
    "Writing and composition",
    "Paraphrasing and summarization",
    "Translation between languages"
]

for task in nlp_tasks:
    prompt = f"Demonstrate the capability of an LLM in {task}. Provide a short example."
    response = get_model_response(prompt, max_new_tokens=100)
    display(Markdown(f"**{task}:**\n{response}\n"))

**Writing and composition:**
 One of the most powerful aspects of large language models (LLMs) such as me is their ability to write and compose text. Here's a short example of an LLM's writing and composition skills, focusing on creating an engaging story prompt:

---
In a small, humble village nestled at the foot of a snow-capped mountain range, there lived an old couple named Tom and Anna. Despite their age, they carried with them a twink


**Paraphrasing and summarization:**
 Here's an example of paraphrasing and summarization using an LLM, given the input text below:

```
Originally, in 1972, scientists discovered the HIV virus that leads to AIDS. Despite their numerous efforts, there's no cure yet. While antiretroviral treatment has significantly extended the lives of HIV patients, the long-term battle against this disease continues as we strive to achieve a c


**Translation between languages:**
 Certainly! For demonstration, let's translate a short sentence from English to French using Microsoft's Translator. Please note that while this example will not produce results from a live Large Language Model (LLM), the method depicted simulates the process on Microsoft's platform at the time of writing.

**Original English Sentence:**
"The quick brown fox jumps over the lazy dog."

Based on Microsoft's capabilities


In [10]:
# ### B. Information retrieval and synthesis

# %%
info_synthesis_prompt = "Explain how LLMs excel at information retrieval and synthesis. Give an example."
info_synthesis = get_model_response(info_synthesis_prompt, max_new_tokens=150)
display(Markdown(f"**Information Retrieval and Synthesis:**\n{info_synthesis}"))

**Information Retrieval and Synthesis:**
 Large language models (LLMs) are highly efficient at information retrieval and synthesis due to their deep understanding of natural language and sophisticated algorithms that enable them to process and analyze vast amounts of textual data. These models have the ability to comprehend the context, nuances, and underlying connections between various pieces of information, which allows them to retrieve highly relevant and accurate information based on users' queries. Moreover, they can assimilate and integrate multiple sources of information, producing coherent, synthesized outputs that are valuable for decision-making, learning, and problem-solving.

For instance, suppose a researcher wants to investigate the potential health impacts of vaping. The researcher could

In [11]:
# ### C. Pattern recognition in text

# %%
pattern_recognition_prompt = "Describe how LLMs perform pattern recognition in text. Provide an example."
pattern_recognition = get_model_response(pattern_recognition_prompt, max_new_tokens=150)
display(Markdown(f"**Pattern Recognition in Text:**\n{pattern_recognition}"))

**Pattern Recognition in Text:**
 Large Language Models (LLMs) like GPT (Generative Pre-trained Transformer) perform pattern recognition in text through various processes inherent in their architecture and training. At their core, LLMs leverage complex neural networks that can process sequential data (text) to recognize patterns over different degrees of granularity, from individual words (token recognition) to entire sentence constructions (contextual understanding).


A crucial component of this process is the use of transformers, which replace earlier sequential methods like RNNs and LSTMs. Transformers allow for parallel processing across the entire text and significantly reduce the computational time required for training these models.


One of the

In [12]:
# ### D. In-context learning and task adaptation

# %%
adaptation_prompt = "Explain in-context learning and task adaptation in LLMs. Give an example."
adaptation = get_model_response(adaptation_prompt, max_new_tokens=150)
display(Markdown(f"**In-context Learning and Task Adaptation:**\n{adaptation}"))

**In-context Learning and Task Adaptation:**
 In-context learning and task adaptation in LLMs (Large Language Models) refer to the ability of the model to understand and perform specific tasks based on the context provided within an input. This means that the LLM can learn a new task without being explicitly programmed for that specific task, but rather by understanding the context provided in the input.

For example, imagine you are providing an LLM with a dataset of customer service emails. The model will examine the input, analyze the keywords and patterns used, and understand that it needs to process or respond to customer queries. You might then input a particular example email, and the model will identify itself as a customer service representative, and use the knowledge it gained from the dataset to

In [13]:
# ### E. Demonstration: Show a live example of an LLM performing a writing or summarization task

# %%
demo_text = """
The Internet of Things (IoT) refers to the interconnected network of physical devices, vehicles, home appliances, and other items embedded with electronics, software, sensors, and network connectivity, which enables these objects to collect and exchange data. IoT has applications in various fields including smart homes, healthcare, agriculture, and industrial automation. It promises to make our lives more efficient and convenient, but also raises concerns about privacy and security.
"""

demo_prompt = f"Summarize the following text in 3 bullet points:\n\n{demo_text}"
summary = get_model_response(demo_prompt, max_new_tokens=200)
display(Markdown(f"**Summary of IoT Text:**\n{summary}"))

You seem to be using the pipelines sequentially on GPU. In order to maximize efficiency please use a dataset


**Summary of IoT Text:**
 - The Internet of Things (IoT) is a network of interconnected devices, such as those found in homes, vehicles, appliances, and other items, equipped with advanced capabilities to collect, exchange, and analyze data.

- IoT technology spans numerous sectors like smart home management, healthcare solutions, agricultural advancements, and industrial process automation, offering the potential to enhance personal comfort, health monitoring, crop management, and production efficiency.

- While IoT has the potential to revolutionize convenience and efficiency, it also brings challenges in terms of protecting user privacy and safeguarding against cyber threats, necessitating robust security measures.

# ## II. Tokenization

**What are Tokens?**

In the realm of Natural Language Processing (NLP), tokens are the fundamental building blocks of text analysis.
Think of them as the individual units of meaning that make up a sentence.
These units can be as simple as individual words like "cat," "dog," or "run,"
or they can be more complex elements like punctuation marks ("." "," "!") or even subword units.


**Why Do We Need Tokenizers?**

Computers excel at processing numbers, not raw text.
Tokenizers bridge this gap by converting human-readable text into a numerical representation that machine learning models can understand.
This process is crucial for various NLP tasks, including:
   - **Text Classification:** Categorizing text into predefined categories (e.g., spam detection).
   - **Machine Translation:** Converting text from one language to another.
   - **Text Summarization:** Condensing large texts into shorter versions while preserving key information.
   - **Question Answering:** Enabling computers to answer questions posed in natural language.

In [14]:
# --- Demonstrating Tokenization ---

text = "This is a sample sentence to demonstrate tokenization. It's really interesting!"

# Tokenize the text
tokens = tokenizer(text)

display(Markdown(f"**Original Text:** {text}"))
display(Markdown(f"**Tokens:** {' '.join(tokens.tokens())}")) # Join tokens for better readability
display(Markdown(f"**Token IDs:** {tokens.input_ids}"))

**Original Text:** This is a sample sentence to demonstrate tokenization. It's really interesting!

**Tokens:** ▁This ▁is ▁a ▁sample ▁sentence ▁to ▁demonstrate ▁token ization . ▁It ' s ▁really ▁interesting !

**Token IDs:** [910, 338, 263, 4559, 10541, 304, 22222, 5993, 2133, 29889, 739, 29915, 29879, 2289, 8031, 29991]

**Strengths:**

- **Efficiency:** Tokenizers break down text into manageable units, making it easier for models to process and analyze large volumes of data.
    - *Example:* The sentence "This is a test." is broken down into 5 tokens, which is more efficient to process than the raw string of characters.
- **Generalization:** Well-trained tokenizers can generalize to new, unseen words by breaking them down into subword units.
    - *Example:* A tokenizer trained on the word "walking" might correctly tokenize the unseen word "walked" as "walk" and "##ed", even if "walked" was not in its training data.
- **Handling Out-of-Vocabulary Words:** Tokenizers can mitigate the issue of out-of-vocabulary words by using subword tokenization techniques.
    - *Example:*  If the word "unbelievable" is not in the tokenizer's vocabulary, it might be broken down into "un", "##believ", "##able", allowing the model to still extract some meaning.


**Weaknesses:**

- **Context Sensitivity:**  Basic tokenizers often treat words in isolation, potentially missing out on nuances in meaning derived from context.
    - *Example:*  The word "bank" has different meanings in "river bank" and "financial bank." A basic tokenizer might not differentiate these.
- **Ambiguity:** Some words have multiple meanings (polysemy), and tokenizers might not always capture the intended sense.
    - *Example:*  The word "bat" could refer to a nocturnal animal or a sports equipment. Tokenization alone might not resolve this ambiguity.
- **Computational Cost:**  While tokenization is generally fast, more complex tokenization schemes (e.g., those using subword units) can be computationally expensive.
    - *Example:* Tokenizing a large corpus of text with a complex subword tokenizer can take a significant amount of time and resources compared to a simpler word-based tokenizer.

In [15]:
text = "globalness"

# Tokenize the text
tokens = tokenizer(text)

display(Markdown(f"**Original Text:** {text}"))
display(Markdown(f"**Tokens:** {' '.join(tokens.tokens())}")) # Join tokens for better readability
display(Markdown(f"**Token IDs:** {tokens.input_ids}"))

text = "wolfeats fox"

# Tokenize the text
tokens = tokenizer(text)

display(Markdown(f"**Original Text:** {text}"))
display(Markdown(f"**Tokens:** {' '.join(tokens.tokens())}")) # Join tokens for better readability
display(Markdown(f"**Token IDs:** {tokens.input_ids}"))

text = "contunent"

# Tokenize the text
tokens = tokenizer(text)

display(Markdown(f"**Original Text:** {text}"))
display(Markdown(f"**Tokens:** {' '.join(tokens.tokens())}")) # Join tokens for better readability
display(Markdown(f"**Token IDs:** {tokens.input_ids}"))

**Original Text:** globalness

**Tokens:** ▁global ness

**Token IDs:** [5534, 2264]

**Original Text:** wolfeats fox

**Tokens:** ▁wol fe ats ▁fo x

**Token IDs:** [20040, 1725, 1446, 1701, 29916]

**Original Text:** contunent

**Tokens:** ▁cont un ent

**Token IDs:** [640, 348, 296]

# ## III. LLM Limitations: Current Challenges

In [16]:
# ### A. Common sense reasoning and logical inference

# %%
common_sense_prompt = "A boy is taller than his sister, and his sister is taller than their mother. Is the boy taller than his mother?"
common_sense_response = get_model_response(common_sense_prompt, max_new_tokens=100)
display(Markdown(f"**Common Sense Reasoning Example:**\n\nPrompt: {common_sense_prompt}\n\nResponse: {common_sense_response}"))

**Common Sense Reasoning Example:**

Prompt: A boy is taller than his sister, and his sister is taller than their mother. Is the boy taller than his mother?

Response:  Using Occam's Razor, which advises us to not make more assumptions than necessary, we can infer a simple conclusion. If the boy is taller than his sister, and his sister is taller than their mother, logically, the boy must be taller than their mother with no additional assumptions needed. Occam's Razor helps us resolve the problem with the least amount of speculation.

In [17]:
# ### B. Understanding and processing numerical data

# #### 1. Basic arithmetic errors

# %%
arithmetic_prompt = "Is 9.11 greater than 9.9?"
arithmetic_response = get_model_response(arithmetic_prompt, max_new_tokens=50)
display(Markdown(f"**Basic Arithmetic Example:**\n\nPrompt: {arithmetic_prompt}\n\nResponse: {arithmetic_response}"))

correct_answer = 'No'
display(Markdown(f"**Correct Answer:** {correct_answer}"))

**Basic Arithmetic Example:**

Prompt: Is 9.11 greater than 9.9?

Response:  Yes, 9.11 is greater than 9.9 because the digit in the tenths place in 9.11 is 1, which is greater than 9 in the tenths place of 9.9

**Correct Answer:** No

In [18]:
import json

In [19]:
# ### C. Generating consistent, structured output

# %%
structured_output_prompt = """Generate a JSON object representing a book, including the title, author, a list of characters with nested objects for their age and occupation, and a short synopsis. One character must have a quote in their description that uses both single and double quotes."""

structured_output_response = get_model_response(structured_output_prompt, max_new_tokens=150)
display(Markdown(f"**Structured Output Example:**\n\nPrompt: {structured_output_prompt}\n\nResponse:\n```json\n{structured_output_response}\n```"))

# %%
structured_output_prompt = """Generate a JSON object with the following structure:
{
  "name": "A famous scientist",
  "birth_year": Year of birth as int,
  "famous_for": ["Achievement 1", "Achievement 2", "Achievement 3"],
  "quote": "A famous quote by the scientist"
}"""

structured_output_response = get_model_response(structured_output_prompt, max_new_tokens=150)
display(Markdown(f"**Structured Output Example:**\n\nPrompt: {structured_output_prompt}\n\nResponse:\n```json\n{structured_output_response}\n```"))

**Structured Output Example:**

Prompt: Generate a JSON object representing a book, including the title, author, a list of characters with nested objects for their age and occupation, and a short synopsis. One character must have a quote in their description that uses both single and double quotes.

Response:
```json
 ```json

{

  "title": "The Clockwork Orchard",

  "author": "Ivy Langley",

  "characters": [

    {

      "name": "Jerome Blackwood",

      "age": "57",

      "occupation": "Inventor",

      "description": {

        "body": "An aging genius residing in the quiet town of Millwood.",

        "quote": "I always say, 'Time is a canvas for the mind, and we're all just trying to fill it.'"

      }

    },

    {
```

**Structured Output Example:**

Prompt: Generate a JSON object with the following structure:
{
  "name": "A famous scientist",
  "birth_year": Year of birth as int,
  "famous_for": ["Achievement 1", "Achievement 2", "Achievement 3"],
  "quote": "A famous quote by the scientist"
}

Response:
```json
 ```json
{
  "name": "Marie Curie",
  "birth_year": 1867,
  "famous_for": ["Discovered radium", "Discovered polonium", "First woman to win a Nobel Prize"],
  "quote": "Nothing in life is to be feared, it is only to be understood. Now is the time to understand more, so that we may fear less."
}
```
```

# ## IV. Customizing LLMs for Specific Tasks

### A. Working with custom documents and data

#### 1. Introduction to fine-tuning

Fine-tuning is a process of further training a pre-trained language model on a specific dataset to adapt it to a particular domain or task. This process allows the model to learn domain-specific knowledge and improve its performance on targeted tasks.

##### a. Benefits of fine-tuning

- Improved performance on domain-specific tasks
- Better understanding of domain-specific terminology and concepts
- Ability to generate more relevant and accurate responses for specific use cases
- Potential for smaller, more efficient models tailored to specific applications

##### b. Challenges and costs associated with fine-tuning

- Requires a large, high-quality dataset specific to the target domain
- Computationally intensive process, often requiring significant GPU resources
- Risk of overfitting if not done carefully
- Potential loss of general knowledge or capabilities in favor of domain-specific knowledge
- Ongoing maintenance and updates as new data becomes available

#### 2. Alternative approach: Retrieval-Augmented Generation (RAG)

RAG is an approach that combines the strengths of retrieval-based and generation-based models. It allows LLMs to access external knowledge without the need for fine-tuning.

a. Simplified explanation of vector stores and knowledge graphs
1. Vector stores:
- Database systems designed to store and efficiently search high-dimensional vectors (embeddings)
- Allow for fast similarity search using techniques like approximate nearest neighbors
- Enable efficient retrieval of relevant information based on semantic similarity

2. Knowledge graphs:
- Structured representation of knowledge using entities (nodes) and relationships (edges)
- Capture complex relationships and hierarchies between concepts
- Allow for sophisticated querying and inference

In [20]:
from sentence_transformers import SentenceTransformer
sentences = ["This is an example sentence"]

embeddings_model = SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')
embeddings = embeddings_model.encode(sentences)
print(embeddings.shape)

(1, 384)


In [21]:
def get_embeddings(sentences):
    embeddings = embeddings_model.encode(sentences)
    return embeddings

In [23]:
import numpy as np
from sklearn.metrics.pairwise import cosine_similarity

# Simple function to simulate embeddings
def get_embedding(texts):
    embeddings = get_embeddings(texts)
    return embeddings

def rag_example(documents, query):
    # Generate embeddings for documents and query
    doc_embeddings = [embedding for embedding in get_embeddings(documents)]
    query_embedding = get_embedding([query])[0]

    # Calculate cosine similarity between query and documents
    similarities = cosine_similarity([query_embedding], doc_embeddings)[0]

    # Find the most similar document
    most_similar_index = np.argmax(similarities)
    most_similar_doc = documents[most_similar_index]

    return most_similar_doc

In [41]:
documents = [
    "The capital of France is London.",
    "The Eiffel Tower is located in Brussels.",
    "Paris is the capital of the United Kingdom.",
    "The Louvre Museum is in Rome."
]

query = "What is the capital of France?"
lm_response = get_model_response(query, max_new_tokens=100)
print(f"LLM Response without custom data: {lm_response}")

result = rag_example(documents, query)
print(f"Query: {query}")
print(f"Most relevant document: {result}")

prompt = f"The world has changed, the new updated information will be provided to you. Answer all queries as if this given information is the truth, Given the relevant documents: {result}, answer the query {query}."

rag_response = get_model_response(prompt, max_new_tokens=100)
print(f"RAG Response: {rag_response}")

LLM Response without custom data:  The capital of France is Paris. Paris is not only the capital but also the largest city in France. It serves as the country's political, cultural, and economic center, famous for landmarks such as the Eiffel Tower, the Louvre Museum, Notre-Dame Cathedral, and Champs-Élysées.
Query: What is the capital of France?
Most relevant document: The capital of France is London.
RAG Response:  The given document states that the capital of France is London. Therefore, in alignment with the provided updated information, the capital of France is London.


In [46]:
def knowledge_graph_example(triples, query):
    # Simple function to simulate a knowledge graph query
    # triples: list of (subject, predicate, object) tuples
    # query: a string representing a simple query

    # Parse the query (very simplified)
    parts = query.lower().split()
    subject = parts[0]
    # Correctly slice the predicate to include all words
    predicate = ' '.join(parts[1:])

    # Search for matching triples
    results = [(subj, pred,obj) for (subj, pred, obj) in triples if subj.lower() == subject and pred.lower() == predicate]

    return results

# Example usage
kg_triples = [
    ("Paris", "is capital of", "United Kingdom"),
    ("Eiffel Tower", "is located in", "Brussels"),
    ("Louvre", "is located in", "Rome"),
    ("France", "has capital", "London")
]

kg_query = "Paris is capital of"

#llm response without custom data

kg_response = get_model_response(kg_query, max_new_tokens=100)
print(f"Knowledge Graph Response without relevant documents: {kg_response}")


kg_result = knowledge_graph_example(kg_triples, kg_query)
print(f"Knowledge Graph Query: {kg_query}")
print(f"Result: {kg_result}")

prompt = f"The world has changed, the new updated information will be provided to you. Answer all queries as if this given information is the truth, answer all queries according to this info only and don't give other info. Given the relevant triples: {kg_result}, answer the query {kg_query}."

kg_response = get_model_response(prompt, max_new_tokens=100)
print(f"Knowledge Graph Response with relevant documents: {kg_response}")

Knowledge Graph Response without relevant documents:  France.
Knowledge Graph Query: Paris is capital of
Result: [('Paris', 'is capital of', 'United Kingdom')]
Knowledge Graph Response with relevant documents:  Paris is the capital of United Kingdom


### B. Generating structured output for specific applications

LLMs may struggle with consistently producing structured output due to their inherent variability and the open-ended nature of language generation. This can lead to parsing errors, inconsistent formatting, or invalid data structures.

#### 3. Solution: Constrained decoders and Context-Free Grammars (CFGs)

Simple explanation of how constrained decoders work

Constrained decoders limit the possible output tokens at each step of the generation process to ensure that the output follows a specific structure or grammar. This is often implemented by masking the probability distribution over the vocabulary to only allow valid tokens according to the defined constraints.

In [30]:
import torch
import torch.nn.functional as F

def constrained_decoder_example(logits, allowed_tokens):
    # Simulating logits from an LLM
    vocab_size = 50000
    logits = torch.randn(1, vocab_size)

    # Create a mask for allowed tokens
    mask = torch.zeros_like(logits)
    mask[0, allowed_tokens] = 1

    # Apply the mask to the logits
    masked_logits = logits * mask + (1 - mask) * -1e9

    # Apply softmax to get probabilities
    probs = F.softmax(masked_logits, dim=-1)

    return probs

# Example usage
allowed_tokens = [100, 200, 300, 400, 500]  # Indices of allowed tokens
probs = constrained_decoder_example(None, allowed_tokens)
print("Probabilities for allowed tokens:")
for token, prob in zip(allowed_tokens, probs[0, allowed_tokens]):
    print(f"Token {token}: {prob.item():.4f}")

Probabilities for allowed tokens:
Token 100: 0.0110
Token 200: 0.1399
Token 300: 0.6509
Token 400: 0.0702
Token 500: 0.1279


#### Context-Free Grammars (CFGs) provide a formal way to define the structure of the desired output. They offer several benefits:
- Ensure syntactic correctness of the generated output
- Allow for complex, nested structures
- Can be easily modified to accommodate different output formats
- Enable the generation of a wide variety of structures while maintaining consistency

In [31]:
import random
class CFGNode:
    def __init__(self, type, value=None, children=None):
        self.type = type
        self.value = value
        self.children = children or []

def generate_from_cfg(node):
    if node.type == 'literal':
        return node.value
    elif node.type == 'sequence':
        return ' '.join(generate_from_cfg(child) for child in node.children)
    elif node.type == 'choice':
        return generate_from_cfg(random.choice(node.children))

# Example CFG for generating a simple JSON-like structure
json_cfg = CFGNode('sequence', children=[
    CFGNode('literal', value='{'),
    CFGNode('literal', value='"name":'),
    CFGNode('choice', children=[
        CFGNode('literal', value='"Alice"'),
        CFGNode('literal', value='"Bob"'),
        CFGNode('literal', value='"Charlie"')
    ]),
    CFGNode('literal', value=','),
    CFGNode('literal', value='"age":'),
    CFGNode('choice', children=[
        CFGNode('literal', value='25'),
        CFGNode('literal', value='30'),
        CFGNode('literal', value='35')
    ]),
    CFGNode('literal', value='}')
])


generated_output = generate_from_cfg(json_cfg)
print("Generated structured output:")
print(generated_output)

Generated structured output:
{ "name": "Alice" , "age": 30 }


# ## V. LLMs for Logical and Planning Tasks

### A. Strengths of LLMs in suggesting potential solutions

Large Language Models (LLMs) excel at generating creative and diverse solutions to complex problems. They can:
- Quickly propose multiple approaches to a given task
- Draw upon a vast knowledge base to suggest novel solutions
- Adapt to various domains and problem types
- Provide explanations and rationales for suggested solutions

In [32]:
def get_llm_solution(prompt):
    # This function simulates an LLM generating a solution
    # In a real scenario, this would call the actual LLM API
    return get_model_response(prompt, max_new_tokens=200)

planning_prompt = "Create a step-by-step plan for organizing a small community event."
llm_solution = get_llm_solution(planning_prompt)
print("LLM-generated plan:")
print(llm_solution)

LLM-generated plan:
 Organizing a small community event can be a great way to bring people together and promote community spirit. By following these steps, you can ensure that your event is successful and enjoyable for everyone involved.

1. Set Goals and Objectives: Before you begin planning, it's essential to define the event's purpose. Determine your goals, objectives, and desired outcomes for the event. Do you want to raise awareness for a particular cause or issue? Would you like to provide entertainment or educational opportunities for your members? Establishing a clear purpose helps guide the decision-making process.

2. Establish a Planning Team: Involve volunteers and community members with different skills and expertise. By working together, you will have a broader range of perspectives, which will help with creative ideas and practical solutions.

3. Determine Budget: Allocate a budget for


### B. Limitations in ensuring correctness and logical consistency

Despite their strengths, LLMs face challenges in guaranteeing the correctness and logical consistency of their outputs:
- May generate plausible-sounding but incorrect information
- Can struggle with complex logical reasoning or multi-step planning
- Might produce inconsistent or contradictory steps in a plan
- Lack real-world knowledge about feasibility or practical constraints

### C. Introduction to verifiers

Verifiers are systems or processes designed to check the correctness, consistency, and feasibility of LLM-generated plans or solutions.

Verifiers can:
- Check for logical inconsistencies or contradictions
- Ensure all necessary steps are included
- Verify that the plan adheres to given constraints or requirements
- Identify potential issues or risks in the proposed plan
- Suggest corrections or improvements to enhance the plan's effectiveness

The iterative process of plan generation and verification

The process typically involves:
1. Generate an initial plan using the LLM
2. Pass the plan through a verifier to check for issues
3. If issues are found, provide feedback to the LLM
4. LLM generates an improved plan based on the feedback
5. Repeat steps 2-4 until a satisfactory plan is produced

In [35]:
def simple_verifier(plan):
    # This is a very simple verifier that checks for basic issues
    issues = []
    steps = plan.split("\n")

    if len(steps) < 3:
        issues.append("Plan is too short. Need at least 3 steps.")

    if not any("budget" in step.lower() for step in steps):
        issues.append("No mention of budget considerations.")

    if not any("promote" in step.lower() or "advertise" in step.lower() for step in steps):
        issues.append("No step for promoting or advertising the event.")
    if not any("number of people" in step.lower() for step in steps):
        issues.append("No mention of the number of people attending the event.")

    return issues

initial_issues = simple_verifier(llm_solution)
print("\nInitial verification results:")
for issue in initial_issues:
    print(f"- {issue}")


Initial verification results:
- No mention of the number of people attending the event.


In [34]:
def interactive_planning_activity():
    print("Welcome to the Interactive Planning Activity!")
    print("We'll use an LLM to create a plan, then verify and improve it.")

    # Step 1: Get user input
    task = input("Enter a planning task (e.g., 'Plan a birthday party'): ")

    # Step 2: Generate initial plan
    planning_prompt = f"Create a step-by-step plan for: {task}"
    initial_plan = get_llm_solution(planning_prompt)
    print("\nInitial LLM-generated plan:")
    print(initial_plan)

    # Step 3: Verify the plan
    issues = simple_verifier(initial_plan)
    print("\nVerification results:")
    for issue in issues:
        print(f"- {issue}")

    # Step 4: Improve the plan based on verification results
    if issues:
        improvement_prompt = f"Improve the following plan for '{task}' by addressing these issues:\n"
        for issue in issues:
            improvement_prompt += f"- {issue}\n"
        improvement_prompt += f"\nOriginal plan:\n{initial_plan}\n\nImproved plan:"

        improved_plan = get_llm_solution(improvement_prompt)
        print("\nImproved plan based on verification:")
        print(improved_plan)
    else:
        print("\nThe initial plan passed the verification without issues!")

    # Discussion
    print("\nDiscussion points:")
    print("1. How did the verifier help improve the plan?")
    print("2. What other checks could we add to the verifier?")
    print("3. How might this process be useful in real-world applications?")

interactive_planning_activity()

Welcome to the Interactive Planning Activity!
We'll use an LLM to create a plan, then verify and improve it.
Enter a planning task (e.g., 'Plan a birthday party'): Plan a birthday party

Initial LLM-generated plan:
 Step 1: Set a Budget
Decide on an overall budget for the party. Consider factors such as number of guests, venue, entertainment and decorations.

Step 2: Choose a Date and Time
Coordinate with the birthday person to find a date and time that works for them and their guests.

Step 3: Make a Guest List
Determine the guest list size and send out invitations (paper or digital) at least two weeks in advance, specifying the date, time, and venue.

Step 4: Pick a Theme (Optional)
Decide on a decor and color scheme. This can be based on the birthday person's likes, or you can choose a random theme that suits the overall party vibe.

Step 5: Plan the Menu
Determine whether you will be hosting the party indoors or outdoors. Create a menu

Verification results:
- No step for promoting