# 📓 The GenAI Revolution Cookbook

**Title:** Unlocking the Potential of Small Language Models for AI Builders

**Description:** Discover how Small Language Models can revolutionize your AI projects with efficient, scalable solutions. Learn to build, deploy, and optimize SLMs in real-world applications.

---

*This jupyter notebook contains executable code examples. Run the cells below to try out the code yourself!*



# Building a Memory-Aware Chatbot with Small Language Models

## Introduction

In the rapidly evolving landscape of artificial intelligence, Small Language Models (SLMs) have emerged as powerful tools for developers, particularly in resource-constrained environments. These models offer a balance between performance and efficiency, making them ideal for applications where computational resources are limited. In this tutorial, we will guide you through the process of building, deploying, and optimizing a memory-aware chatbot using SLMs. This hands-on guide is designed to help AI Builders integrate SLMs into existing systems and optimize them for production environments, ensuring scalability and security.

## Installation

To get started, we need to install the necessary libraries. We'll be using the Hugging Face Transformers library, which provides pre-trained models and tokenizers.

In [None]:
# Install the Hugging Face Transformers library for working with language models
!pip install transformers

## Project Setup

Before diving into the code, let's set up our environment variables. This includes API keys and other configurations necessary for secure access.

In [None]:
import os

# Set the API key as an environment variable for secure access
os.environ['API_KEY'] = 'your_api_key_here'

## Step-by-Step Build

### Tokenization

Tokenization is a crucial step in processing text data. We'll use a pre-trained tokenizer from the Hugging Face library to tokenize our input text.

In [None]:
from transformers import AutoTokenizer

# Load a pre-trained tokenizer for the 'distilbert-base-uncased' model
tokenizer = AutoTokenizer.from_pretrained('distilbert-base-uncased')

# Tokenize a sample text, adding special tokens required by the model
tokens = tokenizer.encode("Sample text for tokenization", add_special_tokens=True)

### Model Integration

Next, we'll load a pre-trained model for sequence classification. This model will be used to understand and respond to user inputs.

In [None]:
from transformers import AutoModelForSequenceClassification

# Load the 'distilbert-base-uncased' model for sequence classification tasks
model = AutoModelForSequenceClassification.from_pretrained('distilbert-base-uncased')

### Summarization Function

We'll create a function to summarize text using our pre-trained language model. This function will be integral to our chatbot's ability to provide concise responses.

In [None]:
def summarize_text(text):
    """
    Summarizes the input text using a pre-trained model.

    Args:
        text (str): The text to be summarized.

    Returns:
        str: The summarized text.
    """
    # Tokenize the input text and convert it to a tensor
    inputs = tokenizer(text, return_tensors='pt')
    
    # Generate a summary with specified constraints on length and beam search
    summary_ids = model.generate(
        inputs['input_ids'], 
        max_length=50, 
        min_length=25, 
        length_penalty=2.0, 
        num_beams=4, 
        early_stopping=True
    )
    
    # Decode the generated summary and return it
    return tokenizer.decode(summary_ids[0], skip_special_tokens=True)

# Example usage of the summarize_text function
print(summarize_text("Your long text goes here."))

### Evaluation

To ensure our model performs well, we'll evaluate its accuracy using a simple metric.

In [None]:
# Example of evaluating model performance
correct_predictions = 80  # Example value
total_predictions = 100   # Example value

# Calculate accuracy as a percentage
accuracy = (correct_predictions / total_predictions) * 100

# Print the model accuracy
print(f"Model Accuracy: {accuracy}%")

## Full End-to-End Application

Now that we've built each component, let's put them together in a single, runnable script that produces a working demo of our memory-aware chatbot.

In [None]:
# Full script to integrate tokenization, model loading, and summarization
def chatbot_response(user_input):
    """
    Generates a chatbot response using a pre-trained model.

    Args:
        user_input (str): The user's input text.

    Returns:
        str: The chatbot's response.
    """
    # Summarize the user's input
    response = summarize_text(user_input)
    return response

# Example usage of the chatbot_response function
print(chatbot_response("Tell me about the latest advancements in AI."))

## Testing & Validation

To validate our chatbot, we can run test queries and evaluate the responses. This step ensures that our application meets the desired performance criteria.

## Conclusion

In this tutorial, we've explored the process of building a memory-aware chatbot using Small Language Models. We've covered tokenization, model integration, summarization, and evaluation, providing a comprehensive guide for AI Builders. As next steps, consider scaling your application for deployment, optimizing for latency and cost, and exploring additional use cases such as edge computing or mobile applications.