Skip to content

Exploration of text generation using GPT-2 and Hugging Face Transformers. Includes environment setup, model loading, parameter tuning (temperature, top-p, max tokens), and output analysis for creative language modeling.

Notifications You must be signed in to change notification settings

makayo/AI-ML-Assignment-4-Generative-LLM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

<<<<<<< HEAD MNIST Digit Classification with a Simple Neural Network Author: M Machine Learning_Neural Networks, Data Preprocessing, and Model Evaluation

Environment Python 3.11.3 TensorFlow 2.13.0 NumPy 1.26.0 Matplotlib 3.8.0

Project Overview This project demonstrates a complete deep learning workflow using a simple feedforward neural network to classify handwritten digits from the MNIST dataset. It covers data inspection, preprocessing, model architecture, training, evaluation, and prediction — all implemented in a Jupyter Notebook.

Dataset The MNIST dataset contains 70,000 grayscale images of handwritten digits (0–9), each 28×28 pixels. Key components used:

x_train: 60,000 training images

y_train: Corresponding digit labels (0–9)

x_test: 10,000 test images

y_test: Test labels for evaluation

Workflow Summary

  1. Data Inspection and Preparation

Loaded MNIST dataset using TensorFlow

Verified image shape and label distribution

Inspected raw pixel values and label formats

  1. Data Preprocessing

Normalized pixel values from 0–255 to [0, 1]

Flattened 28×28 images into 784-length vectors

One-hot encoded labels for multi-class classification

  1. Visualization

Displayed original, normalized, and flattened images

Visualized one-hot encoded label as a bar chart

Explained differences between formats:

Raw: 2D grayscale image with pixel values 0–255

Normalized: Same image scaled to [0, 1]

Flattened: Reshaped into a 784-length vector

One-hot encoded: Label converted to a binary vector for classification

  1. Model Architecture

Input layer: 784 features

Hidden layers: 128 and 64 neurons with ReLU activation

Output layer: 10 neurons with Softmax activation

Compiled with Adam optimizer and categorical crossentropy loss

  1. Model Training

Trained for 5 epochs with batch size of 32

Used 20% of training data for validation

Monitored accuracy and loss during training

  1. Evaluation

Final test accuracy: 97.28%

Evaluated model on unseen test data

Plotted accuracy and loss curves to assess performance

  1. Prediction

Selected a random test image

Compared true label vs predicted label

Displayed image with prediction result

  1. Visual Prediction Test

Used Matplotlib to show the digit image

Annotated with true and predicted labels

Confirmed model’s ability to generalize to unseen data

Results The model achieved 97.28% accuracy on the test set, demonstrating strong generalization. Visual inspection of predictions confirms the model’s ability to correctly classify unseen digits. Training and validation curves show stable learning with no major signs of over-fitting.

GPT-2 Text Generation with Hugging Face Transformers

Author: MARK YOSINAO
Generative AI — Language Modeling, Prompt Engineering, and Output Analysis


Environment

  • Python 3.11.3
  • Transformers 4.34.0
  • PyTorch 2.1.0
  • Jupyter Notebook

Project Overview

This project explores generative language modeling using GPT-2 from Hugging Face. The goal is to understand how different generation parameters—temperature, top-p (nucleus sampling), and max token length—affect the creativity, coherence, and diversity of AI-generated text. The entire workflow is implemented in a Jupyter Notebook using Hugging Face Transformers and PyTorch.


Prompt

All experiments were conducted using the poetic prompt:
"The stars whispered secrets to the ocean"

This prompt was chosen for its abstract and imaginative tone, ideal for observing how generation parameters influence narrative style.


Workflow Summary

1. Environment Setup

  • Installed Hugging Face transformers library
  • Verified PyTorch installation (CPU backend)
  • Imported GPT-2 model and tokenizer from Hugging Face Hub

2. Model Loading

  • Loaded gpt2 using GPT2LMHeadModel and GPT2Tokenizer
  • Initialized tokenizer and model into a text generation pipeline

3. Text Generation Function

  • Defined a reusable generate_text() function
  • Accepted prompt, temperature, top-p, and max token length as inputs
  • Enabled do_sample=True for stochastic sampling
  • Returned decoded output from generated token sequence

4. Temperature Experiment

  • Ran generation with temperature values: 0.2, 0.7, and 1.0
  • Observed how randomness affects word choice and tone
  • Compared outputs for coherence and creativity

5. Max New Tokens Experiment

  • Tested output length using 50, 100, and 150 tokens
  • Evaluated pacing, completeness, and narrative drift
  • Noted how longer outputs allow deeper storytelling

6. Top-p (Nucleus Sampling) Experiment

  • Ran generation with top-p values: 0.5, 0.9, and 1.0
  • Analyzed diversity of vocabulary and sentence structure
  • Compared focused vs. free-flowing outputs

7. Output Analysis

  • Used consistent prompt across all experiments
  • Documented results in notebook and video walkthrough
  • Highlighted trade-offs between control and creativity

Results

  • Temperature and top-p significantly influence tone and imagination
  • Lower values yield safe, repetitive outputs
  • Higher values increase creativity but reduce coherence
  • Max token length controls pacing and narrative complexity
  • Best balance observed at:
    • Temperature = 0.7
    • Top-p = 0.9
    • Max new tokens = 100

Conclusion

This project demonstrates how small changes in generation parameters can dramatically alter the behavior of a language model. By tuning temperature, top-p, and token length, we can guide GPT-2 to produce anything from structured prose to surreal poetry. The experiments offer insight into prompt engineering and the creative potential of generative AI.

16711b051d72778da14bb5068ba1219f6fe4abf3

About

Exploration of text generation using GPT-2 and Hugging Face Transformers. Includes environment setup, model loading, parameter tuning (temperature, top-p, max tokens), and output analysis for creative language modeling.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published