AI Engineering Examples

A collection of practical AI engineering examples, demos, and proof-of-concepts for modern AI applications. This repository serves as a playground for experimenting with cutting-edge AI techniques and preparing for AI engineering roles.

🚀 Quick Start

# Clone the repository
git clone <repository-url>
cd ai-engineering-examples

# Create and activate virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements/requirements.txt

# Run examples
python examples/speculative_decoding/improved_speculative_decoding.py

# Run interactive demo
streamlit run demos/speculative_decoding_demo.py

📁 Repository Structure

ai-engineering-examples/
├── examples/                    # Production-ready AI engineering examples
│   └── speculative_decoding/   # Speculative decoding implementations
│       ├── improved_speculative_decoding.py
│       └── README.md
├── demos/                      # Interactive demos and visualizations
│   └── speculative_decoding_demo.py
├── playground/                 # Learning & experimentation sandbox
│   ├── pythonic_examples.py   # Pythonic programming exercises
│   └── README.md
├── requirements/               # Dependency management
│   └── requirements.txt
├── venv/                      # Virtual environment (not tracked)
├── .gitignore                 # Ignore unnecessary files
└── README.md                  # This file

🎯 Examples

1. Speculative Decoding

Location: examples/speculative_decoding/

A comprehensive implementation of speculative decoding, a technique that uses a smaller "draft" model to predict tokens that are then verified by a larger "target" model. This can significantly speed up text generation.

Features:

✅ Multiple model configurations (same model, different models)
✅ Caching for performance optimization
✅ Sophisticated rejection strategies
✅ Memory usage monitoring
✅ Performance comparison with standard decoding
✅ GPU support

Usage:

python examples/speculative_decoding/improved_speculative_decoding.py

Key Insights:

Acceptance rates vary by model combination
Aggressive speculation (5 tokens) achieved highest throughput: 11.03 tokens/sec
Different models show realistic rejection patterns
Memory overhead: ~111MB vs standard decoding

2. Interactive Demo with Real-Time Visualization

Location: demos/speculative_decoding_demo.py

A Streamlit-based interactive demo that visualizes the speculative decoding process in real-time with live token generation and performance metrics.

🎨 Real-Time Features:

Live Token Generation: Watch tokens being generated step-by-step as they happen
Color-Coded Tokens:
- 🟢 Green: Accepted tokens (✓)
- 🔴 Red: Rejected tokens (✗) with replacement shown
- 🟡 Yellow: Replacement tokens from target model
Real-Time Metrics: Live updates of elapsed time, tokens generated, and tokens/sec
Performance Comparison: Side-by-side comparison with standard decoding
Interactive Parameter Tuning: Adjust temperature, speculative tokens, and model combinations
Memory Usage Monitoring: Real-time resource tracking

Usage:

streamlit run demos/speculative_decoding_demo.py

Note:

The speculative decoding demo is under active improvement. If you have suggestions for better real-time visualization, usability, or want to contribute, please open an issue or pull request!
The demo now features improved color contrast, real-time token updates, and detailed step-by-step token visualization. However, further enhancements are welcome.

🎮 Playground

Location: playground/

A dedicated space for learning, experimentation, and practice. This includes:

Pythonic Programming Exercises: Solutions to Educative.io's "Pythonic Programming Tips for Software Engineers" course
Future Practice Files: Additional learning exercises and experiments
Safe Experimentation: Try new features without affecting production examples

Usage:

# Run Pythonic examples
python playground/pythonic_examples.py

# Add new practice files
touch playground/new_practice.py

🛠️ Technologies Used

PyTorch: Deep learning framework
Transformers: Hugging Face model library
Streamlit: Interactive web applications
Gradio: Alternative demo framework
Plotly: Interactive visualizations
psutil: System monitoring

🎨 Demo Features

Real-Time Speculative Decoding Demo

The interactive demo showcases:

Live Token Visualization: Watch tokens being generated in real-time with color coding
Performance Comparison: Side-by-side comparison with standard decoding
Parameter Tuning: Adjust temperature, speculative tokens, and model combinations
Visual Metrics: Charts showing acceptance rates, speed, and memory usage
Model Selection: Choose from different model combinations
Real-Time Metrics: Live updates of generation progress and performance

Color-Coded Token Display

🟢 Accepted Tokens: Green with checkmark (✓) - Draft model prediction was correct
🔴 Rejected Tokens: Red with X (✗) - Draft model prediction was wrong, replaced with target model's prediction
🟡 Replacement Tokens: Yellow - Target model's replacement for rejected draft token

🚀 Getting Started for AI Engineering

For Interviews and POCs

Study the Examples: Understand the implementation patterns
Run the Demos: See the techniques in action with real-time visualization
Modify Parameters: Experiment with different configurations
Add New Examples: Extend with your own AI engineering patterns

For Production Use

Optimize for Scale: Consider using vLLM for production inference
Add Monitoring: Implement proper logging and metrics
Error Handling: Add robust error handling for production environments
Testing: Add unit tests and integration tests

📊 Performance Benchmarks

Speculative Decoding Results

Configuration	Acceptance Rate	Tokens/sec	Memory (MB)	Speedup
Same Model	100%	8.75	1507.6	1.0x
Different Models	95.56%	6.08	2510.5	0.7x
Conservative (1 token)	100%	8.34	2474.1	0.95x
Aggressive (5 tokens)	100%	11.03	3330.7	1.26x

🔧 Development

Adding New Examples

Create a new directory in examples/
Add your implementation
Include a README explaining the technique
Add any dependencies to requirements/requirements.txt
Consider creating a demo in demos/

Best Practices

Documentation: Always include clear documentation
Performance: Monitor and optimize for performance
Visualization: Include visualizations when possible
Modularity: Keep code modular and reusable
Testing: Add tests for critical functionality

🤝 Contributing

This repository is designed for learning and experimentation. Feel free to:

Add new AI engineering examples
Improve existing implementations
Create new interactive demos
Suggest new techniques to explore

📚 Learning Resources

🎯 Use Cases

This repository is perfect for:

AI Engineering Interviews: Demonstrate practical knowledge with real-time demos
POC Development: Rapid prototyping of AI features
Learning: Understanding modern AI techniques with visual feedback
Research: Experimenting with new approaches
Teaching: Visual demonstrations of AI concepts

📄 License

This project is for educational and research purposes. Please respect the licenses of the underlying libraries and models used.

Happy AI Engineering! 🚀

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AI Engineering Examples

🚀 Quick Start

📁 Repository Structure

🎯 Examples

1. Speculative Decoding

2. Interactive Demo with Real-Time Visualization

🎮 Playground

🛠️ Technologies Used

🎨 Demo Features

Real-Time Speculative Decoding Demo

Color-Coded Token Display

🚀 Getting Started for AI Engineering

For Interviews and POCs

For Production Use

📊 Performance Benchmarks

Speculative Decoding Results

🔧 Development

Adding New Examples

Best Practices

🤝 Contributing

📚 Learning Resources

🎯 Use Cases

📄 License

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
demos		demos
examples/speculative_decoding		examples/speculative_decoding
playground		playground
requirements		requirements
.gitignore		.gitignore
README.md		README.md

mac2bua/ai-engineering-examples

Folders and files

Latest commit

History

Repository files navigation

AI Engineering Examples

🚀 Quick Start

📁 Repository Structure

🎯 Examples

1. Speculative Decoding

2. Interactive Demo with Real-Time Visualization

🎮 Playground

🛠️ Technologies Used

🎨 Demo Features

Real-Time Speculative Decoding Demo

Color-Coded Token Display

🚀 Getting Started for AI Engineering

For Interviews and POCs

For Production Use

📊 Performance Benchmarks

Speculative Decoding Results

🔧 Development

Adding New Examples

Best Practices

🤝 Contributing

📚 Learning Resources

🎯 Use Cases

📄 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages