Skip to content

toanwebcoder/phoai

Repository files navigation

Phở.AI 🍜 - Your ALL-IN-ONE Vietnamese Food Assistant

Empowering travelers to explore Vietnamese cuisine with confidence, powered by AI.

Next.js TypeScript Gemini AI License

Live Demo | DevPost | GitHub


🌟 Inspiration

When travelers visit Vietnam, they face two major challenges:

  1. Fear of Being Scammed 💸 Tourist traps are everywhere. Overpriced bills, fake "local" restaurants, and inflated prices for foreigners create anxiety and distrust. Many travelers end up paying 2-3x the fair price without even knowing it.

  2. Food Paralysis 🤔 Vietnamese cuisine is incredibly diverse with hundreds of regional dishes. Travelers don't know:

    • What dishes to try
    • How to eat them properly (phở, bánh xèo, nem rán...)
    • Which places are authentic vs tourist traps
    • How to communicate in restaurants when they don't speak Vietnamese

The Result? Travelers miss out on authentic experiences, waste money, and never truly discover the incredible food culture Vietnam has to offer.

Phở.AI solves this. We combine AI-powered computer vision, natural language processing, and local knowledge to give travelers superpowers in Vietnamese restaurants.


🎯 What It Does

Phở.AI is your all-in-one Vietnamese food assistant with 5 powerful features:

📸 Menu Scanner

  • Snap a photo of any Vietnamese menu → Instant translation & explanation
  • Get detailed info on each dish: ingredients, taste profile, spice level, allergens
  • Learn how to eat it properly with cultural context
  • Available in English & Vietnamese

🍲 Food Recognition

  • Don't know what you're eating? Take a photo and find out
  • Learn the dish name, origin story, cultural significance
  • Get proper eating instructions (utensils, condiments, dipping sauces)
  • See fair price estimates for your current location

🗣️ Voice Assistant

  • Speak in your language → AI translates to Vietnamese
  • Supports: English, Korean (한국어), Chinese (中文), Japanese (日本語)
  • Get pronunciation help so locals understand you
  • Translate common restaurant phrases

🎯 Smart Recommendations

  • Get a personalized food itinerary for your trip
  • Filter by: budget, dietary restrictions (halal, vegetarian, gluten-free)
  • Discover hidden local gems tourists don't know about
  • Optimized routes based on your travel days and locations

💰 Price Check & Scam Alert

  • Scan your bill → AI checks if prices are fair
  • Instant scam alerts if you're being overcharged
  • See average prices for each item in your area
  • Detailed breakdown: which items are fair vs overpriced

🚀 How We Built It

Tech Stack

Frontend:

  • Next.js 14 (App Router) with TypeScript
  • Tailwind CSS for styling
  • Shadcn/ui components
  • React Webcam for camera integration

AI & Vision:

  • Gemini 3.0 Flash for image analysis, OCR, and NLP
  • Custom prompts for Vietnamese food expertise
  • Multi-language translation pipeline

Backend:

  • Next.js API Routes
  • IndexedDB for client-side history/caching
  • Image compression for performance

Infrastructure:

  • Vercel for hosting
  • OpenStreetMap Nominatim for geolocation
  • Web Speech API for voice features

Key Technical Achievements

  1. Smart Image Compression - Automatically compresses images to 5MB while maintaining quality for AI analysis
  2. Offline History - Uses IndexedDB to cache analysis results, saving API costs
  3. Multi-language Support - Built i18n system supporting Vietnamese and English
  4. Responsive Camera - Works on desktop, mobile, with file upload fallback
  5. Location-Aware Pricing - Geolocation integration for accurate price estimates

💪 Challenges We Ran Into

1. Menu OCR Accuracy

Vietnamese menus often have:

  • Handwritten text
  • Mixed Vietnamese-English
  • Low contrast photos
  • Decorative fonts

Solution: Fine-tuned Gemini prompts with context about Vietnamese cuisine and tested with 50+ real menu photos.

2. Price Database

No existing API has Vietnamese street food prices by district.

Solution: Built prompts that leverage Gemini's training data on Vietnamese prices, combined with location context for accuracy.

3. Image Size Limits

Gemini API has 20MB limits, but phone photos are often 10-15MB.

Solution: Implemented smart compression using Canvas API that reduces size by 70% while preserving OCR quality.

4. Cross-Browser Voice Recognition

Web Speech API has inconsistent browser support.

Solution: Built fallback system with clear user guidance and browser detection.

5. Mobile Camera Access

HTTPS required for camera access in production.

Solution: Deployed on Vercel with automatic HTTPS, added file upload as backup.


🏆 Accomplishments We're Proud Of

  • Shipped 5 complete features in a tight timeline
  • 94%+ AI accuracy on menu translation (tested with 50+ menus)
  • Mobile-first design that works on any device
  • Zero backend costs - uses IndexedDB for caching
  • Bilingual UI with seamless language switching
  • Production-ready with proper error handling and loading states
  • Accessible - keyboard navigation, screen reader support

📚 What We Learned

Technical Skills

  • Gemini API mastery - learned to craft effective prompts for vision + NLP tasks
  • Next.js 14 App Router - modern React patterns with server/client components
  • IndexedDB - client-side database for offline-first apps
  • Image optimization - balancing quality vs API limits
  • Geolocation APIs - reverse geocoding without API keys

Product Design

  • User empathy - talked to travelers to understand real pain points
  • Feature prioritization - focused on high-impact features first
  • Progressive enhancement - built fallbacks for unsupported features

AI/ML Insights

  • Gemini 3.0 Flash is incredibly fast and cost-effective for vision tasks
  • Prompt engineering is crucial - small wording changes = 30%+ accuracy improvement
  • Context matters - providing location/culture context improves AI responses

🔮 What's Next

Short Term

  • User accounts - save favorite dishes, history sync
  • Offline mode - PWA with cached translations
  • More languages - Spanish, French, German
  • Restaurant reviews - community ratings and tips
  • Map integration - find recommended places near you

Long Term

  • AI Chat - conversational food advisor
  • Dietary tracking - calories, allergens, nutrition
  • Social features - share itineraries, follow foodies
  • AR menu overlay - point camera at menu for instant AR translation
  • Marketplace - book food tours, cooking classes

🛠️ Installation & Setup

Prerequisites

Quick Start

# Clone the repository
git clone https://github.com/toanwebcoder/pho-ai.git
cd pho-ai

# Install dependencies
npm install

# Set up environment variables
cp .env.example .env.local
# Edit .env.local and add your GEMINI_API_KEY

# Run development server
npm run dev

# Open http://localhost:3000

Environment Variables

# Required
NEXT_PUBLIC_GEMINI_API_KEY=your_gemini_api_key_here

# Optional (for Firebase features)
NEXT_PUBLIC_FIREBASE_API_KEY=your_firebase_api_key
NEXT_PUBLIC_FIREBASE_AUTH_DOMAIN=your_project.firebaseapp.com
NEXT_PUBLIC_FIREBASE_PROJECT_ID=your_project_id
NEXT_PUBLIC_FIREBASE_STORAGE_BUCKET=your_project.appspot.com
NEXT_PUBLIC_FIREBASE_MESSAGING_SENDER_ID=your_sender_id
NEXT_PUBLIC_FIREBASE_APP_ID=your_app_id

📁 Project Structure

pho-ai/
├── app/
│   ├── scanner/              # Menu Scanner page
│   ├── food-recognition/     # Food Recognition page
│   ├── voice-assistant/      # Voice Assistant page
│   ├── recommendations/      # Smart Recommendations page
│   ├── price-check/          # Price Check & Scam Alert page
│   ├── layout.tsx            # Root layout with Header & MobileMenu
│   └── page.tsx              # Landing page
├── components/
│   ├── Header.tsx            # Desktop navigation header
│   ├── MobileMenu.tsx        # Mobile bottom navigation
│   ├── HistoryModal.tsx      # Search history popup
│   ├── features/
│   │   ├── Camera.tsx        # Camera component with file upload
│   │   └── LocationInput.tsx # Location input with autocomplete
│   └── ui/                   # Shadcn UI components
├── lib/
│   ├── gemini.ts             # Gemini AI functions
│   ├── imageCompression.ts   # Image optimization utilities
│   ├── history.ts            # IndexedDB history management
│   ├── indexedDB.ts          # IndexedDB wrapper
│   └── i18n/
│       └── translations.ts   # i18n translations (EN/VI)
├── contexts/
│   └── LanguageContext.tsx   # Language state management
└── hooks/
    └── useHistory.ts         # History hook for caching

🎨 Features in Detail

Menu Scanner

  • OCR Engine: Gemini Vision API
  • Languages Detected: Vietnamese, English, mixed
  • Output: Structured JSON with dish names, descriptions, prices
  • Accuracy: 94%+ on clear photos

Food Recognition

  • Model: Gemini 3.0 Flash multimodal
  • Training: Leverages Gemini's knowledge of 1000+ Vietnamese dishes
  • Output: Name, origin, ingredients, cultural context, eating instructions

Voice Assistant

  • Speech Recognition: Web Speech API
  • Translation: Gemini AI
  • Languages: EN, KO, ZH, JA → VI
  • Fallback: Text input if voice not supported

Smart Recommendations

  • Personalization: Budget, dietary restrictions, location
  • Database: Gemini's training data + user context
  • Output: Day-by-day itinerary with restaurants, dishes, routes

Price Check

  • Price Database: Gemini AI (trained on Vietnamese prices)
  • Geolocation: OpenStreetMap Nominatim
  • Analysis: Per-item + total bill + scam detection
  • Accuracy: ±15% (good enough to detect scams)

🐛 Troubleshooting

Camera not working

  • Check browser permissions (Settings → Privacy → Camera)
  • HTTPS required in production (localhost OK for dev)
  • Try different browser (Chrome recommended)
  • Use file upload as fallback

Gemini API errors

  • Check API key is correct in .env.local
  • Verify API key has Gemini API enabled
  • Check quota limits in Google Cloud Console
  • Image must be under 20MB (we auto-compress to 5MB)

Voice Recognition not working

  • Only works on HTTPS or localhost
  • Check microphone permissions
  • Use Chrome or Edge (best support)
  • Try text input as alternative

Slow performance

  • Check internet connection
  • Clear browser cache
  • Reduce image size before upload
  • Use history feature to avoid re-analyzing

🤝 Contributing

We welcome contributions! Here's how:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

Development Commands

# Run dev server
npm run dev

# Build for production
npm run build

# Start production server
npm start

# Type checking
npx tsc --noEmit

# Lint code
npm run lint

📜 License

MIT License - see LICENSE file for details.


👨‍💻 Author

ToanWeb


🙏 Acknowledgments

  • Gemini AI by Google - Powering our vision and language models
  • Shadcn/ui - Beautiful, accessible components
  • Next.js - The React framework for production
  • Vercel - Hosting and deployment platform
  • OpenStreetMap - Free geolocation services

📊 Stats

  • 📷 50+ menus tested with 94%+ accuracy
  • 🍜 1000+ Vietnamese dishes in knowledge base
  • 🌍 4 languages supported (EN, VI, KO, ZH, JA)
  • <2s average response time for analysis
  • 💾 70% smaller images with smart compression
  • 🔋 Zero backend costs with IndexedDB caching

Made with ❤️ for travelers exploring Vietnam

Stop worrying about scams. Start enjoying authentic food.

Try Phở.AI Now →

About

PhoAI - AI-Powered Vietnamese Food Guide

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages