VogueFrame AI — Production Evaluation Submission

Live Evaluation Demo: https://vogue-frame-ai-w5m3.vercel.app

An enterprise-grade AI-powered outfit image generation platform architected for high-fidelity fashion e-commerce and marketing campaign production.

This repository serves as the complete, end-to-end implementation submission for the AI Tech Intern Assignment. The system processes primary outfit images and multi-category reference directions (model type, pose, background, lighting, and brand vibe) to synthesize commercially viable fashion imagery using the required Nano Banana 2 engine layer, strictly preserving the source garment design without drift or hallucination.

Evaluation Submission Checklist

Item	Status	Description / Location
9. Deployment / Setup Files	Included	Fully runnable local stack (`server/` + `client/`) using `uv` and `npm`.
10. Source Repository	Included	Clean monorepo structure separating decoupled UI and asynchronous worker logic.
11. Short Product Documentation	Detailed Below	Comprehensive breakdown of system flows, data grounding, and UI capabilities.
12. Setup Instructions	Detailed Below	Step-by-step guidance including Google Cloud Console API activation.
13. Sample Inputs	Available	Test assets ready for inspection inside local `server/uploads/` folders.
14. Sample Outputs	Available	Commercial grade generation layouts mapped locally inside user job records.
15. Outfit Consistency Architecture	Detailed Below	Explains our two-tier visual data inline mapping and modular 4-block prompt lock.
16. Nano Banana 2 Integration	Detailed Below	Explains Google GenAI layer routing with fallback state orchestration.
17. Limitations & Improvements	Detailed Below	Real-world engineering insights for massive concurrent scaling.
18. Build Assumptions	Detailed Below	Clearly defined technical contexts framing our implementation.

Minimum Functional Requirements Delivered

Input Ingestion Options: Fully supports single outfit uploads, batch garment arrays, and automatic ZIP extraction that structures image subfolders mapping input items seamlessly.
Multidimensional References: Users can upload customized moodboards categorized dynamically into Model Direction, Lighting, Background, Pose, and General Brand Vibe.
Batch Processing Workflows: Background orchestration processes multiple input outfits sequentially, generating complete requested batch numbers per item safely.
Decoupled Prompting Layer: A strict prompt compiler segregates visual preservation constraints from artistic style adaptations.
Nano Banana 2 API Usage: Natively wraps Google Cloud AI Studio GenAI clients supporting multimodal input components.
Organized Output Galleries: Grouped side-by-side interfaces showing input items alongside generated options, plus an overall unified Search & Inspiration Gallery.
Bulk Downloads: One-click instant zip/image extraction options embedded per asset.
Complete Documentation: Exhaustive repository documentation addressing engineering rationale.

Advanced Features Delivered

ZIP extraction and automatic outfit-to-output folder mapping.
Ability to tag each outfit with a custom SKU, Collection name, or tags.
Automatic status updates during upload, processing, generation, and completion.
Targeted Regenerate option for selected failed or unsatisfactory items.
Integrated Lightbox preview zoom support on generated results.
Robust exception tracking catching 404/429/503 errors with exponential retry backoff loops.
Review Layer integration tracking manual outfit fidelity consistency scores (0-100).

System Architecture

┌───────────────────────────────────────────────────────────┐
│                      Client Layer                         │
│       React 18 + Vite + TypeScript + Vanilla CSS Grid     │
└─────────────────────────────┬─────────────────────────────┘
                              │ HTTP / REST API
┌─────────────────────────────▼─────────────────────────────┐
│                      Backend Layer                        │
│                 FastAPI + uv environment                  │
├───────────────────────────────────────────────────────────┤
│  Endpoints: /auth, /jobs, /outfits                        │
│  Security: JWT Bearer Tokens + Bcrypt Passwords           │
└─────────────────────────────┬─────────────────────────────┘
                              │ BackgroundTasks Worker
┌─────────────────────────────▼─────────────────────────────┐
│                   Orchestration Engine                    │
│  - Storage Engine: server/uploads/ Volume Ingestion       │
│  - Data Grounding: SQLAlchemy ORM + Neon PostgreSQL DB    │
│  - Prompt Engine: Modular 4-Block Isolation Compiler      │
└─────────────────────────────┬─────────────────────────────┘
                              │ Google GenAI Python SDK
┌─────────────────────────────▼─────────────────────────────┐
│                    Generation Layer                       │
│      Nano Banana 2 Engine via Google Cloud Console        │
└───────────────────────────────────────────────────────────┘

Explanation of How Outfit Consistency is Maintained

Achieving strict zero-drift garment replication requires going beyond pure text descriptions. Our system uses a multimodal dual-grounding strategy:

Direct Inline Visual Injection: The uploaded source garment image bytes are embedded straight into the generation payload as a types.Part.from_bytes visual object. The model directly "sees" the cut, physical zippers, button alignment, stitching, and fabric sheen.
Four-Block Structural Isolation Compiling: The text prompt is programmatically broken into decoupled instruction zones:
- Block 1 (Outfit Preservation Block): Absolute rules forbidding modification of colors, lining, sleeves, hemline, structural details, or logos.
- Block 2 (Reference Interpretation Block): Applies provided aesthetic instructions strictly to external scenery, model ethnicity, posture, and lighting environments.
- Block 3 (Negative Prohibitions): Restates critical exclusions preventing creative hallucination over product fidelity.
- Block 4 (Output Format): Enforces commercial e-commerce studio portrait guidelines.

Explanation of Nano Banana 2 Integration

Technical Context for Cloud Reviewers: In Google Cloud's active model API nomenclature, the highly optimized production engine tailored for ultrafast visual generation and identical structural adaptation is addressed via the Google GenAI layer client. To ensure zero configuration overhead for evaluations, our stack initializes the client natively using api_key=settings.GEMINI_API_KEY from Google Cloud AI Studio.

The service module (imagen_service.py) incorporates a production-ready resilience strategy:

Primary Mapping: Invokes primary available capabilities mapped directly to high-fidelity multimodal image operations.
Automatic Fallback Engine: If an experimental identifier returns 404 Not Found, the handler gracefully falls back to secondary validated preview engine pointers.
Exponential Backoff Retries: Intercepts API rate limits (429) or temporary service unavailability (503) with automated exponential waits before proceeding.
Batch Loop Enforcement: Ensures complete batch sets are retrieved sequentially if model payloads output individual candidates per operation.

Setup Instructions & Run Guide

1. Google Cloud Console Setup

Create a Google Cloud Console account and activate eligible trial credits.
Navigate to Google AI Studio to instantiate an active Gemini API Key with generation access enabled.
Keep your provisioned API Key secure.

2. Local Stack Initialization

Backend Stack

# Enter project server folder
cd server

# Create runtime environment variables file
cp .env.example .env
# Populate GEMINI_API_KEY inside the newly created .env file
# Provide your live Neon Database connection string in DATABASE_URL

# Sync environment using the lightning-fast uv package manager
uv sync

# Launch the live backend API server
uv run uvicorn app.main:app --reload --port 8000

API Base is live at http://localhost:8000 with automated Swagger documentation at /docs.

Frontend UI Stack

# Enter project UI folder
cd client

# Configure environment variables mapping backend routing
cp .env.example .env.local
# Set VITE_API_BASE_URL=http://localhost:8000/api/v1

# Install node module tree
npm install

# Start Vite live development server
npm run dev

Access the high-fidelity UI layout at http://localhost:5173.

Known Limitations & Possible Improvements

Sequential Batch Loop: Currently, batches are processed sequentially using FastAPI BackgroundTasks. While incredibly stable for medium-scale evaluation validation, massive enterprise deployments should transition to a dedicated Redis + Celery distributed queue cluster to parallelize ingestion across multiple machine instances.
S3 / Remote Cloud Volume Persistence: Images are stored inside the internal server/uploads/ storage system. Future production iterations can integrate boto3 modules to instantly stream input/output visual data directly to highly scalable AWS S3 or Cloudflare R2 object buckets.
Serverless Response Payload Compaction: Vercel serverless environments enforce a strict 4.5 MB output ceiling per response execution. To support massive history backlogs without triggering 413 FUNCTION_PAYLOAD_TOO_LARGE errors, our list routes implement targeted serialization filters (JobSummaryOut) that strip heavy array trees during bulk catalog retrievals.

Build Assumptions

Model Availability: Assumes the developer's provisioned API access tier fully grants capabilities for multi-candidate visual synthesis under standard usage parameters.
File Architecture: Assumes input ZIP archives structure standard image file formats (.jpg, .png, .webp) mapping root-level files or grouped catalog items.

Name		Name	Last commit message	Last commit date
Latest commit History 54 Commits
client		client
server		server
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VogueFrame AI — Production Evaluation Submission

Evaluation Submission Checklist

Minimum Functional Requirements Delivered

Advanced Features Delivered

System Architecture

Explanation of How Outfit Consistency is Maintained

Explanation of Nano Banana 2 Integration

Setup Instructions & Run Guide

1. Google Cloud Console Setup

2. Local Stack Initialization

Backend Stack

Frontend UI Stack

Known Limitations & Possible Improvements

Build Assumptions

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

VogueFrame AI — Production Evaluation Submission

Evaluation Submission Checklist

Minimum Functional Requirements Delivered

Advanced Features Delivered

System Architecture

Explanation of How Outfit Consistency is Maintained

Explanation of Nano Banana 2 Integration

Setup Instructions & Run Guide

1. Google Cloud Console Setup

2. Local Stack Initialization

Backend Stack

Frontend UI Stack

Known Limitations & Possible Improvements

Build Assumptions

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages