Skip to content

ProgrammersCube/Image-Generation-Agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Image Generation Agent (LangGraph + LangChain)

Image Generation Agent is a Streamlit application that turns a topic into one or more images using an API-based image model. It uses a LangGraph pipeline to build and optionally refine prompts, then calls the image generation endpoint to render results. This avoids heavy local models and keeps the app lightweight and responsive.

Key Features

  • Topic-to-image generation with a LangGraph agent flow
  • Optional prompt refinement using an LLM
  • Configurable image size and count
  • Model selection with fallback when access is restricted
  • Clean Streamlit UI for fast iteration

Installation

System Requirements

  • Windows 10/11, macOS, or Linux
  • Python 3.10+
  • Internet connection for API calls

Setup

  1. Create and activate a virtual environment.
  2. Install dependencies.
  3. Configure environment variables.
python -m venv .venv
.\.venv\Scripts\activate
python -m pip install -r requirement.txt

Create a .env file in the project root:

OPENAI_API_KEY="your_api_key_here"
OPENAI_TEXT_MODEL="gpt-4o-mini"
OPENAI_IMAGE_MODEL="gpt-image-1"
OPENAI_BASE_URL="https://api.openai.com/v1"

Demo Vedio

https://www.loom.com/share/7d79a0fff53046af9b3114fa4b39ce21

Usage

Run the App

python -m streamlit run app.py

Configuration Options

  • OPENAI_API_KEY: API key for the image and text endpoints
  • OPENAI_TEXT_MODEL: model used to refine prompts
  • OPENAI_IMAGE_MODEL: default image model
  • OPENAI_BASE_URL: API base URL (optional)

Examples

Topic: “a futuristic city skyline”
Style: “cinematic lighting, digital art”
Mood: “mysterious, vibrant”
Extra: “ultra-detailed, wide angle, high resolution”

The app builds a prompt like:

a futuristic city skyline, style: cinematic lighting, digital art, mood: mysterious, vibrant, ultra-detailed, wide angle, high resolution

Technical Documentation

Architecture Overview

flowchart LR
    A[User Topic Input] --> B[LangGraph: Build Prompt]
    B --> C[LangGraph: Optional LLM Refinement]
    C --> D[Image API Request]
    D --> E[Streamlit UI Output]
Loading

Core Components

  • Prompt Builder: Combines topic, style, mood, and extra details into a draft prompt.
  • Prompt Refiner: Uses a text model to rewrite the draft prompt when enabled.
  • Image Generator: Calls the image API and returns image outputs.
  • UI Layer: Streamlit forms for inputs and image display.

API References

The app calls a standard image generation endpoint at:

POST /v1/images/generations

Contribution Guidelines

Reporting Issues

  • Open an issue with steps to reproduce, expected behavior, and actual behavior.
  • Include error logs or screenshots when possible.

Pull Requests

  1. Fork the repository and create a feature branch.
  2. Keep changes focused and small.
  3. Add or update documentation as needed.
  4. Submit a PR with a clear description.

Coding Standards

  • Keep functions small and single-purpose.
  • Use clear, descriptive variable names.
  • Avoid committing secrets or API keys.

License

MIT License. See LICENSE for details.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages