This project contains two main scripts: sd_zoom.py for generating zoomed sequences of images using Stable Diffusion, and smart_cli.py for an interactive prompt generation and image creation experience.
- Python 3.8+
- Stable Diffusion WebUI with API access enabled
- FFmpeg (for video compilation)
- Ollama (for smart_cli.py)
- Clone this repository
- Install the required Python packages:
pip install webuiapi ollama - Ensure FFmpeg is installed on your system
- Copy
config_example.pytoconfig.pyand modify it according to your setup:Editcp config_example.py config.pyconfig.pyto set your Stable Diffusion server details and other preferences.
The config.py file contains various settings that you can customize:
SD_SERVERS: List of available Stable Diffusion serversDEFAULT_SERVER: Index of the default server in theSD_SERVERSlist- Default values for negative prompts, steps, CFG scale, zoom factor, etc.
- HR Upscaler settings
- Ollama model for prompt generation
Modify these settings according to your preferences and setup.
python sd_zoom.py --prompt "Your prompt here" [options]
--host: Stable Diffusion WebUI API host (default: from config)--port: Stable Diffusion WebUI API port (default: from config)--prompt: Initial prompt for image generation (required)--negative_prompt: Negative prompt (default: from config)--seed: Seed for image generation (-1 for random, default: -1)--steps: Number of steps for image generation (default: from config)--cfg_scale: CFG scale for image generation (default: from config)--width: Width of generated images (default: 960)--height: Height of generated images (default: 640)--zoom_factor: Zoom factor between iterations (default: from config)--count: Number of images to generate in the sequence (default: 3)--denoising_strength: Denoising strength for img2img (default: from config)--output_prefix: Prefix for output filenames (default: "zoomed_image")--framerate: Framerate for the output video (default: from config)--sleep_between_frames: Sleep duration between frame generations (default: 0)--hr_upscaler: Use HR upscaler (flag, default: False)--hr_scale: HR upscaler scale factor (default: from config)
The Smart CLI provides an interactive experience for generating prompts and creating zoomed image sequences.
python smart_cli.py
- AI-assisted prompt generation and evolution using Ollama
- Interactive user preferences for video length, resolution, and more
- Automatic prefix generation for output files
- Choice between configured Stable Diffusion servers
- Logging of the entire process
- The script displays example prompts for inspiration
- You can describe your desired image or choose a random example
- The AI generates an initial prompt based on your description
- You can provide feedback to evolve the prompt iteratively
- Once satisfied, you can set preferences for the video output
- The script generates the zoomed image sequence and compiles a video
Both scripts create a new folder for each run, containing:
- A sequence of PNG images, each representing a step in the zoom process
- An MP4 video file combining all the images into a smooth zoom animation
- A log file (for smart_cli.py) detailing the entire process
- Adjust the parameters in
config.pyto control the default behavior of the scripts - Experiment with different prompts and settings to achieve various effects
- The smart_cli.py script provides a more user-friendly and AI-assisted approach to creating zoom sequences
Here are some example prompts to inspire your creations:
-
Mystical Forest:
((hyper-detailed)), lush forest, ancient trees, misty atmosphere, sunbeams filtering through leaves, {moss-covered rocks|gnarled roots}, hidden fairy houses, glowing mushrooms, ((depth of field)), ethereal lighting, 8k resolution -
Futuristic Cityscape:
((ultra-realistic)), sprawling cyberpunk metropolis, neon-lit skyscrapers, flying vehicles, holographic billboards, {rain-slicked streets|bustling hover-trains}, intricate architectural details, moody atmospheric lighting, reflective surfaces, 8k resolution -
Underwater Coral Reef:
((photorealistic)), vibrant coral reef, diverse tropical fish, {sea turtles|manta rays}, shimmering water caustics, schools of colorful fish, intricate coral formations, sunlight filtering through water, underwater flora, crystal clear water, deep ocean background, 8k resolution -
Vast Desert Landscape:
((highly detailed)), expansive desert, towering sand dunes, {ancient ruins|mysterious oasis}, dramatic shadows, heat haze, distant mountains, weathered rock formations, lone nomad figure, golden hour lighting, wispy clouds, 8k resolution -
Fractal Vine Mosaic:
((intricate details)), abstract fractal vines, {emerald|ruby|sapphire} mosaic tiles, spiraling patterns, ((infinite recursion)), gleaming metallic accents, {sacred geometry|mandelbrot set}, iridescent color shifts, flowing organic shapes, 8k resolution
Feel free to use these prompts as starting points and modify them to suit your creative vision!
- If you encounter connection errors, ensure your Stable Diffusion WebUI is running and the API is enabled
- For Ollama-related issues, check that Ollama is installed and running correctly on your system
- If the video compilation fails, make sure FFmpeg is properly installed and accessible from the command line
Contributions to improve the scripts or add new features are welcome! Please submit a pull request or open an issue to discuss proposed changes.
This project is licensed under the MIT License - see the LICENSE file for details.