<a href="https://colab.research.google.com/github/iso-ai/isopro_examples/blob/main/examples/workflow_example.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Automating a Meme Generator with Workflow Simulation
This notebook demonstrates how to use isopro.workflow_simulation to automate a meme generation workflow. We'll train an agent to:

1. Navigate a meme generator website
2. Upload images
3. Add captions
4. Generate and download memes

And do it all automatically!

## Setup
First, let's import our required libraries and set up our environment:

Download the `meme_generator.mp4` video from https://github.com/iso-ai/isopro_examples/blob/main/helper_files/meme-generator.mp4 and attach save it to files in Google Colab.

In [None]:
!pip3 install isopro stable-baselines3 gymnasium isozero iso-adverse tiktoken

In [None]:
import os
from pathlib import Path
from isopro import (
    WorkflowSimulator,
    AgentConfig,
    VisualizationConfig,
    ValidationConfig
)
import matplotlib.pyplot as plt
from IPython.display import Image, HTML
from google.colab import userdata

## Configuration
Let's create a fun configuration for our meme generator automation:

In [None]:
# Create output directory
output_dir = Path("/content")
output_dir.mkdir(exist_ok=True)

# Configure the learning agent
agent_config = AgentConfig(
    learning_rate=3e-4,
    pretrain_epochs=10,
    use_demonstration=True,  # Learn from video
    use_reasoning=True,     # Understand UI context
    reward_threshold=0.8    # High standard for success
)

# Configure visualization
viz_config = VisualizationConfig(
    show_ui_elements=True,  # Highlight detected UI elements
    show_cursor=True,      # Show cursor movements
    show_actions=True,     # Display action predictions
    save_frames=True,      # Save key moments
    real_time_display=True # Watch the learning happen
)

# Define success criteria
validation_config = ValidationConfig.from_dict({
    "success_criteria": [
        "correct_element_clicked",    # Clicked the right UI elements
        "text_entered_correctly",     # Added text in right places
        "style_applied_correctly",    # Applied correct styling
        "final_state_achieved"        # Successfully saved meme
    ],
    "error_tolerance": 0.1
})

## Recording a Demonstration
Before we can train our agent, we need to show it how to make memes. Here's how we record a demonstration:

In [None]:
# Initialize simulator with demonstration video
simulator = WorkflowSimulator(
    video_path=str(demo_path),
    agent_config=agent_config,
    viz_config=viz_config,
    validation_config=validation_config,
    output_dir=str(output_dir)
)

print("""
To create your demonstration video:
1. Open your screen recorder
2. Navigate to your favorite meme generator website
3. Create 2-3 example memes, showing clearly:
   - How to select templates
   - Where to click for text entry
   - How to position elements
   - How to save/download
4. Save the video as 'meme_generator.mp4'
""")

# Verify the demo video exists
demo_path = Path("/content/meme-generator.mp4") # update this with the path for your video in google colab
if demo_path.exists():
    print("✅ Demonstration video found!")
    print(f"Duration: {simulator.env.total_frames / 30:.1f} seconds")
else:
    print("❌ Please record your demonstration video first!")

## Teaching Our Agent How To Make Memes
Now that we have our demonstration, let's train our agent to become a meme master:

In [None]:
print("🔍 Analyzing demonstration video...")
print(f"- Video duration: {simulator.env.total_frames / 30:.1f} seconds")
print(f"- Detected {len(simulator.env._detect_ui_elements())} UI elements")
print("- Analyzing interaction patterns...")

# Train the agent
print("\n🎓 Learning from demonstration...")
training_results = simulator.train_agents()

# Visualize learning progress
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 5))

# Plot mean reward
ax1.bar(['Mean Reward'], [training_results['mean_reward']],
        color='#66B2FF', alpha=0.7)
ax1.set_ylim(0, 1.0)
ax1.set_title('Learning Accuracy')
ax1.grid(True, alpha=0.3)

# Plot success rate
ax2.bar(['Success Rate'], [training_results['success_rate']],
        color='#99FF99', alpha=0.7)
ax2.set_ylim(0, 1.0)
ax2.set_title('Task Completion Rate')
ax2.grid(True, alpha=0.3)

plt.suptitle("Learning Progress", fontsize=14)
plt.tight_layout()
plt.show()

## Creating New Memes Using AI Agent
Let's use our trained agent to generate some memes!

In [None]:
print("\n🎨 Creating new memes using learned behaviors...")

for attempt in range(3):
    print(f"\n✨ Creating meme {attempt + 1}/3")

    try:
        # Start fresh
        observation, info = simulator.reset()
        total_reward = 0
        steps = 0
        done = False
        truncated = False

        # Create meme using learned behaviors
        while not (done or truncated):
            # Analyze current UI state
            ui_elements = simulator.env._detect_ui_elements()
            if ui_elements:
                print(f"\n🔍 Current UI Analysis:")
                for elem in ui_elements:
                    elem_type = elem.get('type', 'unknown')
                    elem_bbox = elem.get('bbox', [])
                    elem_state = elem.get('state', 'unknown')

                    bbox_str = f"[{', '.join(f'{x:.2f}' for x in elem_bbox)}]" if elem_bbox else "[]"
                    print(f"- Found {elem_type} element:")
                    print(f"  Position: {bbox_str}")
                    print(f"  State: {elem_state}")

            # Generate action using the action space
            action = simulator.action_space.sample()

            # Prepare the action for the environment
            env_action = {
                'action_type': int(action.get('action_type', 0)),
                'target_element': [0, 0, 0, 0],  # Default bbox
                'parameters': {
                    'text_input': '',
                    'drag_end': [0, 0]
                }
            }

            # If we detected UI elements, target one of them
            if ui_elements:
                target_elem = ui_elements[0]  # For demo, target first element
                env_action['target_element'] = target_elem['bbox']

            # Execute action
            try:
                observation, reward, done, truncated, info = simulator.step(env_action)

                # Show what's happening
                print(f"\n🤖 Step {steps + 1}:")
                print(f"   Action Type: {simulator.env._get_action_type(env_action)}")
                print(f"   Reward: {reward:.2f}")

                if info.get('last_action'):
                    print(f"   Result: {info['last_action']}")

                # Track progress
                total_reward += reward
                steps += 1

            except Exception as step_error:
                print(f"⚠️ Step error: {str(step_error)}")
                continue

            # Show periodic updates
            if steps % 10 == 0:
                print(f"\n📊 Progress Update:")
                print(f"Steps: {steps}")
                print(f"Current Reward: {total_reward:.2f}")

            # Prevent infinite loops
            if steps >= 100:
                print("⚠️ Maximum steps reached")
                break

        # Show results of this attempt
        print(f"\n📊 Creation Results:")
        print(f"Total Steps: {steps}")
        print(f"Final Score: {total_reward:.2f}")

        # Try to display the created meme
        meme_path = output_dir / f"meme_{attempt + 1}.png"
        if meme_path.exists():
            print("🖼️ Created Meme:")
            display(Image(filename=str(meme_path)))
        else:
            print("ℹ️ No meme image was saved for this attempt")

    except Exception as e:
        print(f"❌ Error during creation: {str(e)}")
        import traceback
        print("Detailed error:")
        print(traceback.format_exc())

    print("\n" + "="*50)

print("""
✨ Creation Attempt Complete!

What we demonstrated:
1. UI Element Detection: The agent identified interactive elements
2. Action Generation: Created appropriate actions for each situation
3. Feedback Loop: Received and responded to environment rewards
4. Learning Application: Applied patterns from the demonstration

For real-world use:
1. Record clear workflow demonstrations
2. Ensure consistent UI interactions
3. Run multiple training episodes
4. Fine-tune the reward thresholds
""")

## Analyzing Our Meme Factory
Let's look at some fun statistics about our meme generation:

In [None]:
# Evaluate agent performance
eval_results = simulator.evaluate_agents()

# Show statistics
print("\n📊 Performance Analysis:")
print(f"Success Rate: {eval_results['success_rate']*100:.1f}%")
print(f"Average Time: {eval_results['mean_length']:.1f} steps")
print(f"Quality Score: {eval_results['mean_reward']:.2f}/1.0")

# Visualize where time was spent
stages = ["Template Selection", "Text Entry", "Styling", "Saving"]
time_distribution = [
    eval_results.get('template_time', 25),
    eval_results.get('text_time', 35),
    eval_results.get('style_time', 25),
    eval_results.get('save_time', 15)
]

plt.figure(figsize=(10, 8))
plt.pie(time_distribution, labels=stages, autopct='%1.1f%%',
        colors=['#FF9999', '#66B2FF', '#99FF99', '#FFCC99'])
plt.title("Time Spent on Different Stages")
plt.show()

# Conclusion
Congratulations! You've successfully created an automated meme factory! 🎉
Some fun things we learned:

- Our agent can learn to navigate UI elements and create memes
- The power of combining computer vision with reinforcement learning
- How to make our code more entertaining with emojis 😄

## Next Steps
Want to make your meme factory even better? Here are some fun ideas:

- Train on different meme templates
- Add text effects and styling
- Create a meme recommendation system
- Build a Discord bot using this automation
- Generate captions using Claude