# üöÄ Enhanced Mujoco Viewer UI
## By Anbar Althaf & Arhaan Girdhar

**Interactive interface for viewing and controlling trained RL models with improved UI and command execution.**

This notebook provides:
- üéÆ Direct execution of `python sb3.py --model ModelName` commands
- üéõÔ∏è Interactive UI with dropdown selections
- üìä Real-time output display
- ü§ñ Automatic model detection

---


In [1]:
# Install and import required libraries
import sys
import subprocess

def install_if_missing(package):
    try:
        __import__(package)
    except ImportError:
        print(f"Installing {package}...")
        subprocess.check_call([sys.executable, "-m", "pip", "install", package])

# Install required packages
packages = ['ipywidgets', 'matplotlib']
for package in packages:
    install_if_missing(package)

print("‚úÖ All packages ready!")


‚úÖ All packages ready!


In [2]:
# Import required libraries
import gymnasium as gym
from stable_baselines3 import SAC, TD3, A2C
import numpy as np
import matplotlib.pyplot as plt
import ipywidgets as widgets
from IPython.display import display, clear_output, HTML
import time
import os
import threading
import subprocess
from collections import deque
import warnings
warnings.filterwarnings('ignore')

print("üì¶ Libraries imported successfully!")
print(f"üêç Python version: {sys.version}")
print(f"üìÅ Current directory: {os.getcwd()}")


üì¶ Libraries imported successfully!
üêç Python version: 3.10.18 (main, Jun  5 2025, 08:37:47) [Clang 14.0.6 ]
üìÅ Current directory: /Users/arhaan17/Coding/Movement_Tracking_Mujoco/test3


In [3]:
def execute_sb3_command(model_name, environment="Humanoid-v4", algorithm="A2C", show_available=True):
    """Execute sb3.py with specified parameters"""
    model_path = f"models/{model_name}.zip"
    
    print(f"üöÄ Executing: python sb3.py {environment} {algorithm} -s {model_path}")
    print(f"üìä Model: {model_name}")
    print(f"üéÆ Environment: {environment}")
    print(f"ü§ñ Algorithm: {algorithm}")
    print("\n" + "="*60)
    
    # Check if model exists
    if not os.path.exists(model_path):
        print(f"‚ùå Model not found: {model_path}")
        if show_available and os.path.exists("models"):
            print("\nüìÅ Available models:")
            models = sorted([f for f in os.listdir("models") if f.endswith('.zip')])
            for i, f in enumerate(models[:20], 1):  # Show first 20
                print(f"   {i:2}. {f}")
            if len(models) > 20:
                print(f"   ... and {len(models)-20} more models")
        return False
    
    try:
        print("üéÆ Starting Mujoco viewer...")
        print("‚ö†Ô∏è  The viewer will open in a new window")
        print("‚èπÔ∏è  Close the viewer window when done")
        print("üîÑ  Output will appear below...\n")
        
        # Execute the command with real-time output
        process = subprocess.Popen(
            ["python", "sb3.py", environment, algorithm, "-s", model_path],
            stdout=subprocess.PIPE,
            stderr=subprocess.STDOUT,
            text=True,
            bufsize=1,
            universal_newlines=True
        )
        
        # Print output in real-time
        output_lines = []
        while True:
            output = process.stdout.readline()
            if output == '' and process.poll() is not None:
                break
            if output:
                line = output.strip()
                print(line)
                output_lines.append(line)
        
        rc = process.poll()
        if rc == 0:
            print("\n‚úÖ Execution completed successfully!")
            return True
        else:
            print(f"\n‚ùå Execution failed with return code: {rc}")
            return False
            
    except KeyboardInterrupt:
        print("\n‚èπÔ∏è Execution interrupted by user")
        try:
            process.terminate()
        except:
            pass
        return False
    except Exception as e:
        print(f"üí• Error: {str(e)}")
        return False

# ========== MODIFY THESE PARAMETERS ==========
MODEL_NAME = "A2C_8125000"      # Change this to your desired model
ALGORITHM = "A2C"             # A2C, SAC, or TD3
ENVIRONMENT = "Humanoid-v4"   # Environment name
# =============================================

print(f"üéØ Quick Execution Settings:")
print(f"   üìä Model: {MODEL_NAME}")
print(f"   ü§ñ Algorithm: {ALGORITHM}")
print(f"   üéÆ Environment: {ENVIRONMENT}")
print(f"\nüìù To change settings: Edit the parameters above and re-run this cell")
print(f"üöÄ To execute: Run the next cell!")


üéØ Quick Execution Settings:
   üìä Model: A2C_8125000
   ü§ñ Algorithm: A2C
   üéÆ Environment: Humanoid-v4

üìù To change settings: Edit the parameters above and re-run this cell
üöÄ To execute: Run the next cell!


In [4]:
# Execute the command with current settings
print("üöÄ Executing sb3.py command...\n")
success = execute_sb3_command(MODEL_NAME, ENVIRONMENT, ALGORITHM)

if success:
    print("\nüéâ Command executed successfully!")
else:
    print("\n‚ö†Ô∏è  Command failed or was interrupted.")
    print("üí° Tips:")
    print("   - Check that the model file exists in models/ directory")
    print("   - Verify the algorithm matches your model type")
    print("   - Make sure sb3.py is in the current directory")


üöÄ Executing sb3.py command...

üöÄ Executing: python sb3.py Humanoid-v4 A2C -s models/A2C_8125000.zip
üìä Model: A2C_8125000
üéÆ Environment: Humanoid-v4
ü§ñ Algorithm: A2C

üéÆ Starting Mujoco viewer...
‚ö†Ô∏è  The viewer will open in a new window
‚èπÔ∏è  Close the viewer window when done
üîÑ  Output will appear below...

logger.deprecation(
Exception: code expected at most 16 arguments, got 18
Wrapping the env with a `Monitor` wrapper
Wrapping the env in a DummyVecEnv.
logger.deprecation(
logger.warn(
Exception ignored in: <function WindowViewer.__del__ at 0x167cec9d0>
Traceback (most recent call last):
File "/opt/homebrew/Caskroom/miniconda/base/envs/test3/lib/python3.10/site-packages/gymnasium/envs/mujoco/mujoco_rendering.py", line 410, in __del__
File "/opt/homebrew/Caskroom/miniconda/base/envs/test3/lib/python3.10/site-packages/gymnasium/envs/mujoco/mujoco_rendering.py", line 397, in free
TypeError: 'NoneType' object is not callable

‚úÖ Execution completed successfully!

## üéõÔ∏è Interactive UI Dashboard

**Enhanced interface with dropdown selections and real-time execution monitoring**


In [5]:
class EnhancedMujocoViewer:
    def __init__(self):
        self.current_process = None
        self.run_statistics = {
            'episode_rewards': [],
            'episode_steps': [],
            'cumulative_rewards': [],
            'episode_times': [],
            'model_name': '',
            'algorithm': '',
            'environment': ''
        }
        self.algorithm_descriptions = {
            'A2C': {
                'name': 'Advantage Actor-Critic',
                'description': 'Policy gradient method with value function baseline. Fast but less sample efficient.',
                'best_for': 'Quick training, simpler tasks',
                'pros': ['Fast computation', 'Stable on simple tasks', 'Low memory usage'],
                'cons': ['Sample inefficient', 'Struggles with complex continuous control', 'Can be unstable']
            },
            'SAC': {
                'name': 'Soft Actor-Critic',
                'description': 'Maximum entropy off-policy algorithm. Excellent for continuous control tasks.',
                'best_for': 'Humanoid walking, robotic control, complex continuous tasks',
                'pros': ['Highly sample efficient', 'Excellent exploration', 'Very stable', 'Great for continuous control'],
                'cons': ['More complex', 'Slightly slower per step']
            },
            'TD3': {
                'name': 'Twin Delayed Deep Deterministic Policy Gradient',
                'description': 'Improved DDPG with twin critics and delayed policy updates.',
                'best_for': 'Continuous control, robotic manipulation',
                'pros': ['Good for continuous control', 'More stable than DDPG', 'Deterministic policy'],
                'cons': ['Less exploration than SAC', 'Sensitive to hyperparameters']
            }
        }
        self.setup_ui()
        
    def get_available_models(self):
        """Get available models organized by algorithm"""
        models = {'A2C': [], 'SAC': [], 'TD3': []}
        
        if not os.path.exists('models'):
            print("‚ö†Ô∏è  Models directory not found")
            return models
            
        for file in os.listdir('models'):
            if file.endswith('.zip'):
                for algo in models.keys():
                    if file.startswith(algo):
                        try:
                            # Extract training steps from filename
                            steps_str = file.replace(f'{algo}_', '').replace('.zip', '')
                            steps = int(steps_str)
                            models[algo].append((steps, file.replace('.zip', '')))
                        except:
                            models[algo].append((0, file.replace('.zip', '')))
        
        # Sort by training steps (descending)
        for algo in models:
            models[algo].sort(reverse=True)
            
        return models
    
    def update_models(self, change):
        """Update model dropdown when algorithm changes"""
        models = self.get_available_models()
        algo = change['new']
        
        if models[algo]:
            model_options = [f"{steps:,} steps" for steps, _ in models[algo][:15]]
            self.model_dropdown.options = model_options
        else:
            self.model_dropdown.options = [f'No {algo} models found']
        
        # Update algorithm description
        self.update_algorithm_description(algo)

    def update_algorithm_description(self, algo):
        """Update algorithm description display"""
        desc = self.algorithm_descriptions[algo]
        
        pros_html = '</li><li>'.join(desc['pros'])
        cons_html = '</li><li>'.join(desc['cons'])
        
        html_content = f'''
        <div style="padding: 15px; background: linear-gradient(135deg, #e8f5e8 0%, #f0f8ff 100%); border-radius: 8px; margin: 10px 0;">
            <h4 style="margin: 0 0 10px 0; color: #2e7d32;">ü§ñ {desc['name']} ({algo})</h4>
            <p style="margin: 5px 0; font-size: 14px;"><strong>Description:</strong> {desc['description']}</p>
            <p style="margin: 5px 0; font-size: 14px;"><strong>üéØ Best for:</strong> {desc['best_for']}</p>
            
            <div style="display: flex; gap: 15px; margin-top: 10px;">
                <div style="flex: 1;">
                    <strong style="color: #2e7d32;">‚úÖ Pros:</strong>
                    <ul style="margin: 5px 0; padding-left: 20px; font-size: 13px;">
                        <li>{pros_html}</li>
                    </ul>
                </div>
                <div style="flex: 1;">
                    <strong style="color: #d32f2f;">‚ùå Cons:</strong>
                    <ul style="margin: 5px 0; padding-left: 20px; font-size: 13px;">
                        <li>{cons_html}</li>
                    </ul>
                </div>
            </div>
        </div>
        '''
        self.algo_description.value = html_content

    def update_model_info(self, change):
        """Update model information display"""
        model_text = change['new']
        algo = self.algo_dropdown.value
        
        if 'No' in model_text and 'found' in model_text:
            self.model_info.value = f'<div style="padding: 8px; background-color: #ffebee; border-radius: 3px; color: #c62828;">‚ùå No {algo} models available</div>'
            return
        
        try:
            steps = int(model_text.split()[0].replace(',', ''))
            model_name = f"{algo}_{steps}"
            model_path = f"models/{model_name}.zip"
            
            if os.path.exists(model_path):
                file_size = os.path.getsize(model_path) / (1024*1024)  # MB
                info_html = f'''
                <div style="padding: 8px; background-color: #e8f5e8; border-radius: 3px;">
                    <b>üìä Model Info:</b><br>
                    üè∑Ô∏è Name: {model_name}<br>
                    üéØ Training Steps: {steps:,}<br>
                    üíæ File Size: {file_size:.1f} MB<br>
                    üìÅ Path: {model_path}
                </div>
                '''
            else:
                info_html = f'<div style="padding: 8px; background-color: #ffebee; border-radius: 3px; color: #c62828;">‚ùå Model file not found: {model_path}</div>'
                
            self.model_info.value = info_html
        except:
            self.model_info.value = '<div style="padding: 8px; background-color: #fff3e0; border-radius: 3px;">‚ö†Ô∏è Invalid model selection</div>'

    def get_selected_model_name(self):
        """Get the selected model name"""
        model_text = self.model_dropdown.value
        algo = self.algo_dropdown.value
        
        if 'No' in model_text and 'found' in model_text:
            return None
        
        try:
            steps = int(model_text.split()[0].replace(',', ''))
            return f"{algo}_{steps}"
        except:
            return None

    def execute_command(self, button):
        """Execute the sb3.py command with enhanced parameters"""
        model_name = self.get_selected_model_name()
        if not model_name:
            self.status_label.value = '<div style="padding: 10px; background-color: #ffebee; border-radius: 5px; color: #c62828;"><b>‚ùå Error</b> - No valid model selected</div>'
            return
        
        algo = self.algo_dropdown.value
        env = self.env_dropdown.value
        episodes = self.episodes_input.value
        max_steps = self.max_steps_input.value
        seed = self.seed_input.value if self.seed_input.value > 0 else None
        record_video = self.record_video_checkbox.value
        
        model_path = f"models/{model_name}.zip"
        
        if not os.path.exists(model_path):
            self.status_label.value = f'<div style="padding: 10px; background-color: #ffebee; border-radius: 5px; color: #c62828;"><b>‚ùå Error</b> - Model file not found: {model_path}</div>'
            return
        
        # Update status
        self.status_label.value = f'<div style="padding: 10px; background-color: #fff3e0; border-radius: 5px; color: #f57c00;"><b>üü° Running</b> - Executing {model_name} on {env} for {episodes} episodes</div>'
        
        with self.output:
            clear_output(wait=True)
            
            print(f"üöÄ Enhanced Mujoco Viewer - Execution Started")
            print(f"{'='*70}")
            print(f"üìä Model: {model_name}")
            print(f"ü§ñ Algorithm: {algo}")
            print(f"üéÆ Environment: {env}")
            print(f"üìÅ Model Path: {model_path}")
            print(f"üéØ Episodes: {episodes}")
            print(f"‚è±Ô∏è  Max Steps per Episode: {max_steps}")
            if seed:
                print(f"üé≤ Random Seed: {seed}")
            else:
                print(f"üé≤ Random Seed: Not set (random)")
            print(f"üìπ Record Video: {'Yes' if record_video else 'No'}")
            print(f"‚è∞ Started at: {time.strftime('%H:%M:%S')}")
            print(f"{'='*70}\n")
            
            try:
                # Clear previous statistics
                self.run_statistics = {
                    'episode_rewards': [],
                    'episode_steps': [],
                    'cumulative_rewards': [],
                    'episode_times': [],
                    'model_name': model_name,
                    'algorithm': algo,
                    'environment': env
                }
                
                # Create enhanced sb3.py command with parameters
                python_script = f"""
import gymnasium as gym
from stable_baselines3 import SAC, TD3, A2C
import os
import time
import random
import numpy as np
import json

# Set seed if provided
seed = {seed if seed else 'None'}
if seed is not None:
    random.seed(seed)
    np.random.seed(seed)

# Load model
env_name = '{env}'
algo = '{algo}'
model_path = '{model_path}'
episodes = {episodes}
max_steps = {max_steps}
record_video = {record_video}

print(f"Loading {algo} model from {model_path}...")

# Create environment
if record_video:
    from gymnasium.wrappers import RecordVideo
    env = gym.make(env_name, render_mode='rgb_array')
    env = RecordVideo(env, video_folder='videos', episode_trigger=lambda x: True)
else:
    env = gym.make(env_name, render_mode='human')

# Load model based on algorithm
if algo == 'SAC':
    model = SAC.load(model_path, env=env)
elif algo == 'TD3':
    model = TD3.load(model_path, env=env)
elif algo == 'A2C':
    model = A2C.load(model_path, env=env)
else:
    raise ValueError(f"Unknown algorithm: {algo}")

print(f"Running {episodes} episodes with max {max_steps} steps each...")
print("Press Ctrl+C to stop early\\n")

total_reward = 0
total_steps = 0
episode_data = []

for episode in range(episodes):
    episode_start_time = time.time()
    obs, _ = env.reset()
    episode_reward = 0
    episode_steps = 0
    
    print(f"Episode {{episode + 1}}/{episodes}: ", end="", flush=True)
    
    for step in range(max_steps):
        action, _ = model.predict(obs, deterministic=True)
        obs, reward, terminated, truncated, _ = env.step(action)
        episode_reward += reward
        episode_steps += 1
        
        if terminated or truncated:
            break
    
    episode_time = time.time() - episode_start_time
    total_reward += episode_reward
    total_steps += episode_steps
    
    # Store episode data
    episode_info = dict()
    episode_info['episode'] = episode + 1
    episode_info['reward'] = float(episode_reward)
    episode_info['steps'] = int(episode_steps)
    episode_info['time'] = float(episode_time)
    episode_data.append(episode_info)
    
    print(f"Reward: {{episode_reward:.2f}}, Steps: {{episode_steps}}, Time: {{episode_time:.1f}}s")

# Save statistics for visualization
print("\\nSTATS_JSON_START")
print(json.dumps(episode_data))
print("STATS_JSON_END")

print("\\nüìä Summary:")
print(f"   Average Reward: {{total_reward/episodes:.2f}}")
print(f"   Average Steps: {{total_steps/episodes:.1f}}")
print(f"   Total Episodes: {episodes}")

env.close()
print("‚úÖ Execution completed!")
"""
                cmd = ["python", "-c", python_script]
                
                print(f"üîß Executing enhanced sb3.py with custom parameters...")
                print(f"üéÆ Opening Mujoco viewer...")
                print(f"‚ö†Ô∏è  Viewer will open in a separate window")
                print(f"‚èπÔ∏è  Use the Stop button or close the window to end")
                if record_video:
                    print(f"üìπ Videos will be saved to 'videos' folder")
                print(f"\nüîÑ Output:\n")
                
                # Execute with real-time output
                self.current_process = subprocess.Popen(
                    cmd,
                    stdout=subprocess.PIPE,
                    stderr=subprocess.STDOUT,
                    text=True,
                    bufsize=1,
                    universal_newlines=True
                )
                
                # Monitor output in real-time and capture statistics
                capturing_stats = False
                stats_lines = []
                
                while True:
                    output = self.current_process.stdout.readline()
                    if output == '' and self.current_process.poll() is not None:
                        break
                    if output:
                        line = output.strip()
                        print(line)
                        
                        # Capture statistics JSON
                        if line == "STATS_JSON_START":
                            capturing_stats = True
                        elif line == "STATS_JSON_END":
                            capturing_stats = False
                            # Parse the statistics
                            if stats_lines:
                                try:
                                    import json
                                    stats_json = ''.join(stats_lines)
                                    episode_stats = json.loads(stats_json)
                                    
                                    # Update run statistics
                                    self.run_statistics['episode_rewards'] = [ep['reward'] for ep in episode_stats]
                                    self.run_statistics['episode_steps'] = [ep['steps'] for ep in episode_stats]
                                    self.run_statistics['episode_times'] = [ep['time'] for ep in episode_stats]
                                    self.run_statistics['cumulative_rewards'] = [sum(self.run_statistics['episode_rewards'][:i+1]) for i in range(len(self.run_statistics['episode_rewards']))]
                                    
                                    print(f"\nüìä Statistics captured for visualization!")
                                except Exception as e:
                                    print(f"\n‚ö†Ô∏è Error parsing statistics: {e}")
                        elif capturing_stats:
                            stats_lines.append(line)
                
                rc = self.current_process.poll()
                if rc == 0:
                    print(f"\n‚úÖ Execution completed successfully at {time.strftime('%H:%M:%S')}!")
                    self.status_label.value = '<div style="padding: 10px; background-color: #e8f5e8; border-radius: 5px; color: #2e7d32;"><b>üü¢ Completed</b> - Execution finished successfully</div>'
                else:
                    print(f"\n‚ùå Execution failed with return code: {rc}")
                    self.status_label.value = f'<div style="padding: 10px; background-color: #ffebee; border-radius: 5px; color: #c62828;"><b>‚ùå Failed</b> - Return code: {rc}</div>'
                    
            except Exception as e:
                print(f"üí• Error: {str(e)}")
                self.status_label.value = f'<div style="padding: 10px; background-color: #ffebee; border-radius: 5px; color: #c62828;"><b>üí• Error</b> - {str(e)}</div>'
            finally:
                self.current_process = None

    def stop_execution(self, button):
        """Stop the current execution"""
        if self.current_process:
            try:
                self.current_process.terminate()
                self.status_label.value = '<div style="padding: 10px; background-color: #ffebee; border-radius: 5px; color: #c62828;"><b>‚èπÔ∏è Stopped</b> - Execution terminated by user</div>'
                with self.output:
                    print(f"\n‚èπÔ∏è Execution stopped by user at {time.strftime('%H:%M:%S')}")
            except:
                pass
            self.current_process = None
        else:
            self.status_label.value = '<div style="padding: 10px; background-color: #fff3e0; border-radius: 5px; color: #f57c00;"><b>‚ö†Ô∏è Info</b> - No active execution to stop</div>'

    def refresh_models(self, button):
        """Refresh the available models list"""
        models = self.get_available_models()
        algo = self.algo_dropdown.value
        
        if models[algo]:
            model_options = [f"{steps:,} steps" for steps, _ in models[algo][:15]]
            self.model_dropdown.options = model_options
            self.status_label.value = f'<div style="padding: 10px; background-color: #e8f5e8; border-radius: 5px; color: #2e7d32;"><b>üîÑ Refreshed</b> - Found {len(models[algo])} {algo} models</div>'
        else:
            self.model_dropdown.options = [f'No {algo} models found']
            self.status_label.value = f'<div style="padding: 10px; background-color: #fff3e0; border-radius: 5px; color: #f57c00;"><b>‚ö†Ô∏è No Models</b> - No {algo} models found</div>'

    def show_models_summary(self, button):
        """Show a summary of all available models"""
        models = self.get_available_models()
        with self.output:
            clear_output(wait=True)
            print("üìä MODELS SUMMARY")
            print("="*80)
            print()
            
            total_models = 0
            for algo in ['A2C', 'SAC', 'TD3']:
                count = len(models[algo])
                total_models += count
                print(f"ü§ñ {algo} ({self.algorithm_descriptions[algo]['name']}):")
                print(f"   üìà Models: {count}")
                
                if count > 0:
                    steps_list = [steps for steps, _ in models[algo]]
                    print(f"   üìä Training Range: {min(steps_list):,} - {max(steps_list):,} steps")
                    print(f"   üéØ Best Model: {max(steps_list):,} steps")
                    print(f"   üí° {self.algorithm_descriptions[algo]['best_for']}")
                else:
                    print(f"   ‚ùå No models found")
                print()
            
            print(f"üìã TOTAL: {total_models} models across all algorithms")
            print("="*80)

    def visualize_performance(self, button):
        """Show performance visualizations for the last run"""
        if not self.run_statistics['episode_rewards']:
            with self.output:
                clear_output(wait=True)
                print("üìä PERFORMANCE VISUALIZATION")
                print("="*60)
                print("‚ùå No run data available!")
                print("üí° Execute a model run first to see performance graphs.")
                print("="*60)
            return
        
        with self.output:
            clear_output(wait=True)
            
            import matplotlib.pyplot as plt
            import numpy as np
            
            print("üìä PERFORMANCE VISUALIZATION")
            print("="*80)
            print(f"ü§ñ Model: {self.run_statistics['model_name']}")
            print(f"üî¨ Algorithm: {self.run_statistics['algorithm']}")
            print(f"üéÆ Environment: {self.run_statistics['environment']}")
            print("="*80)
            
            # Create subplots
            fig, ((ax1, ax2), (ax3, ax4)) = plt.subplots(2, 2, figsize=(15, 10))
            fig.suptitle(f'Performance Analysis: {self.run_statistics["model_name"]}', fontsize=16, fontweight='bold')
            
            episodes = range(1, len(self.run_statistics['episode_rewards']) + 1)
            
            # 1. Episode Rewards
            ax1.plot(episodes, self.run_statistics['episode_rewards'], 'b-o', linewidth=2, markersize=6)
            ax1.set_title('üéØ Episode Rewards', fontsize=14, fontweight='bold')
            ax1.set_xlabel('Episode')
            ax1.set_ylabel('Reward')
            ax1.grid(True, alpha=0.3)
            ax1.axhline(y=np.mean(self.run_statistics['episode_rewards']), color='r', linestyle='--', 
                       label=f'Mean: {np.mean(self.run_statistics["episode_rewards"]):.2f}')
            ax1.legend()
            
            # 2. Episode Steps
            ax2.bar(episodes, self.run_statistics['episode_steps'], color='green', alpha=0.7)
            ax2.set_title('‚è±Ô∏è Steps per Episode', fontsize=14, fontweight='bold')
            ax2.set_xlabel('Episode')
            ax2.set_ylabel('Steps')
            ax2.grid(True, alpha=0.3)
            ax2.axhline(y=np.mean(self.run_statistics['episode_steps']), color='r', linestyle='--',
                       label=f'Mean: {np.mean(self.run_statistics["episode_steps"]):.1f}')
            ax2.legend()
            
            # 3. Cumulative Rewards
            cumulative = np.cumsum(self.run_statistics['episode_rewards'])
            ax3.plot(episodes, cumulative, 'purple', linewidth=3)
            ax3.fill_between(episodes, cumulative, alpha=0.3, color='purple')
            ax3.set_title('üìà Cumulative Rewards', fontsize=14, fontweight='bold')
            ax3.set_xlabel('Episode')
            ax3.set_ylabel('Cumulative Reward')
            ax3.grid(True, alpha=0.3)
            
            # 4. Performance Distribution
            ax4.hist(self.run_statistics['episode_rewards'], bins=max(3, len(episodes)//2), 
                    color='orange', alpha=0.7, edgecolor='black')
            ax4.axvline(x=np.mean(self.run_statistics['episode_rewards']), color='r', linestyle='--', linewidth=2,
                       label=f'Mean: {np.mean(self.run_statistics["episode_rewards"]):.2f}')
            ax4.axvline(x=np.median(self.run_statistics['episode_rewards']), color='blue', linestyle='--', linewidth=2,
                       label=f'Median: {np.median(self.run_statistics["episode_rewards"]):.2f}')
            ax4.set_title('üìä Reward Distribution', fontsize=14, fontweight='bold')
            ax4.set_xlabel('Reward')
            ax4.set_ylabel('Frequency')
            ax4.legend()
            ax4.grid(True, alpha=0.3)
            
            plt.tight_layout()
            plt.show()
            
            # Print summary statistics
            print("\nüìà PERFORMANCE SUMMARY")
            print("="*50)
            print(f"üìä Total Episodes: {len(self.run_statistics['episode_rewards'])}")
            print(f"üéØ Average Reward: {np.mean(self.run_statistics['episode_rewards']):.2f}")
            print(f"üèÜ Best Episode: {np.max(self.run_statistics['episode_rewards']):.2f}")
            print(f"üìâ Worst Episode: {np.min(self.run_statistics['episode_rewards']):.2f}")
            print(f"üìè Reward Std Dev: {np.std(self.run_statistics['episode_rewards']):.2f}")
            print(f"‚è±Ô∏è  Average Steps: {np.mean(self.run_statistics['episode_steps']):.1f}")
            print(f"üî• Total Steps: {np.sum(self.run_statistics['episode_steps'])}")
            print(f"‚ö° Success Rate: {(np.array(self.run_statistics['episode_rewards']) > 0).mean()*100:.1f}%")
            print("="*50)

    def display(self):
        """Display the enhanced UI"""
        # Custom header with gradient styling
        header_html = """
        <div style="background: linear-gradient(135deg, #667eea 0%, #764ba2 100%); 
                    color: white; padding: 25px; text-align: center; 
                    border-radius: 12px; margin-bottom: 25px;
                    box-shadow: 0 8px 32px rgba(0,0,0,0.1);">
            <h1 style="margin: 0; font-size: 2.2em;">üöÄ Enhanced Mujoco Viewer</h1>
            <p style="margin: 10px 0 0 0; font-size: 1.1em; opacity: 0.9;">Interactive interface for RL model visualization and control with advanced parameters</p>
        </div>
        """
        display(HTML(header_html))
        
        # Algorithm description section
        algo_desc_header = widgets.HTML('<h3 style="color: #333; margin: 20px 0 10px 0;">üìö Algorithm Information</h3>')
        
        # Model selection section
        selection_header = widgets.HTML('<h3 style="color: #333; margin: 20px 0 10px 0;">ü§ñ Model Selection</h3>')
        
        selection_row = widgets.HBox([
            self.algo_dropdown,
            self.model_dropdown,
            self.env_dropdown
        ], layout=widgets.Layout(margin='0 0 15px 0'))
        
        # Execution parameters section
        params_header = widgets.HTML('<h3 style="color: #333; margin: 20px 0 10px 0;">‚öôÔ∏è Execution Parameters</h3>')
        
        params_info = widgets.HTML('''
        <div style="padding: 10px; background-color: #e3f2fd; border-radius: 5px; margin-bottom: 10px; font-size: 13px;">
            <strong>üí° Parameter Guide:</strong><br>
            ‚Ä¢ <strong>Episodes:</strong> Number of complete runs (1-10)<br>
            ‚Ä¢ <strong>Max Steps:</strong> Maximum steps per episode before timeout<br>
            ‚Ä¢ <strong>Seed:</strong> Random seed for reproducibility (0 = random)<br>
            ‚Ä¢ <strong>Record Video:</strong> Save videos to 'videos' folder
        </div>
        ''')
        
        params_row1 = widgets.HBox([
            self.episodes_input,
            self.max_steps_input
        ], layout=widgets.Layout(margin='0 0 10px 0'))
        
        params_row2 = widgets.HBox([
            self.seed_input,
            self.record_video_checkbox
        ], layout=widgets.Layout(margin='0 0 15px 0'))
        
        # Control buttons section
        controls_header = widgets.HTML('<h3 style="color: #333; margin: 20px 0 10px 0;">üéÆ Controls</h3>')
        
        buttons_row1 = widgets.HBox([
            self.execute_btn,
            self.stop_btn,
            self.refresh_btn
        ], layout=widgets.Layout(margin='0 0 10px 0'))
        
        buttons_row2 = widgets.HBox([
            self.summary_btn,
            self.visualize_btn
        ], layout=widgets.Layout(margin='0 0 15px 0'))
        
        # Status and info section
        status_header = widgets.HTML('<h3 style="color: #333; margin: 20px 0 10px 0;">üìä Status & Information</h3>')
        
        info_section = widgets.VBox([
            self.status_label,
            self.model_info
        ], layout=widgets.Layout(margin='0 0 15px 0'))
        
        # Output section
        output_header = widgets.HTML('<h3 style="color: #333; margin: 20px 0 10px 0;">üì∫ Execution Output</h3>')
        
        # Main layout
        main_layout = widgets.VBox([
            algo_desc_header,
            self.algo_description,
            selection_header,
            selection_row,
            params_header,
            params_info,
            params_row1,
            params_row2,
            controls_header,
            buttons_row1,
            buttons_row2,
            status_header,
            info_section,
            output_header,
            self.output
        ])
        
        display(main_layout)
    
    def setup_ui(self):
        """Create the UI components"""
        models = self.get_available_models()
        
        # Algorithm dropdown
        self.algo_dropdown = widgets.Dropdown(
            options=['A2C', 'SAC', 'TD3'],
            value='A2C',
            description='ü§ñ Algorithm:',
            style={'description_width': '120px'},
            layout=widgets.Layout(width='200px')
        )
        
        # Model dropdown
        initial_models = [f"{steps:,} steps" for steps, _ in models['A2C'][:15]] if models['A2C'] else ['No models found']
        self.model_dropdown = widgets.Dropdown(
            options=initial_models,
            description='üìä Model:',
            style={'description_width': '120px'},
            layout=widgets.Layout(width='300px')
        )
        
        # Environment dropdown
        self.env_dropdown = widgets.Dropdown(
            options=['Humanoid-v4', 'Humanoid-v5', 'HumanoidStandup-v4'],
            value='Humanoid-v4',
            description='üéÆ Environment:',
            style={'description_width': '120px'},
            layout=widgets.Layout(width='250px')
        )
        
        # Additional parameters
        self.episodes_input = widgets.IntSlider(
            value=1,
            min=1,
            max=10,
            step=1,
            description='üéØ Episodes:',
            style={'description_width': '120px'},
            layout=widgets.Layout(width='300px')
        )
        
        self.max_steps_input = widgets.IntSlider(
            value=1000,
            min=100,
            max=5000,
            step=100,
            description='‚è±Ô∏è Max Steps:',
            style={'description_width': '120px'},
            layout=widgets.Layout(width='300px')
        )
        
        self.seed_input = widgets.IntText(
            value=0,
            description='üé≤ Seed (0=random):',
            style={'description_width': '120px'},
            layout=widgets.Layout(width='250px')
        )
        
        self.record_video_checkbox = widgets.Checkbox(
            value=False,
            description='üìπ Record Video',
            style={'description_width': '120px'},
            layout=widgets.Layout(width='200px')
        )
        
        # Control buttons with better layout
        button_layout = widgets.Layout(width='140px', height='45px')
        
        self.execute_btn = widgets.Button(
            description='üöÄ Run Viewer',
            button_style='primary',
            layout=button_layout,
            tooltip='Execute sb3.py with selected model'
        )
        
        self.stop_btn = widgets.Button(
            description='‚èπÔ∏è Stop',
            button_style='danger',
            layout=button_layout,
            tooltip='Stop current execution'
        )
        
        self.refresh_btn = widgets.Button(
            description='üîÑ Refresh',
            button_style='info',
            layout=button_layout,
            tooltip='Refresh available models list'
        )
        
        self.summary_btn = widgets.Button(
            description='üìä Summary',
            button_style='warning',
            layout=button_layout,
            tooltip='Show summary of all available models'
        )
        
        self.visualize_btn = widgets.Button(
            description='üìà Visualize',
            button_style='success',
            layout=button_layout,
            tooltip='Show performance visualizations'
        )
        
        # Status display
        self.status_label = widgets.HTML(
            value='<div style="padding: 10px; background-color: #e3f2fd; border-radius: 5px;"><b>üü¢ Ready</b> - Select a model and click Run Viewer</div>'
        )
        
        # Output area with enhanced styling
        self.output = widgets.Output(
            layout={
                'border': '2px solid #4CAF50', 
                'height': '450px', 
                'overflow': 'scroll',
                'padding': '10px',
                'background-color': '#fafafa'
            }
        )
        
        # Model info display
        self.model_info = widgets.HTML(
            value='<div style="padding: 8px; background-color: #f5f5f5; border-radius: 3px;">Select a model to see details</div>'
        )
        
        # Algorithm description display
        self.algo_description = widgets.HTML()
        
        # Bind events
        self.algo_dropdown.observe(self.update_models, names='value')
        self.model_dropdown.observe(self.update_model_info, names='value')
        self.execute_btn.on_click(self.execute_command)
        self.stop_btn.on_click(self.stop_execution)
        self.refresh_btn.on_click(self.refresh_models)
        self.summary_btn.on_click(self.show_models_summary)
        self.visualize_btn.on_click(self.visualize_performance)
        
        # Initial updates
        self.update_model_info({'new': self.model_dropdown.value})
        self.update_algorithm_description('A2C')

# Create the enhanced viewer instance
print("üéõÔ∏è Creating Enhanced Mujoco Viewer UI...")
enhanced_viewer = EnhancedMujocoViewer()
print("‚úÖ UI created successfully!")


üéõÔ∏è Creating Enhanced Mujoco Viewer UI...
‚úÖ UI created successfully!


In [6]:
# Display the interactive UI dashboard
enhanced_viewer.display() 


VBox(children=(HTML(value='<h3 style="color: #333; margin: 20px 0 10px 0;">üìö Algorithm Information</h3>'), HTM‚Ä¶

## üìö Parameter Guide

### üéØ **Episodes** (1-10)
- **Purpose**: Number of complete environment runs
- **Example**: Episodes=3 runs the model 3 times from start to finish
- **Use Case**: Test consistency across multiple runs

### ‚è±Ô∏è **Max Steps** (100-5000)  
- **Purpose**: Maximum actions per episode before timeout
- **Example**: Max Steps=1000 limits each episode to 1000 steps
- **Use Case**: Prevent infinite runs; humanoid tasks typically need 500-2000 steps

### üé≤ **Seed** (0=random)
- **Purpose**: Control randomness for reproducible results
- **Example**: Seed=42 gives identical starting conditions every time
- **Use Case**: Set seed for debugging; use 0 for random variety

### üìπ **Record Video**
- **Purpose**: Save episode videos to 'videos' folder
- **Use Case**: Share results or analyze movements frame-by-frame

---

## üöÄ Quick Start
1. Select your **Algorithm** (A2C, SAC, or TD3)
2. Choose a **Model** from the dropdown
3. Adjust parameters as needed
4. Click **üöÄ Run Viewer** to start
5. After running, click **üìà Visualize** to see performance graphs!
