# Atlas Explorer 3.0 - Multicore Performance Analysis

This notebook demonstrates **multicore CPU performance analysis** using the Atlas Explorer 3.0 modular architecture.

## Learning Objectives
- Configure Atlas Explorer for multicore analysis
- Execute parallel workloads across multiple threads
- Analyze thread load balancing and scaling efficiency
- Understand resource contention and cache sharing
- Generate advanced multicore optimization insights

## Prerequisites
- Atlas Explorer credentials configured (`atlasexplorer configure`)
- Package installed with notebook dependencies (`uv pip install -e '.[notebooks]'`)
- Familiarity with single-core analysis (recommended)

In [None]:
# Import Required Libraries for Multicore Analysis
import os
import sys
import json
from pathlib import Path
import pandas as pd
from datetime import datetime

# Atlas Explorer 3.0 modular imports
from atlasexplorer.core.client import AtlasExplorer
from atlasexplorer.core.experiment import Experiment

print("Libraries imported successfully!")
print("Atlas Explorer 3.0 - Ready for multicore analysis")

## Configuration Check

Let's verify that your Atlas Explorer credentials are properly configured for multicore experiments.

In [None]:
# Check Atlas Explorer Configuration for Multicore
print("Checking Atlas Explorer Configuration for Multicore...")
print("=" * 60)

try:
    # Initialize Atlas Explorer client
    ae = AtlasExplorer(channel="development", verbose=True)
    
    print("Configuration Status: READY FOR MULTICORE")
    print(f"Gateway: {ae.config.gateway}")
    print(f"Channel: {ae.config.channel}")
    print(f"Region: {ae.config.region}")
    print(f"API Key: {ae.config.apikey[:8]}...")
    print("Ready for parallel computing analysis!")
    
except Exception as e:
    print("Configuration Error:")
    print(f"Error: {e}")
    print("\nTo fix this, run: atlasexplorer configure")
    raise

## Experiment History

Let's check your previous experiments to see any multicore analysis history.

In [None]:
# Display Previous Multicore Experiments
print("Previous Atlas Explorer Experiments:")
print("=" * 50)

experiment_dir = Path("myexperiments")
experiments = []

if experiment_dir.exists():
    for exp_path in experiment_dir.iterdir():
        if exp_path.is_dir():
            config_file = exp_path / "config.json"
            if config_file.exists():
                try:
                    with open(config_file) as f:
                        config = json.load(f)
                    
                    mod_time = datetime.fromtimestamp(exp_path.stat().st_mtime)
                    core_config = config.get("core", "unknown")
                    
                    experiments.append({
                        "Experiment": exp_path.name,
                        "Channel": config.get("channel", "unknown"),
                        "Core": core_config,
                        "Type": "Multicore" if "thread" in core_config and not "1_thread" in core_config else "Single-core",
                        "Modified": mod_time.strftime("%Y-%m-%d %H:%M:%S")
                    })
                except (json.JSONDecodeError, KeyError):
                    continue

if experiments:
    df = pd.DataFrame(experiments).sort_values("Modified", ascending=False)
    display(df)
    
    multicore_count = sum(1 for exp in experiments if "Multicore" in exp["Type"])
    print(f"\nFound {len(experiments)} total experiments ({multicore_count} multicore)")
else:
    print("No previous experiments found. This will be your first multicore experiment!")
    print("Creating 'myexperiments' directory for results...")
    experiment_dir.mkdir(exist_ok=True)

## Multicore Experiment Setup

Now let's set up a multicore experiment using multiple workloads to analyze parallel performance.

In [None]:
# Multicore Experiment Configuration
print("Setting up Multicore Performance Experiment")
print("=" * 55)

# Experiment parameters
elf_files = [
    "resources/mandelbrot_rv64_O0.elf",
    "resources/memcpy_rv64.elf"
]
core_config = "I8500_(2_threads)"  # 2-core configuration
experiment_name = "multicore_parallel_analysis"
results_dir = "myexperiments"

print(f"ELF Files:")
for i, elf_file in enumerate(elf_files, 1):
    print(f"   {i}. {elf_file}")
print(f"Core Configuration: {core_config}")
print(f"Experiment Name: {experiment_name}")
print(f"Results Directory: {results_dir}")

# Verify all ELF files exist
missing_files = []
for elf_file in elf_files:
    if not os.path.exists(elf_file):
        missing_files.append(elf_file)
    else:
        print(f"Verified: {elf_file}")

if missing_files:
    print(f"Error: Missing ELF files: {missing_files}")
    print("Make sure you're running from the repository root directory")
    raise FileNotFoundError(f"ELF files not found: {missing_files}")
    
print("\nReady to launch multicore experiment!")
print(f"This will analyze parallel execution across {core_config.split('_')[1].replace('(', '').replace(')', '')}")

In [None]:
# Launch Multicore Experiment
print("Launching Multicore Performance Analysis...")
print("=" * 50)

try:
    # Create experiment
    experiment = Experiment(experiment_name, ae)
    
    # Configure experiment with multiple workloads
    for elf_file in elf_files:
        experiment.addWorkload(elf_file)
    
    experiment.setCore(core_config)
    experiment.setResultsDir(results_dir)
    
    print(f"Multicore experiment configured:")
    print(f"   Name: {experiment_name}")
    print(f"   Workloads: {len(elf_files)} parallel tasks")
    for i, elf_file in enumerate(elf_files, 1):
        print(f"     {i}. {os.path.basename(elf_file)}")
    print(f"   Core: {core_config}")
    print(f"   Channel: {ae.config.channel}")
    
    # Run experiment
    print("\nRunning multicore experiment... (this may take 2-5 minutes)")
    print("Analyzing parallel execution, thread load balancing, and resource sharing...")
    experiment.run()
    
    print("\nMulticore experiment completed successfully!")
    print(f"Results saved to: {results_dir}/{experiment_name}/")
    
except Exception as e:
    print(f"Experiment failed: {e}")
    print("Check your network connection and credentials")
    raise

## Multicore Performance Results

Now let's analyze the parallel performance results and thread efficiency.

In [None]:
# Extract and Display Multicore Performance Metrics
print("MULTICORE PERFORMANCE ANALYSIS RESULTS")
print("=" * 55)

# Get experiment summary
summary = experiment.getSummary()

# Overall performance metrics
total_cycles = summary.getTotalCycles()
total_instructions = summary.getTotalInstructions()
combined_ipc = summary.getIPC()
l1_icache_hit_rate = summary.getL1InstructionCacheHitRate()
l1_dcache_hit_rate = summary.getL1DataCacheHitRate()

print(f"OVERALL MULTICORE RESULTS:")
print(f"   Total Cycles: {total_cycles:,}")
print(f"   Instructions Executed: {total_instructions:,}")
print(f"   Combined IPC: {combined_ipc:.3f}")
print(f"\nSHARED CACHE PERFORMANCE:")
print(f"   L1 Instruction Cache Hit Rate: {l1_icache_hit_rate:.2f}%")
print(f"   L1 Data Cache Hit Rate: {l1_dcache_hit_rate:.2f}%")

# Try to get per-thread metrics if available
print(f"\nTHREAD-LEVEL ANALYSIS:")
try:
    # This may vary depending on the exact API
    thread_metrics = summary.getThreadMetrics() if hasattr(summary, 'getThreadMetrics') else None
    if thread_metrics:
        for i, thread in enumerate(thread_metrics):
            print(f"   Thread {i}: {thread['instructions']:,} instructions, IPC: {thread['ipc']:.3f}")
    else:
        # Estimate thread distribution
        estimated_instructions_per_thread = total_instructions // len(elf_files)
        estimated_ipc_per_thread = combined_ipc / len(elf_files)
        
        for i in range(len(elf_files)):
            print(f"   Thread {i} (estimated): ~{estimated_instructions_per_thread:,} instructions, IPC: ~{estimated_ipc_per_thread:.3f}")
        print(f"   Note: Per-thread metrics estimated from combined results")
        
except Exception as e:
    print(f"   Warning: Individual thread metrics not available: {e}")
    print(f"   Combined metrics shown above")

In [None]:
# Parallel Efficiency Analysis
print("PARALLEL EFFICIENCY & SCALING ANALYSIS")
print("=" * 55)

# Calculate parallel efficiency metrics
num_cores = len(elf_files)
theoretical_max_ipc = num_cores  # Theoretical maximum if perfect scaling

# Parallel efficiency calculation
parallel_efficiency = (combined_ipc / theoretical_max_ipc) * 100
scaling_factor = combined_ipc

print(f"PARALLEL COMPUTING METRICS:")
print(f"   Number of Cores/Threads: {num_cores}")
print(f"   Combined IPC: {combined_ipc:.3f}")
print(f"   Theoretical Maximum IPC: {theoretical_max_ipc:.1f}")
print(f"   Parallel Efficiency: {parallel_efficiency:.1f}%")
print(f"   Scaling Factor: {scaling_factor:.2f}x")

# Performance rating
if parallel_efficiency > 90:
    efficiency_rating = "Excellent"
    efficiency_desc = "Outstanding parallel scaling"
elif parallel_efficiency > 80:
    efficiency_rating = "Very Good"
    efficiency_desc = "Strong parallel performance"
elif parallel_efficiency > 60:
    efficiency_rating = "Good"
    efficiency_desc = "Reasonable parallel scaling with room for improvement"
else:
    efficiency_rating = "Needs Improvement"
    efficiency_desc = "Poor parallel scaling - investigate bottlenecks"

print(f"\nEFFICIENCY ASSESSMENT:")
print(f"   Rating: {efficiency_rating}")
print(f"   Assessment: {efficiency_desc}")

# Load balancing analysis
print(f"\nLOAD BALANCING ANALYSIS:")
if num_cores == 2:
    instructions_per_thread = total_instructions // num_cores
    print(f"   Average Instructions per Thread: {instructions_per_thread:,}")
    
    # Estimate load balance (perfect would be equal distribution)
    load_balance_quality = "Perfect" if parallel_efficiency > 85 else "Good" if parallel_efficiency > 70 else "Unbalanced"
    print(f"   Load Balance Quality: {load_balance_quality}")

# Create summary DataFrame
multicore_data = {
    "Metric": [
        "Number of Cores",
        "Total Cycles",
        "Total Instructions",
        "Combined IPC",
        "Parallel Efficiency (%)",
        "Scaling Factor",
        "L1 I-Cache Hit Rate (%)",
        "L1 D-Cache Hit Rate (%)"
    ],
    "Value": [
        f"{num_cores}",
        f"{total_cycles:,}",
        f"{total_instructions:,}",
        f"{combined_ipc:.3f}",
        f"{parallel_efficiency:.1f}",
        f"{scaling_factor:.2f}x",
        f"{l1_icache_hit_rate:.2f}",
        f"{l1_dcache_hit_rate:.2f}"
    ]
}

multicore_df = pd.DataFrame(multicore_data)
display(multicore_df)

## Advanced Multicore Analysis

Let's dive deeper into multicore-specific performance characteristics and optimization opportunities.

In [None]:
# Advanced Multicore Performance Insights
print("ADVANCED MULTICORE OPTIMIZATION INSIGHTS")
print("=" * 60)

print("THREAD EFFICIENCY ANALYSIS:")

# Thread efficiency insights
if parallel_efficiency > 90:
    print("   Outstanding thread coordination and minimal overhead")
    print("   Recommendation: Consider scaling to more cores for increased throughput")
elif parallel_efficiency > 80:
    print("   Good parallel scaling with acceptable overhead")
    print("   Recommendation: Minor optimizations could improve efficiency further")
else:
    print("   Significant parallel overhead detected")
    print("   Recommendation: Investigate synchronization, memory contention, or load imbalance")

print(f"\nCACHE SHARING ANALYSIS:")

# Cache performance under parallel load
if l1_icache_hit_rate > 99 and l1_dcache_hit_rate > 99:
    print("   Excellent cache performance maintained under parallel load")
    print("   Recommendation: Memory access patterns are cache-friendly across threads")
elif l1_icache_hit_rate > 95 and l1_dcache_hit_rate > 95:
    print("   Good cache utilization with minimal thread interference")
    print("   Recommendation: Consider data partitioning optimizations")
else:
    print("   Cache performance degraded under parallel load")
    print("   Recommendation: Optimize data access patterns to reduce cache conflicts")

print(f"\nSCALING RECOMMENDATIONS:")

if parallel_efficiency > 85:
    print("   Ready for aggressive scaling to 4+ cores")
    print("   Next experiment: Try I8500_(4_threads) configuration")
    print("   Expected outcome: Strong performance gains with more cores")
elif parallel_efficiency > 70:
    print("   Moderate scaling potential to 4 cores")
    print("   Recommendation: Optimize current 2-core performance before scaling up")
    print("   Focus areas: Load balancing and cache optimization")
else:
    print("   Address parallel bottlenecks before scaling up")
    print("   Investigation needed: Thread synchronization and memory access patterns")
    print("   Consider: Workload partitioning strategies")

print(f"\nMULTICORE OPTIMIZATION STRATEGIES:")
print("   • Thread Affinity: Pin threads to specific cores")
print("   • Data Locality: Minimize cross-thread memory sharing")
print("   • Load Balancing: Ensure equal work distribution")
print("   • Cache Optimization: Reduce false sharing between threads")
print("   • NUMA Awareness: Consider memory topology for larger systems")

print(f"\nComplete multicore analysis saved to: {results_dir}/{experiment_name}/")

## Multicore Analysis Summary

**Excellent work!** You've successfully completed a comprehensive multicore performance analysis using Atlas Explorer 3.0.

### What You Accomplished:
- Configured Atlas Explorer 3.0 for multicore analysis
- Ran parallel workloads across multiple threads
- Analyzed parallel efficiency and scaling characteristics
- Evaluated thread load balancing and cache sharing
- Generated advanced multicore optimization insights

### Advanced Experiments to Try:
1. **Scale Up**: Try `I8500_(4_threads)` for quad-core analysis
2. **Compare Configurations**: Run same workloads on different core counts
3. **Optimize Workloads**: Test with `-O3` optimized ELF files
4. **Custom Workloads**: Analyze your own parallel applications
5. **Scaling Studies**: Create performance scaling curves

### Research Opportunities:
- **Parallel Overhead Analysis**: Quantify coordination costs
- **Cache Coherency Studies**: Analyze inter-thread cache behavior
- **NUMA Performance**: Study memory topology effects
- **Workload Characterization**: Profile different parallel patterns

### Continue Learning:
- [Command-line automation](../examples/)
- [Single-core analysis](ae_singlecore_notebook.ipynb)
- [Full documentation](../README.md)
- [Advanced configuration options](../README.md#configuration-guide)

**Congratulations on mastering multicore performance analysis with Atlas Explorer 3.0!**

# 🚀 ATLAS Explorer 3.0 - Multicore Performance Analysis

This notebook demonstrates **multicore CPU performance analysis** using the Atlas Explorer 3.0 modular architecture.

## 📚 What You'll Master
- 🖥️ Thread load balancing and parallel efficiency
- ⚡ Resource contention analysis and optimization
- 📊 Scaling studies across different core counts  
- 🔄 Cache sharing and memory system behavior
- 🚀 Advanced multicore optimization techniques

## 🎯 Prerequisites
- Atlas Explorer credentials configured (`atlasexplorer configure`)
- Package installed with notebook dependencies (`uv pip install -e '.[notebooks]'`)
- Familiarity with single-core analysis (recommended)

In [None]:
# 📚 Import Required Libraries for Multicore Analysis
import os
import sys
import json
from pathlib import Path
import pandas as pd
from datetime import datetime

# Atlas Explorer 3.0 modular imports
from atlasexplorer.core.client import AtlasExplorer
from atlasexplorer.core.experiment import Experiment

print("✅ Libraries imported successfully!")
print("🚀 Atlas Explorer 3.0 - Ready for multicore analysis")
print("🖥️ Parallel computing performance insights coming up!")

## 🔧 Configuration Check

Let's verify that your Atlas Explorer credentials are properly configured for multicore experiments.

In [None]:
# 🔧 Check Atlas Explorer Configuration for Multicore
print("🔧 Checking Atlas Explorer Configuration for Multicore...")
print("=" * 60)

try:
    # Initialize Atlas Explorer client
    ae = AtlasExplorer(channel="development", verbose=True)
    
    print("✅ Configuration Status: READY FOR MULTICORE")
    print(f"🌐 Gateway: {ae.config.gateway}")
    print(f"📡 Channel: {ae.config.channel}")
    print(f"🌍 Region: {ae.config.region}")
    print(f"🔑 API Key: {ae.config.apikey[:8]}...")
    print("🖥️ Ready for parallel computing analysis!")
    
except Exception as e:
    print("❌ Configuration Error:")
    print(f"Error: {e}")
    print("\n💡 To fix this, run: atlasexplorer configure")
    raise

## 📁 Experiment History

Let's check your previous experiments to see any multicore analysis history.

In [None]:
# 📁 Display Previous Multicore Experiments
print("📁 Previous Atlas Explorer Experiments:")
print("=" * 50)

experiment_dir = Path("myexperiments")
experiments = []

if experiment_dir.exists():
    for exp_path in experiment_dir.iterdir():
        if exp_path.is_dir():
            config_file = exp_path / "config.json"
            if config_file.exists():
                try:
                    with open(config_file) as f:
                        config = json.load(f)
                    
                    mod_time = datetime.fromtimestamp(exp_path.stat().st_mtime)
                    core_config = config.get("core", "unknown")
                    
                    experiments.append({
                        "Experiment": exp_path.name,
                        "Channel": config.get("channel", "unknown"),
                        "Core": core_config,
                        "Type": "Multicore" if "thread" in core_config and not "1_thread" in core_config else "Single-core",
                        "Modified": mod_time.strftime("%Y-%m-%d %H:%M:%S")
                    })
                except (json.JSONDecodeError, KeyError):
                    continue

if experiments:
    df = pd.DataFrame(experiments).sort_values("Modified", ascending=False)
    display(df)
    
    multicore_count = sum(1 for exp in experiments if "Multicore" in exp["Type"])
    print(f"\n📊 Found {len(experiments)} total experiments ({multicore_count} multicore)")
else:
    print("No previous experiments found. This will be your first multicore experiment!")
    print("🆕 Creating 'myexperiments' directory for results...")
    experiment_dir.mkdir(exist_ok=True)

## 🖥️ Multicore Experiment Setup

Now let's set up a multicore experiment using multiple workloads to analyze parallel performance.

In [None]:
# 🖥️ Multicore Experiment Configuration
print("🖥️ Setting up Multicore Performance Experiment")
print("=" * 55)

# Experiment parameters
elf_files = [
    "resources/mandelbrot_rv64_O0.elf",
    "resources/memcpy_rv64.elf"
]
core_config = "I8500_(2_threads)"  # 2-core configuration
experiment_name = "multicore_parallel_analysis"
results_dir = "myexperiments"

print(f"📁 ELF Files:")
for i, elf_file in enumerate(elf_files, 1):
    print(f"   {i}. {elf_file}")
print(f"🖥️ Core Configuration: {core_config}")
print(f"📊 Experiment Name: {experiment_name}")
print(f"💾 Results Directory: {results_dir}")

# Verify all ELF files exist
missing_files = []
for elf_file in elf_files:
    if not os.path.exists(elf_file):
        missing_files.append(elf_file)
    else:
        print(f"✅ Verified: {elf_file}")

if missing_files:
    print(f"❌ Error: Missing ELF files: {missing_files}")
    print("💡 Make sure you're running from the repository root directory")
    raise FileNotFoundError(f"ELF files not found: {missing_files}")
    
print("\n🚀 Ready to launch multicore experiment!")
print(f"🧵 This will analyze parallel execution across {core_config.split('_')[1].replace('(', '').replace(')', '')}")

In [None]:
# 🚀 Launch Multicore Experiment
print("🚀 Launching Multicore Performance Analysis...")
print("=" * 50)

try:
    # Create experiment
    experiment = Experiment(experiment_name, ae)
    
    # Configure experiment with multiple workloads
    for elf_file in elf_files:
        experiment.addWorkload(elf_file)
    
    experiment.setCore(core_config)
    experiment.setResultsDir(results_dir)
    
    print(f"📋 Multicore experiment configured:")
    print(f"   • Name: {experiment_name}")
    print(f"   • Workloads: {len(elf_files)} parallel tasks")
    for i, elf_file in enumerate(elf_files, 1):
        print(f"     {i}. {os.path.basename(elf_file)}")
    print(f"   • Core: {core_config}")
    print(f"   • Channel: {ae.config.channel}")
    
    # Run experiment
    print("\n⏳ Running multicore experiment... (this may take 2-5 minutes)")
    print("🧵 Analyzing parallel execution, thread load balancing, and resource sharing...")
    experiment.run()
    
    print("\n✅ Multicore experiment completed successfully!")
    print(f"📁 Results saved to: {results_dir}/{experiment_name}/")
    
except Exception as e:
    print(f"❌ Experiment failed: {e}")
    print("💡 Check your network connection and credentials")
    raise

## 📊 Multicore Performance Results

Now let's analyze the parallel performance results and thread efficiency.

In [None]:
# 📊 Extract and Display Multicore Performance Metrics
print("📊 MULTICORE PERFORMANCE ANALYSIS RESULTS")
print("=" * 55)

# Get experiment summary
summary = experiment.getSummary()

# Overall performance metrics
total_cycles = summary.getTotalCycles()
total_instructions = summary.getTotalInstructions()
combined_ipc = summary.getIPC()
l1_icache_hit_rate = summary.getL1InstructionCacheHitRate()
l1_dcache_hit_rate = summary.getL1DataCacheHitRate()

print(f"🎯 OVERALL MULTICORE RESULTS:")
print(f"   Total Cycles: {total_cycles:,}")
print(f"   Instructions Executed: {total_instructions:,}")
print(f"   Combined IPC: {combined_ipc:.3f}")
print(f"\n💾 SHARED CACHE PERFORMANCE:")
print(f"   L1 Instruction Cache Hit Rate: {l1_icache_hit_rate:.2f}%")
print(f"   L1 Data Cache Hit Rate: {l1_dcache_hit_rate:.2f}%")

# Try to get per-thread metrics if available
print(f"\n🧵 THREAD-LEVEL ANALYSIS:")
try:
    # This may vary depending on the exact API
    thread_metrics = summary.getThreadMetrics() if hasattr(summary, 'getThreadMetrics') else None
    if thread_metrics:
        for i, thread in enumerate(thread_metrics):
            print(f"   Thread {i}: {thread['instructions']:,} instructions, IPC: {thread['ipc']:.3f}")
    else:
        # Estimate thread distribution
        estimated_instructions_per_thread = total_instructions // len(elf_files)
        estimated_ipc_per_thread = combined_ipc / len(elf_files)
        
        for i in range(len(elf_files)):
            print(f"   Thread {i} (estimated): ~{estimated_instructions_per_thread:,} instructions, IPC: ~{estimated_ipc_per_thread:.3f}")
        print(f"   📝 Note: Per-thread metrics estimated from combined results")
        
except Exception as e:
    print(f"   ⚠️ Individual thread metrics not available: {e}")
    print(f"   📊 Combined metrics shown above")

In [None]:
# 📈 Parallel Efficiency Analysis
print("📈 PARALLEL EFFICIENCY & SCALING ANALYSIS")
print("=" * 55)

# Calculate parallel efficiency metrics
num_cores = len(elf_files)
theoretical_max_ipc = num_cores  # Theoretical maximum if perfect scaling

# Parallel efficiency calculation
parallel_efficiency = (combined_ipc / theoretical_max_ipc) * 100
scaling_factor = combined_ipc

print(f"🖥️ PARALLEL COMPUTING METRICS:")
print(f"   Number of Cores/Threads: {num_cores}")
print(f"   Combined IPC: {combined_ipc:.3f}")
print(f"   Theoretical Maximum IPC: {theoretical_max_ipc:.1f}")
print(f"   Parallel Efficiency: {parallel_efficiency:.1f}%")
print(f"   Scaling Factor: {scaling_factor:.2f}x")

# Performance rating
if parallel_efficiency > 90:
    efficiency_rating = "🌟 Excellent"
    efficiency_desc = "Outstanding parallel scaling"
elif parallel_efficiency > 80:
    efficiency_rating = "✅ Very Good"
    efficiency_desc = "Strong parallel performance"
elif parallel_efficiency > 60:
    efficiency_rating = "⚠️ Good"
    efficiency_desc = "Reasonable parallel scaling with room for improvement"
else:
    efficiency_rating = "❌ Needs Improvement"
    efficiency_desc = "Poor parallel scaling - investigate bottlenecks"

print(f"\n🎯 EFFICIENCY ASSESSMENT:")
print(f"   Rating: {efficiency_rating}")
print(f"   Assessment: {efficiency_desc}")

# Load balancing analysis
print(f"\n⚖️ LOAD BALANCING ANALYSIS:")
if num_cores == 2:
    instructions_per_thread = total_instructions // num_cores
    print(f"   Average Instructions per Thread: {instructions_per_thread:,}")
    
    # Estimate load balance (perfect would be equal distribution)
    load_balance_quality = "🏆 Perfect" if parallel_efficiency > 85 else "✅ Good" if parallel_efficiency > 70 else "⚠️ Unbalanced"
    print(f"   Load Balance Quality: {load_balance_quality}")

# Create summary DataFrame
multicore_data = {
    "Metric": [
        "🖥️ Number of Cores",
        "🎯 Total Cycles",
        "📊 Total Instructions",
        "⚡ Combined IPC",
        "📈 Parallel Efficiency (%)",
        "🚀 Scaling Factor",
        "💾 L1 I-Cache Hit Rate (%)",
        "💾 L1 D-Cache Hit Rate (%)"
    ],
    "Value": [
        f"{num_cores}",
        f"{total_cycles:,}",
        f"{total_instructions:,}",
        f"{combined_ipc:.3f}",
        f"{parallel_efficiency:.1f}",
        f"{scaling_factor:.2f}x",
        f"{l1_icache_hit_rate:.2f}",
        f"{l1_dcache_hit_rate:.2f}"
    ]
}

multicore_df = pd.DataFrame(multicore_data)
display(multicore_df)

## 🔍 Advanced Multicore Analysis

Let's dive deeper into multicore-specific performance characteristics and optimization opportunities.

In [None]:
# 🔍 Advanced Multicore Performance Insights
print("🔍 ADVANCED MULTICORE OPTIMIZATION INSIGHTS")
print("=" * 60)

print("🧵 THREAD EFFICIENCY ANALYSIS:")

# Thread efficiency insights
if parallel_efficiency > 90:
    print("   🌟 Outstanding thread coordination and minimal overhead")
    print("   💡 Consider scaling to more cores for increased throughput")
elif parallel_efficiency > 80:
    print("   ✅ Good parallel scaling with acceptable overhead")
    print("   💡 Minor optimizations could improve efficiency further")
else:
    print("   ⚠️ Significant parallel overhead detected")
    print("   💡 Investigate synchronization, memory contention, or load imbalance")

print(f"\n💾 CACHE SHARING ANALYSIS:")

# Cache performance under parallel load
if l1_icache_hit_rate > 99 and l1_dcache_hit_rate > 99:
    print("   🌟 Excellent cache performance maintained under parallel load")
    print("   💡 Memory access patterns are cache-friendly across threads")
elif l1_icache_hit_rate > 95 and l1_dcache_hit_rate > 95:
    print("   ✅ Good cache utilization with minimal thread interference")
    print("   💡 Consider data partitioning optimizations")
else:
    print("   ⚠️ Cache performance degraded under parallel load")
    print("   💡 Optimize data access patterns to reduce cache conflicts")

print(f"\n🚀 SCALING RECOMMENDATIONS:")

if parallel_efficiency > 85:
    print("   🎯 Ready for aggressive scaling to 4+ cores")
    print("   💡 Try I8500_(4_threads) configuration for next experiment")
    print("   📈 Expect strong performance gains with more cores")
elif parallel_efficiency > 70:
    print("   ⚡ Moderate scaling potential to 4 cores")
    print("   💡 Optimize current 2-core performance before scaling up")
    print("   🔧 Focus on load balancing and cache optimization")
else:
    print("   ⚠️ Address parallel bottlenecks before scaling up")
    print("   💡 Investigate thread synchronization and memory access patterns")
    print("   🔧 Consider workload partitioning strategies")

print(f"\n🏆 MULTICORE OPTIMIZATION STRATEGIES:")
print("   • 🧵 Thread Affinity: Pin threads to specific cores")
print("   • 💾 Data Locality: Minimize cross-thread memory sharing")
print("   • ⚖️ Load Balancing: Ensure equal work distribution")
print("   • 🔄 Cache Optimization: Reduce false sharing between threads")
print("   • 📊 NUMA Awareness: Consider memory topology for larger systems")

print(f"\n📁 Complete multicore analysis saved to: {results_dir}/{experiment_name}/")

## 🏁 Multicore Analysis Summary

**Excellent work!** You've successfully completed a comprehensive multicore performance analysis using Atlas Explorer 3.0.

### 📋 What You Accomplished:
- ✅ Configured Atlas Explorer 3.0 for multicore analysis
- ✅ Ran parallel workloads across multiple threads
- ✅ Analyzed parallel efficiency and scaling characteristics
- ✅ Evaluated thread load balancing and cache sharing
- ✅ Generated advanced multicore optimization insights

### 🚀 Advanced Experiments to Try:
1. **Scale Up**: Try `I8500_(4_threads)` for quad-core analysis
2. **Compare Configurations**: Run same workloads on different core counts
3. **Optimize Workloads**: Test with `-O3` optimized ELF files
4. **Custom Workloads**: Analyze your own parallel applications
5. **Scaling Studies**: Create performance scaling curves

### 🔬 Research Opportunities:
- **Parallel Overhead Analysis**: Quantify coordination costs
- **Cache Coherency Studies**: Analyze inter-thread cache behavior
- **NUMA Performance**: Study memory topology effects
- **Workload Characterization**: Profile different parallel patterns

### 📚 Continue Learning:
- 📖 [Command-line automation](../examples/)
- 🔬 [Single-core analysis](ae_singlecore_notebook.ipynb)
- 📊 [Full documentation](../README.md)
- 🏗️ [Advanced configuration options](../README.md#configuration-guide)

**🎉 Congratulations on mastering multicore performance analysis with Atlas Explorer 3.0!**