# Artificial Bee Colony Optimization for Santa's Workshop Scheduling

## 1. Algorithm Overview
- **Objective**: Assign 5,000 families to 100 workshop days
  - Minimize: `Preference Costs` + `Accounting Penalties` 
  - Constraints: 125 ≤ Daily Occupancy ≤ 300

- **Swarm Intelligence Approach**:
  - Colony of artificial bees (solutions)
  - Three bee types with specialized roles:
    - Employed bees: Local solution refinement
    - Onlooker bees: Selective intensification  
    - Scout bees: Global exploration

## 2. Key Components

### Solution Representation 
- **Encoding**: List of 5,000 integers (days 1-100)
  - Index = Family ID
  - Value = Assigned day

### Fitness Evaluation
- Reuses cost functions from Artificial Immune system solution:
  ```python
  def calculate_affinity(solution):
      return (calculate_preference_cost(solution) + 
              calculate_accounting_penalty(solution))
- These functions precisely encode Santa's workshop's official requirements
- Using identical functions enables fair comparison between GA and AIS


In [9]:
import csv

with open('family_data.csv', mode='r', newline='', encoding='utf-8') as file:
    reader = csv.reader(file)
    headers = next(reader)
    family_data = [list(map(int, line)) for line in reader]

In [10]:
import random
import numpy as np
from typing import List, Tuple
from copy import deepcopy

num_families = 5000
num_days = 100
min_occupancy = 125
max_occupancy = 300

In [11]:
def calculate_preference_cost(solution):
    total_cost = 0
    for family_id, assigned_day in enumerate(solution):
        family = family_data[family_id]
        choices = family[1:-1]
        n_people = family[-1]
        
        # Determining which choice corresponds to the assigned day
        choice_index = choices.index(assigned_day) if assigned_day in choices else len(choices)
        
        # Calculating the cost of gifts
        if choice_index == 0:
            cost = 0
        elif choice_index == 1:
            cost = 50
        elif choice_index == 2:
            cost = 50 + 9 * n_people
        elif choice_index == 3:
            cost = 100 + 9 * n_people
        elif choice_index == 4:
            cost = 200 + 9 * n_people
        elif choice_index == 5:
            cost = 200 + 18 * n_people
        elif choice_index == 6:
            cost = 300 + 18 * n_people
        elif choice_index == 7:
            cost = 300 + 36 * n_people
        elif choice_index == 8:
            cost = 400 + 36 * n_people
        elif choice_index == 9:
            cost = 500 + 235 * n_people 
        else:
            cost = 500 + 434 * n_people
            
        total_cost += cost
        
    return total_cost

def calculate_accounting_penalty(solution):
    daily_occupancy = [0] * num_days
    for family_id, assigned_day in enumerate(solution):
        n_people = family_data[family_id][-1]
        daily_occupancy[assigned_day - 1] += n_people
    
    penalty = 0
    for d in range(num_days):
        Nd = daily_occupancy[d]
        Nd_next = daily_occupancy[d + 1] if d < num_days - 1 else Nd
        penalty += ((Nd - 125) / 400) * Nd**(0.5 + abs(Nd - Nd_next) / 50)
    return penalty

def calculate_affinity(solution):
    return calculate_preference_cost(solution) + calculate_accounting_penalty(solution)

# Solution Validation Function

## `is_valid(solution)`
**Purpose**: Verifies if a solution meets all problem constraints.

### Key Features:
- **Constraint Checking**:
  - Calculates daily occupancy counts
  - Validates all days have 125-300 visitors
- **Return Value**:
  - `True`: Solution meets all occupancy constraints
  - `False`: At least one day violates constraints

In [12]:
def is_valid(solution):
    daily_occupancy = [0] * num_days
    for family_id, day in enumerate(solution):
        daily_occupancy[day - 1] += family_data[family_id][-1]

    return all(min_occupancy <= occupancy <= max_occupancy for occupancy in daily_occupancy)

# Population Initialization Functions

## `initialize_individual() -> List[int]`
Creates a single valid solution using a constraint-aware greedy approach.

### Algorithm Steps:
1. **Sort families** by size (descending) to handle largest groups first
2. **Initialize daily occupancy tracker** (`[0] * 100`)
3. For each family:
   - Extract their preferred days (`choices = family[1:-1]`)
   - Get family size (`n_people = family[-1]`)
   - Find valid preferred days (those that won't exceed `max_occupancy`)
   - Assign to:
     - Random valid preferred day (if available)
     - Random valid fallback day (if no preferred days work)
     - Completely random day (last resort)
   - Update occupancy counts

### Key Features:
- **Constraint Satisfaction**: Guarantees daily occupancy ≤ 300
- **Preference Awareness**: Prioritizes families' preferred days when possible
- **Deterministic Start**: Sorting ensures reproducible initialization

In [13]:
def initialize_individual() -> List[int]:
    """Create one valid individual using greedy approach"""
    solution = []
    daily_occupancy = [0] * num_days
    
    # Sort families by size (largest first)
    sorted_families = sorted(family_data, key=lambda x: x[-1], reverse=True)
    
    for family in sorted_families:
        choices = family[1:-1]
        n_people = family[-1]
        
        # Try preferred days first
        valid_choices = [day for day in choices 
                        if daily_occupancy[day - 1] + n_people <= max_occupancy]
        
        if valid_choices:
            assigned_day = random.choice(valid_choices)
        else:
            # Fallback to any valid day
            valid_days = [day for day in range(1, num_days + 1) 
                         if daily_occupancy[day - 1] + n_people <= max_occupancy]
            assigned_day = random.choice(valid_days) if valid_days else random.randint(1, num_days)
        
        solution.append(assigned_day)
        daily_occupancy[assigned_day - 1] += n_people
    
    return solution

# Local Search Function

## `local_search(solution)`
**Purpose**: Repairs invalid solutions by adjusting family assignments to meet occupancy constraints.

### Key Operations:
1. **Occupancy Calculation**:
   - Tracks daily visitor counts
   - Identifies under/over-occupied days

2. **Constraint Repair**:
   - **For under-occupied days (<125)**:
     - Finds movable families from over-occupied days
     - Transfers them while maintaining other constraints
   - **For over-occupied days (>300)**:
     - Randomly reassigns families to valid days
     - Ensures new assignments don't create other violations

### Behavior Flow:
1. Calculate current daily occupancies
2. For each day:
   - While constraints violated:
     - Find and execute valid family moves
     - Break if no improvements possible
3. Return repaired solution

In [14]:
def local_search(solution):
    daily_occupancy = [0] * num_days
    for family_id, day in enumerate(solution):
        daily_occupancy[day - 1] += family_data[family_id][-1]
    
    for day in range(100):
        while daily_occupancy[day] < min_occupancy or daily_occupancy[day] > max_occupancy:
            if daily_occupancy[day] < min_occupancy:
                # Finding families that can be moved on this day
                for family_id, assigned_day in enumerate(solution):
                    family = family_data[family_id]
                    n_people = family[-1]
                    
                    # We check whether it is possible to move the family on this day
                    if (daily_occupancy[day] + n_people <= max_occupancy and
                        daily_occupancy[assigned_day - 1] - n_people >= min_occupancy):
                        # Moving the family
                        solution[family_id] = day + 1
                        daily_occupancy[assigned_day - 1] -= n_people
                        daily_occupancy[day] += n_people
                        break
                else:
                    break
            
            elif daily_occupancy[day] > max_occupancy:
                for family_id, assigned_day in enumerate(solution):
                    if assigned_day == day + 1:
                        family = family_data[family_id]
                        n_people = family[-1]
                        
                        valid_days = [d for d in range(1, num_days + 1) if daily_occupancy[d - 1] + n_people <= max_occupancy]
                        if valid_days:
                            new_day = random.choice(valid_days)
                            solution[family_id] = new_day
                            daily_occupancy[day] -= n_people
                            daily_occupancy[new_day - 1] += n_people
                            break
                else:
                    break
    
    return solution

# `generate_neighbor(solution, num_changes=5)`

## Purpose
Generates a modified ("neighbor") solution by perturbing a given solution while maintaining feasibility.

## Parameters
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `solution` | `List[int]` | - | Current solution (family-day assignments) |
| `num_changes` | `int` | `5` | Number of family assignments to modify |

## Algorithm Steps

- Creates independent copy to avoid modifying original
- Performs num_changes random family reassignments
- Guarantees returned solution meets all constraints
- Uses existing repair function

## Key Features
1. Controlled Perturbation: Modifies 5±2 families typically (balance between exploration and preservation)

2. Preference Awareness: Prioritizes families' preferred days when possible

3. Validity Guarantee: Always returns feasible solutions

## Usage
Primarily used by:

- Employed bees for local search

- Onlooker bees for solution refinement

In [15]:
def generate_neighbor(solution, num_changes=5):
    """Create neighbor solution by modifying several family assignments"""
    neighbor = solution.copy()
    
    # Try to change several random families
    for _ in range(num_changes):
        family_id = random.randint(0, num_families - 1)
        family = family_data[family_id]
        current_day = neighbor[family_id]
        
        # Prefer moving to other preferred days
        valid_days = [d for d in family[1:-1] if d != current_day]
        if not valid_days:
            valid_days = [d for d in range(1, num_days + 1) if d != current_day]
        
        if valid_days:
            neighbor[family_id] = random.choice(valid_days)
    
    # Ensure solution remains valid
    if not is_valid(neighbor):
        neighbor = local_search(neighbor)
    return neighbor

# Artificial Bee Colony Optimization

## Algorithm Parameters

| Parameter | Description | Recommended Value |
|-----------|-------------|-------------------|
| `colony_size` | Number of candidate solutions in the population | 50-100 |
| `max_iterations` | Maximum number of optimization cycles | 100-200 |
| `abandonment_limit` | Number of failed improvements before solution is abandoned | 10-20 |

## Optimization Phases

### 1. Initialization Phase
- Creates initial population of feasible solutions  
- Calculates initial fitness values (nectar amounts)  
- Initializes trial counters to track solution improvements

### 2. Employed Bees Phase
- Performs local search around each solution  
- Generates and evaluates neighboring solutions  
- Updates solutions when improvements are found  
- Resets trial counters for improved solutions

### 3. Onlooker Bees Phase
- Selects solutions probabilistically based on fitness  
- Focuses search effort on the most promising solutions  
- Uses same neighborhood search as employed bees  
- Maintains balance between exploration and exploitation

### 4. Scout Bees Phase
- Identifies stagnant solutions (reached abandonment limit)  
- Replaces them with new random solutions  
- Maintains population diversity  
- Prevents premature convergence to local optima

## Key Features:

- Automatic balance between exploration (scouts) and exploitation (employed/onlookers)

- Adaptive search intensity based on solution quality

- Guaranteed constraint satisfaction through repair mechanisms



In [17]:
def abc_optimization(colony_size=50, max_iterations=100, abandonment_limit=20):
    """Artificial Bee Colony implementation"""
    # Initialize population
    population = [initialize_individual() for _ in range(colony_size)]
    fitness = [1/(1 + calculate_affinity(sol)) for sol in population]  # Nectar amount
    trials = [0] * colony_size  # Track unsuccessful improvements
    
    best_solution = min(population, key=lambda x: calculate_affinity(x))
    best_cost = calculate_affinity(best_solution)
    cost_history = [best_cost]
    
    for iteration in range(max_iterations):
        # Employed Bees Phase (local improvement)
        for i in range(colony_size):
            neighbor = generate_neighbor(population[i])
            neighbor_cost = calculate_affinity(neighbor)
            
            if neighbor_cost < calculate_affinity(population[i]):
                population[i] = neighbor
                fitness[i] = 1/(1 + neighbor_cost)
                trials[i] = 0
            else:
                trials[i] += 1
        
        # Onlooker Bees Phase (probabilistic selection)
        probs = np.array(fitness) / sum(fitness)
        for _ in range(colony_size):
            idx = np.random.choice(range(colony_size), p=probs)
            neighbor = generate_neighbor(population[idx])
            neighbor_cost = calculate_affinity(neighbor)
            
            if neighbor_cost < calculate_affinity(population[idx]):
                population[idx] = neighbor
                fitness[idx] = 1/(1 + neighbor_cost)
                trials[idx] = 0
            else:
                trials[idx] += 1
        
        # Scout Bees Phase (random exploration)
        for i in range(colony_size):
            if trials[i] >= abandonment_limit:
                population[i] = initialize_individual()
                fitness[i] = 1/(1 + calculate_affinity(population[i]))
                trials[i] = 0
        
        # Update best solution
        current_best = min(population, key=lambda x: calculate_affinity(x))
        current_cost = calculate_affinity(current_best)
        if current_cost < best_cost:
            best_solution = deepcopy(current_best)
            best_cost = current_cost
        cost_history.append(best_cost)
        
        print(f"Iteration {iteration}: Best Cost = {best_cost:.2f}")
    
    return best_solution, cost_history

# Run the ABC optimization
best_solution_abc, abc_history = abc_optimization(
    colony_size=50,
    max_iterations=150,
    abandonment_limit=15
)

print(f"ABC Best Solution Cost: {calculate_affinity(best_solution_abc)}")
print(f"Valid Solution: {is_valid(best_solution_abc)}")

Iteration 0: Best Cost = 1426098677.75
Iteration 1: Best Cost = 1195674340.82
Iteration 2: Best Cost = 1041403444.40
Iteration 3: Best Cost = 703945324.69
Iteration 4: Best Cost = 582307489.61
Iteration 5: Best Cost = 414172245.22
Iteration 6: Best Cost = 355177099.38
Iteration 7: Best Cost = 354961607.19
Iteration 8: Best Cost = 161470283.49
Iteration 9: Best Cost = 130339259.40
Iteration 10: Best Cost = 100071981.22
Iteration 11: Best Cost = 96147112.25
Iteration 12: Best Cost = 63178947.16
Iteration 13: Best Cost = 43325988.13
Iteration 14: Best Cost = 27300776.35
Iteration 15: Best Cost = 24426743.11
Iteration 16: Best Cost = 22984027.84
Iteration 17: Best Cost = 18249003.96
Iteration 18: Best Cost = 14983133.29
Iteration 19: Best Cost = 13533037.79
Iteration 20: Best Cost = 11260893.42
Iteration 21: Best Cost = 10878963.78
Iteration 22: Best Cost = 10253511.12
Iteration 23: Best Cost = 10135209.46
Iteration 24: Best Cost = 9967099.64
Iteration 25: Best Cost = 9799903.14
Iteration 