In [26]:
from musical_festival_lineup.mf_lineup import MFLineupSolution
from musical_festival_lineup.mf_lineup_data import MFLineupData
from musical_festival_lineup.utils import visualize_musical_lineup
import numpy as np
import time

In [27]:
%load_ext autoreload
%autoreload 2

### 1. Defining a  solution/individual

The  objective  is  to  design  the  optimal  festival  lineup  by  scheduling  artists  across  stages  and 
time  slots while:
- Maximizing  prime  slot  popularity
- Ensuring  genre  diversity  among  stages 
- Minimizing  fan conflicts  at each time slot. 

The problem involves creating a festival lineup by deciding which artist plays on which stage and at what time. There are 35 artists, 5 stages, and 7 time slots, and each artist must be scheduled exactly once. All stages have the same number of slots, and all performances happen in the same time blocks.
To represent a solution, we used a list of 35 elements, where each element is an artist. The search space includes every possible combination where no artist repeats.
We considered two ways of organising this list:
-	**Option 1** — Slot-based grouping:
The first 5 elements represent the artists performing in Time Slot 1, one on each of the 5 stages.
The next 5 elements represent Time Slot 2, and so on. 
Example: Positions 0-4 → Slot 1 (Stages 1-5), Positions 5-9 → Slot 2 (Stages 1-5), and so on up to Slot 7.
-	**Option 2** — Stage-based grouping:
The first 7 elements represent the artists performing on Stage 1, one in each time slot.
The next 7 elements represent Stage 2, and so on.


We chose Option 1 (Slot-based) because it works better with our fitness function, which compares artists performing at the same time on different stages. It also makes it easier to preserve the time slot grouping — which is what we want to maintain across generations — when applying crossover operations in the Genetic Algorithm.


In [28]:
data=MFLineupData()
lineup=MFLineupSolution(data=data)
",".join([ str(i) for i in lineup.repr])

'12,4,23,34,24,15,20,9,19,33,3,30,27,7,17,28,21,22,16,25,18,5,26,32,14,13,6,1,8,11,29,10,0,31,2'

In [29]:
visualize_musical_lineup(lineup.repr, data.artists_df)

Unnamed: 0,Slot,Stage 1,Stage 2,Stage 3,Stage 4,Stage 5
0,Slot 1,12: Blue Horizon|Pop|51,4: The Silver Owls|Classical|85,23: Electric Serpents|Electronic|99,34: Parallel Dimension|Electronic|58,24: Shadow Cadence|Jazz|66
1,Slot 2,15: Golden Ember|Rock|61,20: The Sonic Drifters|Rock|88,9: Deep Resonance|Jazz|90,19: Astral Tide|Electronic|69,33: Cosmic Frequency|Rock|53
2,Slot 3,3: Neon Reverie|Electronic|100,30: Turbo Vortex|Rock|53,27: Hypnotic Echoes|Rock|77,7: Static Mirage|Rock|94,17: Nightfall Sonata|Classical|84
3,Slot 4,28: The Polyrhythm Syndicate|Jazz|66,21: Celestial Voyage|Electronic|95,22: Quantum Beat|Hip-Hop|96,16: Mystic Rhythms|Classical|78,25: Rhythm Alchemy|Jazz|94
4,Slot 5,18: Velvet Underground|Rock|72,5: Echo Chamber|Electronic|98,26: Cloud Nine Collective|Pop|97,32: The Bassline Architects|Hip-Hop|61,14: Synthwave Saints|Rock|94
5,Slot 6,13: Lunar Spectrum|Rock|99,6: Aurora Skies|Pop|75,1: Solar Flare|Electronic|78,8: Crimson Harmony|Classical|20,11: Phantom Groove|Hip-Hop|47
6,Slot 7,29: Harmonic Dissonance|Classical|96,10: The Wandering Notes|Jazz|84,0: Midnight Echo|Rock|75,31: The Jazz Nomads|Jazz|64,2: Velvet Pulse|Jazz|35


- Get artists of slot

In [30]:
lineup._get_slot_repr_list(slot=6)

[29, 10, 0, 31, 2]

### 2. Defining fitness

The quality of a festival lineup is determined by balancing **`three equally`** important objectives, each contributing to the overall score. Because these objectives operate on different scales, **they must be normalized to a common range (between 0 and 1) to ensure equal contribution to the final fitness score.**
<br>
The objectives are as follows:

---

**`Prime Slot Popularity`**:  
The most popular artists should be scheduled in the prime slots (**the last time slot on each stage**). This score is calculated by normalizing the total popularity of artists performing in prime slots against the maximum possible total popularity  
(e.g., if only the highest popularity artists — scoring 100 — were scheduled in those slots).

---

**`Genre Diversity`**:  
A diverse range of **genres across stages in each time slot** enhances the festival experience. This score is obtained by normalizing the number of unique genres in each time slot relative to the maximum possible unique genres  
(e.g., if only distinct genres were scheduled in that slot). The average across all time slots is then taken.

---

**`Conflict Penalty`**:  
**Fan conflicts occur when artists with overlapping audiences perform simultaneously on different stages.**  
This score is calculated by normalizing the total conflict value in each time slot against the worst-case conflict scenario  
(e.g., when all artists with the highest conflict values are scheduled together). The average normalized score across time slots is taken. Since conflicts detract from the lineup quality, this score acts as a penalty.

To manage the information needed for these calculations:
- we created a class called **LineupData**, which organises thedata related to the artists — such as their popularity, genres and conflicts — and provides functions to calculate the necessary metrics.

- For each of these objectives, we defined a function to calculate the value based on a list of five artist IDs (one slot). Since each goal uses different values and scales, we needed to normalise them. To do this, we worked out the maximum possible value for each objective.


| Objective           | Description                                                 | Calculation Method                                            | Normalization                                                                                                  | Weight |
|---------------------|-------------------------------------------------------------|--------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------|--------|
| Prime Slot Popularity| Measures how popular the artists in the final time slot are | Sum of popularity scores for the 5 artists in the final slot | Divided by sum of the 5 highest popularity scores in dataset, as there are five stages.                        | 1/3    |
| Genre Diversity     | Measures the variety of genres in each time slot             | Count of unique genres represented in each slot              | Divided by 5 (max possible distinct genres per slot). —The minimum between the total number of available genres and the number of stages | 1/3    |
| Conflict Penalty    | Penalizes scheduling artists with overlapping fan bases at the same time | Sum of conflict values between all pairs of artists in a slot | Divided by sum of the 10 highest conflict values in dataset. The maximum possible conflict in a slot is the sum of the 10 highest conflicts, because there are 10 artist pairs in 5 stages (from the combination of 5 taken 2 at a time) | 1/3    |


__`Prime  Slot  Popularity`__  

1.  Maximum  possible  total  popularity, having into account that There is 5 stages, so we eill be summing the 5(NUM STAGES) largest

In [31]:
data.max_popularity_in_prime_slot

493

2. Total  popularity  of  artists: <br>
For that, it was defined a function that receives artists Ids and returns the popularity for that specifics Ids. <br>
Lets test it!



In [32]:
#Summing popularity of all artists
data.get_sum_popularity(list(range(14)))

1031

In [33]:
#Summing popularity of [1,2,3,4]
data.get_sum_popularity([1,2,3,4])

298

3. Normalizing the popularity against the max number of popularity

In [34]:
lineup._get_popularity_normalized([1,2,3,4])

0.6044624746450304

__`Genre  Diversity`__  : 

Get maximum number of genres per slot, can not be greater than the total number of stages, which is 51.

In [35]:
#Genre  Diversity
data.max_distinct_genre_per_slot

5

2. Number of distinct genres : <br>
For that, it was defined a function that receives artists Ids and returns distinct genres in that list of specifics Ids. <br>
Lets test it!


In [36]:
data.get_count_distinct_genres([0,1,10,2])

3

3. Normalizing the genres against the max number of distinct genres

In [37]:
lineup._get_genre_diversity_normalized(artists_ids_list=[0,1,10])

0.6

__`Penalty Conflict`__  : 

1. Get maximum possible conflift, the  worst  possible  conflict  scenario, so for that we get the top K worst case scenerio. <br>
 What is K?
 Is the numbers os conflits in a slot, which is the combination of 5 elements when grouping in two so, we will have C5,2, tthat is equal to 10.
 So will will be choosing the top 10 worst conflits


In [38]:
data.max_worst_conflit_per_slot

np.float64(10.0)

In [39]:
data.max_worst_conflit_per_slot

np.float64(10.0)

2. The  Total  conflict  value  in  each  slot : <br>
For that, it was defined a function that receives artists Ids and sum of all conflicts in the list. <br>
Lets test it!


In [40]:
data.get_sum_conflicts([3,1,10])

np.float64(1.8)

In [41]:
data.get_sum_conflicts([32, 29, 25, 13, 30])

np.float64(5.75)

In [42]:
data.get_sum_conflicts([2,10])

np.float64(0.9)

3. Normalizing the conflicts against the worst conflit scenario

In [43]:
lineup._get_conflicts_normalized(artists_ids_list=[3,1,10])

np.float64(0.18)

In [44]:
lineup._get_conflicts_normalized(artists_ids_list=[32, 29, 25, 13, 30])

np.float64(0.575)

##### 3.2 Other important functions in the solution

- Get the list of artists Ids in a Slot, Slot is a parameter

In [45]:
lineup._get_slot_repr_list(1)

[15, 20, 9, 19, 33]

In [46]:
lineup._get_slot_repr_list(0)

[12, 4, 23, 34, 24]

##### 3.3 Fitness

In [47]:
#Random 1
lineup=MFLineupSolution(data=data)
lineup.repr
lineup.fitness(verbose=True)

Slot of artists: [21, 29, 9, 10, 5]
Slot 0: Conflitcs: 0.49000000000000005, genres: 0.6, sum_popularity: 0
Slot 0: List of  Conflitcs: [np.float64(0.49000000000000005)], List of genres: [0.6], Popularity of the prime slot: 0
Slot of artists: [11, 17, 33, 24, 1]
Slot 1: Conflitcs: 0.3, genres: 1.0, sum_popularity: 0
Slot 1: List of  Conflitcs: [np.float64(0.49000000000000005), np.float64(0.3)], List of genres: [0.6, 1.0], Popularity of the prime slot: 0
Slot of artists: [7, 18, 31, 23, 32]
Slot 2: Conflitcs: 0.37, genres: 0.8, sum_popularity: 0
Slot 2: List of  Conflitcs: [np.float64(0.49000000000000005), np.float64(0.3), np.float64(0.37)], List of genres: [0.6, 1.0, 0.8], Popularity of the prime slot: 0
Slot of artists: [3, 28, 6, 8, 13]
Slot 3: Conflitcs: 0.24, genres: 1.0, sum_popularity: 0
Slot 3: List of  Conflitcs: [np.float64(0.49000000000000005), np.float64(0.3), np.float64(0.37), np.float64(0.24)], List of genres: [0.6, 1.0, 0.8, 1.0], Popularity of the prime slot: 0
Slot of ar

np.float64(1.1180208635178208)

In [48]:
#Random 2
lineup=MFLineupSolution(data=data)
lineup.repr
lineup.fitness(verbose=True)

Slot of artists: [32, 7, 11, 9, 22]
Slot 0: Conflitcs: 0.54, genres: 0.6, sum_popularity: 0
Slot 0: List of  Conflitcs: [np.float64(0.54)], List of genres: [0.6], Popularity of the prime slot: 0
Slot of artists: [25, 4, 19, 20, 24]
Slot 1: Conflitcs: 0.55, genres: 0.8, sum_popularity: 0
Slot 1: List of  Conflitcs: [np.float64(0.54), np.float64(0.55)], List of genres: [0.6, 0.8], Popularity of the prime slot: 0
Slot of artists: [33, 21, 10, 26, 3]
Slot 2: Conflitcs: 0.32, genres: 0.8, sum_popularity: 0
Slot 2: List of  Conflitcs: [np.float64(0.54), np.float64(0.55), np.float64(0.32)], List of genres: [0.6, 0.8, 0.8], Popularity of the prime slot: 0
Slot of artists: [2, 17, 12, 31, 14]
Slot 3: Conflitcs: 0.255, genres: 0.8, sum_popularity: 0
Slot 3: List of  Conflitcs: [np.float64(0.54), np.float64(0.55), np.float64(0.32), np.float64(0.255)], List of genres: [0.6, 0.8, 0.8, 0.8], Popularity of the prime slot: 0
Slot of artists: [0, 34, 13, 8, 1]
Slot 4: Conflitcs: 0.3350000000000001, gen

np.float64(1.0198203419298755)