# Binary Black Hole Evolution: Stable Mass Transfer vs Common Envelope
**Group 3 authors:** 
- Bonasera Elias Maria 
- Casellato Alberto (2139206)
- Garbin Nicola (2156363)
- Tamanna Tasneem 


<!---
## 1. Introduction
The goal of this project is to study the formation of binary black holes through:  
- **Stable Mass Transfer (SMT)**
- **Common Envelope (CE)**

We want to analyze how different initial conditions influence the final properties of binary black holes.
-->

## 1. Goal of the Project

The goal of the project is to understand the differences between binary black hole (BBH) mergers whose progenitor stars evolved via common envelope (CE) and those whose progenitors evolved via stable mass transfer (SMT). We analyze a set of simulated binary black holes, focusing on how the different evolutionary paths affect the final properties of the binary black holes.

The idea is to split the analysis into two parts:

- **Classify BBH mergers based on their evolutionary path.**  
  The dataset contains information about whether a BBH system evolved via common envelope (CE) or stable mass transfer (MT). This is recorded in **Column 21** of the dataset:  
  - `True`: The system underwent a common envelope phase.  
  - `False`: The system evolved via stable mass transfer.  
  We will analyze the key features that distinguish these two types of BBH systems.

- **Analyzing the impact of different physical parameters on the formation of BBHs.**  
  The dataset includes simulations with different values of the **alpha parameter** (α = 0.5, 1, 3, 5), which determines the efficiency of the common envelope process. We will investigate how α affects the properties of BBH mergers.

To perform this analysis, we will use visualization techniques and machine learning methods (e.g., Random Forest) to determine the most significant features influencing BBH evolution.


## 2. Theoretical Background

A **binary star system** is composed of two stars that orbit around the center of mass of the system. The most massive star is called the **primary star** (or *donor star*), and the other one is the **secondary star** (or *accretor*). There are different ways in which a star can lose mass:

- **Stellar winds** due to the natural evolution.
- **Roche Lobe overflow**: when a star increases in size due to natural evolution, exceeding a certain boundary. This boundary is defined by two teardrop-shaped equipotential surfaces that connect the two stars, within which the stars are confined. Each teardrop is called the **Roche Lobe** of a star, and the point where they touch is the **Lagrangian point L1**. When stellar matter reaches L1, it is no longer gravitationally bound to the donor star and is transferred to the accretor, initiating a process of mass transfer known as **Roche Lobe overflow**.

From literature, we know that $R_L \propto a \ f(q)$, where:
- $R_L$ is the Roche Lobe radius,
- $a$ is the semi-major axis of the system,
- $q$ is the mass ratio of the two stars.

<img src="images/roche_potential_1.png" width="400">

There are two types of responses a star can have to mass loss, depending on its internal structure:

- **Stable Mass Transfer (SMT)**: The star responds to mass loss by adjusting its size, leading to a gradual transfer of mass to the secondary star. This results in minimal mass loss from the system.
- **Common Envelope (CE)**: The primary star expands uncontrollably, leading to an unstable phase where the outer layers engulf both stars, forming a common envelope.

### 2.1 Common Envelope
The common envelope that surrounds the two stars extends beyond the system's orbit. Due to angular momentum conservation, it rotates slower than the two stars inside. This difference in velocity creates friction forces, which extract energy from the system, causing the two stars to spiral inward.

There are two possible outcomes:

- The system transfers enough energy to the common envelope to eject it outward, leaving behind a shrunken orbit and significant mass loss.
- The system fails to transfer enough energy, leading to the merger of both stars inside the envelope, forming a single object surrounded by the ejected material.

The **$\alpha$ parameter** is a free parameter that quantifies the efficiency of energy transfer between the system and the common envelope. It determines how much orbital energy is used to eject the envelope. The larger $\alpha$, the more efficient the energy transfer needs to be for the system to survive as a binary.

<!--A higher \(\alpha_{\text{CE}}\) generally leads to more efficient envelope ejection, less orbital shrinkage, and a higher likelihood of the binary system surviving the CE phase. However, the exact value of \(\alpha_{\text{CE}}\) is highly uncertain and can vary depending on the specific system and the physics of the CE process.-->

<img src="images/common_envelope_evolution.png" width="400">


<!--
Da aggiungere alle conclusioni:
- CE processo più dissipativo, infatti crea buchi neri di massa più piccola, mentre SMT crea BH più massicci
-->

In [1]:
from IPython.display import Image
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import matplotlib as mpl
from matplotlib.ticker import MultipleLocator
import numpy as np
import sys
sys.path.append("src")

from data_preprocessing import load_data

### Data_preprocessing.py
The python file `data_preprocessing.py` contains the function **load_data(folder)** that organizes the data files from a given folder into a nested dictionary.

- It scans each subdirectory of the folder. Each one represents a different value of the alpha parameter
- It reads the data files contained in each subdirectory, corresponding to a different metallicity value
- Each file is loaded into a pandas dataframe, skipping the first two rows which contain metadata.

- The function returns a nested dictionary with first-level and second level-keys and the values are pandas dataframes:
   - First lever-keys represent alpha values
   - Second level-keys represent metallicity values


In [2]:
import pandas as pd
import os

def load_data(folder):
    dict_dataset = {}
    for filename1 in os.listdir(folder):
        folder_path = os.path.join(folder, filename1)
        list_dataset = {}

        for filename2 in os.listdir(folder_path):
            file_path = os.path.join(folder_path, filename2)
            df = pd.read_csv(file_path, delimiter=" ", skiprows=2)
            list_dataset[filename2.split("_")[2].split(".txt")[0]] = df
        
        dict_dataset[filename1] = list_dataset
    
    return dict_dataset

In [3]:
# plot settings
major_ticks_length = 10
minor_ticks_length = major_ticks_length / 2
label_size = 20
mpl.rcParams['axes.linewidth'] = 1.1

## 3. Dataset

The data is provided in the file `stable_MT_vs_CE.tgz`, which contains four folders: A05, A1, A3 and A5. Each folder corresponds to a different value of the **alpha parameter** (α = 0.5, 1, 3, 5), which regulates the efficiency of the common envelope process.  

Inside each folder, there are **12 data files** named MTCE_BBHs_*.txt, where `*` is a number representing the stellar metallicity (from **0.0002 to 0.02**) of the simulation.

Each file follows this structure:

- **Row 0:** Header for Row 1.  
- **Row 1:** Two columns:  
  - **Column 0:** Total simulated stellar mass (M☉).  
  - **Column 1:** Number of simulated binary black hole mergers.
- **Row 2:** Header for the data columns.


  
### Selected Features for Analysis:
- **Basic Properties:**
  - `Column 0`: Identifier of the binary system.
  - `Column 1`: Initial mass (ZAMS) of the primary star (M☉).
  - `Column 2`: Initial mass (ZAMS) of the secondary star (M☉).
  - `Column 3`: Mass of the black hole formed from the primary star (M☉).
  - `Column 4`: Mass of the black hole formed from the secondary star (M☉).
  - `Column 5`: Mass of the merger remnant (M☉).

- **Evolutionary and Orbital Properties:**
  - `Column 6`: Delay time (time from the formation of the binary system to BBH merger, in Myr).
  - `Column 7`: Semi-major axis of the binary at the formation of the second black hole (R☉).
  - `Column 8`: Orbital eccentricity. 

- **Supernova:**
  - `Column 9`: Magnitude of the supernova kick (km/s) for the primary black hole.
  - `Column 10`: Magnitude of the supernova kick (km/s) for the secondary black hole.
  - `Column 11`: Cosine of the tilt angle (before and after supernova) for the primary black hole.
  - `Column 12`: Cosine of the tilt angle (before and after supernova) for the secondary black hole.

- **Center-of-Mass Velocities:**
  - `Columns 13-18`: x, y, z components of the center-of-mass velocity after the supernova explosion of the primary and secondary components.

- **Supernova Timings:**
  - `Column 19`: Time at which the primary component undergoes a supernova.
  - `Column 20`: Time at which the secondary component undergoes a supernova.

- **Key Classification Label:**
  - `Column 21`: **Binary evolution path**  
    - `True`: System that underwent a common envelope phase.  
    - `False`: System that evolved via stable mass transfer.

In [4]:
data = load_data("data")

In [5]:
def plot_histogram(axes, data_1, data_2, metallicities, bins=100, xlim=None, xlabel="", title=""):
    """
    Plots histograms for two sets of data with the same bins and xlim.
    """ 
    cmap = cm.plasma
    colors = [cmap(i / len(data_1)) for i in range(len(data_1))]
    if xlim:
        bins = np.linspace(xlim[0], xlim[1], bins)
    
    # plots
    for i in range(len(data_1)):
        for k,data in enumerate([data_1, data_2]):
            axes[k].hist(data[i], bins=bins, histtype="step", label=f"Z={metallicities[i]}", 
                         edgecolor=colors[i], linewidth=2)
    
    # axes settings
    # axes[0].legend(ncol=2, loc="upper right", fontsize="small")
    axes[0].tick_params(axis='both', which='both', direction="in", top=True, right=True, labelbottom=False)
    axes[1].tick_params(axis='both', which='both', direction="in", top=True, right=True, labelbottom=True)
    
    # axes[0].set_ylabel("$N$ Systems\n (Stable Mass Transfer)", fontsize=label_size)
    # axes[1].set_ylabel("$N$ Systems\n (Common Envelope)", fontsize=label_size)
    axes[1].set_xlabel(xlabel, fontsize=label_size)

    for ax in axes:
        if xlim:
            ax.set_xlim(xlim)
        ax.set_yscale("log")
        ax.tick_params(which="major", length=major_ticks_length, labelsize=label_size)
        ax.tick_params(which="minor", length=minor_ticks_length)

    # fig.suptitle(title, fontsize=label_size+2)
    plt.tight_layout()


def plot_stackplot(fractions_dict, metallicities, colors):
    """
    Plots a stackplot showing fractions of stable mass transfer and common envelope systems.
    """    
    fig, axes = plt.subplots(1, 4, figsize=(6*4, 8), sharex=True, sharey=True, 
                             gridspec_kw={"hspace": 0, "wspace": 0})
    
    sorted_alpha = sorted(list(fractions_dict.keys()), key=lambda x: float(x[1:]))
    
    for i,alpha in enumerate(sorted_alpha):
        axes[i].stackplot(
            metallicities,
            fractions_dict[alpha][0],
            fractions_dict[alpha][1],
            labels=["Stable Mass Transfer", "Common Envelope"],
            colors=colors,
            edgecolor="k",
            alpha=0.8
        )    
        axes[i].tick_params(which="major", direction="in", length=major_ticks_length, labelsize=label_size+2)
        axes[i].tick_params(which="minor", direction="in", length=minor_ticks_length)
        
        if i == 0:
            axes[i].legend(loc="upper right", fontsize=label_size)
            axes[i].set_ylabel("Fraction", fontsize=label_size+7, labelpad=20)

        axes[i].set_xlim(min(metallicities), max(metallicities))
        axes[i].set_ylim(0,1.)
        axes[i].set_xscale("log")
        axes[i].set_title(f"$\\alpha$ = {alpha[1:]}", fontsize=label_size+3)
        axes[i].grid(linestyle="--", which="both", linewidth=1, color="dimgrey")
        # axes[i].set_xlabel("Metallicity", fontsize=label_size)
        # plt.ylabel("Fraction", fontsize=label_size)
        # plt.title(f"CE vs Stable Mass Transfer Fractions ($\\alpha={alpha})$", fontsize=label_size+2)
        
    fig.supxlabel("Metallicity", fontsize=label_size+7, y=0.01)
    fig.suptitle("Fraction of Systems with SMT and CE", fontsize=label_size+10, y=1)
    
    plt.tight_layout()
    plt.savefig("images/stackplot.png", bbox_inches='tight')
    plt.close(fig)

## **Descriptions of the Functions**

### **`plot_histogram` Function**
The function `plot_histogram` generates histograms for two sets of data, representing different evolutionary scenarios (SMT or CE) for different metallicities. 

#### **How it works:**
- It takes, as input, two datasets (`data_1` and `data_2`), a list of corresponding metallicities, and optional parameters such as the number of bins, x-axis limits, labels, and a title.
- The function uses the Plasma colormap to assign a color to different values of metallicity. Lower metallicities are represented with darker colors, and higher metallicities with lighter ones.
- If `xlim` is provided, it defines the range of bins for the histogram.
- The function iterates through each dataset and plots step-style histograms for both data sets on the given axes.
- On the y-axis, a logarithmic scale is applied to better represent the distributions, since they cover several orders of magnitude.


### **`plot_stackplot` Function**
The `plot_stackplot` function creates a stack plot that shows the fractions of systems that evolve through Stable Mass Transfer (SMT) and Common Envelope (CE) as a function of metallicity.

#### **How it works:**
- It takes as input a dictionary (`fractions_dict`) containing the fractions of systems for different \(\alpha\) values, a list of metallicities, and a color palette.
- A figure with four subplots is generated, where each subplot represents a different value of the \(\alpha\) parameter.
- The function, for each value of alpha:
  - Plots a stackplot where the fractions of SMT and CE systems are stacked and indicated with different colors.
  - Adds a legend to label the SMT and CE components.
  - Applies a logarithmic scale to the x-axis, that represents the metallicities.
- The final figure is saved as an image file (`stackplot.png`).



In [6]:
properties_alpha = {}
fractions_alpha = {}
list_metallicites = []

alpha_sorted = sorted(data.keys(), key=lambda x: float(x[1:]))
for a in alpha_sorted:
    dict_df = data[a]
    alpha = float(a[1:])
    
    masses_smt = []
    masses_ce = []
    masses2_smt = []
    masses2_ce = []
    masses_rate_smt = []
    masses_rate_ce = []
    masses_BH_smt = []
    masses_BH_ce = []
    masses_rate_BH_smt = []
    masses_rate_BH_ce = []
    delay_time_smt = []
    delay_time_ce = []
    list_sma_smt =[]
    list_sma_ce = []
    list_e_smt =[]
    list_e_ce = []
    
    smt_fractions = []
    ce_fractions = []
    
    metallicities = sorted(dict_df.keys(), key=lambda x: float(x))
    list_metallicites = [float(m) for m in metallicities]
    for i,z in enumerate(metallicities):
        df = dict_df[z]
        
        # data selection
        df_smt = df[df["col.21:CE"]==0]
        df_ce = df[df["col.21:CE"]==1]
        smt_count = df_smt.shape[0]
        ce_count = df_ce.shape[0]
        total_count = smt_count + ce_count
        smt_fractions.append(smt_count/total_count)
        ce_fractions.append(ce_count/total_count)
        
        m1_smt = df_smt["col.1:m1ZAMS/Msun"]
        m1_ce = df_ce["col.1:m1ZAMS/Msun"]
        m2_smt = df_smt['col.2:m2ZAMS/Msun']
        m2_ce = df_ce['col.2:m2ZAMS/Msun']
        
        m1_BH_smt = df_smt["col.3:m1rem/Msun"]
        m1_BH_ce = df_ce["col.3:m1rem/Msun"]
        m2_BH_smt = df_smt['col.4:m2rem/Msun']
        m2_BH_ce = df_ce['col.4:m2rem/Msun']
        
        delay_smt = df_smt['col.6:delay_time/Myr']/1000
        delay_ce = df_ce['col.6:delay_time/Myr']/1000
        
        sma_smt = df_smt['col.7:sma/Rsun']
        sma_ce = df_ce['col.7:sma/Rsun']
        
        e_smt = df_smt['col.8:ecc']
        e_ce = df_ce['col.8:ecc']
        
        # mass data
        masses_smt.append(m1_smt)
        masses_ce.append(m1_ce)
        
        # mass 2 data
        masses2_smt.append(m2_smt)
        masses2_ce.append(m2_ce)
        
        # mass rate data
        masses_rate_smt.append(m2_smt/m1_smt)
        masses_rate_ce.append(m2_ce/m1_ce)
        
        # mass BH data
        masses_BH_smt.append(m1_BH_smt)
        masses_BH_ce.append(m1_BH_ce)
        
        # mass rate BH data
        masses_rate_BH_smt.append(m2_BH_smt/m1_BH_smt)
        masses_rate_BH_ce.append(m2_BH_ce/m1_BH_ce)
        
        # delay time data
        delay_time_smt.append(delay_smt)
        delay_time_ce.append(delay_ce)
        
        # semi-major axis data
        list_sma_smt.append(sma_smt)
        list_sma_ce.append(sma_ce)
        
        # eccentricity data
        list_e_smt.append(e_smt)
        list_e_ce.append(e_ce)
    
    properties_alpha[a] = [masses_smt, masses_ce, masses2_smt, masses2_ce, masses_rate_smt, masses_rate_ce, masses_BH_smt, 
                         masses_BH_ce, masses_rate_BH_smt , masses_rate_BH_ce , list_e_smt, list_e_ce]
    fractions_alpha[a] = [smt_fractions, ce_fractions]
    
xlabel_list = [
    "Primary Mass [M$_\\odot$]", 
    "Secondary Mass [M$_\\odot$]", 
    "Mass ratio [$M_2/M_1$]", 
    "Primary BH Mass [M$_\\odot$]", 
    "BH mass ratio [$M_2/M_1$]",
    "Eccentricity"]

titles_list = [
    "ZAMS Mass Distribution of Primary Stars",
    "ZAMS Mass Distribution of Secondary Stars",
    "ZAMS Mass Ratio of Stars",
    "Mass Distribution of Primary BHs",
    "Mass Ratio of BHs",
    "Orbital Eccentricity of BBHs"
]

bins_list=[50, 50, 50, 50, 50, 30]
xlimlist=[[10, 160], [-2, 162], [0.1, 1.05], [0,49.5], [0, 4.9], None]
maxlen=50

for k in range(0, len(properties_alpha["A1"]) - 1, 2):        
    fig, axes = plt.subplots(2, 4, figsize=(28, 12), sharex=True, sharey=True, 
                             gridspec_kw={"hspace": 0, "wspace": 0})

    for i, alpha in enumerate(list(properties_alpha.keys())):
        pr1, pr2 = properties_alpha[alpha][k], properties_alpha[alpha][k+1]
        for j in range(len(pr1)):
            if len(pr1[j])<maxlen:
                pr1[j]=[]
            if len(pr2[j])<maxlen:
                pr2[j]=[]
        plot_histogram([axes[0][i], axes[1][i]], pr1, pr2, metallicities, bins=bins_list[k//2], xlim=xlimlist[k//2])
        
        if i == 0:
            axes[0][i].set_ylabel("$N$ Systems\n (Stable Mass Transfer)", fontsize=label_size+7)
            axes[1][i].set_ylabel("$N$ Systems\n (Common Envelope)", fontsize=label_size+7)
        
        axes[0][i].set_title(f"$\\alpha$ = {alpha[1:]}", fontsize=label_size+3)

    fig.supxlabel(xlabel_list[k//2], fontsize=label_size+10, y=-0.04)  
    fig.suptitle(titles_list[k//2], fontsize=label_size+15, y=1.05)

   
    handles, labels = axes[0][-1].get_legend_handles_labels()
    fig.legend(handles, labels, loc="center right", fontsize=label_size, title="Metallicity", 
        title_fontsize=18, frameon=False, bbox_to_anchor=(0.98, 0.5))

    plt.subplots_adjust(right=0.88)

    fname = xlabel_list[k//2]
    plt.savefig(f"images/{fname.split(' [')[0]}.png", bbox_inches='tight')
    plt.close(fig)

plot_stackplot(fractions_alpha, list_metallicites, colors=["#298c8c", "#f1a226"])

# 4. Visualization  

## 4.1 Property Distributions  

To analyze the impact of different parameters on the evolution of binary systems, we compare the distribution of key properties across various values of α, considering different metallicities and evolutionary pathways (Stable Mass Transfer, SMT, and Common Envelope, CE).  

Each figure consists of two rows:  

- The first row represents systems that evolve through **Stable Mass Transfer (SMT)**.  
- The second row corresponds to systems undergoing **Common Envelope (CE) evolution**.  
- Each column illustrates a different value of the **α parameter**, which governs the efficiency of energy transfer during CE evolution.  

The histograms within each plot show the distribution of a specific property for different **metallicities**. Lower metallicities are represented by **darker colors**, while higher metallicities appear in **lighter shades**. High metallicity (Z) is only observed in common envelope (CE) systems with a decreasing trend as alpha increases.

Since the α parameter only affects **CE evolution**, the first row (SMT) is expected to remain consistent across different α values. This consistency allows for a direct comparison with the second row (CE), where variations due to α can be observed. 

---

<img src="images/Primary Mass.png">

The figure illustrates the distribution of the ZAMS masses of the primary star. As alpha increases, there is a reduction in the number of primary stars with higher mass formed through common envelope (CE). When $\alpha \ge 3$,  systems with a high primary star mass that evolve through stable mass transfer (SMT) become more prevalent than those that undergo CE.

---

<img src="images/Secondary Mass.png">

The figure shows the distribution of the ZAMS masses of the secondary star. As alpha increases, the CE distribution tends to follow the SMT distribution, indicating that an efficient envelope ejection produces systems that resemble SMT binaries.

---

<img src="images/Mass ratio.png">

The figure shows the distribution of the ZAMS mass ratio, representing the secondary star mass over the primary star mass. For stars with similar masses (≥0.6), no stable mass transfer (SMT) occurs. As alpha increases, the number of high metallicity systems (CE) decreases.

---

<img src="images/Primary BH Mass.png">

The figure shows the distribution of the masses of the primary black holes. For alpha ≥ 3, there are fewer systems with high primary black hole mass formed through CE evolution. Systems with high metallicity form black holes with lower masses in CE systems. The primary BH masses of SMT systems follow a uniform distribution of black hole masses, with no masses below 10 M☉.

---

<img src="images/BH mass ratio.png">

The figure shows the distribution of the black hole mass ratio, defined as the secondary black hole mass over the primary black hole mass. The mass ratio tends to invert once a binary black hole (BBH) system is formed, with the ratio sometimes reaching values greater than or equal to 1, especially for alpha ≥ 3. In low-metallicity common envelope (CE) systems, the ratio remains between 0 and 1. For stable mass transfer (SMT), the mass ratio stays small, which implies a minimal change in the mass ratio of the system. In contrast, CE systems experience significant mass loss, leading to an inversion of the ratio (M2 > M1).

---

<img src="images/Eccentricity.png">

This image shows the distribution of orbital eccentricity of binary black holes (BBHs). Systems with common envelope evolution occupy the entire eccentricity range between 0 and 1, while other systems are limited to a narrower range (≤ 0.3). Some systems have eccentricity equal to 1, which indicates that they are no longer bound. 

---

### 4.2 Stackplot of Evolution vs Metallicity
Another method we use to compare the two classes is the visualization of the fraction of systems that evolved through SMT and CE as function of metallicity. There will be a plot for each value of alpha and the fraction of systems is normalized. In yellow are represented the systems that underwent CE and in green the ones that formed through SMT. 

<img src="images/stackplot.png">

There is a majority of systems with CE for all metallicities and for every value of the alpha parameter. As alpha increases, in the region between $10^{-3}$ and $10^{-2}$ the fraction of systems with SMT increases and exceeds the fraction of CE systems.