In [1]:
import numpy as np

**Objective Function:**
```
Obj(θ) = Σ L(y_i, ŷ_i) + Σ Ω(f_k)

Where:
- L: Loss function (e.g., squared error, log loss)
- Ω: Regularization term
- f_k: k-th tree
```

**Regularization Term:**
```
Ω(f) = γT + (1/2)λ||w||²

Where:
- T: Number of leaves
- w: Leaf weights
- γ: Minimum loss reduction (gamma)
- λ: L2 regularization (lambda)
```

**Tree Structure Score (Similarity Score):**
```
Similarity = (Σ g_i)² / (Σ h_i + λ)

Where:
- g_i: First-order gradient
- h_i: Second-order gradient (Hessian)
- λ: L2 regularization
```

**Gain (Split Quality):**
```
Gain = Similarity_left + Similarity_right - Similarity_parent - γ

Where:
- γ: Minimum gain threshold
```

In [2]:
# Calculate similarity score for a leaf
def calculate_similarity(gradients, hessians, lambda_reg=1.0):
    sum_g = np.sum(gradients)
    sum_h = np.sum(hessians)
    similarity = (sum_g ** 2) / (sum_h + lambda_reg)
    return similarity


gradients = np.array([0.5, -0.3, 0.8, -0.2])
hessians = np.array([0.25, 0.15, 0.4, 0.1])

similarity = calculate_similarity(gradients, hessians)
print(f"Similarity score: {similarity:.4f}")

Similarity score: 0.3368
