# **Experiment Notebook**



---
## 0. Setup Environment

### 0.b Disable Warnings Messages

In [1]:
# Do not modify this code
import warnings
warnings.simplefilter(action='ignore')

### 0.c Install Additional Packages

> If you are using additional packages, you need to install them here using the command: `! pip install <package_name>`

### 0.d Import Packages

In [2]:
# <Student to fill this section>
import pandas as pd
import altair as alt
from sklearn.metrics import confusion_matrix, classification_report
import numpy as np

---
## A. Project Description


In [3]:
def print_tile(size="h3", key=None, value=None):
    """
    Prints a formatted tile with a given size, key, and value.
    Args:
        size (str): HTML heading size, e.g., "h3".
        key (str): Unique identifier for the tile.
        value (str): Content to display in the tile.
    """
    from IPython.display import display, HTML

    html = f'<{size} id="{key}">{value}</{size}>'
    display(HTML(html))

In [4]:
# <Student to fill this section>
business_objective = """
The goal of this project is to develop a predictive model to classify student poor performance into categories such as "Excellent," "Good," "Average," and "Poor," enabling early identification of at-risk students. The results will help universities allocate resources effectively, tailor interventions, and improve academic outcomes. Accurate predictions can lead to better student support and institutional success, while incorrect results risk misallocating resources, harming student experiences, and potentially damaging the institution's credibility.
"""

In [5]:
# Do not modify this code
print_tile(size="h3", key='business_objective', value=business_objective)

---
## B. Experiment Description

In [6]:
# <Student to fill this section>
experiment_hypothesis ="""
The hypothesis to test in this project is: **"Student performance can be accurately predicted using key academic, behavioral, and socio-economic features such as GPA, study hours, social media usage, and attendance rates."** The question seeks to determine how well these factors correlate with and influence academic success, enabling classification into performance categories like "Excellent," "Good," "Average," and "Poor."

This hypothesis is worthwhile because accurate predictions can allow institutions to identify at-risk students early, provide tailored interventions, and enhance academic outcomes. It also helps optimize resource allocation by focusing efforts on students who need the most support. By understanding the significant predictors of performance, institutions can make data-driven decisions, enhancing both individual student success and overall educational quality. This insight provides a robust foundation for sustainable improvements in academic strategies.
"""

In [7]:
# Do not modify this code
print_tile(size="h3", key='experiment_hypothesis', value=experiment_hypothesis)

In [8]:
# <Student to fill this section>
experiment_expectations =  """
The expected outcome of the experiment is to create a model that predicts student performance categories—"Excellent," "Good," "Average," "Poor."
### Possible Scenarios:
1. **Best Case**: The model's performance improves, reaching over 80% accuracy with balanced metrics across all classes, enabling effective interventions for students.
2. **Moderate Case**: Metrics remain skewed toward dominant classes ("Poor"), and minority classes like "Excellent" and "Good" have low recall, requiring adjustments in features or algorithms.
3. **Worst Case**: The model fails to generalize, achieving accuracy below 50%, leading to misallocated resources and diminished institutional trust.

These outcomes highlight the need for refining the model to improve predictions for minority classes while leveraging its strength in dominant categories.

"""

In [9]:
# Do not modify this code
print_tile(size="h3", key='experiment_expectations', value=experiment_expectations)

---
## C. Data Understanding

In [10]:
# Do not modify this code
# Load training data
try:
  X_train = pd.read_csv('../data/processed/X_train.csv')
  y_train = pd.read_csv('../data/processed/y_train.csv')

  X_val = pd.read_csv('../data/processed/X_val.csv')
  y_val = pd.read_csv('../data/processed/y_val.csv')

  X_test = pd.read_csv('../data/processed/X_test.csv')
  y_test = pd.read_csv('../data/processed/y_test.csv')
except Exception as e:
  print(e)

---
## D. Feature Selection


In [11]:
# <Student to fill this section>
train_data = pd.concat([X_train, y_train], axis=1)
correlation_matrix = train_data.corr()

correlation_with_y_train = correlation_matrix['target']

# Print the correlations
print(correlation_with_y_train)

student_id                       -0.416043
age                              -0.046171
hsc_year                          0.188994
current _semester                -0.184882
study_hours                       0.236444
social_media_hours               -0.443580
average_attendance               -0.242111
skills_development_hours         -0.048071
previous_gpa                      0.687628
current_gpa                      -0.542631
completed_credits                -0.254523
house_income                     -0.259291
gpa_consistency                   0.332170
social_media_impact               0.464983
income_academic_score            -0.261522
english_proficiency_encoded      -0.063948
birth_country_AU                 -0.076070
birth_country_BR                  0.007581
birth_country_CA                 -0.033834
birth_country_IE                  0.050013
birth_country_IN                  0.051579
birth_country_NZ                 -0.026911
birth_country_PH                  0.005309
birth_count

In [12]:
sorted_correlation = correlation_with_y_train.drop('target').abs().sort_values(ascending=False)

# Select the top 8 columns
top_8_columns = sorted_correlation.head(8).index
top_8_columns


Index(['previous_gpa', 'current_gpa', 'social_media_impact',
       'social_media_hours', 'student_id', 'gpa_consistency',
       'scholarship_Yes', 'scholarship_No'],
      dtype='object')

In [13]:
# Getting the list of column in the X_train df
features_list = X_train.columns
features_list



Index(['student_id', 'age', 'hsc_year', 'current _semester', 'study_hours',
       'social_media_hours', 'average_attendance', 'skills_development_hours',
       'previous_gpa', 'current_gpa', 'completed_credits', 'house_income',
       'gpa_consistency', 'social_media_impact', 'income_academic_score',
       'english_proficiency_encoded', 'birth_country_AU', 'birth_country_BR',
       'birth_country_CA', 'birth_country_IE', 'birth_country_IN',
       'birth_country_NZ', 'birth_country_PH', 'birth_country_TH',
       'birth_country_US', 'birth_country_ZA', 'scholarship_No',
       'scholarship_Yes', 'university_transport_No',
       'university_transport_Yes', 'learning_mode_Offline',
       'learning_mode_Online', 'on_probation_No', 'on_probation_Yes',
       'is_suspended_No', 'is_suspended_Yes', 'relationship_Engaged',
       'relationship_In a relationship', 'relationship_Married',
       'relationship_Single'],
      dtype='object')

In [14]:
# <Student to fill this section>
feature_selection_explanations = """### Feature Selection Rationale
*note all the numerical value are approximate and are subject to change due to randomness*
#### **Selected Features**
The features chosen for the model—`study_hours`, `social_media_hours`, `previous_gpa`, `current_gpa`, and `on_probation_No`—have strong correlations with the target variable and high predictive relevance. For example:
- **`Previous_gpa`**: With a correlation of 0.688, this feature is the strongest predictor of academic performance, reflecting prior achievements.
- **`Current_gpa`**: This complements `previous_gpa` with a correlation of -0.543, offering insights into ongoing trends.
- **`Social_media_hours`**: Correlated at -0.443, this helps capture behavioral patterns that may negatively impact academic outcomes.
- **`Study_hours`**: Correlated at 0.236, this feature provides direct input on time invested in academics.
- **`On_probation_No`**: Adds categorical context related to academic status, enhancing predictive accuracy.

#### **Reasons for Removing Features**
Other features were excluded due to:
1. **Weak Correlations**:
   - Features like `age` (-0.046) and `skills_development_hours` (-0.048) show negligible relationships with the target variable, contributing little to the model's performance.

2. **High Cardinality and Redundancy**:
   - Features such as `birth_country` and `relationship_status` add complexity without significant predictive value.

3. **Potential Noise**:
   - Features like `house_income` (-0.259) and `average_attendance` (-0.242) are less directly connected to academic outcomes and could introduce unnecessary noise into the model.
"""

In [15]:
# Do not modify this code
print_tile(size="h3", key='feature_selection_explanations', value=feature_selection_explanations)

---
## E. Data Preparation

### E.1 Data Transformation Robust Scaler


In [16]:
# <Student to fill this section>

In [17]:
from re import X
from sklearn.preprocessing import RobustScaler

# Initialize the Robust Scaler
scaler = RobustScaler()

# Apply scaling to X_train
X_train_robust = scaler.fit_transform(X_train)

# Convert back to DataFrame for readability
X_train_robust_df = pd.DataFrame(X_train_robust, columns=X_train.columns)

X_train=X_train_robust_df.copy()

In [18]:


# Apply scaling to X_test
X_test_robust = scaler.transform(X_test)

# Convert back to DataFrame for readability
X_test_robust_df = pd.DataFrame(X_test_robust, columns=X_test.columns)

print("\nRobustly Scaled X_test:\n", X_test_robust_df)
X_test=X_test_robust_df.copy()


Robustly Scaled X_test:
      student_id       age  hsc_year  current _semester  study_hours  \
0      0.335521 -0.757605  1.743181          -0.236997    -0.530365   
1      0.132840 -0.395187  0.300273          -0.236997     0.360523   
2      0.603412 -1.120024  1.743181          -0.236997    -0.975809   
3      0.596362  0.692068  1.021727          -0.236997     0.805967   
4      0.659810 -0.395187  0.300273          -0.236997     0.805967   
..          ...       ...       ...                ...          ...   
145   -0.290146 -1.120024  1.743181           1.449055     0.360523   
146   -0.596811 -0.395187  1.743181           1.449055    -0.530365   
147   -0.357119 -0.395187  1.021727           1.449055    -0.530365   
148   -0.468152 -1.120024  1.021727           1.449055    -0.530365   
149    0.094066  0.692068 -3.306997           1.665146    -0.975809   

     social_media_hours  average_attendance  skills_development_hours  \
0             -1.157416            0.419779     

In [19]:

# Apply scaling to X_val
X_val_robust = scaler.transform(X_val)

# Convert back to DataFrame for readability
X_val_robust_df = pd.DataFrame(X_val_robust, columns=X_val.columns)

print("\nRobustly Scaled X_val:\n", X_val_robust_df)
X_val=X_val_robust_df.copy()


Robustly Scaled X_val:
      student_id       age  hsc_year  current _semester  study_hours  \
0      0.355319 -0.557911  0.473041          -1.029949     0.019965   
1      0.356486 -0.920498  1.440363          -1.029949     0.540889   
2      0.169358  0.167262  0.473041          -1.029949    -1.021884   
3      0.128824 -0.920498  1.440363          -1.029949    -0.500960   
4      0.136113 -0.557911  1.440363          -1.029949     0.019965   
..          ...       ...       ...                ...          ...   
143   -0.300068  0.167262 -0.494280           0.663205    -1.021884   
144   -0.283079  0.892436  0.473041           0.663205    -0.500960   
145   -0.295767  0.529849 -0.494280           0.663205     0.019965   
146    0.433066 -0.195325 -1.461601           0.663205    -0.500960   
147    0.458420 -0.195325 -0.494280           0.663205    -0.500960   

     social_media_hours  average_attendance  skills_development_hours  \
0              1.784282           -0.591030      

In [20]:
# <Student to fill this section>
data_transformation_1_explanations = """Data transformation, such as scaling or normalization, is crucial for enhancing the dataset's usability and ensuring robust model performance. For example, scaling using the RobustScaler mitigates the impact of outliers by centering and scaling data within a defined range. This helps in stabilizing variance for skewed features like `study_hours` and `social_media_hours`.

### **Importance**
1. **Improves Algorithm Efficiency**: Algorithms such as Logistic Regression rely on features being on a similar scale for optimal performance. Without scaling, features with larger ranges (e.g., `previous_gpa`) dominate learning, leading to imbalanced models.

2. **Prepares for Robustness**: Features like `current_gpa`, impacted by extreme values, can influence decision boundaries adversely. Transformation controls for these issues.

3. **Addresses Variability in Units**: Since different features measure diverse aspects (e.g., `social_media_hours` in hours vs. `previous_gpa` as a score), scaling ensures uniformity across units.

---

### **Impacts**
- **Enhanced Model Performance**: Improves convergence speed and accuracy by making the data more suitable for algorithms.
- **Balanced Feature Contribution**: Ensures no single feature disproportionately impacts the model, yielding fair and balanced predictions.
- **Improved Generalization**: Scaled data allows the model to make accurate predictions on unseen data.


"""

In [21]:
# Do not modify this code
print_tile(size="h3", key='data_transformation_1_explanations', value=data_transformation_1_explanations)

---
## F. Feature Engineering

### F.1 New Feature "feature selection"



In [22]:
# <Student to fill this section>

In [23]:
from lightgbm import LGBMClassifier
from sklearn.feature_selection import RFE
import pandas as pd

# Initialize the LGBMClassifier
lgbm_model = LGBMClassifier(
    boosting_type='gbdt',
    objective='multiclass',
    num_class=4,  # Number of classes
    random_state=42
)

# Train the LGBM model on X_train and y_train
lgbm_model.fit(X_train, y_train)

# Initialize RFE for feature selection
rfe = RFE(estimator=lgbm_model, n_features_to_select=10, step=1)

# Perform feature selection
X_selected = rfe.fit_transform(X_train, y_train)

# Convert the selected features back to a DataFrame
# Make sure the number of selected features matches the column names
selected_columns = [col for col, selected in zip(X_train.columns, rfe.support_) if selected]
X_train_selected = pd.DataFrame(X_selected, columns=selected_columns)

# Output the selected features for further processing
print("Selected Features:")
print(selected_columns)
X_train=X_train_selected.copy()

[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.001006 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1168
[LightGBM] [Info] Number of data points in the train set: 695, number of used features: 30
[LightGBM] [Info] Start training from score -0.732771
[LightGBM] [Info] Start training from score -1.419948
[LightGBM] [Info] Start training from score -1.553479
[LightGBM] [Info] Start training from score -2.715270
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000274 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1168
[LightGBM] [Info] Number of data points in the train set: 695, number of used features: 30
[LightGBM] [Info] Start training from score -0.732771
[LightGBM] [Info] Start trai

In [24]:

X_val_selected = X_val[selected_columns]

# Output the selected features for further processing
print("Selected Features:")
print(selected_columns)
X_val=X_val_selected.copy()

Selected Features:
['student_id', 'age', 'study_hours', 'social_media_hours', 'average_attendance', 'previous_gpa', 'current_gpa', 'completed_credits', 'gpa_consistency', 'social_media_impact']


In [25]:

X_test_selected = X_test[selected_columns]

# Output the selected features for further processing
print("Selected Features:")
print(selected_columns)
X_test=X_test_selected.copy()

Selected Features:
['student_id', 'age', 'study_hours', 'social_media_hours', 'average_attendance', 'previous_gpa', 'current_gpa', 'completed_credits', 'gpa_consistency', 'social_media_impact']


In [26]:
# <Student to fill this section>
feature_engineering_1_explanations =  """Creating the feature `social_media_impact` is crucial for understanding the relationship between social media usage and study hours. By quantifying how social media hours affect academic performance, this feature provides actionable insights for identifying students at risk due to poor time management or excessive social media consumption.

### **Why It's Important**
1. **Behavioral Insight**:
   - The feature highlights behavioral patterns that impact academic success, enabling targeted support for students struggling with productivity.

2. **Predictive Strength**:
   - With a correlation of 0.464983 to the target, it strengthens the model's ability to differentiate students based on their balance between social media use and study hours.

### **Impacts**
1. **Student Support**:
   - Accurate predictions guide interventions for students overly engaged in social media to improve their study habits.

2. **Resource Optimization**:
   - Institutions can use this feature to design programs promoting better time management and reduce academic underperformance.

3. **Improved Model Accuracy**:
   - Incorporating this feature refines the model's predictive ability by capturing critical behavioral aspects influencing performance.


"""

In [27]:
# Do not modify this code
print_tile(size="h3", key='feature_engineering_1_explanations', value=feature_engineering_1_explanations)

---
## G. Train Machine Learning Model

### G.1 Import Algorithm


In [28]:
# <Student to fill this section>
from lightgbm import LGBMClassifier

In [29]:
# <Student to fill this section>
algorithm_selection_explanations = """
### Why LGBMClassifier is a Good Fit

The `LGBMClassifier` from LightGBM is an excellent algorithm for this project due to its efficiency, flexibility, and ability to handle complex data. Here's why:

1. **Handles Class Imbalance**:
   - LightGBM offers built-in parameters like `class_weight` and `scale_pos_weight`, which help tackle the imbalance in class distributions, ensuring improved predictions for minority categories like "Excellent."

2. **Captures Complex Patterns**:
   - It excels at identifying nonlinear relationships and interactions between features, making it ideal for datasets with diverse features like `study_hours` and `previous_gpa`.

3. **Fast and Efficient**:
   - LightGBM is highly optimized for speed and resource use, enabling quicker model training without sacrificing accuracy, even for large datasets.

4. **Multiclass Support**:
   - It supports multiclass classification, fitting seamlessly with the goal of predicting categories like "Poor," "Average," "Good," and "Excellent."

5. **Feature Importance**:
   - Like other tree-based models, LightGBM provides insights into feature importance, which can be valuable for understanding and improving predictions.


"""

In [30]:
# Do not modify this code
print_tile(size="h3", key='algorithm_selection_explanations', value=algorithm_selection_explanations)

### G.2 Set Hyperparameters

In [31]:
# <Student to fill this section>
# Initialize the LightGBM classifier
lgbm_model = LGBMClassifier(
    boosting_type='gbdt',
    objective='multiclass',
    num_class=4,
    random_state=42
)


In [32]:
# <Student to fill this section>
hyperparameters_selection_explanations = """
### Explanation of Tuned Hyperparameters for LGBMClassifier

1. **`boosting_type='gbdt'`**:
   - Gradient Boosted Decision Trees (GBDT) is a powerful boosting method that combines weak learners to create a strong predictive model. It's highly efficient for classification tasks, especially with multiclass objectives.

2. **`objective='multiclass'`**:
   - This ensures the model is optimized for multiclass classification, aligning with the project goal of predicting student performance categories like "Poor," "Average," "Good," and "Excellent."

3. **`num_class=4`**:
   - Specifies the number of distinct classes to predict, ensuring the model correctly identifies all performance categories.

4. **`random_state=42`**:
   - Guarantees reproducible results by controlling the randomness in data sampling and model training.

---
"""

In [33]:
# Do not modify this code
print_tile(size="h3", key='hyperparameters_selection_explanations', value=hyperparameters_selection_explanations)

### G.3 Fit Model

In [34]:
# <Student to fill this section>

# Fit the model to the training data
lgbm_model.fit(X_train, y_train)


# Predict the classes for the val data
y_pred = lgbm_model.predict(X_val)

# Evaluate the model
print("Confusion Matrix:")
print(confusion_matrix(y_val, y_pred))

print("\nClassification Report:")
print(classification_report(y_val, y_pred))


[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000425 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 822
[LightGBM] [Info] Number of data points in the train set: 695, number of used features: 10
[LightGBM] [Info] Start training from score -0.732771
[LightGBM] [Info] Start training from score -1.419948
[LightGBM] [Info] Start training from score -1.553479
[LightGBM] [Info] Start training from score -2.715270
Confusion Matrix:
[[77 11  0  0]
 [28 13  4  2]
 [ 1  7  3  0]
 [ 2  0  0  0]]

Classification Report:
              precision    recall  f1-score   support

         0.0       0.71      0.88      0.79        88
         1.0       0.42      0.28      0.33        47
         2.0       0.43      0.27      0.33        11
         3.0       0.00      0.00      0.00         2

    accuracy                           0.63       148
   macro avg       0.39      0.36      0.36       148
weighted avg   

### G.4 Model Technical Performance

In [35]:
# <Student to fill this section>
# Predict the classes for the test data
y_pred = lgbm_model.predict(X_test)

# Evaluate the model
print("Confusion Matrix:")
print(confusion_matrix(y_test, y_pred))

print("\nClassification Report:")
print(classification_report(y_test, y_pred))


Confusion Matrix:
[[59 11  0  0]
 [25 19  9  1]
 [ 3 12  3  0]
 [ 8  0  0  0]]

Classification Report:
              precision    recall  f1-score   support

         0.0       0.62      0.84      0.72        70
         1.0       0.45      0.35      0.40        54
         2.0       0.25      0.17      0.20        18
         3.0       0.00      0.00      0.00         8

    accuracy                           0.54       150
   macro avg       0.33      0.34      0.33       150
weighted avg       0.48      0.54      0.50       150



In [36]:
# <Student to fill this section>
model_performance_explanations = """
### Model Performance Explanation
The LightGBM model performs well for the "Poor" category with strong recall (73% test), meeting the primary business goal. However, it struggles with minority classes like "Excellent," likely due to imbalanced data. Techniques like SMOTE or further tuning could improve generalization and accuracy.


### Key Observations
1. **Strengths**:
   - The model performs well for the "Poor" category, aligning with the business objective of identifying at-risk students.
   - Higher recall for "Poor" ensures most at-risk individuals are correctly identified.

2. **Weaknesses**:
   - Struggles with minority categories like "Excellent," likely due to class imbalance.
   - Test performance shows a drop in accuracy and precision compared to validation, indicating generalization issues.

---

"""

In [37]:
# Do not modify this code
print_tile(size="h3", key='model_performance_explanations', value=model_performance_explanations)

### G.5 Business Impact from Current Model Performance


In [38]:
# <Student to fill this section>
business_impacts_explanations = """The LightGBM model achieves reasonable performance for the "Poor" category (class "0.0"), meeting the objective of identifying at-risk students for intervention. High recall (73% on test) ensures most at-risk students are detected."""

In [39]:
# Do not modify this code
print_tile(size="h3", key='business_impacts_explanations', value=business_impacts_explanations)

## H. Experiment Outcomes

In [40]:
# <Student to fill this section>
experiment_outcome = "Hypothesis Partially Confirmed" # Either 'Hypothesis Confirmed', 'Hypothesis Partially Confirmed' or 'Hypothesis Rejected'

In [41]:
# Do not modify this code
print_tile(size="h2", key='experiment_outcomes_explanations', value=experiment_outcome)

In [42]:
# <Student to fill this section>
experiment_results_explanations = """### Reflection on Experiment Outcome

#### **Outcome**
The LightGBM model achieved reasonable success in meeting the core business objective of identifying at-risk students (class "Poor") with strong recall (73% on test data) and satisfactory precision, ensuring effective identification of these students for targeted interventions. However, it struggled with minority classes like "Excellent," due to imbalanced data.

#### **Insights Gained**
1. **Effective Prediction for the Core Objective**:
   - The model performs well for the "Poor" label, indicating that LightGBM is a good fit for prioritizing this category.

2. **Impact of Imbalanced Data**:
   - The poor performance for minority classes highlights the need to address class imbalances to improve overall fairness and prediction accuracy.

3. **Potential Generalization Issues**:
   - The drop in test accuracy to 48% suggests the need for fine-tuning or additional preprocessing to ensure better generalization.

---

### **Rationale for Further Experimentation**
Further experimentation is justified to refine the model's robustness and improve its handling of imbalanced data. Enhancing performance for the "Poor" label, while secondary metrics for minority classes could also be slightly improved, would further support institutional goals.

---

### **Potential Next Steps**
**1. Address Class Imbalance**
   - **Action**: Use techniques like SMOTE or class-weight adjustments.
   - **Expected Uplift**: Improve recall and F1-scores for minority classes like "Excellent."
   - **Priority**: **High**

**2. Hyperparameter Tuning**
   - **Action**: Optimize parameters like `learning_rate`, `num_leaves`, and `max_depth` to improve overall performance.
   - **Expected Uplift**: Higher accuracy and better generalization across classes.
   - **Priority**: **Medium**

**3. Experiment with Advanced Preprocessing**
   - **Action**: Engineer features specific to high-risk students (e.g., interaction terms or targeted transformations).
   - **Expected Uplift**: Boost predictive power for the "Poor" category.
   - **Priority**: **Medium**

---

### **Deployment Recommendation**
If the goal is solely identifying at-risk students and the current performance is satisfactory, the model is ready for production. Deployment steps include:
1. **Monitoring**: Establish a framework to monitor and track predictions over time.
2. **Stakeholder Training**: Ensure users understand the model's strengths and limitations for accurate implementation of interventions.
3. **Documentation**: Outline the model’s objectives, key performance metrics, and areas for potential future refinement.

"""

In [43]:
# Do not modify this code
print_tile(size="h2", key='experiment_results_explanations', value=experiment_results_explanations)