# Boost up AI 2025: 신약 개발 경진대회

주제: 인체 내 약물 대사에 관여하는 CYP3A4 효소 저해 예측 모델 개발   
목표: 화합물의 Canonical SMILES 정보로부터 CYP3A4 효소 저해율 (% Inhibition)을 예측하는 회귀 모델 개발, 학습 데이터 1,681종을 기반으로 테스트 데이터의 저해율을 예측하여 제출

평가 산식:
- ① Normalized RMSE
- ② Pearson Correlation Coefficient
- Score = 0.5 x (1 - min(①, 1)) + 0.5 x ②

활용 기법
- ① Molecular Fingerprints
    - 화합물의 분자 구조를 이진 벡터로 표현하는 방식
    - 분자의 구조적 유사성을 수치로 표현하여, 머신러닝 모델이 화합물의 특징을 효과적으로 학습할 수 있도록 하기 위해 사용
    - Morgan Fingerprint (ECFP): 원자 주변 환경을 반영하는 Circular Fingerprint
    - MACCS Keys: 166개의 주요 서브구조 존재 여부를 체크하는 키셋

- ② 2D Chemical Descriptors
    - RDKit으로 계산한 화학적 특성 값 (ex. Molecular Weight, LogP(지용성), TPSA(극성 표면적), 수소 결합 기증자/수용자 수 등)
    - 저해율에 영향을 미치는 물리화학적 성질(용해도, 극성, 크기 등)을 보완적 정보로 제공해 모델 성능을 향상시키기 위함
    - Molecular Weight, TPSA, LogP 등 RDKit 기반 2D 특성

- ③ Mol2Vec Embedding
    - 화합물의 부분 구조(Substructure)를 단어로 간주하여 Word2Vec 방식으로 벡터화
    - Fingerprint가 갖지 못하는 연속적, 의미적 유사성을 학습하여 구조 정보의 표현력을 더 풍부하게 하기 위함

- ④ Ensemble Learning
    - 여러 모델의 예측 결과를 평균하여 예측
    - 서로 다른 모델의 편향을 상쇄하고, 과적합 위험을 낮추기 위해 사용 (K-Fold로 OOF(Out-of-Fold, 모델이 자기 학습에 쓰지 않은 Fold에서 예측한 값들의 모음) 예측을 만들어 안정성 평가)
    - LightGBM + XGBoost + CatBoost 모델 Ensemble

파이썬에서 화학구조를 그릴 수 있는 **rdkit** 라이브러리를 다운받아 활용한다.   
SMILES(분자를 문자열로 표현하는 방법)울 통해 분자를 분석하고자 한다.   

[사전학습 모델]
- 모델명: Mol2Vec 300dim pretrained embedding
- 출처 URL: https://github.com/samoturk/mol2vec
- 모델 다운로드: https://www.kaggle.com/datasets/christang0002/mol2vec-and-protvec?resource=download
- 라이선스: MIT License

### Library Import

In [11]:
import numpy as np
import pandas as pd
from rdkit import Chem
from rdkit.Chem import AllChem, Descriptors
from sklearn.preprocessing import StandardScaler
from lightgbm import LGBMRegressor
from sklearn.model_selection import KFold
import xgboost as xgb
from catboost import CatBoostRegressor
from gensim.models import Word2Vec
from mol2vec.features import MolSentence, mol2alt_sentence, sentences2vec
from sklearn.model_selection import train_test_split
from sklearn.decomposition import PCA
import lightgbm as lgb
from sklearn.metrics import mean_squared_error

### Load Data

In [12]:
train = pd.read_csv('train.csv')
test = pd.read_csv('test.csv')
submission = pd.read_csv('sample_submission.csv')

print(train.shape, test.shape, submission.shape)

(1681, 3) (100, 2) (100, 2)


### Feature Engineering

In [13]:
# Mol2Vec 모델 로드
model_mol2vec = Word2Vec.load('model_300dim.pkl')
print("모델 Vocabulary 크기:", len(model_mol2vec.wv))
print("벡터 차원:", model_mol2vec.vector_size)

모델 Vocabulary 크기: 21003
벡터 차원: 300


In [14]:
# Fingerprint 생성 함수
def get_fingerprint(smiles):
    try:
        mol = Chem.MolFromSmiles(smiles)
        if mol is None:
            return [0] * 2048

        fp = AllChem.GetMorganFingerprintAsBitVect(mol, radius=2, nBits=2048)
        return [int(x) for x in fp.ToBitString()]

    except Exception as e:
        print(f"Error: {e}")
        return [0] * 2048

# 2D Descriptor 생성 함수
def get_2d_descriptors(smiles):
    try:
        mol = Chem.MolFromSmiles(smiles)
        if mol is None:
            return [0] * len(Descriptors.descList)
        return [desc(mol) for name, desc in Descriptors.descList]
    except:
        return [0] * len(Descriptors.descList)
    
# Mol2Vec 벡터 생성 함수
def get_mol2vec_embedding(smiles, model):
    try:
        mol = Chem.MolFromSmiles(smiles)
        if mol is None:
            return [0]*300
        sentence = mol2alt_sentence(mol, radius=1)
        embedding = sentences2vec([sentence], model, unseen='UNK')
        return embedding[0]
    except:
        return [0]*300

print("Fingerprint 생성 중...")
train['fingerprint'] = train['Canonical_Smiles'].apply(get_fingerprint)
test['fingerprint'] = test['Canonical_Smiles'].apply(get_fingerprint)

print("2D Descriptor 생성 중...")
train['desc'] = train['Canonical_Smiles'].apply(get_2d_descriptors)
test['desc'] = test['Canonical_Smiles'].apply(get_2d_descriptors)

print("Mol2Vec 생성 중...")
train['mol2vec'] = train['Canonical_Smiles'].apply(lambda x: get_mol2vec_embedding(x, model_mol2vec))
test['mol2vec'] = test['Canonical_Smiles'].apply(lambda x: get_mol2vec_embedding(x, model_mol2vec))


# Concatenate
X_train = np.array([
    fp + desc + mol2 for fp, desc, mol2 in zip(
        train['fingerprint'], train['desc'], train['mol2vec']
    )
])
X_test = np.array([
    fp + desc + mol2 for fp, desc, mol2 in zip(
        test['fingerprint'], test['desc'], test['mol2vec']
    )
])

y_train = train['Inhibition'].values

print("최종 Feature Shape:", X_train.shape)
print("테스트 Feature Shape:", X_test.shape)


Fingerprint 생성 중...




2D Descriptor 생성 중...
Mol2Vec 생성 중...




최종 Feature Shape: (1681, 2565)
테스트 Feature Shape: (100, 2565)


### Scaling & PCA & Feature Importance

**PCA**는 적용하기 전과 후가 차이가 아예 없으므로 적용하지 않는다.

In [22]:
# Feature Scaling
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

# PCA
#pca = PCA(n_components=300, random_state=42)
#X_train_pca = pca.fit_transform(X_train_scaled)
#X_test_pca = pca.transform(X_test_scaled)

# Feature Importance
model = lgb.LGBMRegressor(
    n_estimators=1000,
    learning_rate=0.02,
    num_leaves=64,
    max_depth=7,
    subsample=0.8,
    colsample_bytree=0.8,
    reg_alpha=0.1,
    reg_lambda=1,
    random_state=42
)
model.fit(X_train_scaled, y_train)

# 중요도 정렬
importances = model.feature_importances_
indices = np.argsort(importances)[::-1]

# 상위 1000개 선택
top_indices = indices[:300]

# Feature 선택
X_train_top = X_train_scaled[:, top_indices]
X_test_top = X_test_scaled[:, top_indices]

print("선택된 Feature shape:", X_train_top.shape)

[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.027412 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 25380
[LightGBM] [Info] Number of data points in the train set: 1681, number of used features: 960
[LightGBM] [Info] Start training from score 33.221831
선택된 Feature shape: (1681, 300)


### KFold + LightGBM + XGBoost Ensemble

In [23]:
def normalized_rmse(y_true, y_pred):
    rmse = np.sqrt(mean_squared_error(y_true, y_pred))
    return rmse / (np.max(y_true) - np.min(y_true))

def pearson_correlation(y_true, y_pred):
    corr = np.corrcoef(y_true, y_pred)[0, 1]
    return np.clip(corr, 0, 1)

def competition_score(y_true, y_pred):
    nrmse = min(normalized_rmse(y_true, y_pred), 1)
    pearson = pearson_correlation(y_true, y_pred)
    return 0.5 * (1 - nrmse) + 0.5 * pearson


kf = KFold(n_splits=5, shuffle=True, random_state=42)

oof_preds_lgb = np.zeros(X_train_top.shape[0])
oof_preds_xgb = np.zeros(X_train_top.shape[0])
oof_preds_cat = np.zeros(X_train_top.shape[0])

test_preds_lgb = np.zeros(X_test_top.shape[0])
test_preds_xgb = np.zeros(X_test_top.shape[0])
test_preds_cat = np.zeros(X_test_top.shape[0])

fold_scores = []

for fold, (train_idx, val_idx) in enumerate(kf.split(X_train_top)):
    print(f"\n========== Fold {fold+1} ==========")
    
    X_tr, X_val = X_train_top[train_idx], X_train_top[val_idx]
    y_tr, y_val = y_train[train_idx], y_train[val_idx]
    
    # LightGBM
    model_lgb = lgb.LGBMRegressor(
        n_estimators=5000,
        learning_rate=0.01,
        num_leaves=64,
        max_depth=7,
        subsample=0.8,
        colsample_bytree=0.8,
        reg_alpha=0.1,
        reg_lambda=1,
        random_state=fold
    )
    callbacks = [
        lgb.early_stopping(stopping_rounds=100),
        lgb.log_evaluation(period=200)
    ]
    model_lgb.fit(
        X_tr, y_tr,
        eval_set=[(X_val, y_val)],
        callbacks=callbacks
    )
    oof_preds_lgb[val_idx] = model_lgb.predict(X_val)
    test_preds_lgb += model_lgb.predict(X_test_top) / kf.n_splits

    # XGBoost Native API
    dtrain = xgb.DMatrix(X_tr, label=y_tr)
    dval = xgb.DMatrix(X_val, label=y_val)
    dtest = xgb.DMatrix(X_test_top)

    params = {
        'objective': 'reg:squarederror',
        'eval_metric': 'rmse',
        'learning_rate': 0.01,
        'max_depth': 7,
        'subsample': 0.8,
        'colsample_bytree': 0.8,
        'reg_alpha': 0.1,
        'reg_lambda': 1,
        'seed': fold,
        'tree_method': 'hist'
    }

    watchlist = [(dtrain, 'train'), (dval, 'eval')]

    model_xgb = xgb.train(
        params,
        dtrain,
        num_boost_round=5000,
        evals=watchlist,
        early_stopping_rounds=100,
        verbose_eval=200
    )

    oof_preds_xgb[val_idx] = model_xgb.predict(dval)
    test_preds_xgb += model_xgb.predict(dtest) / kf.n_splits

    # CatBoost
    model_cat = CatBoostRegressor(
        iterations=5000,
        learning_rate=0.01,
        depth=7,
        subsample=0.8,
        l2_leaf_reg=3,
        random_seed=fold,
        verbose=200,
        early_stopping_rounds=100
    )
    model_cat.fit(
        X_tr, y_tr,
        eval_set=(X_val, y_val)
    )
    oof_preds_cat[val_idx] = model_cat.predict(X_val)
    test_preds_cat += model_cat.predict(X_test_top) / kf.n_splits

    # 앙상블 (3개 모델 평균)
    oof_preds_avg = (0.4 * oof_preds_lgb[val_idx] + 
                     0.3 * oof_preds_xgb[val_idx] + 
                     0.3 * oof_preds_cat[val_idx])

    nrmse = normalized_rmse(y_val, oof_preds_avg)
    pearson = pearson_correlation(y_val, oof_preds_avg)
    score = competition_score(y_val, oof_preds_avg)
    fold_scores.append(score)
    
    print(f"Fold {fold+1} - NRMSE: {nrmse:.4f}, Pearson: {pearson:.4f}, Score: {score:.4f}")

# 전체 OOF 평균
oof_preds_final = (oof_preds_lgb + oof_preds_xgb + oof_preds_cat) / 3

nrmse_total = normalized_rmse(y_train, oof_preds_final)
pearson_total = pearson_correlation(y_train, oof_preds_final)
score_total = competition_score(y_train, oof_preds_final)

print("\n========== 전체 OOF Score (Ensemble) ==========")
print(f"NRMSE: {nrmse_total:.4f}")
print(f"Pearson: {pearson_total:.4f}")
print(f"Score: {score_total:.4f}")



[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.006042 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 22627
[LightGBM] [Info] Number of data points in the train set: 1344, number of used features: 272
[LightGBM] [Info] Start training from score 33.391242
Training until validation scores don't improve for 100 rounds
[200]	valid_0's l2: 533.906
[400]	valid_0's l2: 523.548
[600]	valid_0's l2: 517.91
[800]	valid_0's l2: 513.969
[1000]	valid_0's l2: 512.228
[1200]	valid_0's l2: 511.073
[1400]	valid_0's l2: 511.004
Early stopping, best iteration is:
[1353]	valid_0's l2: 510.604
[0]	train-rmse:26.28425	eval-rmse:26.31020




[200]	train-rmse:13.30029	eval-rmse:23.18992
[400]	train-rmse:8.75633	eval-rmse:22.93146
[600]	train-rmse:6.23264	eval-rmse:22.81869
[800]	train-rmse:4.42780	eval-rmse:22.75358
[1000]	train-rmse:3.17932	eval-rmse:22.70963
[1200]	train-rmse:2.27293	eval-rmse:22.69279
[1400]	train-rmse:1.60820	eval-rmse:22.68024
[1545]	train-rmse:1.24933	eval-rmse:22.67776
0:	learn: 26.3599947	test: 26.3425905	best: 26.3425905 (0)	total: 9.51ms	remaining: 47.5s
200:	learn: 21.8056898	test: 23.7407993	best: 23.7407993 (200)	total: 978ms	remaining: 23.3s
400:	learn: 19.6243807	test: 23.2562882	best: 23.2562882 (400)	total: 1.97s	remaining: 22.5s
600:	learn: 17.9946234	test: 22.9837508	best: 22.9837508 (600)	total: 2.97s	remaining: 21.8s
800:	learn: 16.4321092	test: 22.8232610	best: 22.8222870 (799)	total: 3.97s	remaining: 20.8s
1000:	learn: 14.8417352	test: 22.7423410	best: 22.7322926 (960)	total: 4.94s	remaining: 19.8s
1200:	learn: 13.3184792	test: 22.7000731	best: 22.6939365 (1138)	total: 5.9s	remaining:



[200]	train-rmse:13.39907	eval-rmse:23.04316
[400]	train-rmse:8.62310	eval-rmse:22.95449
[600]	train-rmse:6.12949	eval-rmse:22.87738
[717]	train-rmse:5.01691	eval-rmse:22.89558
0:	learn: 26.5884310	test: 25.3956126	best: 25.3956126 (0)	total: 5.54ms	remaining: 27.7s
200:	learn: 21.8052253	test: 23.4899352	best: 23.4899352 (200)	total: 979ms	remaining: 23.4s
400:	learn: 19.6933613	test: 23.1214362	best: 23.1209606 (399)	total: 1.97s	remaining: 22.5s
600:	learn: 18.0285113	test: 22.9658152	best: 22.9615620 (585)	total: 2.94s	remaining: 21.5s
800:	learn: 16.4529471	test: 22.8525545	best: 22.8517626 (799)	total: 3.89s	remaining: 20.4s
1000:	learn: 14.7653137	test: 22.7816788	best: 22.7799637 (975)	total: 4.85s	remaining: 19.4s
1200:	learn: 13.1471595	test: 22.7502645	best: 22.7465335 (1182)	total: 5.81s	remaining: 18.4s
Stopped by overfitting detector  (100 iterations wait)

bestTest = 22.74641732
bestIteration = 1226

Shrink model to first 1227 iterations.
Fold 2 - NRMSE: 0.2303, Pearson:



[200]	train-rmse:13.23077	eval-rmse:24.69615
[400]	train-rmse:8.74420	eval-rmse:24.49544
[600]	train-rmse:6.30814	eval-rmse:24.43737
[800]	train-rmse:4.58402	eval-rmse:24.41945
[806]	train-rmse:4.54308	eval-rmse:24.42013
0:	learn: 26.1535685	test: 27.1571830	best: 27.1571830 (0)	total: 5.53ms	remaining: 27.6s
200:	learn: 21.5736672	test: 24.9389577	best: 24.9389577 (200)	total: 1.06s	remaining: 25.4s
400:	learn: 19.4581374	test: 24.5438693	best: 24.5438693 (400)	total: 2.16s	remaining: 24.7s
600:	learn: 17.8709216	test: 24.3732841	best: 24.3732841 (600)	total: 3.35s	remaining: 24.5s
800:	learn: 16.3229133	test: 24.3406434	best: 24.3392025 (798)	total: 4.37s	remaining: 22.9s
1000:	learn: 14.7432538	test: 24.3288206	best: 24.3169872 (929)	total: 5.37s	remaining: 21.4s
Stopped by overfitting detector  (100 iterations wait)

bestTest = 24.31698718
bestIteration = 929

Shrink model to first 930 iterations.
Fold 3 - NRMSE: 0.2471, Pearson: 0.4497, Score: 0.6013

[LightGBM] [Info] Auto-choosi



[200]	train-rmse:13.10963	eval-rmse:23.30633
[400]	train-rmse:8.57612	eval-rmse:23.21615
[600]	train-rmse:6.13965	eval-rmse:23.12517
[800]	train-rmse:4.45362	eval-rmse:23.09459
[1000]	train-rmse:3.16918	eval-rmse:23.05560
[1122]	train-rmse:2.59254	eval-rmse:23.04821
0:	learn: 26.5514778	test: 25.5314408	best: 25.5314408 (0)	total: 7.77ms	remaining: 38.8s
200:	learn: 21.7573322	test: 23.5809739	best: 23.5777414 (199)	total: 1.06s	remaining: 25.4s
400:	learn: 19.5781127	test: 23.3276493	best: 23.3276493 (400)	total: 2.06s	remaining: 23.6s
600:	learn: 17.9572870	test: 23.1839337	best: 23.1839337 (600)	total: 3.04s	remaining: 22.3s
800:	learn: 16.3340870	test: 23.1167333	best: 23.1086623 (782)	total: 4.07s	remaining: 21.4s
Stopped by overfitting detector  (100 iterations wait)

bestTest = 23.10866233
bestIteration = 782

Shrink model to first 783 iterations.
Fold 4 - NRMSE: 0.2388, Pearson: 0.4394, Score: 0.6003

[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of tes



[200]	train-rmse:13.32878	eval-rmse:24.18544
[400]	train-rmse:8.75197	eval-rmse:23.81320
[594]	train-rmse:6.22716	eval-rmse:23.77598
0:	learn: 26.0950591	test: 27.4082279	best: 27.4082279 (0)	total: 8.98ms	remaining: 44.9s
200:	learn: 21.6408133	test: 24.9121279	best: 24.9121279 (200)	total: 1.06s	remaining: 25.3s
400:	learn: 19.5688666	test: 24.3424868	best: 24.3424868 (400)	total: 2.22s	remaining: 25.5s
600:	learn: 17.9933748	test: 24.0663037	best: 24.0663037 (600)	total: 3.2s	remaining: 23.4s
800:	learn: 16.4391376	test: 23.9068599	best: 23.9061710 (798)	total: 4.13s	remaining: 21.6s
1000:	learn: 14.7072430	test: 23.8326947	best: 23.8313075 (999)	total: 5.09s	remaining: 20.3s
1200:	learn: 13.0960863	test: 23.7947127	best: 23.7931086 (1137)	total: 6.1s	remaining: 19.3s
1400:	learn: 11.6695378	test: 23.7812737	best: 23.7644223 (1371)	total: 7.21s	remaining: 18.5s
Stopped by overfitting detector  (100 iterations wait)

bestTest = 23.7644223
bestIteration = 1371

Shrink model to first 1

### Submission

In [24]:
submission['Inhibition'] = (test_preds_lgb + test_preds_xgb) / 2
submission.to_csv('ensemble_submission.csv', index=False)
print("제출 파일 저장 완료: ensemble_submission_cat.csv")

제출 파일 저장 완료: ensemble_submission_cat.csv


### 📄 Expriments Notes

**[ First Test ]**   
- Feature: Morgan Fingerprint + MACCS (500 top)
- top_indices: 500
- KFold + LightGBM + XGBoost (learning_rate = 0.01)

= Local Score: 0.6125 / NRMSE: 0.2358 / Pearson: 0.4607 

= **Public Score: 0.69314**

---

**[ Second Test ]**   
- Feature: Morgan Fingerprint + MACCS (500 top)
- top_indices: 1500
- KFold + LightGBM + XGBoost (learning_rate = 0.02)

= Local Score: 0.6032 / NRMSE: 0.2380 / Pearson: 0.4445

---

**[ Third Test ]**   
- Feature: Morgan Fingerprint + MACCS (500 top)
- top_indices: 500
- KFold + LightGBM + XGBoost + CatBoost (learning_rate = 0.01)

= Local Score: 0.6147 / NRMSE: 0.2352 / Pearson: 0.4646

---

**[ Fourth Test ]**
- Feature: Morgan Fingerprint + RDKit 2D Description (500 top)
- top_indices: 500
- KFold + LightGBM + XGBoost + CatBoost (learning_rate = 0.01)

= Local Score: 0.6154 / NRMSE: 0.2351 / Pearson: 0.4659

---

**[ Fifth Test ]**
- Feature: Morgan Fingerprint + RDKit 2D Description + Mol2Vec Embedding (500 top)
- top_indices: 500
- KFold + LightGBM + XGBoost + CatBoost (learning_rate = 0.01)

= Local Score: 0.6157 / NRMSE: 0.2350 / Pearson: 0.4663   

= **Public Score: 0.74983**

---

**[ Sixth Test ]**
- Feature: Morgan Fingerprint + RDKit 2D Description + Mol2Vec Embedding (500 top)
- top_indices: 1000
- KFold + LightGBM + XGBoost + CatBoost (Ensemble Weight 0.5:0.2:0.3)

= Local Score: 0.6138 / NRMSE: 0.2355 / Pearson: 0.4632   

---

**[ Seventh Test ]**
- Feature: Morgan Fingerprint + RDKit 2D Description + Mol2Vec Embedding (500 top)
- top_indices: 500
- KFold + LightGBM + XGBoost + CatBoost (Ensemble Weight 0.5:0.2:0.3)

= Local Score: 0.6157 / NRMSE: 0.2350 / Pearson: 0.4663   

---

**[ Eighth Test ]**
- Feature: Morgan Fingerprint + RDKit 2D Description + Mol2Vec Embedding (500 top)
- top_indices: 500
- KFold + LightGBM + XGBoost + CatBoost (Ensemble Weight 0.5:0.2:0.3)
- Hyperparameter (LightGBM, XGBoost): Lr [0.01 > 0.015], max_depth [7 > 9], subsample, colsample_bytree [0.8 > 0.9]
- Hyperparameter (CatBoost): Lr [0.01 > 0.015], depth [7 > 9]

= Local Score: 0.6119 / NRMSE: 0.2359 / Pearson: 0.4597   

---

**[ Ninth Test ]**
- Feature: Morgan Fingerprint + RDKit 2D Description + Mol2Vec Embedding (500 top)
- top_indices: 500
- KFold + LightGBM + XGBoost + CatBoost (Ensemble Weight 0.5:0.2:0.3)
- Hyperparameter (LightGBM): Lr [0.015 > 0.01], max_depth [9 > 7], subsample, colsample_bytree [0.9 > 0.8], num_leaves [64 > 128]
- Hyperparameter (XGBoost): Lr [0.015 > 0.01], max_depth [9 > 7], subsample, colsample_bytree [0.9 > 0.8]

= Local Score: 0.6147 / NRMSE: 0.2353 / Pearson: 0.4646 

---

**[ Tenth Test ]**
- Feature: Morgan Fingerprint + RDKit 2D Description + Mol2Vec Embedding (500 top)
- top_indices: 300
- KFold + LightGBM + XGBoost + CatBoost (Ensemble Weight 0.4:0.3:0.3)
- Hyperparameter (LightGBM): num_leaves [128 > 64]

= Local Score: 0.6192 / NRMSE: 0.2342 / Pearson: 0.4725

= **Public Score: 0.66617**